summaryrefslogtreecommitdiff
path: root/drivers/acpi/processor_idle.c
AgeCommit message (Collapse)AuthorFilesLines
2009-04-05Merge branch 'pmtimer-overflow' into releaseLen Brown1-36/+27
2009-04-03ACPI: Remove R40e c-state blacklistThomas Renninger1-51/+0
The recent ACPICA patch (ACPICA: FADT: Favor 32-bit register addresses for compatibility) makes machine to use the right FADT HW addresses and C-states now work fine. http://bugzilla.kernel.org/show_bug.cgi?id=8246 Signed-off-by: Thomas Renninger <trenn@suse.de> Tested-by: Mark Doughty <me@markdoughty.co.uk> Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-26ACPICA: Rename ACPI bit register access functionsBob Moore1-5/+5
Rename acpi_get_register and acpi_set_register to clarify the purpose of these functions. New names are acpi_read_bit_register and acpi_write_bit_register. Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-26ACPICA: Optimize ACPI register lockingBob Moore1-1/+1
Removed locking for reads from the ACPI bit registers in PM1 Status, Enable, Control, and PM2 Control. The lock is not required when reading the single-bit registers. The acpi_get_register_unlocked function is no longer needed and has been removed. This will improve performance for reads on these registers. ACPICA BZ 760. http://www.acpica.org/bugzilla/show_bug.cgi?id=760 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-03-17acpi: fix of pmtimer overflow that make Cx states time incorrectalex.shi1-36/+27
We found Cx states time abnormal in our some of machines which have 16 LCPUs, the C0 take too many time while system is really idle when kernel enabled tickless and highres. powertop output is below: PowerTOP version 1.9 (C) 2007 Intel Corporation Cn Avg residency P-states (frequencies) C0 (cpu running) (40.5%) 2.53 Ghz 0.0% C1 0.0ms ( 0.0%) 2.53 Ghz 0.0% C2 128.8ms (59.5%) 2.40 Ghz 0.0% 1.60 Ghz 100.0% Wakeups-from-idle per second : 4.7 interval: 20.0s no ACPI power usage estimate available Top causes for wakeups: 41.4% ( 24.9) <interrupt> : extra timer interrupt 20.2% ( 12.2) <kernel core> : usb_hcd_poll_rh_status (rh_timer_func) After tacking detailed for this issue, Yakui and I find it is due to 24 bit PM timer overflows when some of cpu sleep more than 4 seconds. With tickless kernel, the CPU want to sleep as much as possible when system idle. But the Cx sleep time are recorded by pmtimer which length is determined by BIOS. The current Cx time was gotten in the following function from driver/acpi/processor_idle.c: static inline u32 ticks_elapsed(u32 t1, u32 t2) { if (t2 >= t1) return (t2 - t1); else if (!(acpi_gbl_FADT.flags & ACPI_FADT_32BIT_TIMER)) return (((0x00FFFFFF - t1) + t2) & 0x00FFFFFF); else return ((0xFFFFFFFF - t1) + t2); } If pmtimer is 24 bits and it take 5 seconds from t1 to t2, in above function, just about 1 seconds ticks was recorded. So the Cx time will be reduced about 4 seconds. and this is why we see above powertop output. To resolve this problem, Yakui and I use ktime_get() to record the Cx states time instead of PM timer as the following patch. the patch was tested with i386/x86_64 modes on several platforms. Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Tested-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Yakui.zhao <yakui.zhao@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2009-02-06ACPI: delete CPU_IDLE=n codeLen Brown1-608/+0
CPU_IDLE=y has been default for ACPI=y since Nov-2007, and has shipped in many distributions since then. Here we delete the CPU_IDLE=n ACPI idle code, since nobody should be using it, and we don't want to maintain two versions. Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-28ACPI: remove BM_RLD access from idle entry pathLen Brown1-48/+9
It is true that BM_RLD needs to be set to enable bus master activity to wake an older chipset (eg PIIX4) from C3. This is contrary to the erroneous wording the ACPI 2.0, 3.0 specifications that suggests that BM_RLD is an indicator rather than a control bit. ACPI 1.0's correct wording should be restored in ACPI 4.0: http://www.acpica.org/bugzilla/show_bug.cgi?id=689 But the kernel should not have to clear BM_RLD when entering a non C3-type state just to set it again when entering a C3-type C-state. We should be able to set BM_RLD at boot time and leave it alone -- removing the overhead of accessing this IO register from the idle entry path. Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-28ACPI: remove locking from PM1x_STS register readsLen Brown1-2/+2
PM1a_STS and PM1b_STS are twins that get OR'd together on reads, and all writes are repeated to both. The fields in PM1x_STS are single bits only, there are no multi-bit fields. So it is not necessary to lock PM1x_STS reads against writes because it is impossible to read an intermediate value of a single bit. It will either be 0 or 1, even if a write is in progress during the read. Reads are asynchronous to writes no matter if a lock is used or not. Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-06remove linux/hardirq.h from asm-generic/local.hRussell King1-0/+1
While looking at reducing the amount of architecture namespace pollution in the generic kernel, I found that asm/irq.h is included in the vast majority of compilations on ARM (around 650 files.) Since asm/irq.h includes a sub-architecture include file on ARM, this causes a negative impact on the ccache's ability to re-use the build results from other sub-architectures, so we have a desire to reduce the dependencies on asm/irq.h. It turns out that a major cause of this is the needless include of linux/hardirq.h into asm-generic/local.h. The patch below removes this include, resulting in some 250 to 300 files (around half) of the kernel then omitting asm/irq.h. My test builds still succeed, provided two ARM files are fixed (arch/arm/kernel/traps.c and arch/arm/mm/fault.c) - so there may be negative impacts for this on other architectures. Note that x86 does not include asm/irq.h nor linux/hardirq.h in its asm/local.h, so this patch can be viewed as bringing the generic version into line with the x86 version. [kosaki.motohiro@jp.fujitsu.com: add #include <linux/irqflags.h> to acpi/processor_idle.c] [adobriyan@gmail.com: fix sparc64] Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-16x86: support always running TSC on Intel CPUsVenki Pallipadi1-3/+3
Impact: reward non-stop TSCs with good TSC-based clocksources, etc. Add support for CPUID_0x80000007_Bit8 on Intel CPUs as well. This bit means that the TSC is invariant with C/P/T states and always runs at constant frequency. With Intel CPUs, we have 3 classes * CPUs where TSC runs at constant rate and does not stop n C-states * CPUs where TSC runs at constant rate, but will stop in deep C-states * CPUs where TSC rate will vary based on P/T-states and TSC will stop in deep C-states. To cover these 3, one feature bit (CONSTANT_TSC) is not enough. So, add a second bit (NONSTOP_TSC). CONSTANT_TSC indicates that the TSC runs at constant frequency irrespective of P/T-states, and NONSTOP_TSC indicates that TSC does not stop in deep C-states. CPUID_0x8000000_Bit8 indicates both these feature bit can be set. We still have CONSTANT_TSC _set_ and NONSTOP_TSC _not_set_ on some older Intel CPUs, based on model checks. We can use TSC on such CPUs for time, as long as those CPUs do not support/enter deep C-states. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-07ACPI: consolidate ACPI_*_COMPONENT definitions in acpi_drivers.hBjorn Helgaas1-1/+0
Move all the component definitions for drivers to a single shared place, include/acpi/acpi_drivers.h. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-10-16cpuidle: update the last_state acpi cpuidle reflecting actual state enteredVenkatesh Pallipadi1-0/+1
reflect the actual state entered in dev->last_state, when actaul state entered is different from intended one. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-08-15acpi: trivial cleanupsPavel Machek1-1/+0
Trivial cleanups for ACPI. Fix misspelling in printk(), fix mismerge, add file header. AK: removed file header Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-28ACPI/CPUIDLE: prevent setting pm_idle to NULLThomas Gleixner1-4/+11
pm_idle_save resp. pm_idle_old can be NULL when the restore code in acpi_processor_cst_has_changed() resp. cpuidle_uninstall_idle_handler() is called. This can set pm_idle unconditinally to NULL, which causes the kernel to panic when calling pm_idle in the x86 idle code. This was covered by an extra check for !pm_idle in the x86 idle code, which was removed during the x86 idle code refactoring. Instead of restoring the pm_idle check in the x86 code prevent the acpi/cpuidle code to set pm_idle to NULL. Reported by: Dhaval Giani http://lkml.org/lkml/2008/7/2/309 Based on a debug patch from Ingo Molnar Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26ftrace: disable tracing on acpi idle callsSteven Rostedt1-0/+6
The acpi idle waits calls local_irq_save and then uses mwait to go into idle. The tracer gets reenabled at local_irq_save but does not detect that the idle allows for wake ups. This patch adds code to disable the tracing when acpi puts the CPU to idle. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16ACPI : Create "idle=nomwait" bootparamZhao Yakui1-1/+5
"idle=nomwait" disables the use of the MWAIT instruction from both C1 (C1_FFH) and deeper (C2C3_FFH) C-states. When MWAIT is unavailable, the BIOS and OS generally negotiate to use the HALT instruction for C1, and use IO accesses for deeper C-states. This option is useful for power and performance comparisons, and also to work around BIOS bugs where broken MWAIT support is advertised. http://bugzilla.kernel.org/show_bug.cgi?id=10807 http://bugzilla.kernel.org/show_bug.cgi?id=10914 Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16ACPI: Create "idle=halt" bootparamZhao Yakui1-0/+22
"idle=halt" limits the idle loop to using the halt instruction. No MWAIT, no IO accesses, no C-states deeper than C1. If something is broken in the idle code, "idle=halt" is a less severe workaround than "idle=poll" which disables all power savings. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16ACPI: change processors from array to per_cpu variableMike Travis1-4/+4
Change processors from an array sized by NR_CPUS to a per_cpu variable. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-06-26smp_call_function: get rid of the unused nonatomic/retry argumentJens Axboe1-1/+1
It's never used and the comments refer to nonatomic and retry interchangably. So get rid of it. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-11cpuidle acpi driver: fix oops on AC<->DCVenkatesh Pallipadi1-6/+7
cpuidle and acpi driver interaction bug with the way cpuidle_register_driver() is called. Due to this bug, there will be oops on AC<->DC on some systems, where they support C-states in one DC and not in AC. The current code does ON BOOT: Look at CST and other C-state info to see whether more than C1 is supported. If it is, then acpi processor_idle does a cpuidle_register_driver() call, which internally enables the device. ON CST change notification (AC<->DC) and on suspend-resume: acpi driver temporarily disables device, updates the device with any new C-states, and reenables the device. The problem is is on boot, there are no C2, C3 states supported and we skip the register. Later on AC<->DC, we may get a CST notification and we try to reevaluate CST and enabled the device, without actually registering it. This causes breakage as we try to create /sys fs sub directory, without the parent directory which is created at register time. Thanks to Sanjeev for reporting the problem here. http://bugzilla.kernel.org/show_bug.cgi?id=10394 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-04-30Merge branch 'release' of ↵Linus Torvalds1-2/+16
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (179 commits) ACPI: Fix acpi_processor_idle and idle= boot parameters interaction acpi: fix section mismatch warning in pnpacpi intel_menlo: fix build warning ACPI: Cleanup: Remove unneeded, multiple local dummy variables ACPI: video - fix permissions on some proc entries ACPI: video - properly handle errors when registering proc elements ACPI: video - do not store invalid entries in attached_array list ACPI: re-name acpi_pm_ops to acpi_suspend_ops ACER_WMI/ASUS_LAPTOP: fix build bug thinkpad_acpi: fix possible NULL pointer dereference if kstrdup failed ACPI: check a return value correctly in acpi_power_get_context() #if 0 acpi/bay.c:eject_removable_drive() eeepc-laptop: add hwmon fan control eeepc-laptop: add backlight eeepc-laptop: add base driver ACPI: thinkpad-acpi: bump up version to 0.20 ACPI: thinkpad-acpi: fix selects in Kconfig ACPI: thinkpad-acpi: use a private workqueue ACPI: thinkpad-acpi: fluff really minor fix ACPI: thinkpad-acpi: use uppercase for "LED" on user documentation ... Fixed conflicts in drivers/acpi/video.c and drivers/misc/intel_menlow.c manually.
2008-04-30Merge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', ↵Len Brown1-12/+25
'bugzilla-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release
2008-04-30ACPI: Fix acpi_processor_idle and idle= boot parameters interactionVenkatesh Pallipadi1-2/+12
acpi_processor_idle and "idle=" boot parameter interaction is broken. The problem is that, at boot time acpi driver is checking for "idle=" boot option and not registering the acpi idle handler. But, when there is a CST changed callback (typically when switching AC <-> battery or suspend-resume) there are no checks for boot_option_idle_override and acpi idle handler tries to get installed with nasty side effects. With CPU_IDLE configured this issue causes results in a nasty oops on CST change callback and without CPU_IDLE there is no oops, but boot option of "idle=" gets ignored and acpi idle handler gets installed. Change the behavior to not do anything in acpi idle handler when there is a "idle=" boot option. Note that the problem is only there when "idle=" boot option is used. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-04-29acpi: use non-racy method for proc entries creationDenis V. Lunev1-8/+5
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Add correct ->owner to proc_fops to fix reading/module unloading race. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Len Brown <lenb@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-27fix idle (arch, acpi and apm) and lockdepPeter Zijlstra1-10/+9
OK, so 25-mm1 gave a lockdep error which made me look into this. The first thing that I noticed was the horrible mess; the second thing I saw was hacks like: 71e93d15612c61c2e26a169567becf088e71b8ff The problem is that arch idle routines are somewhat inconsitent with their IRQ state handling and instead of fixing _that_, we go paper over the problem. So the thing I've tried to do is set a standard for idle routines and fix them all up to adhere to that. So the rules are: idle routines are entered with IRQs disabled idle routines will exit with IRQs enabled Nearly all already did this in one form or another. Merge the 32 and 64 bit bits so they no longer have different bugs. As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated irq-enable. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-242.6.25 regression: powertop says 120K wakeups/secVenkatesh Pallipadi1-0/+4
Patch to fix huge number of wakeups reported due to recent changes in processor_idle.c. The problem was that the entry_method determination was broken due to one of the recent commits (bc71bec91f987) causing C1 entry to not to go to halt. http://lkml.org/lkml/2008/3/22/124 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26cpuidle: fix 100% C0 statistics regressionVenki Pallipadi1-1/+3
commit 9b12e18cdc1553de62d931e73443c806347cd974 'ACPI: cpuidle: Support C1 idle time accounting' was implicated in a 100% C0 idle regression. http://bugzilla.kernel.org/show_bug.cgi?id=10076 It pointed out a potential problem where the menu governor may get confused by the C-state residency time from poll idle or C1 idle, where this timing info is not accurate. This inaccuracy is due to interrupts being handled before we account for C-state exit. Do not mark TIME_VALID for CO poll state. Mark C1 time as valid only with the MWAIT (CSTATE_FFH) entry method. This makes governors use the timing information only when it is correct and eliminates any wrong policy decisions that may result from invalid timing information. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26ACPI: fix mis-merge -- invoke acpi_unlazy_tlb() only on C3 entryVenki Pallipadi1-1/+2
This original patch http://ussg.iu.edu/hypermail/linux/kernel/0712.2/1451.html was intending to add acpi_unlazy_tlb() to acpi_idle_enter_bm(), which is used for C3 entry. But it was merged incorrectly as commmit bde6f5f59c2b2b48a7a849c129d5b48838fe77ee 'x86: voluntary leave_mm before entering ACPI C3' so the call was instead added to acpi_idle_enter_simple() (which is C2 entry routine), probably due to identical context in that function. Move the call back to acpi_idle_enter_bm(). Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-14ACPI: lockdep warning on boot, 2.6.25-rc5Venki Pallipadi1-3/+9
This avoids the harmless WARNING by lockdep in acpi_processor_idle(). The reason for WARNING is because at the depth of idle handling code, some of the idle handlers disable interrupts, some times, while returning from the idle handler. After return, acpi_processor_idle and few other routines in the file did an unconditional local_irq_enable(). With LOCKDEP, enabling irq when it is already enabled generates the below WARNING. > > [ 0.593038] ------------[ cut here ]------------ > > [ 0.593267] WARNING: at kernel/lockdep.c:2035 trace_hardirqs_on+0xa0/0x115() > > [ 0.593596] Modules linked in: > > [ 0.593756] Pid: 0, comm: swapper Not tainted 2.6.25-rc5 #8 > > [ 0.594017] > > [ 0.594017] Call Trace: > > [ 0.594216] [<ffffffff80231663>] warn_on_slowpath+0x58/0x6b > > [ 0.594495] [<ffffffff80495966>] ? _spin_unlock_irqrestore+0x38/0x47 > > [ 0.594809] [<ffffffff80329a86>] ? acpi_os_release_lock+0x9/0xb > > [ 0.595103] [<ffffffff80337840>] ? acpi_set_register+0x161/0x173 > > [ 0.595401] [<ffffffff8034c8d4>] ? acpi_processor_idle+0x1de/0x546 > > [ 0.595706] [<ffffffff8020a23b>] ? default_idle+0x0/0x73 > > [ 0.595970] [<ffffffff8024fc0e>] trace_hardirqs_on+0xa0/0x115 > > [ 0.596049] [<ffffffff8034c6f6>] ? acpi_processor_idle+0x0/0x546 > > [ 0.596346] [<ffffffff8034c8d4>] acpi_processor_idle+0x1de/0x546 > > [ 0.596642] [<ffffffff8020a23b>] ? default_idle+0x0/0x73 > > [ 0.596912] [<ffffffff8034c6f6>] ? acpi_processor_idle+0x0/0x546 > > [ 0.597209] [<ffffffff8020a23b>] ? default_idle+0x0/0x73 > > [ 0.597472] [<ffffffff8020a355>] cpu_idle+0xa7/0xd1 > > [ 0.597717] [<ffffffff80485fa1>] rest_init+0x55/0x57 > > [ 0.597957] [<ffffffff8062fb49>] start_kernel+0x29d/0x2a8 > > [ 0.598215] [<ffffffff8062f1da>] _sinittext+0x1da/0x1e1 > > [ 0.598464] > > [ 0.598546] ---[ end trace 778e504de7e3b1e3 ]--- Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-19ACPI: TSC breaks atkbd suspendPavel Machek1-5/+5
TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because ACPI code understood the config switch as "may use TSC". This actually fixes suspend problems on my x60. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-14Merge branches 'release', 'dmi', 'idle' and 'misc' into releaseLen Brown1-0/+19
2008-02-14ACPI, cpuidle: Clarify C-state description in sysfsVenkatesh Pallipadi1-0/+11
Add a new sysfs entry under cpuidle states. desc - can be used by driver to communicate to userspace any specific information about the state. This helps in identifying the exact hardware C-states behind the ACPI C-state definition. Idea is to export this through powertop, which will help to map the C-state reported by powertop to actual hardware C-state. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-13ACPI: fix suspend regression due to idle updateVenkatesh Pallipadi1-0/+8
Earlier patch (bc71bec91f9875ef825d12104acf3bf4ca215fa4) broke suspend resume on many laptops. The problem was reported by Carlos R. Mafra and Calvin Walton, who bisected the issue to above patch. The problem was because, C2 and C3 code were calling acpi_idle_enter_c1 directly, with C2 or C3 as state parameter, while suspend/resume was in progress. The patch bc71bec started making use of that state information, assuming that it would always be referring to C1 state. This caused the problem with suspend-resume as we ended up using C2/C3 state indirectly. Fix this by adding acpi_idle_suspend check in enter_c1. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07Merge branches 'release', 'cpuidle-2.6.25' and 'idle' into releaseLen Brown1-11/+36
2008-02-07cpuidle: Add a poll_idle methodvenkatesh.pallipadi@intel.com1-1/+3
Add a default poll idle state with 0 latency. Provides an option to users to use poll_idle by using 0 as the latency requirement. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07ACPI: cpuidle: Support C1 idle time accountingvenkatesh.pallipadi@intel.com1-1/+6
Show C1 idle time in /sysfs cpuidle interface. C1 idle time may not be entirely accurate in all cases. It includes the time spent in the interrupt handler after wakeup with "hlt" based C1. But, it will be accurate with "mwait" based C1. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07ACPI: enable MWAIT for C1 idlevenkatesh.pallipadi@intel.com1-8/+12
Add MWAIT idle for C1 state instead of halt, on platforms that support C1 state with MWAIT. Renames cx->space_id to something more appropriate. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07ACPI: idle: Fix acpi_safe_halt usages and interrupt enabling/disablingvenkatesh.pallipadi@intel.com1-0/+11
acpi_safe_halt() needs interrupts to be disabled for atomic need_resched check and safe halt. Otherwise we may miss an interrupt and go into halt. acpi_safe_halt() also does not enable interrupts on all return paths. So the callers should handle enable and disable interrupts around it. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-05latency.c: use QoS infrastructureMark Gross1-7/+11
Replace latency.c use with pm_qos_params use. Signed-off-by: mark gross <mgross@linux.intel.com> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Len Brown <lenb@kernel.org> Cc: Jaroslav Kysela <perex@suse.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-30x86: don't disable TSC in any C states on AMD Fam10hAndi Kleen1-4/+28
The ACPI code currently disables TSC use in any C2 and C3 states. But the AMD Fam10h BKDG documents that the TSC will never stop in any C states when the CONSTANT_TSC bit is set. Make this disabling conditional on CONSTANT_TSC not set on AMD. I actually think this is true on Intel too for C2 states on CPUs with p-state invariant TSC, but this needs further discussions with Len to really confirm :-) So far it is only enabled on AMD. Cc: lenb@kernel.org Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30x86: voluntary leave_mm before entering ACPI C3Venki Pallipadi1-0/+2
Aviod TLB flush IPIs during C3 states by voluntary leave_mm() before entering C3. The performance impact of TLB flush on C3 should not be significant with respect to C3 wakeup latency. Also, CPUs tend to flush TLB in hardware while in C3 anyways. On a 8 logical CPU system, running make -j2, the number of tlbflush IPIs goes down from 40 per second to ~ 0. Total number of interrupts during the run of this workload was ~1200 per second, which makes it ~3% savings in wakeups. There was no measurable performance or power impact however. [ akpm@linux-foundation.org: symbol export fixes. ] Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-07ACPI: Reintroduce run time configurable max_cstate for !CPU_IDLE caseVenki Pallipadi1-0/+4
This was writeable in 2.6.23 but the cpuidle merge made it read-only. But some people's scripts (ie: Mark's) were writing to it. As an unhappy compromise, make max_cstate writeable again if the kernel was configured without CONFIG_CPU_IDLE. http://bugzilla.kernel.org/show_bug.cgi?id=9683 Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Mark Lord <lkml@rtr.ca> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-14cpuidle: default processor.latency_factor=2Len Brown1-1/+1
More aggressively request deep C-states. Note that the job of the OS is to minimize latency impact to expected break events such as interrupts. It is not the job of the OS to try to calculate if the C-state will reach energy break-even. The platform doesn't give the OS enough information for it to make that calculation. Thus, it is up to the platform to decide if it is worth it to go as deep as the OS requested it to, or if it should internally demote to a more shallow C-state. But the converse is not true. The platform can not promote into a deeper C-state than the OS requested else it may violate latency constraints. So it is important that the OS be aggressive in giving the platform permission to enter deep C-states. Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-14cpuidle: create processor.latency_factor tunableLen Brown1-1/+4
Start with default value of 6, so by default, there is no functional change in this patch. Signed-off-by: Len Brown <len.brown@intel.com>
2007-12-07ACPI: move timer broadcast before busmaster disableThomas Gleixner1-5/+14
The timer broadcast code might access HPET, which should not be accessed after the busmaster disable. In acpi_idle_enter_simple() this change also prevents, that we modify the busmaster state without going actually idle. This might leave the ACPI bm state in a stale state, when we leave the function early in the need_resched() check. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
2007-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86Linus Torvalds1-0/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix APIC related bootup crash on Athlon XP CPUs time: add ADJ_OFFSET_SS_READ x86: export the symbol empty_zero_page on the 32-bit x86 architecture x86: fix kprobes_64.c inlining borkage pci: use pci=bfsort for HP DL385 G2, DL585 G2 x86: correctly set UTS_MACHINE for "make ARCH=x86" lockdep: annotate do_debug() trap handler x86: turn off iommu merge by default x86: fix ACPI compile for LOCAL_APIC=n x86: printk kernel version in WARN_ON and other dump_stack users ACPI: Set max_cstate to 1 for early Opterons. x86: fix NMI watchdog & 'stopped time' problem
2007-11-26ACPI: Set max_cstate to 1 for early Opterons.Alexey Starikovskiy1-0/+1
AMD Opteron processors before CG revision don't like C-states > 1. This solves the long standing bugzilla #5303 and probably some more on affected machines: http://bugzilla.kernel.org/show_bug.cgi?id=5303 [ tglx@linutronix.de: reworked the patch so it does not wreck ia64 ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-20Pull cpuidle into release branchLen Brown1-48/+63
2007-11-19cpuidle: fix HP nx6125 regressionVenkatesh Pallipadi1-70/+55
Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9355 cpuidle always used to fallback to C2 if there is some bm activity while entering C3. But, presence of C2 is not always guaranteed. Change cpuidle algorithm to detect a safe_state to fallback in case of bm_activity and use that state instead of C2. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-19cpuidle: add sched_clock_idle_[sleep|wakeup]_event() hooksVenkatesh Pallipadi1-2/+17
Port 2aa44d0567ed21b47b87d68819415d48194cb923 (sched: sched_clock_idle_[sleep|wakeup]_event()) to cpuidle. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>