summaryrefslogtreecommitdiff
path: root/drivers/cpuidle
AgeCommit message (Collapse)AuthorFilesLines
2018-08-25cpuidle: menu: Retain tick when shallow state is selectedRafael J. Wysocki1-1/+12
The case addressed by commit 5ef499cd571c (cpuidle: menu: Handle stopped tick more aggressively) in the stopped tick case is present when the tick has not been stopped yet too. Namely, if only two CPU idle states, shallow state A with target residency significantly below the tick boundary and deep state B with target residency significantly above it, are available and the predicted idle duration is above the tick boundary, but below the target residency of state B, state A will be selected and the CPU may spend indefinite amount of time in it, which is not quite energy-efficient. However, if the tick has not been stopped yet and the governor is about to select a shallow idle state for the CPU even though the idle duration predicted by it is above the tick boundary, it should be fine to wake up the CPU early, so the tick can be retained then and the governor will have a chance to select a deeper state when it runs next time. [Note that when this really happens, it will make the idle duration predictor believe that the CPU might be idle longer than predicted, which will make it more likely to predict longer idle durations going forward, but that will also cause deeper idle states to be selected going forward, on average, which is what's needed here.] Fixes: 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states with stopped tick) Reported-by: Leo Yan <leo.yan@linaro.org> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+: 5ef499cd571c (cpuidle: menu: Handle ...) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-22Merge tag 'pm-4.19-rc1-2' of ↵Linus Torvalds1-17/+28
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These fix the main idle loop and the menu cpuidle governor, clean up the latter, fix a mistake in the PCI bus type's support for system suspend and resume, fix the ondemand and conservative cpufreq governors, address a build issue in the system wakeup framework and make the ACPI C-states desciptions less confusing. Specifics: - Make the idle loop handle stopped scheduler tick correctly (Rafael Wysocki). - Prevent the menu cpuidle governor from letting CPUs spend too much time in shallow idle states when it is invoked with scheduler tick stopped and clean it up somewhat (Rafael Wysocki). - Avoid invoking the platform firmware to make the platform enter the ACPI S3 sleep state with suspended PCIe root ports which may confuse the firmware and cause it to crash (Rafael Wysocki). - Fix sysfs-related race in the ondemand and conservative cpufreq governors which may cause the system to crash if the governor module is removed during an update of CPU frequency limits (Henry Willard). - Select SRCU when building the system wakeup framework to avoid a build issue in it (zhangyi). - Make the descriptions of ACPI C-states vendor-neutral to avoid confusion (Prarit Bhargava)" * tag 'pm-4.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: menu: Handle stopped tick more aggressively sched: idle: Avoid retaining the tick when it has been stopped PCI / ACPI / PM: Resume all bridges on suspend-to-RAM cpuidle: menu: Update stale polling override comment cpufreq: governor: Avoid accessing invalid governor_data x86/ACPI/cstate: Make APCI C1 FFH MWAIT C-state description vendor-neutral cpuidle: menu: Fix white space PM / sleep: wakeup: Fix build error caused by missing SRCU support
2018-08-20cpuidle: menu: Handle stopped tick more aggressivelyRafael J. Wysocki1-12/+24
Commit 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states with stopped tick) missed the case when the target residencies of deep idle states of CPUs are above the tick boundary which may cause the CPU to get stuck in a shallow idle state for a long time. Say there are two CPU idle states available: one shallow, with the target residency much below the tick boundary and one deep, with the target residency significantly above the tick boundary. In that case, if the tick has been stopped already and the expected next timer event is relatively far in the future, the governor will assume the idle duration to be equal to TICK_USEC and it will select the idle state for the CPU accordingly. However, that will cause the shallow state to be selected even though it would have been more energy-efficient to select the deep one. To address this issue, modify the governor to always use the time till the closest timer event instead of the predicted idle duration if the latter is less than the tick period length and the tick has been stopped already. Also make it extend the search for a matching idle state if the tick is stopped to avoid settling on a shallow state if deep states with target residencies above the tick period length are available. In addition, make it always indicate that the tick should be stopped if it has been stopped already for consistency. Fixes: 87c9fe6ee495 (cpuidle: menu: Avoid selecting shallow states with stopped tick) Reported-by: Leo Yan <leo.yan@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-17Merge tag 'powerpc-4.19-1' of ↵Linus Torvalds1-132/+26
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A fix for a bug in our page table fragment allocator, where a page table page could be freed and reallocated for something else while still in use, leading to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for a powerpc only refcount. - Fixes to our pkey support. Several are user-visible changes, but bring us in to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer for reporting many of these. - A series to improve the hvc driver & related OPAL console code, which have been seen to cause hardlockups at times. The hvc driver changes in particular have been in linux-next for ~month. - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y. - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use anywhere other than as a paper weight. - An optimised memcmp implementation using Power7-or-later VMX instructions - Support for barrier_nospec on some NXP CPUs. - Support for flushing the count cache on context switch on some IBM CPUs (controlled by firmware), as a Spectre v2 mitigation. - A series to enhance the information we print on unhandled signals to bring it into line with other arches, including showing the offending VMA and dumping the instructions around the fault. Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand, Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao, zhong jiang" * tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits) powerpc/mm/book3s/radix: Add mapping statistics powerpc/uaccess: Enable get_user(u64, *p) on 32-bit powerpc/mm/hash: Remove unnecessary do { } while(0) loop powerpc/64s: move machine check SLB flushing to mm/slb.c powerpc/powernv/idle: Fix build error powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range powerpc/mm: remove warning about ‘type’ being set powerpc/32: Include setup.h header file to fix warnings powerpc: Move `path` variable inside DEBUG_PROM powerpc/powermac: Make some functions static powerpc/powermac: Remove variable x that's never read cxl: remove a dead branch powerpc/powermac: Add missing include of header pmac.h powerpc/kexec: Use common error handling code in setup_new_fdt() powerpc/xmon: Add address lookup for percpu symbols powerpc/mm: remove huge_pte_offset_and_shift() prototype powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled powerpc/pseries: Fix endianness while restoring of r3 in MCE handler. powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements powerpc/fadump: handle crash memory ranges array index overflow ...
2018-08-16cpuidle: menu: Update stale polling override commentRafael J. Wysocki1-3/+2
The comment to explain why the menu governor uses idle state 1 instead of idle state 0 as the first one sometimes is stale (among other things it mentions a user setting not present any more), so update it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-08-15cpuidle: menu: Fix white spaceRafael J. Wysocki1-2/+2
Fix some damaged white space in menu_select(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-31powernv/cpuidle: Use parsed device tree values for cpuidle_initAkshay Adiga1-132/+26
Export pnv_idle_states and nr_pnv_idle_states so that its accessible to cpuidle driver. Use properties from pnv_idle_states structure for powernv cpuidle_init. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-25ARM: cpuidle: silence error on driver registration failureSudeep Holla1-1/+2
It's perfectly fine to have multiple cpuidle driver compiled in the build configuration. However, it's not good to throw error on driver registration failure if some other driver is already initialised and assigned. In such cases, __cpuidle_register_driver returns -EBUSY and we can check for such error before throwing the error. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-07Merge tag 'powerpc-4.18-1' of ↵Linus Torvalds1-6/+26
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Support for split PMD page table lock on 64-bit Book3S (Power8/9). - Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live patching again. - Add support for patching barrier_nospec in copy_from_user() and syscall entry. - A couple of fixes for our data breakpoints on Book3S. - A series from Nick optimising TLB/mm handling with the Radix MMU. - Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malaterre. - Several series optimising various parts of the 32-bit code from Christophe Leroy. - Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,C2K"), which is why the diffstat has so many deletions. And many other small improvements & fixes. There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve, and a fix to our powernv cpuidle driver. Then there's a series touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details around pkey support. It was ack'ed/reviewed by Ingo & Dave and has been in next for several weeks. Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing" * tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits) powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap cpuidle: powernv: Fix promotion from snooze if next state disabled powerpc: fix build failure by disabling attribute-alias warning in pci_32 ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait() powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" powerpc: fix spelling mistake: "Usupported" -> "Unsupported" powerpc/pkeys: Detach execute_only key on !PROT_EXEC powerpc/powernv: copy/paste - Mask SO bit in CR powerpc: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove support for Marvell mv64x60 i2c controller powerpc/boot: Remove support for Marvell MPSC serial controller powerpc/embedded6xx: Remove C2K board support powerpc/lib: optimise PPC32 memcmp powerpc/lib: optimise 32 bits __clear_user() powerpc/time: inline arch_vtime_task_switch() powerpc/Makefile: set -mcpu=860 flag for the 8xx powerpc: Implement csum_ipv6_magic in assembly powerpc/32: Optimise __csum_partial() powerpc/lib: Adjust .balign inside string functions for PPC32 ...
2018-06-05cpuidle: powernv: Fix promotion from snooze if next state disabledGautham R. Shenoy1-6/+26
The commit 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") introduced a timeout for the snooze idle state so that it could be eventually be promoted to a deeper idle state. The snooze timeout value is static and set to the target residency of the next idle state, which would train the cpuidle governor to pick the next idle state eventually. The unfortunate side-effect of this is that if the next idle state(s) is disabled, the CPU will forever remain in snooze, despite the fact that the system is completely idle, and other deeper idle states are available. This patch fixes the issue by dynamically setting the snooze timeout to the target residency of the next enabled state on the device. Before Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01297 seconds and exited with status 0 |Idle_Stats PKG |CORE|CPU | snoo | Nap | Fast 0| 8| 0| 96.41| 0.00| 0.00 0| 8| 1| 96.43| 0.00| 0.00 0| 8| 2| 96.47| 0.00| 0.00 0| 8| 3| 96.35| 0.00| 0.00 0| 8| 4| 96.37| 0.00| 0.00 0| 8| 5| 96.37| 0.00| 0.00 0| 8| 6| 96.47| 0.00| 0.00 0| 8| 7| 96.47| 0.00| 0.00 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.05033 seconds and exited with status 0 |Idle_Stats PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop 0| 16| 0| 89.79| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00 0| 16| 1| 90.12| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00 0| 16| 2| 90.21| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00 0| 16| 3| 90.29| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00 After Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01200 seconds and exited with status 0 |Idle_Stats PKG |CORE|CPU | snoo | Nap | Fast 0| 8| 0| 16.58| 0.00| 77.21 0| 8| 1| 18.42| 0.00| 75.38 0| 8| 2| 4.70| 0.00| 94.09 0| 8| 3| 17.06| 0.00| 81.73 0| 8| 4| 3.06| 0.00| 95.73 0| 8| 5| 7.00| 0.00| 96.80 0| 8| 6| 1.00| 0.00| 98.79 0| 8| 7| 5.62| 0.00| 94.17 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.02110 seconds and exited with status 0 |Idle_Stats PKG |CORE|CPU | snoo | stop | stop | stop | stop | stop | stop | stop | stop 0| 0| 0| 0.69| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 9.39| 89.70 0| 0| 1| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.05| 93.21 0| 0| 2| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 89.93 0| 0| 3| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 0.00| 93.26 Fixes: 78eaa10f027c ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-05-30cpuidle: governors: Consolidate PM QoS handlingRafael J. Wysocki3-17/+18
There is some code duplication related to the PM QoS handling between the existing cpuidle governors, so move that code to a common helper function and call that from the governors. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30cpuidle: governors: Drop redundant checks related to PM QoSRafael J. Wysocki2-4/+2
PM_QOS_RESUME_LATENCY_NO_CONSTRAINT is defined as the 32-bit integer maximum, so it is not necessary to test the return value of dev_pm_qos_raw_read_value() against it directly in the menu and ladder cpuidle governors. Drop these redundant checks. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-09cpuidle: menu: Avoid selecting shallow states with stopped tickRafael J. Wysocki1-7/+22
If the scheduler tick has been stopped already and the governor selects a shallow idle state, the CPU can spend a long time in that state if the selection is based on an inaccurate prediction of idle time. That effect turns out to be relevant, so it needs to be mitigated. To that end, modify the menu governor to discard the result of the idle time prediction if the tick is stopped and the predicted idle time is less than the tick period length, unless the tick timer is going to expire soon. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2018-04-09cpuidle: menu: Refine idle state selection for running tickRafael J. Wysocki1-2/+25
If the tick isn't stopped, the target residency of the state selected by the menu governor may be greater than the actual time to the next tick and that means lost energy. To avoid that, make tick_nohz_get_sleep_length() return the current time to the next event (before stopping the tick) in addition to the estimated one via an extra pointer argument and make menu_select() use that value to refine the state selection when necessary. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2018-04-06cpuidle: Return nohz hint from cpuidle_select()Rafael J. Wysocki3-15/+57
Add a new pointer argument to cpuidle_select() and to the ->select cpuidle governor callback to allow a boolean value indicating whether or not the tick should be stopped before entering the selected state to be returned from there. Make the ladder governor ignore that pointer (to preserve its current behavior) and make the menu governor return 'false" through it if: (1) the idle exit latency is constrained at 0, or (2) the selected state is a polling one, or (3) the expected idle period duration is within the tick period range. In addition to that, the correction factor computations in the menu governor need to take the possibility that the tick may not be stopped into account to avoid artificially small correction factor values. To that end, add a mechanism to record tick wakeups, as suggested by Peter Zijlstra, and use it to modify the menu_update() behavior when tick wakeup occurs. Namely, if the CPU is woken up by the tick and the return value of tick_nohz_get_sleep_length() is not within the tick boundary, the predicted idle duration is likely too short, so make menu_update() try to compensate for that by updating the governor statistics as though the CPU was idle for a long time. Since the value returned through the new argument pointer of cpuidle_select() is not used by its caller yet, this change by itself is not expected to alter the functionality of the code. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2018-03-29cpuidle: poll_state: Avoid invoking local_clock() too oftenRafael J. Wysocki1-0/+6
Rik reports that he sees an increase in CPU use in one benchmark due to commit 612f1a22f067 "cpuidle: poll_state: Add time limit to poll_idle()" that caused poll_idle() to call local_clock() in every iteration of the loop. Utilization increase generally means more non-idle time with respect to total CPU time (on the average) which implies reduced CPU frequency. Doug reports that limiting the rate of local_clock() invocations in there causes much less power to be drawn during a CPU-intensive parallel workload (with idle states 1 and 2 disabled to enforce more state 0 residency). These two reports together suggest that executing local_clock() on multiple CPUs in parallel at a high rate may cause chips to get hot and trigger thermal/power limits on them to kick in, so reduce the rate of local_clock() invocations in poll_idle() to avoid that issue. Fixes: 612f1a22f067 "cpuidle: poll_state: Add time limit to poll_idle()" Reported-by: Rik van Riel <riel@surriel.com> Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Rik van Riel <riel@surriel.com> Reviewed-by: Rik van Riel <riel@surriel.com>
2018-03-29PM: cpuidle/suspend: Add s2idle usage and time state attributesRafael J. Wysocki2-0/+63
Add a new attribute group called "s2idle" under the sysfs directory of each cpuidle state that supports the ->enter_s2idle callback and put two new attributes, "usage" and "time", into that group to represent the number of times the given state was requested for suspend-to-idle and the total time spent in suspend-to-idle after requesting that state, respectively. That will allow diagnostic information related to suspend-to-idle to be collected without enabling advanced debug features and analyzing dmesg output. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-29cpuidle: Enable coupled cpuidle support on Exynos3250 platformMarek Szyprowski1-1/+2
All the needed code has been already merged to mach-exynos core in commit af9971144dde ("ARM: EXYNOS: add coupled cpuidle support for Exynos3250"), so enable support for coupled variant also for Exynos3250 SoCs. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-29cpuidle: poll_state: Add time limit to poll_idle()Rafael J. Wysocki1-1/+10
If poll_idle() is allowed to spin until need_resched() returns 'true', it may actually spin for a much longer time than expected by the idle governor, since set_tsk_need_resched() is not always called by the timer interrupt handler. If that happens, the CPU may spend much more time than anticipated in the "polling" state. To prevent that from happening, limit the time of the spinning loop in poll_idle(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Doug Smythies <dsmythies@telus.net>
2018-03-05ARM: cpuidle: Drop memory allocation error message from arm_idle_init_cpu()Markus Elfring1-1/+0
Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-02Merge tag 'powerpc-4.16-1' of ↵Linus Torvalds2-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Enable support for memory protection keys aka "pkeys" on Power7/8/9 when using the hash table MMU. - Extend our interrupt soft masking to support masking PMU interrupts as well as "normal" interrupts, and then use that to implement local_t for a ~4x speedup vs the current atomics-based implementation. - A new driver "ocxl" for "Open Coherent Accelerator Processor Interface (OpenCAPI)" devices. - Support for new device tree properties on PowerVM to describe hotpluggable memory and devices. - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit VDSO. - Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI erratum workaround, plus a minor cleanup patch. As well as quite a lot of other changes all over the place, and small fixes and cleanups as always. Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G. Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur, David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva, Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud, Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee, Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann, Vaibhav Jain, Vasyl Gomonovych" * tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits) powerpc/mm/radix: Fix build error when RADIX_MMU=n macintosh/ams-input: Use true and false for boolean values macintosh: change some data types from int to bool powerpc/watchdog: Print the NIP in soft_nmi_interrupt() powerpc/watchdog: regs can't be null in soft_nmi_interrupt() powerpc/watchdog: Tweak watchdog printks powerpc/cell: Remove axonram driver rtc-opal: Fix handling of firmware error codes, prevent busy loops powerpc/mpc52xx_gpt: make use of raw_spinlock variants macintosh/adb: Properly mark continued kernel messages powerpc/pseries: Fix cpu hotplug crash with memoryless nodes powerpc/numa: Ensure nodes initialized for hotplug powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes powerpc/kernel: Block interrupts when updating TIDR powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn powerpc/mm/nohash: do not flush the entire mm when range is a single page powerpc/pseries: Add Initialization of VF Bars powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV powerpc/eeh: Add EEH notify resume sysfs powerpc/eeh: Add EEH operations to notify resume ...
2018-01-18powerpc/pseries/cpuidle: add polling idle for shared processor guestsNicholas Piggin1-2/+8
For shared processor guests (e.g., KVM), add an idle polling mode rather than immediately returning to the hypervisor when the guest CPU goes idle. Test setup is a 2 socket POWER9 with 4 guests running, each with vCPUs equal to 1/2 of real of CPUs. Saturated each guest with tbench. Using polling idle gives about 1.4x throughput. Kernel compile speed was not changed significantly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18cpuidle/powernv: avoid double irq enable coming out of idleNicholas Piggin1-2/+4
Since e1689795a7 ("cpuidle: Add common time keeping and irq enabling"), cpuidle drivers are expected to return from ->enter with irqs disabled. Update the cpuidle-powernv snooze and cede loops to disable irqs before returning. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18cpuidle/powernv: avoid double irq enable coming out of idleNicholas Piggin1-0/+2
Since e1689795a7 ("cpuidle: Add common time keeping and irq enabling"), cpuidle drivers are expected to return from ->enter with irqs disabled. Update the cpuidle-powernv snooze loop to disable irqs before returning. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-05cpuidle: Avoid NULL argument in cpuidle_switch_governor()gaurav jindal1-2/+3
Checks if the new governor is NULL before updating the cupidle_curr_governor. Signed-off-by: gaurav jindal <gauravjindal1104@gmail.com> [ rjw : Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-16Merge tag 'powerpc-4.15-1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "A bit of a small release, I suspect in part due to me travelling for KS. But my backlog of patches to review is smaller than usual, so I think in part folks just didn't send as much this cycle. Non-highlights: - Five fixes for the >128T address space handling, both to fix bugs in our implementation and to bring the semantics exactly into line with x86. Highlights: - Support for a new OPAL call on bare metal machines which gives us a true NMI (ie. is not masked by MSR[EE]=0) for debugging etc. - Support for Power9 DD2 in the CXL driver. - Improvements to machine check handling so that uncorrectable errors can be reported into the generic memory_failure() machinery. - Some fixes and improvements for VPHN, which is used under PowerVM to notify the Linux partition of topology changes. - Plumbing to enable TM (transactional memory) without suspend on some Power9 processors (PPC_FEATURE2_HTM_NO_SUSPEND). - Support for emulating vector loads form cache-inhibited memory, on some Power9 revisions. - Disable the fast-endian switch "syscall" by default (behind a CONFIG), we believe it has never had any users. - A major rework of the API drivers use when initiating and waiting for long running operations performed by OPAL firmware, and changes to the powernv_flash driver to use the new API. - Several fixes for the handling of FP/VMX/VSX while processes are using transactional memory. - Optimisations of TLB range flushes when using the radix MMU on Power9. - Improvements to the VAS facility used to access coprocessors on Power9, and related improvements to the way the NX crypto driver handles requests. - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit. Thanks to: Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Breno Leitao, Christophe Leroy, Christophe Lombard, Cyril Bur, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo Romero, Haren Myneni, Joel Stanley, Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami Hiramatsu, Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee, Shriya, Stephen Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, and William A. Kennington III" * tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (151 commits) powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature powerpc/64s: Fix masking of SRR1 bits on instruction fault powerpc/64s: mm_context.addr_limit is only used on hash powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary powerpc/64s/hash: Fix fork() with 512TB process address space powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation powerpc/64s/hash: Fix 512T hint detection to use >= 128T powerpc: Fix DABR match on hash based systems powerpc/signal: Properly handle return value from uprobe_deny_signal() powerpc/fadump: use kstrtoint to handle sysfs store powerpc/lib: Implement UACCESS_FLUSHCACHE API powerpc/lib: Implement PMEM API powerpc/powernv/npu: Don't explicitly flush nmmu tlb powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm() powerpc/powernv/idle: Round up latency and residency values powerpc/kprobes: refactor kprobe_lookup_name for safer string operations powerpc/kprobes: Blacklist emulate_update_regs() from kprobes powerpc/kprobes: Do not disable interrupts for optprobes and kprobes_on_ftrace powerpc/kprobes: Disable preemption before invoking probe handler for optprobes ...
2017-11-13Merge branch 'pm-cpuidle'Rafael J. Wysocki3-69/+105
* pm-cpuidle: intel_idle: Graceful probe failure when MWAIT is disabled cpuidle: Avoid assignment in if () argument cpuidle: Clean up cpuidle_enable_device() error handling a bit cpuidle: ladder: Add per CPU PM QoS resume latency support ARM: cpuidle: Refactor rollback operations if init fails ARM: cpuidle: Correct driver unregistration if init fails intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT) cpuidle: fix broadcast control when broadcast can not be entered Conflicts: drivers/idle/intel_idle.c
2017-11-13Merge branch 'pm-qos'Rafael J. Wysocki1-2/+2
* pm-qos: PM / QoS: Fix device resume latency framework PM / QoS: Drop PM_QOS_FLAG_REMOTE_WAKEUP
2017-11-13powerpc/powernv/idle: Round up latency and residency valuesVaidyanathan Srinivasan1-2/+2
On PowerNV platforms, firmware provides exit latency and target residency for each of the idle states in nano seconds. Cpuidle framework expects the values in micro seconds. Round up to nearest micro seconds to avoid errors in cases where the values are defined as fractional micro seconds. Default idle state of 'snooze' has exit latency of zero. If other states have fractional micro second exit latency, they would get rounded down to zero micro second and make cpuidle framework choose deeper idle state when snooze loop is the right choice. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-08cpuidle: Avoid assignment in if () argumentGaurav Jindal1-3/+5
Clean up cpuidle_enable_device() to avoid doing an assignment in an expression evaluated as an argument of if (), which also makes the code in question more readable. Signed-off-by: Gaurav Jindal <gauravjindal1104@gmail.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08cpuidle: Clean up cpuidle_enable_device() error handling a bitGaurav Jindal1-1/+4
Do not fetch per CPU drv if cpuidle_curr_governor is NULL to avoid useless per CPU processing. Signed-off-by: Gaurav Jindal <gauravjindal1104@gmail.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08cpuidle: ladder: Add per CPU PM QoS resume latency supportRamesh Thomas1-0/+7
Individual CPUs may have special requirements to not enter deep idle states. For example, a CPU running real time applications would not want to enter deep idle states to avoid latency impacts. At the same time other CPUs that do not have such a requirement could allow deep idle states to save power. This was already implemented in the menu governor. Implementing similar changes in the ladder governor which gets selected when CONFIG_NO_HZ and CONFIG_NO_HZ_IDLE are not set. Refer following commits for the menu governor changes. Signed-off-by: Ramesh Thomas <ramesh.thomas@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08Merge branch 'pm-qos' into pm-cpuidleRafael J. Wysocki1-2/+2
2017-11-08PM / QoS: Fix device resume latency frameworkRafael J. Wysocki1-2/+2
The special value of 0 for device resume latency PM QoS means "no restriction", but there are two problems with that. First, device resume latency PM QoS requests with 0 as the value are always put in front of requests with positive values in the priority lists used internally by the PM QoS framework, causing 0 to be chosen as an effective constraint value. However, that 0 is then interpreted as "no restriction" effectively overriding the other requests with specific restrictions which is incorrect. Second, the users of device resume latency PM QoS have no way to specify that *any* resume latency at all should be avoided, which is an artificial limitation in general. To address these issues, modify device resume latency PM QoS to use S32_MAX as the "no constraint" value and 0 as the "no latency at all" one and rework its users (the cpuidle menu governor, the genpd QoS governor and the runtime PM framework) to follow these changes. Also add a special "n/a" value to the corresponding user space I/F to allow user space to indicate that it cannot accept any resume latencies at all for the given device. Fixes: 85dc0b8a4019 (PM / QoS: Make it possible to expose PM QoS latency constraints) Link: https://bugzilla.kernel.org/show_bug.cgi?id=197323 Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Tero Kristo <t-kristo@ti.com> Reviewed-by: Ramesh Thomas <ramesh.thomas@intel.com>
2017-11-03Update MIPS email addressesPaul Burton1-1/+1
MIPS will soon not be a part of Imagination Technologies, and as such many @imgtec.com email addresses will no longer be valid. This patch updates the addresses for those who: - Have 10 or more patches in mainline authored using an @imgtec.com email address, or any patches dated within the past year. - Are still with Imagination but leaving as part of the MIPS business unit, as determined from an internal email address list. - Haven't already updated their email address (ie. JamesH) or expressed a desire to be excluded (ie. Maciej). - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt & myself. New addresses are of the form firstname.lastname@mips.com, and all verified against an internal email address list. An entry is added to .mailmap for each person such that get_maintainer.pl will report the new addresses rather than @imgtec.com addresses which will soon be dead. Instances of the affected addresses throughout the tree are then mechanically replaced with the new @mips.com address. Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com> Acked-by: Dengcheng Zhu <dengcheng.zhu@mips.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Matt Redfearn <matt.redfearn@mips.com> Acked-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: trivial@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman5-0/+5
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-14ARM: cpuidle: Refactor rollback operations if init failsLeo Yan1-63/+82
If init fails, we need execute two levels rollback operations: the first level is for the failed CPU rollback operations, the second level is to iterate all succeeded CPUs to cancel their registration; currently the code uses one function to finish these two levels rollback operations. This commit is to refactor rollback operations, so it adds a new function arm_idle_init_cpu() to encapsulate one specified CPU driver registration and rollback the first level operations; and use function arm_idle_init() to iterate all CPUs and finish the second level's rollback operations. Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-14ARM: cpuidle: Correct driver unregistration if init failsLeo Yan1-9/+13
If cpuidle init fails, the code misses to unregister the driver for current CPU. Furthermore, we also need to rollback to cancel all previous CPUs registration; but the code retrieves driver handler by using function cpuidle_get_driver(), this function returns back current CPU driver handler but not previous CPU's handler, which leads to the failure handling code cannot unregister previous CPUs driver. This commit fixes two mentioned issues, it adds error handling path 'goto out_unregister_drv' for current CPU driver unregistration; and it is to replace cpuidle_get_driver() with cpuidle_get_cpu_driver(), the later function can retrieve driver handler for previous CPUs according to the CPU device handler so can unregister the driver properly. This patch also adds extra error handling paths 'goto out_kfree_dev' and 'goto out_kfree_drv' and adjusts the freeing sentences for previous CPUs; so make the code more readable for freeing 'dev' and 'drv' structures. Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Fixes: d50a7d8acd78 (ARM: cpuidle: Support asymmetric idle definition) Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-28cpuidle: fix broadcast control when broadcast can not be enteredNicholas Piggin1-0/+1
When failing to enter broadcast timer mode for an idle state that requires it, a new state is selected that does not require broadcast, but the broadcast variable remains set. This causes tick_broadcast_exit to be called despite not having entered broadcast mode. This causes the WARN_ON_ONCE(!irqs_disabled()) to trigger in some cases. It does not appear to cause problems for code today, but seems to violate the interface so should be fixed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-19ARM: cpuidle: Avoid memleak if init failStefan Wahren1-2/+4
In case there are no DT idle states defined or cpuidle_register_driver() fails, the copy of the idle driver is leaked: unreferenced object 0xede0dc00 (size 1024): comm "swapper/0", pid 1, jiffies 4294937431 (age 744.510s) hex dump (first 32 bytes): 94 9e 0b c1 00 00 00 00 00 00 00 00 00 00 00 00 ................ 57 46 49 00 00 00 00 00 00 00 00 00 00 00 00 00 WFI............. backtrace: [<c1295f04>] arm_idle_init+0x44/0x1ac [<c0301e6c>] do_one_initcall+0x3c/0x16c [<c1200d70>] kernel_init_freeable+0x110/0x1d0 [<c0cb3624>] kernel_init+0x8/0x114 [<c0307a98>] ret_from_fork+0x14/0x3c So fix this by freeing the unregistered copy in error case. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: d50a7d8acd78 (ARM: cpuidle: Support asymmetric idle definition) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-15Merge branch '4.14-features' of ↵Linus Torvalds1-1/+1
git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS updates from Ralf Baechle: "This is the main pull request for 4.14 for MIPS; below a summary of the non-merge commits: CM: - Rename mips_cm_base to mips_gcr_base - Specify register size when generating accessors - Use BIT/GENMASK for register fields, order & drop shifts - Add cluster & block args to mips_cm_lock_other() CPC: - Use common CPS accessor generation macros - Use BIT/GENMASK for register fields, order & drop shifts - Introduce register modify (set/clear/change) accessors - Use change_*, set_* & clear_* where appropriate - Add CM/CPC 3.5 register definitions - Use GlobalNumber macros rather than magic numbers - Have asm/mips-cps.h include CM & CPC headers - Cluster support for topology functions - Detect CPUs in secondary clusters CPS: - Read GIC_VL_IDENT directly, not via irqchip driver DMA: - Consolidate coherent and non-coherent dma_alloc code - Don't use dma_cache_sync to implement fd_cacheflush FPU emulation / FP assist code: - Another series of 14 commits fixing corner cases such as NaN propgagation and other special input values. - Zero bits 32-63 of the result for a CLASS.D instruction. - Enhanced statics via debugfs - Do not use bools for arithmetic. GCC 7.1 moans about this. - Correct user fault_addr type Generic MIPS: - Enhancement of stack backtraces - Cleanup from non-existing options - Handle non word sized instructions when examining frame - Fix detection and decoding of ADDIUSP instruction - Fix decoding of SWSP16 instruction - Refactor handling of stack pointer in get_frame_info - Remove unreachable code from force_fcr31_sig() - Convert to using %pOF instead of full_name - Remove the R6000 support. - Move FP code from *_switch.S to *_fpu.S - Remove unused ST_OFF from r2300_switch.S - Allow platform to specify multiple its.S files - Add #includes to various files to ensure code builds reliable and without warning.. - Remove __invalidate_kernel_vmap_range - Remove plat_timer_setup - Declare various variables & functions static - Abstract CPU core & VP(E) ID access through accessor functions - Store core & VP IDs in GlobalNumber-style variable - Unify checks for sibling CPUs - Add CPU cluster number accessors - Prevent direct use of generic_defconfig - Make CONFIG_MIPS_MT_SMP default y - Add __ioread64_copy - Remove unnecessary inclusions of linux/irqchip/mips-gic.h GIC: - Introduce asm/mips-gic.h with accessor functions - Use new GIC accessor functions in mips-gic-timer - Remove counter access functions from irq-mips-gic.c - Remove gic_read_local_vp_id() from irq-mips-gic.c - Simplify shared interrupt pending/mask reads in irq-mips-gic.c - Simplify gic_local_irq_domain_map() in irq-mips-gic.c - Drop gic_(re)set_mask() functions in irq-mips-gic.c - Remove gic_set_polarity(), gic_set_trigger(), gic_set_dual_edge(), gic_map_to_pin() and gic_map_to_vpe() from irq-mips-gic.c. - Convert remaining shared reg access, local int mask access and remaining local reg access to new accessors - Move GIC_LOCAL_INT_* to asm/mips-gic.h - Remove GIC_CPU_INT* macros from irq-mips-gic.c - Move various definitions to the driver - Remove gic_get_usm_range() - Remove __gic_irq_dispatch() forward declaration - Remove gic_init() - Use mips_gic_present() in place of gic_present and remove gic_present - Move gic_get_c0_*_int() to asm/mips-gic.h - Remove linux/irqchip/mips-gic.h - Inline __gic_init() - Inline gic_basic_init() - Make pcpu_masks a per-cpu variable - Use pcpu_masks to avoid reading GIC_SH_MASK* - Clean up mti, reserved-cpu-vectors handling - Use cpumask_first_and() in gic_set_affinity() - Let the core set struct irq_common_data affinity microMIPS: - Fix microMIPS stack unwinding on big endian systems MIPS-GIC: - SYNC after enabling GIC region NUMA: - Remove the unused parent_node() macro R6: - Constify r2_decoder_tables - Add accessor & bit definitions for GlobalNumber SMP: - Constify smp ops - Allow boot_secondary SMP op to return errors VDSO: - Drop gic_get_usm_range() usage - Avoid use of linux/irqchip/mips-gic.h Platform changes: Alchemy: - Add devboard machine type to cpuinfo - update cpu feature overrides - Threaded carddetect irqs for devboards AR7: - allow NULL clock for clk_get_rate BCM63xx: - Fix ENETDMA_6345_MAXBURST_REG offset - Allow NULL clock for clk_get_rate CI20: - Enable GPIO and RTC drivers in defconfig - Add ethernet and fixed-regulator nodes to DTS Generic platform: - Move Boston and NI 169445 FIT image source to their own files - Include asm/bootinfo.h for plat_fdt_relocated() - Include asm/time.h for get_c0_*_int() - Include asm/bootinfo.h for plat_fdt_relocated() - Include asm/time.h for get_c0_*_int() - Allow filtering enabled boards by requirements - Don't explicitly disable CONFIG_USB_SUPPORT - Bump default NR_CPUS to 16 JZ4700: - Probe the jz4740-rtc driver from devicetree Lantiq: - Drop check of boot select from the spi-falcon driver. - Drop check of boot select from the lantiq-flash MTD driver. - Access boot cause register in the watchdog driver through regmap - Add device tree binding documentation for the watchdog driver - Add docs for the RCU DT bindings. - Convert the fpi bus driver to a platform_driver - Remove ltq_reset_cause() and ltq_boot_select( - Switch to a proper reset driver - Switch to a new drivers/soc GPHY driver - Add an USB PHY driver for the Lantiq SoCs using the RCU module - Use of_platform_default_populate instead of __dt_register_buses - Enable MFD_SYSCON to be able to use it for the RCU MFD - Replace ltq_boot_select() with dummy implementation. Loongson 2F: - Allow NULL clock for clk_get_rate Malta: - Use new GIC accessor functions NI 169445: - Add support for NI 169445 board. - Only include in 32r2el kernels Octeon: - Add support for watchdog of 78XX SOCs. - Add support for watchdog of CN68XX SOCs. - Expose support for mips32r1, mips32r2 and mips64r1 - Enable more drivers in config file - Add support for accessing the boot vector. - Remove old boot vector code from watchdog driver - Define watchdog registers for 70xx, 73xx, 78xx, F75xx. - Make CSR functions node aware. - Allow access to CIU3 IRQ domains. - Misc cleanups in the watchdog driver Omega2+: - New board, add support and defconfig Pistachio: - Enable Root FS on NFS in defconfig Ralink: - Add Mediatek MT7628A SoC - Allow NULL clock for clk_get_rate - Explicitly request exclusive reset control in the pci-mt7620 PCI driver. SEAD3: - Only include in 32 bit kernels by default VoCore: - Add VoCore as a vendor t0 dt-bindings - Add defconfig file" * '4.14-features' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (167 commits) MIPS: Refactor handling of stack pointer in get_frame_info MIPS: Stacktrace: Fix microMIPS stack unwinding on big endian systems MIPS: microMIPS: Fix decoding of swsp16 instruction MIPS: microMIPS: Fix decoding of addiusp instruction MIPS: microMIPS: Fix detection of addiusp instruction MIPS: Handle non word sized instructions when examining frame MIPS: ralink: allow NULL clock for clk_get_rate MIPS: Loongson 2F: allow NULL clock for clk_get_rate MIPS: BCM63XX: allow NULL clock for clk_get_rate MIPS: AR7: allow NULL clock for clk_get_rate MIPS: BCM63XX: fix ENETDMA_6345_MAXBURST_REG offset mips: Save all registers when saving the frame MIPS: Add DWARF unwinding to assembly MIPS: Make SAVE_SOME more standard MIPS: Fix issues in backtraces MIPS: jz4780: DTS: Probe the jz4740-rtc driver from devicetree MIPS: Ci20: Enable RTC driver watchdog: octeon-wdt: Add support for 78XX SOCs. watchdog: octeon-wdt: Add support for cn68XX SOCs. watchdog: octeon-wdt: File cleaning. ...
2017-09-05Merge tag 'pm-4.14-rc1' of ↵Linus Torvalds7-67/+72
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki2-11/+11
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-08-30cpuidle: Make drivers initialize polling stateRafael J. Wysocki2-3/+2
Make the drivers that want to include the polling state into their states table initialize it explicitly and drop the initialization of it (which in fact is conditional, but that is not obvious from the code) from the core. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30cpuidle: Move polling state initialization code to separate fileRafael J. Wysocki3-31/+37
Move the polling state initialization code to a separate file built conditionally on CONFIG_ARCH_HAS_CPU_RELAX to get rid of the #ifdef in driver.c. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbolRafael J. Wysocki3-14/+14
On some architectures the first (index 0) idle state is a polling one and it doesn't really save energy, so there is the CPUIDLE_DRIVER_STATE_START symbol allowing some pieces of cpuidle code to avoid using that state. However, this makes the code rather hard to follow. It is better to explicitly avoid the polling state, so add a new cpuidle state flag CPUIDLE_FLAG_POLLING to mark it and make the relevant code check that flag for the first state instead of using the CPUIDLE_DRIVER_STATE_START symbol. In the ACPI processor driver that cannot always rely on the state flags (like before the states table has been set up) define a new internal symbol ACPI_IDLE_STATE_START equivalent to the CPUIDLE_DRIVER_STATE_START one and drop the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-30MIPS: Unify checks for sibling CPUsPaul Burton1-1/+1
Up until now we have open-coded checks for whether CPUs are siblings, with slight variations on whether we consider the package ID or not. This will only get more complex when we introduce cluster support, so in preparation for that this patch introduces a cpus_are_siblings() function which can be used to check whether or not 2 CPUs are siblings in a consistent manner. By checking globalnumber with the VP ID masked out this also has the neat side effect of being ready for multi-cluster systems already. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17011/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-30MIPS: Abstract CPU core & VP(E) ID access through accessor functionsPaul Burton1-1/+1
We currently have fields in struct cpuinfo_mips for the core & VP(E) ID of a particular CPU, and various pieces of code directly access those fields. This patch abstracts such access by introducing accessor functions cpu_core(), cpu_set_core(), cpu_vpe_id() & cpu_set_vpe_id() and having code that needs to access these values call those functions rather than directly accessing the struct cpuinfo_mips fields. This prepares us for changes to the way in which those values are stored in later patches. The cpu_vpe_id() function is introduced even though we already had a cpu_vpe_id() macro for a couple of reasons: 1) It's more consistent with the core, and future cluster, accessors. 2) It ensures a sensible return type without explicit casts. 3) It's generally preferable to use functions rather than macros. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17009/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2017-08-29smp: Avoid using two cache lines for struct call_single_dataYing Huang1-5/+5
struct call_single_data is used in IPIs to transfer information between CPUs. Its size is bigger than sizeof(unsigned long) and less than cache line size. Currently it is not allocated with any explicit alignment requirements. This makes it possible for allocated call_single_data to cross two cache lines, which results in double the number of the cache lines that need to be transferred among CPUs. This can be fixed by requiring call_single_data to be aligned with the size of call_single_data. Currently the size of call_single_data is the power of 2. If we add new fields to call_single_data, we may need to add padding to make sure the size of new definition is the power of 2 as well. Fortunately, this is enforced by GCC, which will report bad sizes. To set alignment requirements of call_single_data to the size of call_single_data, a struct definition and a typedef is used. To test the effect of the patch, I used the vm-scalability multiple thread swap test case (swap-w-seq-mt). The test will create multiple threads and each thread will eat memory until all RAM and part of swap is used, so that huge number of IPIs are triggered when unmapping memory. In the test, the throughput of memory writing improves ~5% compared with misaligned call_single_data, because of faster IPIs. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Huang, Ying <ying.huang@intel.com> [ Add call_single_data_t and align with size of call_single_data. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-25cpuidle: Convert to using %pOF instead of full_nameRob Herring1-10/+10
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>