summaryrefslogtreecommitdiff
path: root/kernel/softirq.c
AgeCommit message (Collapse)AuthorFilesLines
2019-02-10softirq: Don't skip softirq execution when softirq thread is parkingMatthias Kaehlcke1-1/+2
When a CPU is unplugged the kernel threads of this CPU are parked (see smpboot_park_threads()). kthread_park() is used to mark each thread as parked and wake it up, so it can complete the process of parking itselfs (see smpboot_thread_fn()). If local softirqs are pending on interrupt exit invoke_softirq() is called to process the softirqs, however it skips processing when the softirq kernel thread of the local CPU is scheduled to run. The softirq kthread is one of the threads that is parked when a CPU is unplugged. Parking the kthread wakes it up, however only to complete the parking process, not to process the pending softirqs. Hence processing of softirqs at the end of an interrupt is skipped, but not done elsewhere, which can result in warnings about pending softirqs when a CPU is unplugged: /sys/devices/system/cpu # echo 0 > cpu4/online [ ... ] NOHZ: local_softirq_pending 02 [ ... ] NOHZ: local_softirq_pending 202 [ ... ] CPU4: shutdown [ ... ] psci: CPU4 killed. Don't skip processing of softirqs at the end of an interrupt when the softirq thread of the CPU is parking. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Douglas Anderson <dianders@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Link: https://lkml.kernel.org/r/20190128234625.78241-3-mka@chromium.org
2018-10-25Merge branch 'irq-core-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The interrupt brigade came up with the following updates: - Driver for the Marvell System Error Interrupt machinery - Overhaul of the GIC-V3 ITS driver - Small updates and fixes all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) genirq: Fix race on spurious interrupt detection softirq: Fix typo in __do_softirq() comments genirq: Fix grammar s/an /a / irqchip/gic: Unify GIC priority definitions irqchip/gic-v3: Remove acknowledge loop dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller dt-bindings/interrupt-controller: Update Marvell ICU bindings irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI) arm64: marvell: Enable SEI driver irqchip/irq-mvebu-sei: Add new driver for Marvell SEI irqchip/irq-mvebu-icu: Support ICU subnodes irqchip/irq-mvebu-icu: Disociate ICU and NSR irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts irqchip/irq-mvebu-icu: Fix wrong private data retrieval dt-bindings/interrupt-controller: Fix Marvell ICU length in the example genirq/msi: Allow creation of a tree-based irqdomain for platform-msi dt-bindings: irqchip: renesas-irqc: Document r8a7744 support dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support irqchip/pdc: Setup all edge interrupts as rising edge at GIC irqchip/gic-v3-its: Allow use of LPI tables in reserved memory ...
2018-10-18softirq: Fix typo in __do_softirq() commentsYangtao Li1-3/+3
s/s/as [ mingo: Also add a missing 'the', add proper punctuation and clarify what 'swap' means here. ] Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.levin@verizon.com Cc: frederic@kernel.org Cc: joel@joelfernandes.org Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20181018142133.12341-1-tiny.windzz@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-08-30rcu: Define RCU-bh update API in terms of RCUPaul E. McKenney1-1/+0
Now that the main RCU API knows about softirq disabling and softirq's quiescent states, the RCU-bh update code can be dispensed with. This commit therefore removes the RCU-bh update-side implementation and defines RCU-bh's update-side API in terms of that of either RCU-preempt or RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option. In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect of reducing by one the number of rcuo kthreads per CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-08-30rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safePaul E. McKenney1-0/+2
One necessary step towards consolidating the three flavors of RCU is to make sure that the resulting consolidated "one flavor to rule them all" correctly handles networking denial-of-service attacks. One thing that allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every so often, and so something similar has to happen for consolidated RCU. This must be done carefully. For example, if a preemption-disabled region of code takes an interrupt which does softirq processing before returning, consolidated RCU must ignore the resulting rcu_bh_qs() invocations -- preemption is still disabled, and that means an RCU reader for the consolidated flavor. This commit therefore creates a new rcu_softirq_qs() that is called only from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region problem. This new rcu_softirq_qs() function invokes rcu_sched_qs(), rcu_preempt_qs(), and rcu_preempt_deferred_qs(). The latter call handles any deferred quiescent states. Note that __do_softirq() still invokes rcu_bh_qs(). It will continue to do so until a later stage of cleanup when the RCU-bh flavor is removed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Fix !SMP issue located by kbuild test robot. ]
2018-08-03nohz: Fix missing tick reprogram when interrupting an inline softirqFrederic Weisbecker1-1/+1
The full nohz tick is reprogrammed in irq_exit() only if the exit is not in a nesting interrupt. This stands as an optimization: whether a hardirq or a softirq is interrupted, the tick is going to be reprogrammed when necessary at the end of the inner interrupt, with even potential new updates on the timer queue. When soft interrupts are interrupted, it's assumed that they are executing on the tail of an interrupt return. In that case tick_nohz_irq_exit() is called after softirq processing to take care of the tick reprogramming. But the assumption is wrong: softirqs can be processed inline as well, ie: outside of an interrupt, like in a call to local_bh_enable() or from ksoftirqd. Inline softirqs don't reprogram the tick once they are done, as opposed to interrupt tail softirq processing. So if a tick interrupts an inline softirq processing, the next timer will neither be reprogrammed from the interrupting tick's irq_exit() nor after the interrupted softirq processing. This situation may leave the tick unprogrammed while timers are armed. To fix this, simply keep reprogramming the tick even if a softirq has been interrupted. That can be optimized further, but for now correctness is more important. Note that new timers enqueued in nohz_full mode after a softirq gets interrupted will still be handled just fine through self-IPIs triggered by the timer code. Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org # 4.14+ Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org
2018-07-17Mark HI and TASKLET softirq synchronousLinus Torvalds1-4/+8
Way back in 4.9, we committed 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), and ever since we've had small nagging issues with it. For example, we've had: 1ff688209e2e ("watchdog: core: make sure the watchdog_worker is not deferred") 8d5755b3f77b ("watchdog: softdog: fire watchdog even if softirqs do not get to run") 217f69743681 ("net: busy-poll: allow preemption in sk_busy_loop()") all of which worked around some of the effects of that commit. The DVB people have also complained that the commit causes excessive USB URB latencies, which seems to be due to the USB code using tasklets to schedule USB traffic. This seems to be an issue mainly when already living on the edge, but waiting for ksoftirqd to handle it really does seem to cause excessive latencies. Now Hanna Hawa reports that this issue isn't just limited to USB URB and DVB, but also causes timeout problems for the Marvell SoC team: "I'm facing kernel panic issue while running raid 5 on sata disks connected to Macchiatobin (Marvell community board with Armada-8040 SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean the done descriptors from the queue" The latency problem causes a panic: mv_xor_v2 f0400000.xor: dma_sync_wait: timeout! Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction We've discussed simply just reverting the original commit entirely, and also much more involved solutions (with per-softirq threads etc). This patch is intentionally stupid and fairly limited, because the issue still remains, and the other solutions either got sidetracked or had other issues. We should probably also consider the timer softirqs to be synchronous and not be delayed to ksoftirqd (since they were the issue with the earlier watchdog problems), but that should be done as a separate patch. This does only the tasklet cases. Reported-and-tested-by: Hanna Hawa <hannah@marvell.com> Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at> Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-21softirq: Reorder trace_softirqs_on to prevent lockdep splatJoel Fernandes (Google)1-1/+5
I'm able to reproduce a lockdep splat with config options: CONFIG_PROVE_LOCKING=y, CONFIG_DEBUG_LOCK_ALLOC=y and CONFIG_PREEMPTIRQ_EVENTS=y $ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable [ 26.112609] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [ 26.112636] WARNING: CPU: 0 PID: 118 at kernel/locking/lockdep.c:3854 [...] [ 26.144229] Call Trace: [ 26.144926] <IRQ> [ 26.145506] lock_acquire+0x55/0x1b0 [ 26.146499] ? __do_softirq+0x46f/0x4d9 [ 26.147571] ? __do_softirq+0x46f/0x4d9 [ 26.148646] trace_preempt_on+0x8f/0x240 [ 26.149744] ? trace_preempt_on+0x4d/0x240 [ 26.150862] ? __do_softirq+0x46f/0x4d9 [ 26.151930] preempt_count_sub+0x18a/0x1a0 [ 26.152985] __do_softirq+0x46f/0x4d9 [ 26.153937] irq_exit+0x68/0xe0 [ 26.154755] smp_apic_timer_interrupt+0x271/0x280 [ 26.156056] apic_timer_interrupt+0xf/0x20 [ 26.157105] </IRQ> The issue was this: preempt_count = 1 << SOFTIRQ_SHIFT __local_bh_enable(cnt = 1 << SOFTIRQ_SHIFT) { if (softirq_count() == (cnt && SOFTIRQ_MASK)) { trace_softirqs_on() { current->softirqs_enabled = 1; } } preempt_count_sub(cnt) { trace_preempt_on() { tracepoint() { rcu_read_lock_sched() { // jumps into lockdep Where preempt_count still has softirqs disabled, but current->softirqs_enabled is true, and we get a splat. Link: http://lkml.kernel.org/r/20180607201143.247775-1-joel@joelfernandes.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Glexiner <tglx@linutronix.de> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Todd Kjos <tkjos@google.com> Cc: Erick Reyes <erickreyes@google.com> Cc: Julia Cartwright <julia@ni.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: stable@vger.kernel.org Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: d59158162e032 ("tracing: Add support for preempt and irq enable/disable events") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-06-04Merge branch 'irq-core-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: - Consolidation of softirq pending: The softirq mask and its accessors/mutators have many implementations scattered around many architectures. Most do the same things consisting in a field in a per-cpu struct (often irq_cpustat_t) accessed through per-cpu ops. We can provide instead a generic efficient version that most of them can use. In fact s390 is the only exception because the field is stored in lowcore. - Support for level!?! triggered MSI (ARM) Over the past couple of years, we've seen some SoCs coming up with ways of signalling level interrupts using a new flavor of MSIs, where the MSI controller uses two distinct messages: one that raises a virtual line, and one that lowers it. The target MSI controller is in charge of maintaining the state of the line. This allows for a much simplified HW signal routing (no need to have hundreds of discrete lines to signal level interrupts if you already have a memory bus), but results in a departure from the current idea the kernel has of MSIs. - Support for Meson-AXG GPIO irqchip - Large stm32 irqchip rework (suspend/resume, hierarchical domains) - More SPDX conversions * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) ARM: dts: stm32: Add exti support to stm32mp157 pinctrl ARM: dts: stm32: Add exti support for stm32mp157c pinctrl/stm32: Add irq_eoi for stm32gpio irqchip irqchip/stm32: Add suspend/resume support for hierarchy domain irqchip/stm32: Add stm32mp1 support with hierarchy domain irqchip/stm32: Prepare common functions irqchip/stm32: Add host and driver data structures irqchip/stm32: Add suspend support irqchip/stm32: Add falling pending register support irqchip/stm32: Checkpatch fix irqchip/stm32: Optimizes and cleans up stm32-exti irq_domain irqchip/meson-gpio: Add support for Meson-AXG SoCs dt-bindings: interrupt-controller: New binding for Meson-AXG SoC dt-bindings: interrupt-controller: Fix the double quotes softirq/s390: Move default mutators of overwritten softirq mask to s390 softirq/x86: Switch to generic local_softirq_pending() implementation softirq/sparc: Switch to generic local_softirq_pending() implementation softirq/powerpc: Switch to generic local_softirq_pending() implementation softirq/parisc: Switch to generic local_softirq_pending() implementation softirq/ia64: Switch to generic local_softirq_pending() implementation ...
2018-05-15softirq: Eliminate unused cond_resched_softirq() macroPaul E. McKenney1-2/+1
The cond_resched_softirq() macro is not used anywhere in mainline, so this commit simplifies the kernel by eliminating it. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Tested-by: Nicholas Piggin <npiggin@gmail.com>
2018-05-14softirq/core: Turn default irq_cpustat_t to standard per-cpuFrederic Weisbecker1-2/+2
In order to optimize and consolidate softirq mask accesses, let's convert the default irq_cpustat_t implementation to per-CPU standard API. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David S. Miller <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/1525786706-22846-5-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-09softirq: Consolidate common code in tasklet_[hi]_action()Ingo Molnar1-39/+15
tasklet_action() + tasklet_hi_action() are almost identical. Move the common code from both function into __tasklet_action_common() and let both functions invoke it with different arguments. [ bigeasy: Splitted out from RT's "tasklet: Prevent tasklets from going into infinite spin in RT" and added commit message] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Julia Cartwright <juliac@eso.teric.us> Link: https://lkml.kernel.org/r/20180227164808.10093-3-bigeasy@linutronix.de
2018-03-09softirq: Consolidate common code in __tasklet_[hi]_schedule()Ingo Molnar1-12/+16
__tasklet_schedule() and __tasklet_hi_schedule() are almost identical. Move the common code from both function into __tasklet_schedule_common() and let both functions invoke it with different arguments. [ bigeasy: Splitted out from RT's "tasklet: Prevent tasklets from going into infinite spin in RT" and added commit message. Use this_cpu_ptr(headp) in __tasklet_schedule_common() as suggested by Julia Cartwright ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Julia Cartwright <juliac@eso.teric.us> Link: https://lkml.kernel.org/r/20180227164808.10093-2-bigeasy@linutronix.de
2017-12-04softirq: Eliminate cond_resched_rcu_qs() in favor of cond_resched()Paul E. McKenney1-1/+1
Now that cond_resched() also provides RCU quiescent states when needed, it can be used in place of cond_resched_rcu_qs(). This commit therefore makes this change. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: NeilBrown <neilb@suse.com> Cc: Ingo Molnar <mingo@kernel.org>
2017-11-15kmemcheck: rip it outLevin, Alexander (Sasha Levin)1-10/+0
Fix up makefiles, remove references, and git rm kmemcheck. Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexander Potapenko <glider@google.com> Cc: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-08irq/softirqs: Use lockdep to assert IRQs are disabled/enabledFrederic Weisbecker1-5/+5
Use lockdep to check that IRQs are enabled or disabled as expected. This way the sanity check only shows overhead when concurrency correctness debug code is enabled. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: David S . Miller <davem@davemloft.net> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1509980490-4285-3-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11sched/core: Remove 'task' parameter and rename tsk_restore_flags() to ↵NeilBrown1-1/+1
current_restore_flags() It is not safe for one thread to modify the ->flags of another thread as there is no locking that can protect the update. So tsk_restore_flags(), which takes a task pointer and modifies the flags, is an invitation to do the wrong thing. All current users pass "current" as the task, so no developers have accepted that invitation. It would be best to ensure it remains that way. So rename tsk_restore_flags() to current_restore_flags() and don't pass in a task_struct pointer. Always operate on current->flags. Signed-off-by: NeilBrown <neilb@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-21softirq: Display IRQ_POLL for irq-poll statisticsSagi Grimberg1-1/+1
This library was moved to the generic area and was renamed to irq-poll. Hence, update proc/softirqs output accordingly. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-15Merge tag 'gcc-plugins-v4.9-rc1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals" * tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: latent_entropy: Mark functions with __latent_entropy gcc-plugins: Add latent_entropy plugin
2016-10-10latent_entropy: Mark functions with __latent_entropyEmese Revfy1-2/+2
The __latent_entropy gcc attribute can be used only on functions and variables. If it is on a function then the plugin will instrument it for gathering control-flow entropy. If the attribute is on a variable then the plugin will initialize it with random contents. The variable must be an integer, an integer array type or a structure with integer fields. These specific functions have been selected because they are init functions (to help gather boot-time entropy), are called at unpredictable times, or they have variable loops, each of which provide some level of latent entropy. Signed-off-by: Emese Revfy <re.emese@gmail.com> [kees: expanded commit message] Signed-off-by: Kees Cook <keescook@chromium.org>
2016-10-03Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds1-21/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Yet another batch of cpu hotplug core updates and conversions: - Provide core infrastructure for multi instance drivers so the drivers do not have to keep custom lists. - Convert custom lists to the new infrastructure. The block-mq custom list conversion comes through the block tree and makes the diffstat tip over to more lines removed than added. - Handle unbalanced hotplug enable/disable calls more gracefully. - Remove the obsolete CPU_STARTING/DYING notifier support. - Convert another batch of notifier users. The relayfs changes which conflicted with the conversion have been shipped to me by Andrew. The remaining lot is targeted for 4.10 so that we finally can remove the rest of the notifiers" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) cpufreq: Fix up conversion to hotplug state machine blk/mq: Reserve hotplug states for block multiqueue x86/apic/uv: Convert to hotplug state machine s390/mm/pfault: Convert to hotplug state machine mips/loongson/smp: Convert to hotplug state machine mips/octeon/smp: Convert to hotplug state machine fault-injection/cpu: Convert to hotplug state machine padata: Convert to hotplug state machine cpufreq: Convert to hotplug state machine ACPI/processor: Convert to hotplug state machine virtio scsi: Convert to hotplug state machine oprofile/timer: Convert to hotplug state machine block/softirq: Convert to hotplug state machine lib/irq_poll: Convert to hotplug state machine x86/microcode: Convert to hotplug state machine sh/SH-X3 SMP: Convert to hotplug state machine ia64/mca: Convert to hotplug state machine ARM/OMAP/wakeupgen: Convert to hotplug state machine ARM/shmobile: Convert to hotplug state machine arm64/FP/SIMD: Convert to hotplug state machine ...
2016-09-30softirq: Let ksoftirqd do its jobEric Dumazet1-1/+15
A while back, Paolo and Hannes sent an RFC patch adding threaded-able napi poll loop support : (https://patchwork.ozlabs.org/patch/620657/) The problem seems to be that softirqs are very aggressive and are often handled by the current process, even if we are under stress and that ksoftirqd was scheduled, so that innocent threads would have more chance to make progress. This patch makes sure that if ksoftirq is running, we let it perform the softirq work. Jonathan Corbet summarized the issue in https://lwn.net/Articles/687617/ Tested: - NIC receiving traffic handled by CPU 0 - UDP receiver running on CPU 0, using a single UDP socket. - Incoming flood of UDP packets targeting the UDP socket. Before the patch, the UDP receiver could almost never get CPU cycles and could only receive ~2,000 packets per second. After the patch, CPU cycles are split 50/50 between user application and ksoftirqd/0, and we can effectively read ~900,000 packets per second, a huge improvement in DOS situation. (Note that more packets are now dropped by the NIC itself, since the BH handlers get less CPU cycles to drain RX ring buffer) Since the load runs in well identified threads context, an admin can more easily tune process scheduling parameters if needed. Reported-by: Paolo Abeni <pabeni@redhat.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@redhat.com> Cc: Jesper Dangaard Brouer <jbrouer@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1472665349.14381.356.camel@edumazet-glaptop3.roam.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-06kernel/softirq: Convert to hotplug state machineSebastian Andrzej Siewior1-21/+6
Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160818125731.27256-7-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-03-25arch, ftrace: for KASAN put hard/soft IRQ entries into separate sectionsAlexander Potapenko1-1/+1
KASAN needs to know whether the allocation happens in an IRQ handler. This lets us strip everything below the IRQ entry point to reduce the number of unique stack traces needed to be stored. Move the definition of __irq_entry to <linux/interrupt.h> so that the users don't need to pull in <linux/ftrace.h>. Also introduce the __softirq_entry macro which is similar to __irq_entry, but puts the corresponding functions to the .softirqentry.text section. Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-29sched/debug: Fix preempt_disable_ip recording for preempt_disable()Sebastian Andrzej Siewior1-2/+2
The preempt_disable() invokes preempt_count_add() which saves the caller in ->preempt_disable_ip. It uses CALLER_ADDR1 which does not look for its caller but for the parent of the caller. Which means we get the correct caller for something like spin_lock() unless the architectures inlines those invocations. It is always wrong for preempt_disable() or local_bh_disable(). This patch makes the function get_lock_parent_ip() which tries CALLER_ADDR0,1,2 if the former is a locking function. This seems to record the preempt_disable() caller properly for preempt_disable() itself as well as for get_cpu_var() or local_bh_disable(). Steven asked for the get_parent_ip() -> get_lock_parent_ip() rename. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160226135456.GB18244@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-09Merge branch 'locking-core-for-linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "The main changes are: - mutex, completions and rtmutex micro-optimizations - lock debugging fix - various cleanups in the MCS and the futex code" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rtmutex: Optimize setting task running after being blocked locking/rwsem: Use task->state helpers sched/completion: Add lock-free checking of the blocking case sched/completion: Remove unnecessary ->wait.lock serialization when reading completion state locking/mutex: Explicitly mark task as running after wakeup futex: Fix argument handling in futex_lock_pi() calls doc: Fix misnamed FUTEX_CMP_REQUEUE_PI op constants locking/Documentation: Update code path softirq/preempt: Add missing current->preempt_disable_ip update locking/osq: No need for load/acquire when acquire-polling locking/mcs: Better differentiate between MCS variants locking/mutex: Introduce ww_mutex_set_context_slowpath() locking/mutex: Move MCS related comments to proper location locking/mutex: Checking the stamp is WW only
2015-01-14ksoftirqd: Use new cond_resched_rcu_qs() functionPaul E. McKenney1-6/+1
Simplify run_ksoftirqd() by using the new cond_resched_rcu_qs() function that conditionally reschedules, but unconditionally supplies an RCU quiescent state. This commit is separate from the previous commit by Calvin Owens because Calvin's approach can be backported, while this commit cannot be. The reason that this commit cannot be backported is that cond_resched_rcu_qs() does not always provide the needed quiescent state in earlier kernels. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14ksoftirqd: Enable IRQs and call cond_resched() before poking RCUCalvin Owens1-1/+5
While debugging an issue with excessive softirq usage, I encountered the following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread infrastructure"): [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ] ...but despite this note, the patch still calls RCU with IRQs disabled. This seemingly innocuous change caused a significant regression in softirq CPU usage on the sending side of a large TCP transfer (~1 GB/s): when introducing 0.01% packet loss, the softirq usage would jump to around 25%, spiking as high as 50%. Before the change, the usage would never exceed 5%. Moving the call to rcu_note_context_switch() after the cond_sched() call, as it was originally before the hotplug patch, completely eliminated this problem. Signed-off-by: Calvin Owens <calvinowens@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-14softirq/preempt: Add missing current->preempt_disable_ip updateHeiko Carstens1-1/+5
While debugging some "sleeping function called from invalid context" bug I realized that the debugging message "Preemption disabled at:" pointed to an incorrect function. In particular if the last function/action that disabled preemption was spin_lock_bh() then current->preempt_disable_ip won't be updated. The reason for this is that __local_bh_disable_ip() will increase preempt_count manually instead of calling preempt_count_add(), which would handle the update correctly. It look like the manual handling was done to work around some lockdep issue. So add the missing update of current->preempt_disable_ip to __local_bh_disable_ip() as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150107090441.GC4365@osiris Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-03rcu: Remove "cpu" argument to rcu_note_context_switch()Paul E. McKenney1-1/+1
The "cpu" argument to rcu_note_context_switch() is always the current CPU, so drop it. This in turn allows the "cpu" argument to rcu_preempt_note_context_switch() to be removed, which allows the sole use of "cpu" in both functions to be replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses of these functions has been replaced by NO_HZ_FULL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-10-15Merge branch 'for-3.18-consistent-ops' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu consistent-ops changes from Tejun Heo: "Way back, before the current percpu allocator was implemented, static and dynamic percpu memory areas were allocated and handled separately and had their own accessors. The distinction has been gone for many years now; however, the now duplicate two sets of accessors remained with the pointer based ones - this_cpu_*() - evolving various other operations over time. During the process, we also accumulated other inconsistent operations. This pull request contains Christoph's patches to clean up the duplicate accessor situation. __get_cpu_var() uses are replaced with with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr(). Unfortunately, the former sometimes is tricky thanks to C being a bit messy with the distinction between lvalues and pointers, which led to a rather ugly solution for cpumask_var_t involving the introduction of this_cpu_cpumask_var_ptr(). This converts most of the uses but not all. Christoph will follow up with the remaining conversions in this merge window and hopefully remove the obsolete accessors" * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits) irqchip: Properly fetch the per cpu offset percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write. percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t Revert "powerpc: Replace __get_cpu_var uses" percpu: Remove __this_cpu_ptr clocksource: Replace __this_cpu_ptr with raw_cpu_ptr sparc: Replace __get_cpu_var uses avr32: Replace __get_cpu_var with __this_cpu_write blackfin: Replace __get_cpu_var uses tile: Use this_cpu_ptr() for hardware counters tile: Replace __get_cpu_var uses powerpc: Replace __get_cpu_var uses alpha: Replace __get_cpu_var ia64: Replace __get_cpu_var uses s390: cio driver &__get_cpu_var replacements s390: Replace __get_cpu_var uses mips: Replace __get_cpu_var uses MIPS: Replace __get_cpu_var uses in FPU emulator. arm: Replace __this_cpu_ptr with raw_cpu_ptr ...
2014-09-07rcu: Per-CPU operation cleanups to rcu_*_qs() functionsPaul E. McKenney1-1/+1
The rcu_bh_qs(), rcu_preempt_qs(), and rcu_sched_qs() functions use old-style per-CPU variable access and write to ->passed_quiesce even if it is already set. This commit therefore updates to use the new-style per-CPU variable access functions and avoids the spurious writes. This commit also eliminates the "cpu" argument to these functions because they are always invoked on the indicated CPU. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-08-26time: Replace __get_cpu_var usesChristoph Lameter1-2/+2
Convert uses of __get_cpu_var for creating a address from a percpu offset to this_cpu_ptr. The two cases where get_cpu_var is used to actually access a percpu variable are changed to use this_cpu_read/raw_cpu_read. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-05-22Merge branch 'rcu/next' of ↵Ingo Molnar1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: " 1. Update RCU documentation. These were posted to LKML at https://lkml.org/lkml/2014/4/28/634. 2. Miscellaneous fixes. These were posted to LKML at https://lkml.org/lkml/2014/4/28/645. 3. Torture-test changes. These were posted to LKML at https://lkml.org/lkml/2014/4/28/667. 4. Variable-name renaming cleanup, sent separately due to conflicts. This was posted to LKML at https://lkml.org/lkml/2014/5/13/854. 5. Patch to suppress RCU stall warnings while sysrq requests are being processed. This patch is the RCU portions of the patch that Rik posted to LKML at https://lkml.org/lkml/2014/4/29/457. The reason for pushing this patch ahead instead of waiting until 3.17 is that the NMI-based stack traces are messing up sysrq output, and in some cases also messing up the system as well." Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-05asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/*Andi Kleen1-2/+2
As requested by Linus add explicit __visible to the asmlinkage users. This marks functions visible to assembler. Tree sweep for rest of tree. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1398984278-29319-4-git-send-email-andi@firstfloor.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-04-29softirq: A single rcu_bh_qs() call per softirq set is enoughEric Dumazet1-3/+1
Calling rcu_bh_qs() after every softirq action is not really needed. What RCU needs is at least one rcu_bh_qs() per softirq round to note a quiescent state was passed for rcu_bh. Note for Paul and myself : this could be inlined as a single instruction and avoid smp_processor_id() (sone this_cpu_write(rcu_bh_data.passed_quiesce, 1)) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-04-28genirq: x86: Ensure that dynamic irq allocation does not conflictThomas Gleixner1-0/+5
On x86 the allocation of irq descriptors may allocate interrupts which are in the range of the GSI interrupts. That's wrong as those interrupts are hardwired and we don't have the irq domain translation like PPC. So one of these interrupts can be hooked up later to one of the devices which are hard wired to it and the io_apic init code for that particular interrupt line happily reuses that descriptor with a completely different configuration so hell breaks lose. Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs, except for a few usage sites which have not yet blown up in our face for whatever reason. But for drivers which need an irq range, like the GPIO drivers, we have no limit in place and we don't want to expose such a detail to a driver. To cure this introduce a function which an architecture can implement to impose a lower bound on the dynamic interrupt allocations. Implement it for x86 and set the lower bound to nr_gsi_irqs, which is the end of the hardwired interrupt space, so all dynamic allocations happen above. That not only allows the GPIO driver to work sanely, it also protects the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and htirq code. They need to be cleaned up as well, but that's a separate issue. Reported-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mathias Nyman <mathias.nyman@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Krogerus Heikki <heikki.krogerus@intel.com> Cc: Linus Walleij <linus.walleij@linaro.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-03-19softirq: Add linux/irq.h to make it compile againThomas Gleixner1-0/+1
On Sparc and S390 the removal of irq.h from kernel_stat.h causes: kernel/softirq.c:774:9: error: 'NR_IRQS_LEGACY' undeclared Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-01-31Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer/dynticks updates from Ingo Molnar: "This tree contains misc dynticks updates: a fix and three cleanups" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/nohz: Fix overflow error in scheduler_tick_max_deferment() nohz_full: fix code style issue of tick_nohz_full_stop_tick nohz: Get timekeeping max deferment outside jiffies_lock tick: Rename tick_check_idle() to tick_irq_enter()
2014-01-27softirq: use const char * const for softirq_to_name, whitespace neateningJoe Perches1-16/+9
Reduce data size a little. Reduce checkpatch noise. $ size kernel/softirq.o* text data bss dec hex filename 11554 6013 4008 21575 5447 kernel/softirq.o.new 11474 6093 4008 21575 5447 kernel/softirq.o.old Signed-off-by: Joe Perches <joe@perches.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27softirq: convert printks to pr_<level>Joe Perches1-2/+4
Use a more current logging style. Signed-off-by: Joe Perches <joe@perches.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-27softirq: use ffs() in __do_softirq()Joe Perches1-21/+22
Possible speed improvement of __do_softirq() by using ffs() instead of using a while loop with an & 1 test then single bit shift. Signed-off-by: Joe Perches <joe@perches.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-25Merge branch 'timers/core' of ↵Ingo Molnar1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/urgent Pull dynticks cleanups from Frederic Weisbecker. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-20Merge branch 'timers-core-for-linus' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer changes from Ingo Molnar: - ARM clocksource/clockevent improvements and fixes - generic timekeeping updates: TAI fixes/improvements, cleanups - Posix cpu timer cleanups and improvements - dynticks updates: full dynticks bugfixes, optimizations and cleanups * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) clocksource: Timer-sun5i: Switch to sched_clock_register() timekeeping: Remove comment that's mostly out of date rtc-cmos: Add an alarm disable quirk timekeeper: fix comment typo for tk_setup_internals() timekeeping: Fix missing timekeeping_update in suspend path timekeeping: Fix CLOCK_TAI timer/nanosleep delays tick/timekeeping: Call update_wall_time outside the jiffies lock timekeeping: Avoid possible deadlock from clock_was_set_delayed timekeeping: Fix potential lost pv notification of time change timekeeping: Fix lost updates to tai adjustment clocksource: sh_cmt: Add clk_prepare/unprepare support clocksource: bcm_kona_timer: Remove unused bcm_timer_ids clocksource: vt8500: Remove deprecated IRQF_DISABLED clocksource: tegra: Remove deprecated IRQF_DISABLED clocksource: misc drivers: Remove deprecated IRQF_DISABLED clocksource: sh_mtu2: Remove unnecessary platform_set_drvdata() clocksource: sh_tmu: Remove unnecessary platform_set_drvdata() clocksource: armada-370-xp: Enable timer divider only when needed clocksource: clksrc-of: Warn if no clock sources are found clocksource: orion: Switch to sched_clock_register() ...
2014-01-20Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-31/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: - Add the initial implementation of SCHED_DEADLINE support: a real-time scheduling policy where tasks that meet their deadlines and periodically execute their instances in less than their runtime quota see real-time scheduling and won't miss any of their deadlines. Tasks that go over their quota get delayed (Available to privileged users for now) - Clean up and fix preempt_enable_no_resched() abuse all around the tree - Do sched_clock() performance optimizations on x86 and elsewhere - Fix and improve auto-NUMA balancing - Fix and clean up the idle loop - Apply various cleanups and fixes * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) sched: Fix __sched_setscheduler() nice test sched: Move SCHED_RESET_ON_FORK into attr::sched_flags sched: Fix up attr::sched_priority warning sched: Fix up scheduler syscall LTP fails sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls sched/core: Fix htmldocs warnings sched/deadline: No need to check p if dl_se is valid sched/deadline: Remove unused variables sched/deadline: Fix sparse static warnings m68k: Fix build warning in mac_via.h sched, thermal: Clean up preempt_enable_no_resched() abuse sched, net: Fixup busy_loop_us_clock() sched, net: Clean up preempt_enable_no_resched() abuse sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding sched/preempt, locking: Rework local_bh_{dis,en}able() sched/clock, x86: Avoid a runtime condition in native_sched_clock() sched/clock: Fix up clear_sched_clock_stable() sched/clock, x86: Use a static_key for sched_clock_stable sched/clock: Remove local_irq_disable() from the clocks sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs ...
2014-01-15tick: Rename tick_check_idle() to tick_irq_enter()Frederic Weisbecker1-1/+1
This makes the code more symetric against the existing tick functions called on irq exit: tick_irq_exit() and tick_nohz_irq_exit(). These function are also symetric as they mirror each other's action: we start to account idle time on irq exit and we stop this accounting on irq entry. Also the tick is stopped on irq exit and timekeeping catches up with the tickless time elapsed until we reach irq entry. This rename was suggested by Peter Zijlstra a long while ago but it got forgotten in the mass. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alex Shi <alex.shi@linaro.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Kevin Hilman <khilman@linaro.org> Link: http://lkml.kernel.org/r/1387320692-28460-2-git-send-email-fweisbec@gmail.com Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-01-13sched/preempt, locking: Rework local_bh_{dis,en}able()Peter Zijlstra1-29/+6
Currently local_bh_disable() is out-of-line for no apparent reason. So inline it to save a few cycles on call/return nonsense, the function body is a single add on x86 (a few loads and store extra on load/store archs). Also expose two new local_bh functions: __local_bh_{dis,en}able_ip(unsigned long ip, unsigned int cnt); Which implement the actual local_bh_{dis,en}able() behaviour. The next patch uses the exposed @cnt argument to optimize bh lock functions. With build fixes from Jacob Pan. Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13locking: Optimize lock_bh functionsPeter Zijlstra1-2/+2
Currently all _bh_ lock functions do two preempt_count operations: local_bh_disable(); preempt_disable(); and for the unlock: preempt_enable_no_resched(); local_bh_enable(); Since its a waste of perfectly good cycles to modify the same variable twice when you can do it in one go; use the new __local_bh_{dis,en}able_ip() functions that allow us to provide a preempt_count value to add/sub. So define SOFTIRQ_LOCK_OFFSET as the offset a _bh_ lock needs to add/sub to be done in one go. As a bonus it gets rid of the preempt_enable_no_resched() usage. This reduces a 1000 loops of: spin_lock_bh(&bh_lock); spin_unlock_bh(&bh_lock); from 53596 cycles to 51995 cycles. I didn't do enough measurements to say for absolute sure that the result is significant but the the few runs I did for each suggest it is so. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-02nohz: Convert a few places to use local per cpu accessesFrederic Weisbecker1-3/+1
A few functions use remote per CPU access APIs when they deal with local values. Just do the right conversion to improve performance, code readability and debug checks. While at it, lets extend some of these function names with *_this_cpu() suffix in order to display their purpose more clearly. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2013-11-27lockdep: Simplify a bit hardirq <-> softirq transitionsFrederic Weisbecker1-21/+16
Instead of saving the hardirq state on a per CPU variable, which require an explicit call before the softirq handling and some complication, just save and restore the hardirq tracing state through functions return values and parameters. It simplifies a bit the black magic that works around the fact that softirqs can be called from hardirqs while hardirqs can nest on softirqs but those two cases have very different semantics and only the latter case assume both states. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1384906054-30676-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>