summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2015-04-03clockevents: Cleanup dead cpu explicitelyThomas Gleixner2-6/+2
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the cleanup function for a dead cpu and invoke it directly from the cpu down code. Make it conditional on CPU_HOTPLUG as well. Temporary change, will be refined in the future. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebased, added clockevents_notify() removal ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1735025.raBZdQHM3m@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03clockevents: Make tick handover explicitThomas Gleixner2-1/+2
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the tick_handover call and invoke it explicitely from the hotplug code. Temporary solution will be cleaned up in later patches. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1658173.RkEEILFiQZ@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03clockevents: Remove broadcast oneshot control leftoversRafael J. Wysocki1-2/+0
Now that all users are converted over to explicit calls into the clockevents state machine, remove the notification chain leftovers. Original-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/14018863.NQUzkFuafr@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03clockevents: Provide explicit broadcast oneshot control functionsThomas Gleixner1-0/+19
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the broadcast oneshot control into a separate function and provide inline helpers. Switch clockevents_notify() over. This will go away once all callers are converted. This also gets rid of the nested locking of clockevents_lock and broadcast_lock. The broadcast oneshot control functions do not require clockevents_lock. Only the managing functions (setup/shutdown/suspend/resume of the broadcast device require clockevents_lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Tony Lindgren <tony@atomide.com> Link: http://lkml.kernel.org/r/13000649.8qZuEDV0OA@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03clockevents: Remove the broadcast control leftoversThomas Gleixner1-3/+0
All users converted. Remove the notify leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/2076318.76XJZ8QYP3@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03clockevents: Provide explicit broadcast control functionsThomas Gleixner1-0/+25
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the broadcast control into a separate function and provide inline helpers. Switch clockevents_notify() over. This will go away once all callers are converted. This also gets rid of the nested locking of clockevents_lock and broadcast_lock. The broadcast control functions do not require clockevents_lock. Only the managing functions (setup/shutdown/suspend/resume of the broadcast device require clockevents_lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Lindgren <tony@atomide.com> Link: http://lkml.kernel.org/r/8086559.ttsuS0n1Xr@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03time, drivers/rtc: Don't bother with rtc_resume() for the nonstop clocksourceXunlei Pang1-6/+3
If a system does not provide a persistent_clock(), the time will be updated on resume by rtc_resume(). With the addition of the non-stop clocksources for suspend timing, those systems set the time on resume in timekeeping_resume(), but may not provide a valid persistent_clock(). This results in the rtc_resume() logic thinking no one has set the time and it then will over-write the suspend time again, which is not necessary and only increases clock error. So, fix this for rtc_resume(). This patch also improves the name of persistent_clock_exist to make it more grammatical. Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427945681-29972-19-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03drivers/rtc: Provide y2038 safe rtc_class_ops.set_mmss() replacementXunlei Pang1-0/+1
Currently the rtc_class_op's set_mmss() function takes a 32-bit second value (on 32-bit systems), which is problematic for dates past y2038. This patch provides a safe version named set_mmss64() using y2038 safe time64_t. After this patch, set_mmss() is deprecated and all its users will be fixed to use set_mmss64(), it can be removed when having no users. Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> [jstultz: Add whitespace fix for checkpatch] Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427945681-29972-8-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03time: Add y2038 safe update_persistent_clock64()Xunlei Pang1-0/+1
As part of addressing in-kernel y2038 issues, this patch adds update_persistent_clock64() and replaces all the call sites of update_persistent_clock() with this function. This is a __weak implementation, which simply calls the existing y2038 unsafe update_persistent_clock(). This allows architecture specific implementations to be converted independently, and eventually y2038-unsafe update_persistent_clock() can be removed after all its architecture specific implementations have been converted to update_persistent_clock64(). Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427945681-29972-4-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03time: Add y2038 safe read_persistent_clock64()Xunlei Pang1-0/+1
As part of addressing in-kernel y2038 issues, this patch adds read_persistent_clock64() and replaces all the call sites of read_persistent_clock() with this function. This is a __weak implementation, which simply calls the existing y2038 unsafe read_persistent_clock(). This allows architecture specific implementations to be converted independently, and eventually the y2038 unsafe read_persistent_clock() can be removed after all its architecture specific implementations have been converted to read_persistent_clock64(). Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427945681-29972-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03time: Add y2038 safe read_boot_clock64()Xunlei Pang1-0/+1
As part of addressing in-kernel y2038 issues, this patch adds read_boot_clock64() and replaces all the call sites of read_boot_clock() with this function. This is a __weak implementation, which simply calls the existing y2038 unsafe read_boot_clock(). This allows architecture specific implementations to be converted independently, and eventually the y2038 unsafe read_boot_clock() can be removed after all its architecture specific implementations have been converted to read_boot_clock64(). Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1427945681-29972-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02clockevents: Fix cpu_down() race for hrtimer based broadcastingPreeti U Murthy1-0/+6
It was found when doing a hotplug stress test on POWER, that the machine either hit softlockups or rcu_sched stall warnings. The issue was traced to commit: 7cba160ad789 ("powernv/cpuidle: Redesign idle states management") which exposed the cpu_down() race with hrtimer based broadcast mode: 5d1638acb9f6 ("tick: Introduce hrtimer based broadcast") The race is the following: Assume CPU1 is the CPU which holds the hrtimer broadcasting duty before it is taken down. CPU0 CPU1 cpu_down() take_cpu_down() disable_interrupts() cpu_die() while (CPU1 != CPU_DEAD) { msleep(100); switch_to_idle(); stop_cpu_timer(); schedule_broadcast(); } tick_cleanup_cpu_dead() take_over_broadcast() So after CPU1 disabled interrupts it cannot handle the broadcast hrtimer anymore, so CPU0 will be stuck forever. Fix this by explicitly taking over broadcast duty before cpu_die(). This is a temporary workaround. What we really want is a callback in the clockevent device which allows us to do that from the dying CPU by pushing the hrtimer onto a different cpu. That might involve an IPI and is definitely more complex than this immediate fix. Changelog was picked up from: https://lkml.org/lkml/2015/2/16/213 Suggested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: mpe@ellerman.id.au Cc: nicolas.pitre@linaro.org Cc: peterz@infradead.org Cc: rjw@rjwysocki.net Fixes: http://linuxppc.10917.n7.nabble.com/offlining-cpus-breakage-td88619.html Link: http://lkml.kernel.org/r/20150330092410.24979.59887.stgit@preeti.in.ibm.com [ Merged it to the latest timer tree, renamed the callback, tidied up the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02clockevents: Clean up clockchips.hIngo Molnar1-46/+41
Do various cleanups on the clockchips.h file: - indent preprocessor blocks to make it more clear which block we are in, this also makes merge resolution easier - comment larger preprocessor blocks consistently, using the: #if FOO ... #else /* !FOO: */ ... #endif /* !FOO */ notation. - unbreak lines - etc. No change in functionality. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01arm/bL_switcher: Kill tick suspend hackeryThomas Gleixner2-21/+4
Use the new tick_suspend/resume_local() and get rid of the homebrewn implementation of these in the ARM bL switcher. The check for the cpumask is completely pointless. There is no harm to suspend a per cpu tick device unconditionally. If that's a real issue then we fix it proper at the core level and not with some completely undocumented hacks in some random core code. Move the tick internals to the core code, now that this nuisance is gone. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ rjw: Rebase, changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Link: http://lkml.kernel.org/r/1655112.Ws17YsMfN7@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01tick/xen: Provide and use tick_suspend_local() and tick_resume_local()Thomas Gleixner1-3/+3
Xen calls on every cpu into tick_resume() which is just wrong. tick_resume() is for the syscore global suspend/resume invocation. What XEN really wants is a per cpu local resume function. Provide a tick_resume_local() function and use it in XEN. Also provide a complementary tick_suspend_local() and modify tick_unfreeze() and tick_freeze(), respectively, to use the new local tick resume/suspend functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Combined two patches, rebased, modified subject/changelog. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan [ Merged to latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01clockevents: Make suspend/resume calls explicitThomas Gleixner2-2/+3
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the suspend/resume() calls and invoke them directly from the call sites. No locking required at this point because these calls happen with interrupts disabled and a single cpu online. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebased on top of 4.0-rc5. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/713674030.jVm1qaHuPf@vostro.rjw.lan [ Rebased on top of latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01tick: Move core only declarations and functions to coreThomas Gleixner2-125/+24
No point to expose everything to the world. People just believe such functions can be abused for whatever purposes. Sigh. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebased on top of 4.0-rc5 ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/28017337.VbCUc39Gme@vostro.rjw.lan [ Merged to latest timers/core ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01clockevents: Remove CONFIG_GENERIC_CLOCKEVENTS_BUILDThomas Gleixner1-6/+3
This option was for simpler migration to the clock events code. Most architectures have been converted and the option has been disfunctional as a standalone option for quite some time. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5021859.jl9OC1medj@vostro.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-31Merge tag 'v4.0-rc6' into timers/core, before applying new patchesIngo Molnar46-152/+265
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27clockevents: Manage device's state separately for the coreViresh Kumar1-13/+31
'enum clock_event_mode' is used for two purposes today: - to pass mode to the driver of clockevent device::set_mode(). - for managing state of the device for clockevents core. For supporting new modes/states we have moved away from the legacy set_mode() callback to new per-mode/state callbacks. New modes/states shouldn't be exposed to the legacy (now OBSOLOTE) callbacks and so we shouldn't add new states to 'enum clock_event_mode'. Lets have separate enums for the two use cases mentioned above. Keep using the earlier enum for legacy set_mode() callback and mark it OBSOLETE. And add another enum to clearly specify the possible states of a clockevent device. This also renames the newly added per-mode callbacks to reflect state changes. We haven't got rid of 'mode' member of 'struct clock_event_device' as it is used by some of the clockevent drivers and it would automatically die down once we migrate those drivers to the new interface. It ('mode') is only updated now for the drivers using the legacy interface. Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: linaro-kernel@lists.linaro.org Cc: linaro-networking@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/b6b0143a8a57bd58352ad35e08c25424c879c0cb.1425037853.git.viresh.kumar@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27clockevents: Handle tick device's resume separatelyViresh Kumar1-2/+2
Upcoming patch will redefine possible states of a clockevent device. The RESUME mode is a special case only for tick's clockevent devices. In future it can be replaced by ->resume() callback already available for clockevent devices. Lets handle it separately so that clockevents_set_mode() only handles states valid across all devices. This also renames set_mode_resume() to tick_resume() to make it more explicit. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kevin Hilman <khilman@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: linaro-kernel@lists.linaro.org Cc: linaro-networking@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/c1b0112410870f49e7bf06958e1483eac6c15e20.1425037853.git.viresh.kumar@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Add ktime_get_tai_ns()Peter Zijlstra1-0/+5
Because it was the only clock for which we didn't have a _ns() accessor yet. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Introduce tk_fast_rawPeter Zijlstra1-0/+1
Add the NMI safe CLOCK_MONOTONIC_RAW accessor.. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.562746929@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Add timerkeeper::tkr_rawPeter Zijlstra1-2/+2
Introduce tkr_raw and make use of it. base_raw -> tkr_raw.base clock->{mult,shift} -> tkr_raw.{mult.shift} Kill timekeeping_get_ns_raw() in favour of timekeeping_get_ns(&tkr_raw), this removes all mono_raw special casing. Duplicate the updates to tkr_mono.cycle_last into tkr_raw.cycle_last, both need the same value. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.422589590@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Rename timekeeper::tkr to timekeeper::tkr_monoPeter Zijlstra1-6/+6
In preparation of adding another tkr field, rename this one to tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base, since the structure is not specific to CLOCK_MONOTONIC and the mono name got added to the tk_read_base instance. Lots of trivial churn. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-25mm: numa: slow PTE scan rate if migration failures occurMel Gorman1-4/+5
Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226 Across the board the 4.0-rc1 numbers are much slower, and the degradation is far worse when using the large memory footprint configs. Perf points straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config: - 56.07% 56.07% [kernel] [k] default_send_IPI_mask_sequence_phys - default_send_IPI_mask_sequence_phys - 99.99% physflat_send_IPI_mask - 99.37% native_send_call_func_ipi smp_call_function_many - native_flush_tlb_others - 99.85% flush_tlb_page ptep_clear_flush try_to_unmap_one rmap_walk try_to_unmap migrate_pages migrate_misplaced_page - handle_mm_fault - 99.73% __do_page_fault trace_do_page_fault do_async_page_fault + async_page_fault 0.63% native_send_call_func_single_ipi generic_exec_single smp_call_function_single This is showing excessive migration activity even though excessive migrations are meant to get throttled. Normally, the scan rate is tuned on a per-task basis depending on the locality of faults. However, if migrations fail for any reason then the PTE scanner may scan faster if the faults continue to be remote. This means there is higher system CPU overhead and fault trapping at exactly the time we know that migrations cannot happen. This patch tracks when migration failures occur and slows the PTE scanner. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-24Merge branch 'for-4.0-fixes' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fix from Tejun Heo: "One patch to fix a regression from the recent switch to blk-mq tag allocation which can cause oops on SAS-attached SATA drives" * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata: Add a new flag to destinguish sas controller
2015-03-24Merge tag 'regulator-fix-v4.0-rc5' of ↵Linus Torvalds2-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two fixes here, one typo fix in the documentation and one fix for a system hang with one of the Palmas chips caused by the use of an incorrect offset being provided for one of the registers" * tag 'regulator-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Fix documentation for regmap in the config regulator: palmas: Correct TPS659038 register definition for REGEN2
2015-03-24Merge tag 'regmap-fix-v4.0-rc5' of ↵Linus Torvalds1-62/+61
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "This patch fixes a bad interaction between the support that was added for having regmaps without devices for early system controller initialization and the trace support. There's a very good analysis of the actual issue in the commit message for the change" * tag 'regmap-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: introduce regmap_name to fix syscon regmap trace events
2015-03-23Merge remote-tracking branches 'regulator/fix/doc' and ↵Mark Brown2-1/+4
'regulator/fix/palmas' into regulator-linus
2015-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-0/+10
Pull networking fixes from David Miller: 1) Validate iov ranges before feeding them into iov_iter_init(), from Al Viro. 2) We changed copy_from_msghdr_from_user() to zero out the msg_namelen is a NULL pointer is given for the msg_name. Do the same in the compat code too. From Catalin Marinas. 3) Fix partially initialized tuples in netfilter conntrack helper, from Ian Wilson. 4) Missing continue; statement in nft_hash walker can lead to crashes, from Herbert Xu. 5) tproxy_tg6_check looks for IP6T_INV_PROTO in ->flags instead of ->invflags, fix from Pablo Neira Ayuso. 6) Incorrect memory account of TCP FINs can result in negative socket memory accounting values. Fix from Josh Hunt. 7) Don't allow virtual functions to enable VLAN promiscuous mode in be2net driver, from Vasundhara Volam. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set cx82310_eth: wait for firmware to become ready net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour be2net: use PCI MMIO read instead of config read for errors be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs be2net: Prevent VFs from enabling VLAN promiscuous mode tcp: fix tcp fin memory accounting ipv6: fix backtracking for throw routes net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check() netfilter: restore rule tracing via nfnetlink_log netfilter: nf_tables: allow to change chain policy without hook if it exists netfilter: Fix potential crash in nft_hash walker netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()
2015-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller1-0/+10
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix missing initialization of tuple structure in nfnetlink_cthelper to avoid mismatches when looking up to attach userspace helpers to flows, from Ian Wilson. 2) Fix potential crash in nft_hash when we hit -EAGAIN in nft_hash_walk(), from Herbert Xu. 3) We don't need to indicate the hook information to update the basechain default policy in nf_tables. 4) Restore tracing over nfnetlink_log due to recent rework to accomodate logging infrastructure into nf_tables. 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY. 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and REJECT6 from xt over nftables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds1-0/+1
Pull SCSI target fixes from Nicholas Bellinger: "Here are current target-pending fixes for v4.0-rc5 code that have made their way into the queue over the last weeks. The fixes this round include: - Fix long-standing iser-target logout bug related to early conn_logout_comp completion, resulting in iscsi_conn use-after-tree OOpsen. (Sagi + nab) - Fix long-standing tcm_fc bug in ft_invl_hw_context() failure handing for DDP hw offload. (DanC) - Fix incorrect use of unprotected __transport_register_session() in tcm_qla2xxx + other single local se_node_acl fabrics. (Bart) - Fix reference leak in target_submit_cmd() -> target_get_sess_cmd() for ack_kref=1 failure path. (Bart) - Fix pSCSI backend ->get_device_type() statistics OOPs with un-configured device. (Olaf + nab) - Fix virtual LUN=0 target_configure_device failure OOPs at modprobe time. (Claudio + nab) - Fix FUA write false positive failure regression in v4.0-rc1 code. (Christophe Vu-Brugier + HCH)" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 target: Fix virtual LUN=0 target_configure_device failure OOPs target/pscsi: Fix NULL pointer dereference in get_device_type tcm_fc: missing curly braces in ft_invl_hw_context() target: Fix reference leak in target_get_sess_cmd() error path loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session tcm_qla2xxx: Fix incorrect use of __transport_register_session iscsi-target: Avoid early conn_logout_comp for iser connections Revert "iscsi-target: Avoid IN_LOGOUT failure case for iser-target" target: Disallow changing of WRITE cache/FUA attrs after export
2015-03-21Merge tag 'dm-4.0-fixes' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull devicemapper fixes from Mike Snitzer: "A handful of stable fixes for DM: - fix thin target to always zero-fill reads to unprovisioned blocks - fix to interlock device destruction's suspend from internal suspends - fix 2 snapshot exception store handover bugs - fix dm-io to cope with DISCARD and WRITE_SAME capabilities changing" * tag 'dm-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME dm snapshot: suspend merging snapshot when doing exception handover dm snapshot: suspend origin when doing exception handover dm: hold suspend_lock while suspending device during device deletion dm thin: fix to consistently zero-fill reads to unprovisioned blocks
2015-03-19target: do not reject FUA CDBs when write cache is enabled but ↵Christophe Vu-Brugier1-0/+1
emulate_write_cache is 0 A check that rejects a CDB with FUA bit set if no write cache is emulated was added by the following commit: fde9f50 target: Add sanity checks for DPO/FUA bit usage The condition is as follows: if (!dev->dev_attrib.emulate_fua_write || !dev->dev_attrib.emulate_write_cache) However, this check is wrong if the backend device supports WCE but "emulate_write_cache" is disabled. This patch uses se_dev_check_wce() (previously named spc_check_dev_wce) to invoke transport->get_write_cache() if the device has a write cache or check the "emulate_write_cache" attribute otherwise. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-03-19Merge tag 'pinctrl-v4.0-2' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Here is a slew of pin control fixes I've accumulated for the v4.0 kernel. Nothing special, just driver fixes (mainly embedded Intel it seems) and a misunderstanding regarding the stub functions was reverted: - Fix up consumer return values on pin control stubs. - Four patches fixing up the interrupt handling and sleep context save in the Baytrail driver. - Make default output directions work properly in the Cherryview driver. - Fix interrupt locking in the AT91 driver. - Fix setting interrupt generating lines as input in the sunxi driver" * tag 'pinctrl-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sun4i: GPIOs configured as irq must be set to input before reading pinctrl: at91: move lock/unlock_as_irq calls into request/release pinctrl: update direction_output function of cherryview driver pinctrl: baytrail: Save pin context over system sleep pinctrl: baytrail: Rework interrupt handling pinctrl: baytrail: Clear interrupt triggering from pins that are in GPIO mode pinctrl: baytrail: Relax GPIO request rules Revert "pinctrl: consumer: use correct retval for placeholder functions"
2015-03-19regmap: introduce regmap_name to fix syscon regmap trace eventsPhilipp Zabel1-62/+61
This patch fixes a NULL pointer dereference when enabling regmap event tracing in the presence of a syscon regmap, introduced by commit bdb0066df96e ("mfd: syscon: Decouple syscon interface from platform devices"). That patch introduced syscon regmaps that have their dev field set to NULL. The regmap trace events expect it to point to a valid struct device and feed it to dev_name(): $ echo 1 > /sys/kernel/debug/tracing/events/regmap/enable Unable to handle kernel NULL pointer dereference at virtual address 0000002c pgd = 80004000 [0000002c] *pgd=00000000 Internal error: Oops: 17 [#1] SMP ARM Modules linked in: coda videobuf2_vmalloc CPU: 0 PID: 304 Comm: kworker/0:2 Not tainted 4.0.0-rc2+ #9197 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Workqueue: events_freezable thermal_zone_device_check task: 9f25a200 ti: 9f1ee000 task.ti: 9f1ee000 PC is at ftrace_raw_event_regmap_block+0x3c/0xe4 LR is at _regmap_raw_read+0x1bc/0x1cc pc : [<803636e8>] lr : [<80365f2c>] psr: 600f0093 sp : 9f1efd78 ip : 9f1efdb8 fp : 9f1efdb4 r10: 00000004 r9 : 00000001 r8 : 00000001 r7 : 00000180 r6 : 00000000 r5 : 9f00e3c0 r4 : 00000003 r3 : 00000001 r2 : 00000180 r1 : 00000000 r0 : 9f00e3c0 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 2d91004a DAC: 00000015 Process kworker/0:2 (pid: 304, stack limit = 0x9f1ee210) Stack: (0x9f1efd78 to 0x9f1f0000) fd60: 9f1efda4 9f1efd88 fd80: 800708c0 805f9510 80927140 800f0013 9f1fc800 9eb2f490 00000000 00000180 fda0: 808e3840 00000001 9f1efdfc 9f1efdb8 80365f2c 803636b8 805f8958 800708e0 fdc0: a00f0013 803636ac 9f16de00 00000180 80927140 9f1fc800 9f1fc800 9f1efe6c fde0: 9f1efe6c 9f732400 00000000 00000000 9f1efe1c 9f1efe00 80365f70 80365d7c fe00: 80365f3c 9f1fc800 9f1fc800 00000180 9f1efe44 9f1efe20 803656a4 80365f48 fe20: 9f1fc800 00000180 9f1efe6c 9f1efe6c 9f732400 00000000 9f1efe64 9f1efe48 fe40: 803657bc 80365634 00000001 9e95f910 9f1fc800 9f1efeb4 9f1efe8c 9f1efe68 fe60: 80452ac0 80365778 9f1efe8c 9f1efe78 9e93d400 9e93d5e8 9f1efeb4 9f72ef40 fe80: 9f1efeac 9f1efe90 8044e11c 80452998 8045298c 9e93d608 9e93d400 808e1978 fea0: 9f1efecc 9f1efeb0 8044fd14 8044e0d0 ffffffff 9f25a200 9e93d608 9e481380 fec0: 9f1efedc 9f1efed0 8044fde8 8044fcec 9f1eff1c 9f1efee0 80038d50 8044fdd8 fee0: 9f1ee020 9f72ef40 9e481398 00000000 00000008 9f72ef54 9f1ee020 9f72ef40 ff00: 9e481398 9e481380 00000008 9f72ef40 9f1eff5c 9f1eff20 80039754 80038bfc ff20: 00000000 9e481380 80894100 808e1662 00000000 9e4f2ec0 00000000 9e481380 ff40: 800396f8 00000000 00000000 00000000 9f1effac 9f1eff60 8003e020 80039704 ff60: ffffffff 00000000 ffffffff 9e481380 00000000 00000000 9f1eff78 9f1eff78 ff80: 00000000 00000000 9f1eff88 9f1eff88 9e4f2ec0 8003df30 00000000 00000000 ffa0: 00000000 9f1effb0 8000eb60 8003df3c 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff Backtrace: [<803636ac>] (ftrace_raw_event_regmap_block) from [<80365f2c>] (_regmap_raw_read+0x1bc/0x1cc) r9:00000001 r8:808e3840 r7:00000180 r6:00000000 r5:9eb2f490 r4:9f1fc800 [<80365d70>] (_regmap_raw_read) from [<80365f70>] (_regmap_bus_read+0x34/0x6c) r10:00000000 r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:9f1fc800 r4:9f1fc800 [<80365f3c>] (_regmap_bus_read) from [<803656a4>] (_regmap_read+0x7c/0x144) r6:00000180 r5:9f1fc800 r4:9f1fc800 r3:80365f3c [<80365628>] (_regmap_read) from [<803657bc>] (regmap_read+0x50/0x70) r9:00000000 r8:9f732400 r7:9f1efe6c r6:9f1efe6c r5:00000180 r4:9f1fc800 [<8036576c>] (regmap_read) from [<80452ac0>] (imx_get_temp+0x134/0x1a4) r6:9f1efeb4 r5:9f1fc800 r4:9e95f910 r3:00000001 [<8045298c>] (imx_get_temp) from [<8044e11c>] (thermal_zone_get_temp+0x58/0x74) r7:9f72ef40 r6:9f1efeb4 r5:9e93d5e8 r4:9e93d400 [<8044e0c4>] (thermal_zone_get_temp) from [<8044fd14>] (thermal_zone_device_update+0x34/0xec) r6:808e1978 r5:9e93d400 r4:9e93d608 r3:8045298c [<8044fce0>] (thermal_zone_device_update) from [<8044fde8>] (thermal_zone_device_check+0x1c/0x20) r5:9e481380 r4:9e93d608 [<8044fdcc>] (thermal_zone_device_check) from [<80038d50>] (process_one_work+0x160/0x3d4) [<80038bf0>] (process_one_work) from [<80039754>] (worker_thread+0x5c/0x4f4) r10:9f72ef40 r9:00000008 r8:9e481380 r7:9e481398 r6:9f72ef40 r5:9f1ee020 r4:9f72ef54 [<800396f8>] (worker_thread) from [<8003e020>] (kthread+0xf0/0x108) r10:00000000 r9:00000000 r8:00000000 r7:800396f8 r6:9e481380 r5:00000000 r4:9e4f2ec0 [<8003df30>] (kthread) from [<8000eb60>] (ret_from_fork+0x14/0x34) r7:00000000 r6:00000000 r5:8003df30 r4:9e4f2ec0 Code: e3140040 1a00001a e3140020 1a000016 (e596002c) ---[ end trace 193c15c2494ec960 ]--- Fixes: bdb0066df96e (mfd: syscon: Decouple syscon interface from platform devices) Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds4-1/+13
Pull networking fixes from David Miller: 1) Fix packet header offset calculation in _decode_session6(), from Hajime Tazaki. 2) Fix route leak in error paths of xfrm_lookup(), from Huaibin Wang. 3) Be sure to clear state properly when scans fail in iwlwifi mvm code, from Luciano Coelho. 4) iwlwifi tries to stop scans that aren't actually running, also from Luciano Coelho. 5) mac80211 should drop mesh frames that are not encrypted, fix from Bob Copeland. 6) Add new device ID to b43 wireless driver for BCM432228 chips, from Rafał Miłecki. 7) Fix accidental addition of members after variable sized array in struct tc_u_hnode, from WANG Cong. 8) Don't re-enable interrupts until after we call napi_complete() in ibmveth and WIZnet drivers, frm Yongbae Park. 9) Fix regression in vlan tag handling of fec driver, from Fugang Duan. 10) If a network namespace change fails during rtnl_newlink(), we don't unwind the device registry properly. 11) Fix two TCP regressions, from Neal Cardwell: - Don't allow snd_cwnd_cnt to accumulate huge values due to missing test in tcp_cong_avoid_ai(). - Restore CUBIC back to advancing cwnd by 1.5x packets per RTT. 12) Fix performance regression in xne-netback involving push TX notifications, from David Vrabel. 13) __skb_tstamp_tx() can be called with a NULL sk pointer, do not dereference blindly. From Willem de Bruijn. 14) Fix potential stack overflow in RDS protocol stack, from Arnd Bergmann. 15) VXLAN_VID_MASK used incorrectly in new remote checksum offload support of VXLAN driver. Fix from Alexey Kodanev. 16) Fix too small netlink SKB allocation in inet_diag layer, from Eric Dumazet. 17) ieee80211_check_combinations() does not count interfaces correctly, from Andrei Otcheretianski. 18) Hardware feature determination in bxn2x driver references a piece of software state that actually isn't initialized yet, fix from Michal Schmidt. 19) inet_csk_wait_for_connect() needs a sched_annotate_sleep() annoation, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) Revert "net: cx82310_eth: use common match macro" net/mlx4_en: Set statistics bitmap at port init IB/mlx4: Saturate RoCE port PMA counters in case of overflow net/mlx4_en: Fix off-by-one in ethtool statistics display IB/mlx4: Verify net device validity on port change event act_bpf: allow non-default TC_ACT opcodes as BPF exec outcome Revert "smc91x: retrieve IRQ and trigger flags in a modern way" inet: Clean up inet_csk_wait_for_connect() vs. might_sleep() ip6_tunnel: fix error code when tunnel exists netdevice.h: fix ndo_bridge_* comments bnx2x: fix encapsulation features on 57710/57711 mac80211: ignore CSA to same channel nl80211: ignore HT/VHT capabilities without QoS/WMM mac80211: ask for ECSA IE to be considered for beacon parse CRC mac80211: count interfaces correctly for combination checks isdn: icn: use strlcpy() when parsing setup options rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() caif: fix MSG_OOB test in caif_seqpkt_recvmsg() bridge: reset bridge mtu after deleting an interface can: kvaser_usb: Fix tx queue start/stop race conditions ...
2015-03-19ata: Add a new flag to destinguish sas controllerShaohua Li1-0/+1
SAS controller has its own tag allocation, which doesn't directly match to ATA tag, so SAS and SATA have different code path for ata tags. Originally we use port->scsi_host (98bd4be1) to destinguish SAS controller, but libsas set ->scsi_host too, so we can't use it for the destinguish, we add a new flag for this purpose. Without this patch, the following oops can happen because scsi-mq uses a host-wide tag map shared among all devices with some integer tag values >= ATA_MAX_QUEUE. These unexpectedly high tag values cause __ata_qc_from_tag() to return NULL, which is then dereferenced in ata_qc_new_init(). BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120 PGD 32adf0067 PUD 32adf1067 PMD 0 Oops: 0002 [#1] SMP DEBUG_PAGEALLOC Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi igb i2c_algo_bit ptp pps_core pm80xx libsas scsi_transport_sas sg coretemp eeprom w83795 i2c_i801 CPU: 4 PID: 1450 Comm: cydiskbench Not tainted 4.0.0-rc3 #1 Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b 05/04/12 task: ffff8800ba86d500 ti: ffff88032a064000 task.ti: ffff88032a064000 RIP: 0010:[<ffffffff804fd46e>] [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120 RSP: 0018:ffff88032a067858 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff8800ba0d2230 RCX: 000000000000002a RDX: ffffffff80505ae0 RSI: 0000000000000020 RDI: ffff8800ba0d2230 RBP: ffff88032a067868 R08: 0000000000000201 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba0d0000 R13: ffff8800ba0d2230 R14: ffffffff80505ae0 R15: ffff8800ba0d0000 FS: 0000000041223950(0063) GS:ffff88033e480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000058 CR3: 000000032a0a3000 CR4: 00000000000006e0 Stack: ffff880329eee758 ffff880329eee758 ffff88032a0678a8 ffffffff80502dad ffff8800ba167978 ffff880329eee758 ffff88032bf9c520 ffff8800ba167978 ffff88032bf9c520 ffff88032bf9a290 ffff88032a0678b8 ffffffff80506909 Call Trace: [<ffffffff80502dad>] ata_scsi_translate+0x3d/0x1b0 [<ffffffff80506909>] ata_sas_queuecmd+0x149/0x2a0 [<ffffffffa0046650>] sas_queuecommand+0xa0/0x1f0 [libsas] [<ffffffff804ea544>] scsi_dispatch_cmd+0xd4/0x1a0 [<ffffffff804eb50f>] scsi_queue_rq+0x66f/0x7f0 [<ffffffff803e5098>] __blk_mq_run_hw_queue+0x208/0x3f0 [<ffffffff803e54b8>] blk_mq_run_hw_queue+0x88/0xc0 [<ffffffff803e5c74>] blk_mq_insert_request+0xc4/0x130 [<ffffffff803e0b63>] blk_execute_rq_nowait+0x73/0x160 [<ffffffffa0023fca>] sg_common_write+0x3da/0x720 [sg] [<ffffffffa0025100>] sg_new_write+0x250/0x360 [sg] [<ffffffffa0025feb>] sg_write+0x13b/0x450 [sg] [<ffffffff8032ec91>] vfs_write+0xd1/0x1b0 [<ffffffff8032ee54>] SyS_write+0x54/0xc0 [<ffffffff80689932>] system_call_fastpath+0x12/0x17 tj: updated description. Fixes: 12cb5ce101ab ("libata: use blk taging") Reported-and-tested-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2015-03-19netfilter: restore rule tracing via nfnetlink_logPablo Neira Ayuso1-0/+10
Since fab4085 ("netfilter: log: nf_log_packet() as real unified interface"), the loginfo structure that is passed to nf_log_packet() is used to explicitly indicate the logger type you want to use. This is a problem for people tracing rules through nfnetlink_log since packets are always routed to the NF_LOG_TYPE logger after the aforementioned patch. We can fix this by removing the trace loginfo structures, but that still changes the log level from 4 to 5 for tracing messages and there may be someone relying on this outthere. So let's just introduce a new nf_log_trace() function that restores the former behaviour. Reported-by: Markus Kötter <koetter@rrzn.uni-hannover.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-18Merge branch 'for-linus' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fix from Jiri Kosina: - fix for potential race with module loading, from Petr Mladek. The race is very unlikely to be seen in real world and has been found by code inspection, but should be fixed for 4.0 anyway. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Fix subtle race with coming and going modules
2015-03-18regulator: Fix documentation for regmap in the configAxel Lin1-1/+1
dev_get_regulator() does not exist, fix the typo. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-03-17netdevice.h: fix ndo_bridge_* commentsNicolas Dichtel1-1/+4
The argument 'flags' was missing in ndo_bridge_setlink(). ndo_bridge_dellink() was missing. Fixes: 407af3299ef1 ("bridge: Add netlink interface to configure vlans on bridge ports") Fixes: add511b38266 ("bridge: add flags argument to ndo_bridge_setlink and ndo_bridge_dellink") CC: Vlad Yasevich <vyasevic@redhat.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds2-4/+16
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio fixes from Rusty Russell: "Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed more minor issues with our virtio 1.0 drivers just introduced in the kernel. (I would normally use my fixes branch for this, but there were a batch of them...)" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: virtio_mmio: fix access width for mmio uapi/virtio_scsi: allow overriding CDB/SENSE size virtio_mmio: generation support virtio_rpmsg: set DRIVER_OK before using device 9p/trans_virtio: fix hot-unplug virtio-balloon: do not call blocking ops when !TASK_RUNNING virtio_blk: fix comment for virtio 1.0 virtio_blk: typo fix virtio_balloon: set DRIVER_OK before using device virtio_console: avoid config access from irq virtio_console: init work unconditionally
2015-03-17Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+1
Pull kvm fixes from Marcelo Tosatti: "KVM bug fixes (ARM and x86)" * git://git.kernel.org/pub/scm/virt/kvm/kvm: arm/arm64: KVM: Keep elrsr/aisr in sync with software model KVM: VMX: Set msr bitmap correctly if vcpu is in guest mode arm/arm64: KVM: fix missing unlock on error in kvm_vgic_create() kvm: x86: i8259: return initialized data on invalid-size read arm64: KVM: Fix outdated comment about VTCR_EL2.PS arm64: KVM: Do not use pgd_index to index stage-2 pgd arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting kvm: move advertising of KVM_CAP_IRQFD to common code
2015-03-17regulator: palmas: Correct TPS659038 register definition for REGEN2Keerthy1-0/+3
The register offset for REGEN2_CTRL in different for TPS659038 chip as when compared with other Palmas family PMICs. In the case of TPS659038 the wrong offset pointed to PLLEN_CTRL and was causing a hang. Correcting the same. Signed-off-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
2015-03-17livepatch: Fix subtle race with coming and going modulesPetr Mladek1-0/+4
There is a notifier that handles live patches for coming and going modules. It takes klp_mutex lock to avoid races with coming and going patches but it does not keep the lock all the time. Therefore the following races are possible: 1. The notifier is called sometime in STATE_MODULE_COMING. The module is visible by find_module() in this state all the time. It means that new patch can be registered and enabled even before the notifier is called. It might create wrong order of stacked patches, see below for an example. 2. New patch could still see the module in the GOING state even after the notifier has been called. It will try to initialize the related object structures but the module could disappear at any time. There will stay mess in the structures. It might even cause an invalid memory access. This patch solves the problem by adding a boolean variable into struct module. The value is true after the coming and before the going handler is called. New patches need to be applied when the value is true and they need to ignore the module when the value is false. Note that we need to know state of all modules on the system. The races are related to new patches. Therefore we do not know what modules will get patched. Also note that we could not simply ignore going modules. The code from the module could be called even in the GOING state until mod->exit() finishes. If we start supporting patches with semantic changes between function calls, we need to apply new patches to any still usable code. See below for an example. Finally note that the patch solves only the situation when a new patch is registered. There are no such problems when the patch is being removed. It does not matter who disable the patch first, whether the normal disable_patch() or the module notifier. There is nothing to do once the patch is disabled. Alternative solutions: ====================== + reject new patches when a patched module is coming or going; this is ugly + wait with adding new patch until the module leaves the COMING and GOING states; this might be dangerous and complicated; we would need to release kgr_lock in the middle of the patch registration to avoid a deadlock with the coming and going handlers; also we might need a waitqueue for each module which seems to be even bigger overhead than the boolean + stop modules from entering COMING and GOING states; wait until modules leave these states when they are already there; looks complicated; we would need to ignore the module that asked to stop the others to avoid a deadlock; also it is unclear what to do when two modules asked to stop others and both are in COMING state (situation when two new patches are applied) + always register/enable new patches and fix up the potential mess (registered patches order) in klp_module_init(); this is nasty and prone to regressions in the future development + add another MODULE_STATE where the kallsyms are visible but the module is not used yet; this looks too complex; the module states are checked on "many" locations Example of patch stacking breakage: =================================== The notifier could _not_ _simply_ ignore already initialized module objects. For example, let's have three patches (P1, P2, P3) for functions a() and b() where a() is from vmcore and b() is from a module M. Something like: a() b() P1 a1() b1() P2 a2() b2() P3 a3() b3(3) If you load the module M after all patches are registered and enabled. The ftrace ops for function a() and b() has listed the functions in this order: ops_a->func_stack -> list(a3,a2,a1) ops_b->func_stack -> list(b3,b2,b1) , so the pointer to b3() is the first and will be used. Then you might have the following scenario. Let's start with state when patches P1 and P2 are registered and enabled but the module M is not loaded. Then ftrace ops for b() does not exist. Then we get into the following race: CPU0 CPU1 load_module(M) complete_formation() mod->state = MODULE_STATE_COMING; mutex_unlock(&module_mutex); klp_register_patch(P3); klp_enable_patch(P3); # STATE 1 klp_module_notify(M) klp_module_notify_coming(P1); klp_module_notify_coming(P2); klp_module_notify_coming(P3); # STATE 2 The ftrace ops for a() and b() then looks: STATE1: ops_a->func_stack -> list(a3,a2,a1); ops_b->func_stack -> list(b3); STATE2: ops_a->func_stack -> list(a3,a2,a1); ops_b->func_stack -> list(b2,b1,b3); therefore, b2() is used for the module but a3() is used for vmcore because they were the last added. Example of the race with going modules: ======================================= CPU0 CPU1 delete_module() #SYSCALL try_stop_module() mod->state = MODULE_STATE_GOING; mutex_unlock(&module_mutex); klp_register_patch() klp_enable_patch() #save place to switch universe b() # from module that is going a() # from core (patched) mod->exit(); Note that the function b() can be called until we call mod->exit(). If we do not apply patch against b() because it is in MODULE_STATE_GOING, it will call patched a() with modified semantic and things might get wrong. [jpoimboe@redhat.com: use one boolean instead of two] Signed-off-by: Petr Mladek <pmladek@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-16Merge tag 'kvm-arm-fixes-4.0-rc5' of ↵Marcelo Tosatti1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm Fixes for KVM/ARM for 4.0-rc5. Fixes page refcounting issues in our Stage-2 page table management code, fixes a missing unlock in a gicv3 error path, and fixes a race that can cause lost interrupts if signals are pending just prior to entering the guest.
2015-03-16Merge branch 'master' of ↵David S. Miller1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-03-16 1) Fix the network header offset in _decode_session6 when multiple IPv6 extension headers are present. From Hajime Tazaki. 2) Fix an interfamily tunnel crash. We set outer mode protocol too early and may dispatch to the wrong address family. Move the setting of the outer mode protocol behind the last accessing of the inner mode to fix the crash. 3) Most callers of xfrm_lookup() expect that dst_orig is released on error. But xfrm_lookup_route() may need dst_orig to handle certain error cases. So introduce a flag that tells what should be done in case of error. From Huaibin Wang. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds1-0/+18
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clock framework fixes from Michael Turquette: "The clk fixes for 4.0-rc4 comprise three themes. First are the usual driver fixes for new regressions since v3.19. Second are fixes to the common clock divider type caused by recent changes to how we round clock rates. This affects many clock drivers that use this common code. Finally there are fixes for drivers that improperly compared struct clk pointers (drivers must not deref these pointers). While some of these drivers have done this for a long time, this did not cause a problem until we started generating unique struct clk pointers for every consumer. A new function, clk_is_match was introduced to get these drivers working again and they are fixed up to no longer deref the pointers themselves" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: ASoC: kirkwood: fix struct clk pointer comparing ASoC: fsl_spdif: fix struct clk pointer comparing ARM: imx: fix struct clk pointer comparing clk: introduce clk_is_match clk: don't export static symbol clk: divider: fix calculation of initial best divider when rounding to closest clk: divider: fix selection of divider when rounding to closest clk: divider: fix calculation of maximal parent rate for a given divider clk: divider: return real rate instead of divider value clk: qcom: fix platform_no_drv_owner.cocci warnings clk: qcom: fix platform_no_drv_owner.cocci warnings clk: qcom: Add PLL4 vote clock clk: qcom: lcc-msm8960: Fix PLL rate detection clk: qcom: Fix slimbus n and m val offsets clk: ti: Fix FAPLL parent enable bit handling