summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2018-06-30Merge tag 'for-linus-20180629' of git://git.kernel.dk/linux-blockLinus Torvalds2-20/+2
Pull block fixes from Jens Axboe: "Small set of fixes for this series. Mostly just minor fixes, the only oddball in here is the sg change. The sg change came out of the stall fix for NVMe, where we added a mempool and limited us to a single page allocation. CONFIG_SG_DEBUG sort-of ruins that, since we'd need to account for that. That's actually a generic problem, since lots of drivers need to allocate SG lists. So this just removes support for CONFIG_SG_DEBUG, which I added back in 2007 and to my knowledge it was never useful. Anyway, outside of that, this pull contains: - clone of request with special payload fix (Bart) - drbd discard handling fix (Bart) - SATA blk-mq stall fix (me) - chunk size fix (Keith) - double free nvme rdma fix (Sagi)" * tag 'for-linus-20180629' of git://git.kernel.dk/linux-block: sg: remove ->sg_magic member drbd: Fix drbd_request_prepare() discard handling blk-mq: don't queue more if we get a busy return block: Fix cloning of requests with a special payload nvme-rdma: fix possible double free of controller async event buffer block: Fix transfer when chunk sectors exceeds max
2018-06-29sg: remove ->sg_magic memberJens Axboe1-18/+0
This was introduced more than a decade ago when sg chaining was added, but we never really caught anything with it. The scatterlist entry size can be critical, since drivers allocate it, so remove the magic member. Recently it's been triggering allocation stalls and failures in NVMe. Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-29Merge tag 'pm-4.18-rc3' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix up recently added features (the Kryo cpufreq driver and performance states coverage in the generic power domains framework), add missing documentation for a recently added sysfs knob in the intel_pstate driver and fix an error in its documentation. Specifics: - Fix the initialization time error handling in the recently added Kryo cpufreq driver (Dan Carpenter). - Fix up the recently added coverage of performance states in the generic power domains (genpd) framework (Viresh Kumar). - Add missing documentation of the new hwp_dynamic_boost sysfs knob in the intel_pstate driver (Rafael Wysocki). - Fix incorrect sysfs path in the intel_pstate driver documentation (Rafael Wysocki)" * tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Documentation: intel_pstate: Describe hwp_dynamic_boost sysfs knob Documentation: admin-guide: intel_pstate: Fix sysfs path PM / Domains: Rename opp_node to np PM / Domains: Fix return value of of_genpd_opp_to_performance_state() cpufreq: qcom-kryo: Fix error handling in probe()
2018-06-29aio: mark __aio_sigset::sigmask constAvi Kivity1-1/+1
io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Avi Kivity <avi@scylladb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [hch: reapply the patch that got incorrectly reverted] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-28Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-1/+5
Merge fixes from Andrew Morton: "7 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: proc: add Alexey to MAINTAINERS kasan: depend on CONFIG_SLUB_DEBUG include/linux/dax.h: dax_iomap_fault() returns vm_fault_t x86/e820: put !E820_TYPE_RAM regions into memblock.reserved slub: fix failure when we delete and create a slab cache Revert mm/vmstat.c: fix vmstat_update() preemption BUG lib/percpu_ida.c: don't do alloc from per-CPU list if there is none
2018-06-28include/linux/dax.h: dax_iomap_fault() returns vm_fault_tSouptick Joarder1-1/+1
Commit 1c8f422059ae ("mm: change return type to vm_fault_t") missed a conversion. It's not a big problem at present because mainline is still using typedef int vm_fault_t; Fixes: 1c8f422059ae ("mm: change return type to vm_fault_t") Link: http://lkml.kernel.org/r/20180620172046.GA27894@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-28slub: fix failure when we delete and create a slab cacheMikulas Patocka1-0/+4
In kernel 4.17 I removed some code from dm-bufio that did slab cache merging (commit 21bb13276768: "dm bufio: remove code that merges slab caches") - both slab and slub support merging caches with identical attributes, so dm-bufio now just calls kmem_cache_create and relies on implicit merging. This uncovered a bug in the slub subsystem - if we delete a cache and immediatelly create another cache with the same attributes, it fails because of duplicate filename in /sys/kernel/slab/. The slub subsystem offloads freeing the cache to a workqueue - and if we create the new cache before the workqueue runs, it complains because of duplicate filename in sysfs. This patch fixes the bug by moving the call of kobject_del from sysfs_slab_remove_workfn to shutdown_cache. kobject_del must be called while we hold slab_mutex - so that the sysfs entry is deleted before a cache with the same attributes could be created. Running device-mapper-test-suite with: dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/ triggered: Buffer I/O error on dev dm-0, logical block 1572848, async page read device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5 device-mapper: thin: 253:1: aborting current metadata transaction sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144' CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25 Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017 Workqueue: dm-thin do_worker [dm_thin_pool] Call Trace: dump_stack+0x5a/0x73 sysfs_warn_dup+0x58/0x70 sysfs_create_dir_ns+0x77/0x80 kobject_add_internal+0xba/0x2e0 kobject_init_and_add+0x70/0xb0 sysfs_slab_add+0xb1/0x250 __kmem_cache_create+0x116/0x150 create_cache+0xd9/0x1f0 kmem_cache_create_usercopy+0x1c1/0x250 kmem_cache_create+0x18/0x20 dm_bufio_client_create+0x1ae/0x410 [dm_bufio] dm_block_manager_create+0x5e/0x90 [dm_persistent_data] __create_persistent_data_objects+0x38/0x940 [dm_thin_pool] dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool] metadata_operation_failed+0x59/0x100 [dm_thin_pool] alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool] process_cell+0x2a3/0x550 [dm_thin_pool] do_worker+0x28d/0x8f0 [dm_thin_pool] process_one_work+0x171/0x370 worker_thread+0x49/0x3f0 kthread+0xf8/0x130 ret_from_fork+0x35/0x40 kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory. kmem_cache_create(dm_bufio_buffer-16) failed with error -17 Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Mike Snitzer <snitzer@redhat.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-28Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLLLinus Torvalds12-20/+27
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-27Merge tag 'scsi-fixes' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small bug fixes (barrier elimination, memory leak on unload, spinlock recursion) and a technical enhancement left over from the merge window: the TCMU read length support is required for tape devices read when the length of the read is greater than the tape block size" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_debug: Fix memory leak on module unload scsi: qla2xxx: Spinlock recursion in qla_target scsi: ipr: Eliminate duplicate barriers scsi: target: tcmu: add read length support
2018-06-27Merge branch 'for-linus' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - the main change is a fix for my brain-dead patch to PS/2 button reporting for some protocols that made it in 4.17 - there is a new driver for Spreadtum vibrator that I intended to send during merge window but ended up not sending the 2nd pull request. Given that this is a brand new driver we should not see regressions here - a fixup to Elantech PS/2 driver to avoid decoding errors on Thinkpad P52 - addition of few more ACPI IDs for Silead and Elan drivers - RMI4 is switched to using IRQ domain code instead of rolling its own implementation * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: psmouse - fix button reporting for basic protocols Input: xpad - fix GPD Win 2 controller name Input: elan_i2c_smbus - fix more potential stack buffer overflows Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID Input: elantech - fix V4 report decoding for module with middle key Input: elantech - enable middle button of touchpads on ThinkPad P52 Input: do not assign new tracking ID when changing tool type Input: make input_report_slot_state() return boolean Input: synaptics-rmi4 - fix axis-swap behavior Input: synaptics-rmi4 - fix the error return code in rmi_probe_interrupts() Input: synaptics-rmi4 - convert irq distribution to irq_domain Input: silead - add MSSL0002 ACPI HID Input: goldfish_events - fix checkpatch warnings Input: add Spreadtrum vibrator driver
2018-06-26block: Fix transfer when chunk sectors exceeds maxKeith Busch1-2/+2
A device may have boundary restrictions where the number of sectors between boundaries exceeds its max transfer size. In this case, we need to cap the max size to the smaller of the two limits. Reported-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Tested-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: <stable@vger.kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-1/+3
Pull networking fixes from David Miller: 1) Fix netpoll OOPS in r8169, from Ville Syrjälä. 2) Fix bpf instruction alignment on powerpc et al., from Eric Dumazet. 3) Don't ignore IFLA_MTU attribute when creating new ipvlan links. From Xin Long. 4) Fix use after free in AF_PACKET, from Eric Dumazet. 5) Mis-matched RTNL unlock in xen-netfront, from Ross Lagerwall. 6) Fix VSOCK loopback on big-endian, from Claudio Imbrenda. 7) Missing RX buffer offset correction when computing DMA addresses in mvneta driver, from Antoine Tenart. 8) Fix crashes in DCCP's ccid3_hc_rx_send_feedback, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits) sfc: make function efx_rps_hash_bucket static strparser: Corrected typo in documentation. qmi_wwan: add support for the Dell Wireless 5821e module cxgb4: when disabling dcb set txq dcb priority to 0 net_sched: remove a bogus warning in hfsc net: dccp: switch rx_tstamp_last_feedback to monotonic clock net: dccp: avoid crash in ccid3_hc_rx_send_feedback() net: Remove depends on HAS_DMA in case of platform dependency MAINTAINERS: Add file patterns for dsa device tree bindings net: mscc: make sparse happy net: mvneta: fix the Rx desc DMA address in the Rx path Documentation: e1000: Fix docs build error Documentation: e100: Fix docs build error Documentation: e1000: Use correct heading adornment Documentation: e100: Use correct heading adornment ipv6: mcast: fix unsolicited report interval after receiving querys vhost_net: validate sock before trying to put its fd VSOCK: fix loopback on big-endian systems net: ethernet: ti: davinci_cpdma: make function cpdma_desc_pool_create static xen-netfront: Update features after registering netdev ...
2018-06-25PM / Domains: Rename opp_node to npViresh Kumar1-2/+2
The DT node passed here isn't necessarily an OPP node, as this routine can also be used for cases where the "required-opps" property is present directly in the device's node. Rename it. This also removes a stale comment. Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-25PM / Domains: Fix return value of of_genpd_opp_to_performance_state()Viresh Kumar1-1/+1
of_genpd_opp_to_performance_state() should return 0 for errors, but the dummy routine isn't doing that. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-24Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-11/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull rseq fixes from Thomas Gleixer: "A pile of rseq related fixups: - Prevent infinite recursion when delivering SIGSEGV - Remove the abort of rseq critical section on fork() as syscalls inside rseq critical sections are explicitely forbidden. So no point in doing the abort on the child. - Align the rseq structure on 32 bytes in the ARM selftest code. - Fix file permissions of the test script" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Avoid infinite recursion when delivering SIGSEGV rseq/cleanup: Do not abort rseq c.s. in child on fork() rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes rseq/selftests: Make run_param_test.sh executable
2018-06-24Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: "Two tiny fixes: - Add the missing machine_real_restart() to objtools noreturn list so it stops complaining - Fix a trivial comment typo" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kernel.h: Fix a typo in comment objtool: Add machine_real_restart() to the noreturn list
2018-06-24Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for x86: - Make Xen PV guest deal with speculative store bypass correctly - Address more fallout from the 5-Level pagetable handling. Undo an __initdata annotation to avoid section mismatch and malfunction when post init code would touch the freed variable. - Handle exception fixup in math_error() before calling notify_die(). The reverse call order incorrectly triggers notify_die() listeners for soemthing which is handled correctly at the site which issues the floating point instruction. - Fix an off by one in the LLC topology calculation on AMD - Handle non standard memory block sizes gracefully un UV platforms - Plug a memory leak in the microcode loader - Sanitize the purgatory build magic - Add the x86 specific device tree bindings directory to the x86 MAINTAINER file patterns" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix 'no5lvl' handling Revert "x86/mm: Mark __pgtable_l5_enabled __initdata" x86/CPU/AMD: Fix LLC ID bit-shift calculation MAINTAINERS: Add file patterns for x86 device tree bindings x86/microcode/intel: Fix memleak in save_microcode_patch() x86/platform/UV: Add kernel parameter to set memory block size x86/platform/UV: Use new set memory block size function x86/platform/UV: Add adjustable set memory block size function x86/build: Remove unnecessary preparation for purgatory Revert "kexec/purgatory: Add clean-up for purgatory directory" x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths x86: Call fixup_exception() before notify_die() in math_error()
2018-06-24Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds3-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of fixes and updates for the locking code: - Prevent lockdep from updating irq state within its own code and thereby confusing itself. - Buid fix for older GCCs which mistreat anonymous unions - Add a missing lockdep annotation in down_read_non_onwer() which causes up_read_non_owner() to emit a lockdep splat - Remove the custom alpha dec_and_lock() implementation which is incorrect in terms of ordering and use the generic one. The remaining two commits are not strictly fixes. They provide irqsave variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These are required to merge the relevant updates and cleanups into different maintainer trees for 4.19, so routing them into mainline without actual users is the sanest approach. They should have been in -rc1, but last weekend I took the liberty to just avoid computers in order to regain some mental sanity" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/qspinlock: Fix build for anonymous union in older GCC compilers locking/lockdep: Do not record IRQ state within lockdep code locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS locking/refcounts: Implement refcount_dec_and_lock_irqsave() atomic: Add irqsave variant of atomic_dec_and_lock() alpha: Remove custom dec_and_lock() implementation
2018-06-24Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds2-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes mostly for the ARM/GIC world: - Fix the MSI affinity handling in the ls-scfg irq chip driver so it updates and uses the effective affinity mask correctly - Prevent binding LPIs to offline CPUs and respect the Cavium erratum which requires that LPIs which belong to an offline NUMA node are not bound to a CPU on a different NUMA node. - Free only the amount of allocated interrupts in the GIC-V2M driver instead of trying to free log2(nrirqs). - Prevent emitting SYNC and VSYNC targetting non existing interrupt collections in the GIC-V3 ITS driver - Ensure that the GIV-V3 interrupt redistributor is correctly reprogrammed on CPU hotplug - Remove a stale unused helper function" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdesc: Delete irq_desc_get_msi_desc() irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node irqchip/gic-v2m: Fix SPI release on error path irqchip/ls-scfg-msi: Fix MSI affinity handling genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
2018-06-24Merge tag 'for-linus-20180623' of git://git.kernel.dk/linux-blockLinus Torvalds2-1/+4
Pull block fixes from Jens Axboe: - Further timeout fixes. We aren't quite there yet, so expect another round of fixes for that to completely close some of the IRQ vs completion races. (Christoph/Bart) - Set of NVMe fixes from the usual suspects, mostly error handling - Two off-by-one fixes (Dan) - Another bdi race fix (Jan) - Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron) * tag 'for-linus-20180623' of git://git.kernel.dk/linux-block: blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE bdi: Fix another oops in wb_workfn() lightnvm: Remove depends on HAS_DMA in case of platform dependency nvme-pci: limit max IO size and segments to avoid high order allocations nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl nvme-fc: release io queues to allow fast fail nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag. block: sed-opal: Fix a couple off by one bugs blk-mq-debugfs: Off by one in blk_mq_rq_state_name() nvmet: reset keep alive timer in controller enable nvme-rdma: don't override opts->queue_size nvme-rdma: Fix command completion race at error recovery nvme-rdma: fix possible free of a non-allocated async event buffer nvme-rdma: fix possible double free condition when failing to create a controller Revert "block: Add warning for bi_next not NULL in bio_endio()" block: fix timeout changes for legacy request drivers
2018-06-23Merge tag 'for-linus-4.18-rc2-tag' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "This contains the following fixes/cleanups: - the removal of a BUG_ON() which wasn't necessary and which could trigger now due to a recent change - a correction of a long standing bug happening very rarely in Xen dom0 when a hypercall buffer from user land was not accessible by the hypervisor for very short periods of time due to e.g. page migration or compaction - usage of EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() in a Xen-related driver (no breakage possible as using those symbols without others already exported via EXPORT-SYMBOL_GPL() wouldn't make any sense) - a simplification for Xen PVH or Xen ARM guests - some additional error handling for callers of xenbus_printf()" * tag 'for-linus-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Remove unnecessary BUG_ON from __unbind_from_irq() xen: add new hypercall buffer mapping device xen/scsiback: add error handling for xenbus_printf scsi: xen-scsifront: add error handling for xenbus_printf xen/grant-table: Export gnttab_{alloc|free}_pages as GPL xen: add error handling for xenbus_printf xen: share start flags between PV and PVH
2018-06-22Merge branch 'linus' into x86/urgentThomas Gleixner665-6184/+16708
Required to queue a dependent fix.
2018-06-22bdi: Fix another oops in wb_workfn()Jan Kara1-1/+1
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was WB_shutting_down after wb->bdi->dev became NULL. This indicates that unregister_bdi() failed to call wb_shutdown() on one of wb objects. The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus drops bdi's reference to wb structures before going through the list of wbs again and calling wb_shutdown() on each of them. This way the loop iterating through all wbs can easily miss a wb if that wb has already passed through cgwb_remove_from_bdi_list() called from wb_shutdown() from cgwb_release_workfn() and as a result fully shutdown bdi although wb_workfn() for this wb structure is still running. In fact there are also other ways cgwb_bdi_unregister() can race with cgwb_release_workfn() leading e.g. to use-after-free issues: CPU1 CPU2 cgwb_bdi_unregister() cgwb_kill(*slot); cgwb_release() queue_work(cgwb_release_wq, &wb->release_work); cgwb_release_workfn() wb = list_first_entry(&bdi->wb_list, ...) spin_unlock_irq(&cgwb_lock); wb_shutdown(wb); ... kfree_rcu(wb, rcu); wb_shutdown(wb); -> oops use-after-free We solve these issues by synchronizing writeback structure shutdown from cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That way we also no longer need synchronization using WB_shutting_down as the mutex provides it for CONFIG_CGROUP_WRITEBACK case and without CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from bdi_unregister(). Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-22rseq: Avoid infinite recursion when delivering SIGSEGVWill Deacon1-7/+11
When delivering a signal to a task that is using rseq, we call into __rseq_handle_notify_resume() so that the registers pushed in the sigframe are updated to reflect the state of the restartable sequence (for example, ensuring that the signal returns to the abort handler if necessary). However, if the rseq management fails due to an unrecoverable fault when accessing userspace or certain combinations of RSEQ_CS_* flags, then we will attempt to deliver a SIGSEGV. This has the potential for infinite recursion if the rseq code continuously fails on signal delivery. Avoid this problem by using force_sigsegv() instead of force_sig(), which is explicitly designed to reset the SEGV handler to SIG_DFL in the case of a recursive fault. In doing so, remove rseq_signal_deliver() from the internal rseq API and have an optional struct ksignal * parameter to rseq_handle_notify_resume() instead. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: peterz@infradead.org Cc: paulmck@linux.vnet.ibm.com Cc: boqun.feng@gmail.com Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
2018-06-22irqdesc: Delete irq_desc_get_msi_desc()John Garry1-5/+0
Function irq_desc_get_msi_desc() is not referenced in the kernel (and does not seem to have been referenced since e39758e0ea76, 3 years ago), so delete it. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <marc.zyngier@arm.com> Cc: <will.deacon@arm.com> Cc: <kstewart@linuxfoundation.org> Cc: <julien.thierry@arm.com> Cc: <andrew@lunn.ch> Cc: <trivial@kernel.org> Link: https://lkml.kernel.org/r/1529667333-92959-1-git-send-email-john.garry@huawei.com
2018-06-22genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debugMarc Zyngier1-0/+1
Debug is missing the IRQCHIP_SUPPORTS_LEVEL_MSI debug entry, making debugfs slightly less useful. Take this opportunity to also add a missing comment in the definition of IRQCHIP_SUPPORTS_LEVEL_MSI. Fixes: 6988e0e0d283 ("genirq/msi: Limit level-triggered MSI to platform devices") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-2-marc.zyngier@arm.com
2018-06-22locking/qspinlock: Fix build for anonymous union in older GCC compilersSteven Rostedt (VMware)1-1/+1
One of my tests compiles the kernel with gcc 4.5.3, and I hit the following build error: include/linux/semaphore.h: In function 'sema_init': include/linux/semaphore.h:35:17: error: unknown field 'val' specified in initializer include/linux/semaphore.h:35:17: warning: missing braces around initializer include/linux/semaphore.h:35:17: warning: (near initialization for '(anonymous).raw_lock.<anonymous>.val') I bisected it down to: 625e88be1f41 ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'") ... which makes qspinlock have an anonymous union, which makes initializing it special for older compilers. By adding strategic brackets, it makes the build happy again. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Waiman Long <longman@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Fixes: 625e88be1f41 ("locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock'") Link: http://lkml.kernel.org/r/20180621203526.172ab5c4@vmware.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-22Merge tag 'nfs-for-4.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-0/+2
Pull NFS client bugfixes from Trond Myklebust: "Hightlights include: - fix an rcu deadlock in nfs_delegation_find_inode() - fix NFSv4 deadlocks due to not freeing the session slot in layoutget - don't send layoutreturn if the layout is already invalid - prevent duplicate XID allocation - flexfiles: Don't tie up all the rpciod threads in resends" * tag 'nfs-for-4.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: pNFS/flexfiles: Process writeback resends from nfsiod context as well pNFS/flexfiles: Don't tie up all the rpciod threads in resends sunrpc: Prevent duplicate XID allocation pNFS: Don't send layoutreturn if the layout is already invalid pNFS: Always free the session slot on error in nfs4_layoutget_handle_exception NFS: Fix an rcu deadlock in nfs_delegation_find_inode()
2018-06-22Merge tag 'acpi-4.18-rc2' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a suspend/resume regression in the ACPI driver for Intel SoCs (LPSS), add a new system wakeup quirk to the ACPI EC driver and fix an inline stub of a function in the ACPI processor driver that diverged from the original. Specifics: - Fix a suspend/resume regression in the ACPI driver for Intel SoCs (LPSS) to make it work on systems where some power management quirks should only be applied for runtime PM and suspend-to-idle and not for suspend-to-RAM (Rafael Wysocki). - Add a system wakeup quirk for Thinkpad X1 Carbon 6th to the ACPI EC driver to avoid drainig battery too fast while suspended to idle on those systems (Mika Westerberg). - Fix an inline stub of acpi_processor_ppc_has_changed() to match the original function definition (Brian Norris)" * tag 'acpi-4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / processor: Finish making acpi_processor_ppc_has_changed() void ACPI / EC: Use ec_no_wakeup on Thinkpad X1 Carbon 6th ACPI / LPSS: Avoid PM quirks on suspend and resume from S3
2018-06-21kernel.h: Fix a typo in commentWei Wang1-1/+1
Signed-off-by: Wei Wang <wvw@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Crt Mori <cmo@melexis.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: gregkh@linuxfoundation.org Cc: wei.vince.wang@gmail.com Link: https://lkml.kernel.org/lkml/20180424212241.16013-1-wvw@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21x86/platform/UV: Add adjustable set memory block size functionmike.travis@hpe.com1-0/+1
Add a new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This is out of necessity so arch dependent code can accommodate specific BIOS requirements which can align these new PMEM modules at less than the default boundaries. A "set order" type of function was used to insure that the memory block size will be a power of two value without requiring a validity check. 64GB was chosen as the upper limit for memory block size values to accommodate upcoming 4PB systems which have 6 more bits of physical address space (46 becoming 52). Signed-off-by: Mike Travis <mike.travis@hpe.com> Reviewed-by: Andrew Banman <andrew.banman@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dan.j.williams@intel.com Cc: jgross@suse.com Cc: kirill.shutemov@linux.intel.com Cc: mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/20180524201711.609546602@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21rseq/cleanup: Do not abort rseq c.s. in child on fork()Mathieu Desnoyers1-4/+1
Considering that we explicitly forbid system calls in rseq critical sections, it is not valid to issue a fork or clone system call within a rseq critical section, so rseq_fork() is not required to restart an active rseq c.s. in the child process. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ben Maurer <bmaurer@fb.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Lameter <cl@linux.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-api@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Link: https://lore.kernel.org/lkml/20180619133230.4087-4-mathieu.desnoyers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21bpf: enforce correct alignment for instructionsEric Dumazet1-1/+3
After commit 9facc336876f ("bpf: reject any prog that failed read-only lock") offsetof(struct bpf_binary_header, image) became 3 instead of 4, breaking powerpc BPF badly, since instructions need to be word aligned. Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-20nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.Doron Roberts-Kedes1-0/+3
If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will issue a disconnect from nbd_release if the device has no remaining bdev->bd_openers. Fix ret val so reconfigure with only setting the flag succeeds. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-5/+8
Pull rdma fixes from Jason Gunthorpe: "Here are eight fairly small fixes collected over the last two weeks. Regression and crashing bug fixes: - mlx4/5: Fixes for issues found from various checkers - A resource tracking and uverbs regression in the core code - qedr: NULL pointer regression found during testing - rxe: Various small bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/rxe: Fix missing completion for mem_reg work requests RDMA/core: Save kernel caller name when creating CQ using ib_create_cq() IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM IB/mlx5: Fix return value check in flow_counters_set_data() IB/mlx5: Fix memory leak in mlx5_ib_create_flow IB/rxe: avoid double kfree skb
2018-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds4-24/+87
Pull networking fixes from David Miller: 1) Fix crash on bpf_prog_load() errors, from Daniel Borkmann. 2) Fix ATM VCC memory accounting, from David Woodhouse. 3) fib6_info objects need RCU freeing, from Eric Dumazet. 4) Fix SO_BINDTODEVICE handling for TCP sockets, from David Ahern. 5) Fix clobbered error code in enic_open() failure path, from Govindarajulu Varadarajan. 6) Propagate dev_get_valid_name() error returns properly, from Li RongQing. 7) Fix suspend/resume in davinci_emac driver, from Bartosz Golaszewski. 8) Various act_ife fixes (recursive locking, IDR leaks, etc.) from Davide Caratti. 9) Fix buggy checksum handling in sungem driver, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) ip: limit use of gso_size to udp stmmac: fix DMA channel hang in half-duplex mode net: stmmac: socfpga: add additional ocp reset line for Stratix10 net: sungem: fix rx checksum support bpfilter: ignore binary files bpfilter: fix build error net/usb/drivers: Remove useless hrtimer_active check net/sched: act_ife: preserve the action control in case of error net/sched: act_ife: fix recursive lock and idr leak net: ethernet: fix suspend/resume in davinci_emac net: propagate dev_get_valid_name return code enic: do not overwrite error code net/tcp: Fix socket lookups with SO_BINDTODEVICE ptp: replace getnstimeofday64() with ktime_get_real_ts64() net/ipv6: respect rcu grace period before freeing fib6_info net: net_failover: fix typo in net_failover_slave_register() ipvlan: use ETH_MAX_MTU as max mtu net: hamradio: use eth_broadcast_addr enic: initialize enic->rfs_h.lock in enic_probe MAINTAINERS: Add Sam as the maintainer for NCSI ...
2018-06-20ACPI / processor: Finish making acpi_processor_ppc_has_changed() voidBrian Norris1-2/+1
Commit bca5f557dcea "ACPI / processor: Make acpi_processor_ppc_has_changed() void" changed one of the declarations of acpi_processor_ppc_has_changed() to return void, but the !CPU_FREQ version still returns int. Let's return void to be consistent. Fixes: bca5f557dcea "ACPI / processor: Make acpi_processor_ppc_has_changed() void" Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-20Merge tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-1/+1
Pull dma-mapping rename from Christoph Hellwig: "Move all the dma-mapping code to kernel/dma and lose their dma-* prefixes" * tag 'dma-rename-4.18' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: move all DMA mapping code to kernel/dma dma-mapping: use obj-y instead of lib-y for generic dma ops
2018-06-20net/ipv6: respect rcu grace period before freeing fib6_infoEric Dumazet1-2/+3
syzbot reported use after free that is caused by fib6_info being freed without a proper RCU grace period. CPU: 0 PID: 1407 Comm: udevd Not tainted 4.17.0+ #39 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __read_once_size include/linux/compiler.h:188 [inline] find_rr_leaf net/ipv6/route.c:705 [inline] rt6_select net/ipv6/route.c:761 [inline] fib6_table_lookup+0x12b7/0x14d0 net/ipv6/route.c:1823 ip6_pol_route+0x1c2/0x1020 net/ipv6/route.c:1856 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2082 fib6_rule_lookup+0x211/0x6d0 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2110 ip6_route_output include/net/ip6_route.h:82 [inline] icmpv6_xrlim_allow net/ipv6/icmp.c:211 [inline] icmp6_send+0x147c/0x2da0 net/ipv6/icmp.c:535 icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43 ip6_link_failure+0xa5/0x790 net/ipv6/route.c:2244 dst_link_failure include/net/dst.h:427 [inline] ndisc_error_report+0xd1/0x1c0 net/ipv6/ndisc.c:695 neigh_invalidate+0x246/0x550 net/core/neighbour.c:892 neigh_timer_handler+0xaf9/0xde0 net/core/neighbour.c:978 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:404 exiting_irq arch/x86/include/asm/apic.h:527 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 </IRQ> RIP: 0010:strlen+0x5e/0xa0 lib/string.c:482 Code: 24 00 74 3b 48 bb 00 00 00 00 00 fc ff df 4c 89 e0 48 83 c0 01 48 89 c2 48 89 c1 48 c1 ea 03 83 e1 07 0f b6 14 1a 38 ca 7f 04 <84> d2 75 23 80 38 00 75 de 48 83 c4 08 4c 29 e0 5b 41 5c 5d c3 48 RSP: 0018:ffff8801af117850 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: ffff880197f53bd0 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81c5b06c RDI: ffff880197f53bc0 RBP: ffff8801af117868 R08: ffff88019a976540 R09: 0000000000000000 R10: ffff88019a976540 R11: 0000000000000000 R12: ffff880197f53bc0 R13: ffff880197f53bc0 R14: ffffffff899e4e90 R15: ffff8801d91c6a00 strlen include/linux/string.h:267 [inline] getname_kernel+0x24/0x370 fs/namei.c:218 open_exec+0x17/0x70 fs/exec.c:882 load_elf_binary+0x968/0x5610 fs/binfmt_elf.c:780 search_binary_handler+0x17d/0x570 fs/exec.c:1653 exec_binprm fs/exec.c:1695 [inline] __do_execve_file.isra.35+0x16fe/0x2710 fs/exec.c:1819 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8f/0xc0 fs/exec.c:1959 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1576a46207 Code: 77 19 f4 48 89 d7 44 89 c0 0f 05 48 3d 00 f0 ff ff 76 e0 f7 d8 64 41 89 01 eb d8 f7 d8 64 41 89 01 eb df b8 3b 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 00 8c 2d 00 f7 d8 64 89 02 RSP: 002b:00007ffff2784568 EFLAGS: 00000202 ORIG_RAX: 000000000000003b RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f1576a46207 RDX: 0000000001215b10 RSI: 00007ffff2784660 RDI: 00007ffff2785670 RBP: 0000000000625500 R08: 000000000000589c R09: 000000000000589c R10: 0000000000000000 R11: 0000000000000202 R12: 0000000001215b10 R13: 0000000000000007 R14: 0000000001204250 R15: 0000000000000005 Allocated by task 12188: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kmem_cache_alloc_trace+0x152/0x780 mm/slab.c:3620 kmalloc include/linux/slab.h:513 [inline] kzalloc include/linux/slab.h:706 [inline] fib6_info_alloc+0xbb/0x280 net/ipv6/ip6_fib.c:152 ip6_route_info_create+0x782/0x2b50 net/ipv6/route.c:3013 ip6_route_add+0x23/0xb0 net/ipv6/route.c:3154 ipv6_route_ioctl+0x5a5/0x760 net/ipv6/route.c:3660 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973 sock_ioctl+0x30d/0x680 net/socket.c:1097 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1402: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xd9/0x260 mm/slab.c:3813 fib6_info_destroy+0x29b/0x350 net/ipv6/ip6_fib.c:207 fib6_info_release include/net/ip6_fib.h:286 [inline] __ip6_del_rt_siblings net/ipv6/route.c:3235 [inline] ip6_route_del+0x11c4/0x13b0 net/ipv6/route.c:3316 ipv6_route_ioctl+0x616/0x760 net/ipv6/route.c:3663 inet6_ioctl+0x100/0x1f0 net/ipv6/af_inet6.c:546 sock_do_ioctl+0xe4/0x3e0 net/socket.c:973 sock_ioctl+0x30d/0x680 net/socket.c:1097 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1cf/0x16f0 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801b5df2580 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 8 bytes inside of 256-byte region [ffff8801b5df2580, ffff8801b5df2680) The buggy address belongs to the page: page:ffffea0006d77c80 count:1 mapcount:0 mapping:ffff8801da8007c0 index:0xffff8801b5df2e40 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006c5cc48 ffffea0007363308 ffff8801da8007c0 raw: ffff8801b5df2e40 ffff8801b5df2080 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b5df2480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b5df2500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc > ffff8801b5df2580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801b5df2600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b5df2680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb Fixes: a64efe142f5e ("net/ipv6: introduce fib6_info struct and helpers") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reported-by: syzbot+9e6d75e3edef427ee888@syzkaller.appspotmail.com Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-19pNFS/flexfiles: Don't tie up all the rpciod threads in resendsTrond Myklebust1-0/+2
We do not want to have rpciod threads perform recursive calls into the RPC layer since that can deadlock. In particular, having to wait for a layoutget can be nasty... We want rather to defer scheduling those retries until we're in the rpc_release() callback, since that is called from the nfsiod workqueue. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-06-19xen: share start flags between PV and PVHRoger Pau Monne1-1/+5
Use a global variable to store the start flags for both PV and PVH. This allows the xen_initial_domain macro to work properly on PVH. Note that ARM is also switched to use the new variable. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-06-18scsi: target: tcmu: add read length supportbstroesser@ts.fujitsu.com1-1/+3
Generally target core and TCMUser seem to work fine for tape devices and media changers. But there is at least one situation where TCMUser is not able to support sequential access device emulation correctly. The situation is when an initiator sends a SCSI READ CDB with a length that is greater than the length of the tape block to read. We can distinguish two subcases: A) The initiator sent the READ CDB with the SILI bit being set. In this case the sequential access device has to transfer the data from the tape block (only the length of the tape block) and transmit a good status. The current interface between TCMUser and the userspace does not support reduction of the read data size by the userspace program. The patch below fixes this subcase by allowing the userspace program to specify a reduced data size in read direction. B) The initiator sent the READ CDB with the SILI bit not being set. In this case the sequential access device has to transfer the data from the tape block as in A), but additionally has to transmit CHECK CONDITION with the ILI bit set and NO SENSE in the sensebytes. The information field in the sensebytes must contain the residual count. With the below patch a user space program can specify the real read data length and appropriate sensebytes. TCMUser then uses the se_cmd flag SCF_TREAT_READ_AS_NORMAL, to force target core to transmit the real data size and the sensebytes. Note: the flag SCF_TREAT_READ_AS_NORMAL is introduced by Lee Duncan's patch "[PATCH v4] target: transport should handle st FM/EOM/ILI reads" from Tue, 15 May 2018 18:25:24 -0700. Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Acked-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-18RDMA/core: Save kernel caller name when creating CQ using ib_create_cq()Bharat Potnuri1-5/+8
Few kernel applications like SCST-iSER create CQ using ib_create_cq(), where accessing CQ structures using rdma restrack tool leads to below NULL pointer dereference. This patch saves caller kernel module name similar to ib_alloc_cq(). BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8132ca70>] skip_spaces+0x30/0x30 PGD 738bac067 PUD 8533f0067 PMD 0 Oops: 0000 [#1] SMP R10: ffff88017fc03300 R11: 0000000000000246 R12: 0000000000000000 R13: ffff88082fa5a668 R14: ffff88017475a000 R15: 0000000000000000 FS: 00002b32726582c0(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000008491a1000 CR4: 00000000003607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffc05af69c>] ? fill_res_name_pid+0x7c/0x90 [ib_core] [<ffffffffc05af79f>] fill_res_cq_entry+0xef/0x170 [ib_core] [<ffffffffc05af4c4>] res_get_common_dumpit+0x3c4/0x480 [ib_core] [<ffffffffc05af5d3>] nldev_res_get_cq_dumpit+0x13/0x20 [ib_core] [<ffffffff815bc1e7>] netlink_dump+0x117/0x2e0 [<ffffffff815bcb8b>] __netlink_dump_start+0x1ab/0x230 [<ffffffffc059fead>] ibnl_rcv_msg+0x11d/0x1f0 [ib_core] [<ffffffffc05af5c0>] ? nldev_res_get_mr_dumpit+0x20/0x20 [ib_core] [<ffffffffc059fd90>] ? rdma_nl_multicast+0x30/0x30 [ib_core] [<ffffffff815bea49>] netlink_rcv_skb+0xa9/0xc0 [<ffffffffc05a0018>] ibnl_rcv+0x98/0xb0 [ib_core] [<ffffffff815be132>] netlink_unicast+0xf2/0x1b0 [<ffffffff815be50f>] netlink_sendmsg+0x31f/0x6a0 [<ffffffff8156b580>] sock_sendmsg+0xb0/0xf0 [<ffffffff816ace9e>] ? _raw_spin_unlock_bh+0x1e/0x20 [<ffffffff8156f998>] ? release_sock+0x118/0x170 [<ffffffff8156b731>] SYSC_sendto+0x121/0x1c0 [<ffffffff81568340>] ? sock_alloc_file+0xa0/0x140 [<ffffffff81221265>] ? __fd_install+0x25/0x60 [<ffffffff8156c2ce>] SyS_sendto+0xe/0x10 [<ffffffff816b6c2a>] system_call_fastpath+0x16/0x1b RIP [<ffffffff8132ca70>] skip_spaces+0x30/0x30 RSP <ffff88072be97760> CR2: 0000000000000000 Cc: <stable@vger.kernel.org> Fixes: f66c8ba4c9fa ("RDMA/core: Save kernel caller name when creating PD and CQ objects") Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18Merge branch 'dmi-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging Pull dmi update from Jean Delvare: "Expose SKU ID string as a DMI attribute" * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: firmware: dmi: Add access to the SKU ID string
2018-06-17firmware: dmi: Add access to the SKU ID stringSimon Glass1-0/+1
This is used in some systems from user space for determining the identity of the device. Expose this as a file so that that user-space tools don't need to read from /sys/firmware/dmi/tables/DMI Signed-off-by: Simon Glass <sjg@chromium.org> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-06-17atm: Preserve value of skb->truesize when accounting to vccDavid Woodhouse1-0/+15
ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on which they are to be sent. But it doesn't take ownership of those packets from the sock (if any) which originally owned them. They should remain owned by their actual sender until they've left the box. There's a hack in pskb_expand_head() to avoid adjusting skb->truesize for certain skbs, precisely to avoid messing up sk_wmem_alloc accounting. Ideally that hack would cover the ATM use case too, but it doesn't — skbs which aren't owned by any sock, for example PPP control frames, still get their truesize adjusted when the low-level ATM driver adds headroom. This has always been an issue, it seems. The truesize of a packet increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't for normal traffic, only for control frames. So I think we just got away with it, and we probably needed to send 2GiB of LCP echo frames before the misaccounting would ever have caused a problem and caused atm_may_send() to start refusing packets. Commit 14afee4b609 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t") did exactly what it was intended to do, and turned this mostly-theoretical problem into a real one, causing PPPoATM to fail immediately as sk_wmem_alloc underflows and atm_may_send() *immediately* starts refusing to allow new packets. The least intrusive solution to this problem is to stash the value of skb->truesize that was accounted to the VCC, in a new member of the ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that value instead of the then-current value of skb->truesize. Fixes: 158f323b9868 ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: David Woodhouse <dwmw2@infradead.org> Tested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-22/+69
Daniel Borkmann says: ==================== pull-request: bpf 2018-06-16 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a panic in devmap handling in generic XDP where return type of __devmap_lookup_elem() got changed recently but generic XDP code missed the related update, from Toshiaki. 2) Fix a freeze when BPF progs are loaded that include BPF to BPF calls when JIT is enabled where we would later bail out via error path w/o dropping kallsyms, and another one to silence syzkaller splats from locking prog read-only, from Daniel. 3) Fix a bug in test_offloads.py BPF selftest which must not assume that the underlying system have no BPF progs loaded prior to test, and one in bpftool to fix accuracy of program load time, from Jakub. 4) Fix a bug in bpftool's probe for availability of the bpf(2) BPF_TASK_FD_QUERY subcommand, from Yonghong. 5) Fix a regression in AF_XDP's XDP_SKB receive path where queue id check got erroneously removed, from Björn. 6) Fix missing state cleanup in BPF's xfrm tunnel test, from William. 7) Check tunnel type more accurately in BPF's tunnel collect metadata kselftest, from Jian. 8) Fix missing Kconfig fragments for BPF kselftests, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-17Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-blockLinus Torvalds2-4/+2
Pull block fixes from Jens Axboe: "A collection of fixes that should go into -rc1. This contains: - bsg_open vs bsg_unregister race fix (Anatoliy) - NVMe pull request from Christoph, with fixes for regressions in this window, FC connect/reconnect path code unification, and a trace point addition. - timeout fix (Christoph) - remove a few unused functions (Christoph) - blk-mq tag_set reinit fix (Roman)" * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block: bsg: fix race of bsg_open and bsg_unregister block: remov blk_queue_invalidate_tags nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine blk-mq: don't time out requests again that are in the timeout handler nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem blk-mq: reinit q->tag_set_list entry only after grace period
2018-06-17Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimentalLinus Torvalds11-11/+11
Pull documentation fixes from Mauro Carvalho Chehab: "This solves a series of broken links for files under Documentation, and improves a script meant to detect such broken links (see scripts/documentation-file-ref-check). The changes on this series are: - can.rst: fix a footnote reference; - crypto_engine.rst: Fix two parsing warnings; - Fix a lot of broken references to Documentation/*; - improve the scripts/documentation-file-ref-check script, in order to help detecting/fixing broken references, preventing false-positives. After this patch series, only 33 broken references to doc files are detected by scripts/documentation-file-ref-check" * tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits) fix a series of Documentation/ broken file name references Documentation: rstFlatTable.py: fix a broken reference ABI: sysfs-devices-system-cpu: remove a broken reference devicetree: fix a series of wrong file references devicetree: fix name of pinctrl-bindings.txt devicetree: fix some bindings file names MAINTAINERS: fix location of DT npcm files MAINTAINERS: fix location of some display DT bindings kernel-parameters.txt: fix pointers to sound parameters bindings: nvmem/zii: Fix location of nvmem.txt docs: Fix more broken references scripts/documentation-file-ref-check: check tools/*/Documentation scripts/documentation-file-ref-check: get rid of false-positives scripts/documentation-file-ref-check: hint: dash or underline scripts/documentation-file-ref-check: add a fix logic for DT scripts/documentation-file-ref-check: accept more wildcards at filenames scripts/documentation-file-ref-check: fix help message media: max2175: fix location of driver's companion documentation media: v4l: fix broken video4linux docs locations media: dvb: point to the location of the old README.dvb-usb file ...
2018-06-17Merge tag 'fsnotify_for_v4.18-rc1' of ↵Linus Torvalds1-10/+69
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "fsnotify cleanups unifying handling of different watch types. This is the shortened fsnotify series from Amir with the last five patches pulled out. Amir has modified those patches to not change struct inode but obviously it's too late for those to go into this merge window" * tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: add fsnotify_add_inode_mark() wrappers fanotify: generalize fanotify_should_send_event() fsnotify: generalize send_to_group() fsnotify: generalize iteration of marks by object type fsnotify: introduce marks iteration helpers fsnotify: remove redundant arguments to handle_event() fsnotify: use type id to identify connector object type