summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-09-26nfsd: randomize SETCLIENTID reply to help distinguish serversJ. Bruce Fields1-0/+2
NFSv4.1 has built-in trunking support that allows a client to determine whether two connections to two different IP addresses are actually to the same server. NFSv4.0 does not, but RFC 7931 attempts to provide clients a means to do this, basically by performing a SETCLIENTID to one address and confirming it with a SETCLIENTID_CONFIRM to the other. Linux clients since 05f4c350ee02 "NFS: Discover NFSv4 server trunking when mounting" implement a variation on this suggestion. It is possible that other clients do too. This depends on the clientid and verifier not being accepted by an unrelated server. Since both are 64-bit values, that would be very unlikely if they were random numbers. But they aren't: knfsd generates the 64-bit clientid by concatenating the 32-bit boot time (in seconds) and a counter. This makes collisions between clientids generated by the same server extremely unlikely. But collisions are very likely between clientids generated by servers that boot at the same time, and it's quite common for multiple servers to boot at the same time. The verifier is a concatenation of the SETCLIENTID time (in seconds) and a counter, so again collisions between different servers are likely if multiple SETCLIENTIDs are done at the same time, which is a common case. Therefore recent NFSv4.0 clients may decide two different servers are really the same, and mount a filesystem from the wrong server. Fortunately the Linux client, since 55b9df93ddd6 "nfsv4/v4.1: Verify the client owner id during trunking detection", only does this when given the non-default "migration" mount option. The fault is really with RFC 7931, and needs a client fix, but in the meantime we can mitigate the chance of these collisions by randomizing the starting value of the counters used to generate clientids and verifiers. Reported-by: Frank Sorenson <fsorenso@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26nfsd: set the MAY_NOTIFY_LOCK flag in OPEN repliesJeff Layton1-2/+4
If we are using v4.1+, then we can send notification when contended locks become free. Inform the client of that fact. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26nfs: add a new NFS4_OPEN_RESULT_MAY_NOTIFY_LOCK constantJeff Layton1-2/+3
As defined in RFC 5661, section 18.16. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26nfsd: add a LRU list for blocked locksJeff Layton3-0/+65
It's possible for a client to call in on a lock that is blocked for a long time, but discontinue polling for it. A malicious client could even set a lock on a file, and then spam the server with failing lock requests from different lockowners that pile up in a DoS attack. Add the blocked lock structures to a per-net namespace LRU when hashing them, and timestamp them. If the lock request is not revisited after a lease period, we'll drop it under the assumption that the client is no longer interested. This also gives us a mechanism to clean up these objects at server shutdown time as well. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26nfsd: have nfsd4_lock use blocking locks for v4.1+ locksJeff Layton2-20/+156
Create a new per-lockowner+per-inode structure that contains a file_lock. Have nfsd4_lock add this structure to the lockowner's list prior to setting the lock. Then call the vfs and request a blocking lock (by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED back, then we dequeue the block structure and free it. When the next lock request comes in, we'll look for an existing block for the same filehandle and dequeue and reuse it if there is one. When the lock comes free (a'la an lm_notify call), we dequeue it from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to inform the client that it should retry the lock request. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26nfsd: plumb in a CB_NOTIFY_LOCK operationJeff Layton3-0/+73
Add the encoding/decoding for CB_NOTIFY_LOCK operations. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-26NFSD: fix corruption in notifier registrationVasily Averin1-4/+14
By design notifier can be registered once only, however nfsd registers the same inetaddr notifiers per net-namespace. When this happen it corrupts list of notifiers, as result some notifiers can be not called on proper event, traverse on list can be cycled forever, and second unregister can access already freed memory. Cc: stable@vger.kernel.org fixes: 36684996 ("nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23svcrdma: support Remote InvalidationChuck Lever3-6/+65
Support Remote Invalidation. A private message is exchanged with the client upon RDMA transport connect that indicates whether Send With Invalidation may be used by the server to send RPC replies. The invalidate_rkey is arbitrarily chosen from among rkeys present in the RPC-over-RDMA header's chunk lists. Send With Invalidate improves performance only when clients can recognize, while processing an RPC reply, that an rkey has already been invalidated. That has been submitted as a separate change. In the future, the RPC-over-RDMA protocol might support Remote Invalidation properly. The protocol needs to enable signaling between peers to indicate when Remote Invalidation can be used for each individual RPC. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23svcrdma: Server-side support for rpcrdma_connect_privateChuck Lever1-4/+30
Prepare to receive an RDMA-CM private message when handling a new connection attempt, and send a similar message as part of connection acceptance. Both sides can communicate their various implementation limits. Implementations that don't support this sideband protocol ignore it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23rpcrdma: RDMA/CM private message data structureChuck Lever1-0/+35
Introduce data structure used by both client and server to exchange implementation details during RDMA/CM connection establishment. This is an experimental out-of-band exchange between Linux RPC-over-RDMA Version One implementations, replacing the deprecated CCP (see RFC 5666bis). The purpose of this extension is to enable prototyping of features that might be introduced in a subsequent version of RPC-over-RDMA. Suggested by Christoph Hellwig and Devesh Sharma. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23svcrdma: Skip put_page() when send_reply() failsChuck Lever1-1/+1
Message from syslogd@klimt at Aug 18 17:00:37 ... kernel:page:ffffea0020639b00 count:0 mapcount:0 mapping: (null) index:0x0 Aug 18 17:00:37 klimt kernel: flags: 0x2fffff80000000() Aug 18 17:00:37 klimt kernel: page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) Aug 18 17:00:37 klimt kernel: kernel BUG at /home/cel/src/linux/linux-2.6/include/linux/mm.h:445! Aug 18 17:00:37 klimt kernel: RIP: 0010:[<ffffffffa05c21c1>] svc_rdma_sendto+0x641/0x820 [rpcrdma] send_reply() assigns its page argument as the first page of ctxt. On error, send_reply() already invokes svc_rdma_put_context(ctxt, 1); which does a put_page() on that very page. No need to do that again as svc_rdma_sendto exits. Fixes: 3e1eeb980822 ("svcrdma: Close connection when a send error occurs") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23svcrdma: Tail iovec leaves an orphaned DMA mappingChuck Lever5-26/+27
The ctxt's count field is overloaded to mean the number of pages in the ctxt->page array and the number of SGEs in the ctxt->sge array. Typically these two numbers are the same. However, when an inline RPC reply is constructed from an xdr_buf with a tail iovec, the head and tail often occupy the same page, but each are DMA mapped independently. In that case, ->count equals the number of pages, but it does not equal the number of SGEs. There's one more SGE, for the tail iovec. Hence there is one more DMA mapping than there are pages in the ctxt->page array. This isn't a real problem until the server's iommu is enabled. Then each RPC reply that has content in that iovec orphans a DMA mapping that consists of real resources. krb5i and krb5p always populate that tail iovec. After a couple million sent krb5i/p RPC replies, the NFS server starts behaving erratically. Reboot is needed to clear the problem. Fixes: 9d11b51ce7c1 ("svcrdma: Fix send_reply() scatter/gather set-up") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-23nfsd: fix dprintk in nfsd4_encode_getdeviceinfoJeff Layton1-1/+1
nfserr is big-endian, so we should convert it to host-endian before printing it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-16nfsd: eliminate cb_minorversion fieldJeff Layton2-5/+3
We already have that info in the client pointer. No need to pass around a copy. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-16nfsd: don't set a FL_LAYOUT lease for flexfiles layoutsJeff Layton3-1/+7
We currently can hit a deadlock (of sorts) when trying to use flexfiles layouts with XFS. XFS will call break_layout when something wants to write to the file. In the case of the (super-simple) flexfiles layout driver in knfsd, the MDS and DS are the same machine. The client can get a layout and then issue a v3 write to do its I/O. XFS will then call xfs_break_layouts, which will cause a CB_LAYOUTRECALL to be issued to the client. The client however can't return the layout until the v3 WRITE completes, but XFS won't allow the write to proceed until the layout is returned. Christoph says: XFS only cares about block-like layouts where the client has direct access to the file blocks. I'd need to look how to propagate the flag into break_layout, but in principle we don't need to do any recalls on truncate ever for file and flexfile layouts. If we're never going to recall the layout, then we don't even need to set the lease at all. Just skip doing so on flexfiles layouts by adding a new flag to struct nfsd4_layout_ops and skipping the lease setting and removal when that flag is true. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-12svcauth_gss: Revert 64c59a3726f2 ("Remove unnecessary allocation")Chuck Lever1-2/+3
rsc_lookup steals the passed-in memory to avoid doing an allocation of its own, so we can't just pass in a pointer to memory that someone else is using. If we really want to avoid allocation there then maybe we should preallocate somwhere, or reference count these handles. For now we should revert. On occasion I see this on my server: kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851! kernel: invalid opcode: 0000 [#1] SMP kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b #15 kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015 kernel: Workqueue: events do_cache_clean [sunrpc] kernel: task: ffff8808541d8000 task.stack: ffff880854344000 kernel: RIP: 0010:[<ffffffff811e7075>] [<ffffffff811e7075>] kfree+0x155/0x180 kernel: RSP: 0018:ffff880854347d70 EFLAGS: 00010246 kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600 kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064 kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780 kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500 kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0 kernel: FS: 0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0 kernel: Stack: kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0 kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006 kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f kernel: Call Trace: kernel: [<ffffffffa013cf76>] rsc_free+0x16/0x90 [auth_rpcgss] kernel: [<ffffffffa013d006>] rsc_put+0x16/0x30 [auth_rpcgss] kernel: [<ffffffffa0406f60>] cache_clean+0x2e0/0x300 [sunrpc] kernel: [<ffffffffa04073ee>] do_cache_clean+0xe/0x70 [sunrpc] kernel: [<ffffffff8109a70f>] process_one_work+0x1ff/0x3b0 kernel: [<ffffffff8109b15c>] worker_thread+0x2bc/0x4a0 kernel: [<ffffffff8109aea0>] ? rescuer_thread+0x3a0/0x3a0 kernel: [<ffffffff810a0ba4>] kthread+0xe4/0xf0 kernel: [<ffffffff8169c47f>] ret_from_fork+0x1f/0x40 kernel: [<ffffffff810a0ac0>] ? kthread_stop+0x110/0x110 kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff <0f> 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff kernel: RIP [<ffffffff811e7075>] kfree+0x155/0x180 kernel: RSP <ffff880854347d70> kernel: ---[ end trace 3fdec044969def26 ]--- It seems to be most common after a server reboot where a client has been using a Kerberos mount, and reconnects to continue its workload. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-09-11Linux 4.8-rc6v4.8-rc6Linus Torvalds1-1/+1
2016-09-11nvme: make NVME_RDMA depend on BLOCKLinus Torvalds1-1/+1
Commit aa71987472a9 ("nvme: fabrics drivers don't need the nvme-pci driver") removed the dependency on BLK_DEV_NVME, but the cdoe does depend on the block layer (which used to be an implicit dependency through BLK_DEV_NVME). Otherwise you get various errors from the kbuild test robot random config testing when that happens to hit a configuration with BLOCK device support disabled. Cc: Christoph Hellwig <hch@lst.de> Cc: Jay Freyensee <james_p_freyensee@linux.intel.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-11Merge tag 'staging-4.8-rc6' of ↵Linus Torvalds6-8/+19
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull IIO fixes from Greg KH: "Here are a few small IIO fixes for 4.8-rc6. Nothing major, full details are in the shortlog, all of these have been in linux-next with no reported issues" * tag 'staging-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio:core: fix IIO_VAL_FRACTIONAL sign handling iio: ensure ret is initialized to zero before entering do loop iio: accel: kxsd9: Fix scaling bug iio: accel: bmc150: reset chip at init time iio: fix pressure data output unit in hid-sensor-attributes tools:iio:iio_generic_buffer: fix trigger-less mode
2016-09-11Merge tag 'usb-4.8-rc6' of ↵Linus Torvalds8-8/+39
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB gadget, phy, and xhci fixes for 4.8-rc6. All of these resolve minor issues that have been reported, and all have been in linux-next with no reported issues" * tag 'usb-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase xhci: fix null pointer dereference in stop command timeout function usb: dwc3: pci: fix build warning on !PM_SLEEP usb: gadget: prevent potenial null pointer dereference on skb->len usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition usb: phy: phy-generic: Check clk_prepare_enable() error usb: gadget: udc: renesas-usb3: clear VBOUT bit in DRD_CON Revert "usb: dwc3: gadget: always decrement by 1"
2016-09-10Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds7-12/+30
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "nvdimm fixes for v4.8, two of them are tagged for -stable: - Fix devm_memremap_pages() to use track_pfn_insert(). Otherwise, DAX pmd mappings end up with an uncached pgprot, and unusable performance for the device-dax interface. The device-dax interface appeared in 4.7 so this is tagged for -stable. - Fix a couple VM_BUG_ON() checks in the show_smaps() path to understand DAX pmd entries. This fix is tagged for -stable. - Fix a mis-merge of the nfit machine-check handler to flip the polarity of an if() to match the final version of the patch that Vishal sent for 4.8-rc1. Without this the nfit machine check handler never detects / inserts new 'badblocks' entries which applications use to identify lost portions of files. - For test purposes, fix the nvdimm_clear_poison() path to operate on legacy / simulated nvdimm memory ranges. Without this fix a test can set badblocks, but never clear them on these ranges. - Fix the range checking done by dax_dev_pmd_fault(). This is not tagged for -stable since this problem is mitigated by specifying aligned resources at device-dax setup time. These patches have appeared in a next release over the past week. The recent rebase you can see in the timestamps was to drop an invalid fix as identified by the updated device-dax unit tests [1]. The -mm touches have an ack from Andrew" [1]: "[ndctl PATCH 0/3] device-dax test for recent kernel bugs" https://lists.01.org/pipermail/linux-nvdimm/2016-September/006855.html * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm: allow legacy (e820) pmem region to clear bad blocks nfit, mce: Fix SPA matching logic in MCE handler mm: fix cache mode of dax pmd mappings mm: fix show_smap() for zone_device-pmd ranges dax: fix mapping size check
2016-09-10Merge branch 'i2c/for-current' of ↵Linus Torvalds8-15/+43
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Mostly driver bugfixes, but also a few cleanups which are nice to have out of the way" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: rk3x: Restore clock settings at resume time i2c: Spelling s/acknowedge/acknowledge/ i2c: designware: save the preset value of DW_IC_SDA_HOLD Documentation: i2c: slave-interface: add note for driver development i2c: mux: demux-pinctrl: run properly with multiple instances i2c: bcm-kona: fix inconsistent indenting i2c: rcar: use proper device with dma_mapping_error i2c: sh_mobile: use proper device with dma_mapping_error i2c: mux: demux-pinctrl: invalidate properly when switching fails
2016-09-10Merge tag 'for_linus_stable' of ↵Linus Torvalds4-24/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull fscrypto fixes fromTed Ts'o: "Fix some brown-paper-bag bugs for fscrypto, including one one which allows a malicious user to set an encryption policy on an empty directory which they do not own" * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fscrypto: require write access to mount to set encryption policy fscrypto: only allow setting encryption policy on directories fscrypto: add authorization check for setting encryption policy
2016-09-10fscrypto: require write access to mount to set encryption policyEric Biggers4-25/+29
Since setting an encryption policy requires writing metadata to the filesystem, it should be guarded by mnt_want_write/mnt_drop_write. Otherwise, a user could cause a write to a frozen or readonly filesystem. This was handled correctly by f2fs but not by ext4. Make fscrypt_process_policy() handle it rather than relying on the filesystem to get it right. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-09fscrypto: only allow setting encryption policy on directoriesEric Biggers1-0/+2
The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption policy on nondirectory files. This was unintentional, and in the case of nonempty regular files did not behave as expected because existing data was not actually encrypted by the ioctl. In the case of ext4, the user could also trigger filesystem errors in ->empty_dir(), e.g. due to mismatched "directory" checksums when the kernel incorrectly tried to interpret a regular file as a directory. This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with kernels v4.6 and later. It appears that older kernels only permitted directories and that the check was accidentally lost during the refactoring to share the file encryption code between ext4 and f2fs. This patch restores the !S_ISDIR() check that was present in older kernels. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-09-09fscrypto: add authorization check for setting encryption policyEric Biggers1-0/+3
On an ext4 or f2fs filesystem with file encryption supported, a user could set an encryption policy on any empty directory(*) to which they had readonly access. This is obviously problematic, since such a directory might be owned by another user and the new encryption policy would prevent that other user from creating files in their own directory (for example). Fix this by requiring inode_owner_or_capable() permission to set an encryption policy. This means that either the caller must own the file, or the caller must have the capability CAP_FOWNER. (*) Or also on any regular file, for f2fs v4.6 and later and ext4 v4.8-rc1 and later; a separate bug fix is coming for that. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-09-09libnvdimm: allow legacy (e820) pmem region to clear bad blocksDave Jiang1-1/+5
Bad blocks can be injected via /sys/block/pmemN/badblocks. In a situation where legacy pmem is being used or a pmem region created by using memmap kernel parameter, the injected bad blocks are not cleared due to nvdimm_clear_poison() failing from lack of ndctl function pointer. In this case we need to just return as handled and allow the bad blocks to be cleared rather than fail. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-09nfit, mce: Fix SPA matching logic in MCE handlerVishal Verma1-1/+1
The check for a 'pmem' type SPA in the MCE handler was inverted due to a merge/rebase error. Fixes: 6839a6d nfit: do an ARS scrub on hitting a latent media error Cc: linux-acpi@vger.kernel.org Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-09mm: fix cache mode of dax pmd mappingsDan Williams2-7/+19
track_pfn_insert() in vmf_insert_pfn_pmd() is marking dax mappings as uncacheable rendering them impractical for application usage. DAX-pte mappings are cached and the goal of establishing DAX-pmd mappings is to attain more performance, not dramatically less (3 orders of magnitude). track_pfn_insert() relies on a previous call to reserve_memtype() to establish the expected page_cache_mode for the range. While memremap() arranges for reserve_memtype() to be called, devm_memremap_pages() does not. So, teach track_pfn_insert() and untrack_pfn() how to handle tracking without a vma, and arrange for devm_memremap_pages() to establish the write-back-cache reservation in the memtype tree. Cc: <stable@vger.kernel.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Nilesh Choudhury <nilesh.choudhury@oracle.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Toshi Kani <toshi.kani@hpe.com> Reported-by: Kai Zhang <kai.ka.zhang@oracle.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-09mm: fix show_smap() for zone_device-pmd rangesDan Williams2-2/+4
Attempting to dump /proc/<pid>/smaps for a process with pmd dax mappings currently results in the following VM_BUG_ONs: kernel BUG at mm/huge_memory.c:1105! task: ffff88045f16b140 task.stack: ffff88045be14000 RIP: 0010:[<ffffffff81268f9b>] [<ffffffff81268f9b>] follow_trans_huge_pmd+0x2cb/0x340 [..] Call Trace: [<ffffffff81306030>] smaps_pte_range+0xa0/0x4b0 [<ffffffff814c2755>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307656>] show_smap+0xa6/0x2b0 kernel BUG at fs/proc/task_mmu.c:585! RIP: 0010:[<ffffffff81306469>] [<ffffffff81306469>] smaps_pte_range+0x499/0x4b0 Call Trace: [<ffffffff814c2795>] ? vsnprintf+0x255/0x4c0 [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0 [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80 [<ffffffff81307696>] show_smap+0xa6/0x2b0 These locations are sanity checking page flags that must be set for an anonymous transparent huge page, but are not set for the zone_device pages associated with dax mappings. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-09Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-9/+16
Pull virtio fixes from Michael Tsirkin: "This includes a couple of bugfixs for virtio. The virtio console patch is actually also in x86/tip targeting 4.9 because it helps vmap stacks, but it also fixes IOMMU_PLATFORM which was added in 4.8, and it seems important not to ship that in a broken configuration" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_console: Stop doing DMA on the stack virtio: mark vring_dma_dev() static
2016-09-09Merge tag 'pm-4.8-rc6' of ↵Linus Torvalds2-2/+11
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "This includes a PM QoS framework fix from Tejun to prevent interrupts from being enabled unexpectedly during early boot and a cpufreq documentation fix. Specifics: - If the PM QoS framework invokes cancel_delayed_work_sync() during early boot, it will enable interrupts which is not expected at that point, so prevent it from happening (Tejun Heo) - Fix cpufreq statistic documentation to follow a recent change in behavior that forgot to update it as appropriate (Jean Delvare)" * tag 'pm-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq-stats: Minor documentation fix PM / QoS: avoid calling cancel_delayed_work_sync() during early boot
2016-09-09Merge branches 'pm-core-fixes' and 'pm-cpufreq-fixes'Rafael J. Wysocki1-1/+1
* pm-core-fixes: PM / QoS: avoid calling cancel_delayed_work_sync() during early boot * pm-cpufreq-fixes: cpufreq-stats: Minor documentation fix
2016-09-09Merge tag 'gpio-v4.8-3' of ↵Linus Torvalds4-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some GPIO fixes that have been boiling the last two weeks or so. Nothing special, I'm trying to sort out some Kconfig business and Russell needs a fix in for -his SA1100 rework. Summary: - Revert a pointless attempt to add an include to solve the UM allyes compilation problem. - Make the mcp23s08 depend on OF_GPIO as it uses it and doesn't compile properly without it. - Fix a probing problem for ucb1x00" * tag 'gpio-v4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: sa1100: fix irq probing for ucb1x00 gpio: mcp23s08: make driver depend on OF_GPIO Revert "gpio: include <linux/io-mapping.h> in gpiolib-of"
2016-09-09Merge branch 'for-linus' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fix from Miklos Szeredi: "This fixes a deadlock when fuse, direct I/O and loop device are combined" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: direct-io: don't dirty ITER_BVEC pages
2016-09-09Merge branch 'overlayfs-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fix from Miklos Szeredi: "This fixes a regression caused by the last pull request" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix workdir creation
2016-09-09Merge branch 'for-linus-4.8' of ↵Linus Torvalds3-8/+17
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I'm not proud of how long it took me to track down that one liner in btrfs_sync_log(), but the good news is the patches I was trying to blame for these problems were actually fine (sorry Filipe)" * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns btrfs: do not decrease bytes_may_use when replaying extents
2016-09-09Merge tag 'sound-4.8-rc6' of ↵Linus Torvalds9-50/+118
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "We've got quite a few fixes at this time, and all are stable patches. syzkaller strikes back again (episode 19 or so), and we had to plug some holes in ALSA core part (mostly timer). In addition, a couple of FireWire audio fixes for the invalid copy user calls in locks, and a few quirks for HD-audio and USB-audio as usual are included" * tag 'sound-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: rawmidi: Fix possible deadlock with virmidi registration ALSA: timer: Fix zero-division by continue of uninitialized instance ALSA: timer: fix NULL pointer dereference in read()/ioctl() race ALSA: fireworks: accessing to user space outside spinlock ALSA: firewire-tascam: accessing to user space outside spinlock ALSA: hda - Enable subwoofer on Dell Inspiron 7559 ALSA: hda - Add headset mic quirk for Dell Inspiron 5468 ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114 ALSA: timer: fix NULL pointer dereference on memory allocation failure ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE
2016-09-09virtio_console: Stop doing DMA on the stackAndy Lutomirski1-8/+15
virtio_console uses a small DMA buffer for control requests. Move that buffer into heap memory. Doing virtio DMA on the stack is normally okay on non-DMA-API virtio systems (which is currently most of them), but it breaks completely if the stack is virtually mapped. Tested by typing both directions using picocom aimed at /dev/hvc0. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com>
2016-09-09virtio: mark vring_dma_dev() staticBaoyou Xie1-1/+1
We get 1 warning when building kernel with W=1: drivers/virtio/virtio_ring.c:170:16: warning: no previous prototype for 'vring_dma_dev' [-Wmissing-prototypes] In fact, this function is only used in the file in which it is declared and don't need a declaration, but can be made static. so this patch marks this function with 'static'. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09Merge tag 'arm64-fixes' of ↵Linus Torvalds2-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - smp_mb__before_spinlock() changed to smp_mb() on arm64 since the generic definition to smp_wmb() is not sufficient - avoid a recursive loop with the graph tracer by using using preempt_(enable|disable)_notrace in _percpu_(read|write) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: use preempt_disable_notrace in _percpu_read/write arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()
2016-09-09Merge tag 'powerpc-4.8-5' of ↵Linus Torvalds5-17/+29
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Don't alias user region to other regions below PAGE_OFFSET from Paul Mackerras - Fix again csum_partial_copy_generic() on 32-bit from Christophe Leroy - Fix corrupted PE allocation bitmap on releasing PE from Gavin Shan Fixes for code merged this cycle: - Fix crash on releasing compound PE from Gavin Shan - Fix processor numbers in OPAL ICP from Benjamin Herrenschmidt - Fix little endian build with CONFIG_KEXEC=n from Thiago Jung Bauermann" * tag 'powerpc-4.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET powerpc/32: Fix again csum_partial_copy_generic() powerpc/powernv: Fix corrupted PE allocation bitmap on releasing PE powerpc/powernv: Fix crash on releasing compound PE powerpc/xics/opal: Fix processor numbers in OPAL ICP powerpc/pseries: Fix little endian build with CONFIG_KEXEC=n
2016-09-09Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds5-2/+23
Pull ARM fixes from Russell King: "A few ARM fixes: - Robin Murphy noticed that the non-secure privileged entry was relying on undefined behaviour, which needed to be fixed. - Vladimir Murzin noticed that prov-v7 fails to build for MMUless configurations because a required header file wasn't included. - A bunch of fixes for StrongARM regressions found while testing 4.8-rc on such platforms" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: sa1100: clear reset status prior to reboot ARM: 8600/1: Enforce some NS-SVC initialisation ARM: 8599/1: mm: pull asm/memory.h explicitly ARM: sa1100: register clocks early ARM: sa1100: fix 3.6864MHz clock
2016-09-09Merge tag 'fixes-for-v4.8-rc6' of ↵Greg Kroah-Hartman6-7/+25
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: usb: fixes for v4.8-rc6 Unfortunately we have a bogus dwc3 patch leaked through the cracks and got merged into Linus' HEAD. That patch ended up causing off-by-1 error in our TRB accounting logic. Thankfully John Youn found out the problem and we provided a revert to the bogus dwc3 patch in no time. Apart from this off-by-1 error, we have two fixes to the Renesas drivers, a small fix to our generic phy driver, a NULL pointer dereference fix for f_eem and a build warning fix in dwc3.
2016-09-09Merge tag 'usb-ci-v4.8-rc6' of ↵Greg Kroah-Hartman1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-linus Peter writes: Fix the possible kernel panic when the hardware signal is bad for chipidea udc.
2016-09-09Merge tag 'iio-fixes-for-4.8b' of ↵Greg Kroah-Hartman6-8/+19
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus Jonathan writes: Second set of IIO fixes for the 4.8 cycle. We have a big rework of the kxsd9 driver queued up behind the fix below and a fix for a recent fix that was marked for stable. Hence this fix series is perhaps a little more urgent than average for IIO. * core - a fix for a fix in the last set. The recent fix for blocking ops when ! task running left a path (unlikely one) in which the function return value was not set - so initialise it to 0. - The IIO_TYPE_FRACTIONAL code previously didn't cope with negative fractions. Turned out a fix for this was in Analog's tree but hadn't made it upstream. * bmc150 - reset chip at init time. At least one board out there ends up coming up in an unstable state due to noise during power up. The reset does no harm on other boards. * kxsd9 - Fix a bug in the reported scaling due to failing to set the integer part to 0. * hid-sensors-pressure - Output was in the wrong units to comply with the IIO ABI. * tools - iio_generic_buffer: Fix the trigger-less mode by ensuring we don't fault out for having no trigger when we explicitly said we didn't want to have one.
2016-09-09arm64: use preempt_disable_notrace in _percpu_read/writeChunyan Zhang1-4/+4
When debug preempt or preempt tracer is enabled, preempt_count_add/sub() can be traced by function and function graph tracing, and preempt_disable/enable() would call preempt_count_add/sub(), so in Ftrace subsystem we should use preempt_disable/enable_notrace instead. In the commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events do") the function this_cpu_read() was added to trace_graph_entry(), and if this_cpu_read() calls preempt_disable(), graph tracer will go into a recursive loop, even if the tracing_on is disabled. So this patch change to use preempt_enable/disable_notrace instead in this_cpu_read(). Since Yonghui Yang helped a lot to find the root cause of this problem, so also add his SOB. Signed-off-by: Yonghui Yang <mark.yang@spreadtrum.com> Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-09arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb()Will Deacon1-0/+10
smp_mb__before_spinlock() is intended to upgrade a spin_lock() operation to a full barrier, such that prior stores are ordered with respect to loads and stores occuring inside the critical section. Unfortunately, the core code defines the barrier as smp_wmb(), which is insufficient to provide the required ordering guarantees when used in conjunction with our load-acquire-based spinlock implementation. This patch overrides the arm64 definition of smp_mb__before_spinlock() to map to a full smp_mb(). Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-09usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phaseClemens Gruber1-0/+9
Problems with the signal integrity of the high speed USB data lines or noise on reference ground lines can cause the i.MX6 USB controller to violate USB specs and exhibit unexpected behavior. It was observed that USBi_UI interrupts were triggered first and when isr_setup_status_phase was called, ci->status was NULL, which lead to a NULL pointer dereference kernel panic. This patch fixes the kernel panic, emits a warning once and returns -EPIPE to halt the device and let the host get stalled. It also adds a comment to point people, who are experiencing this issue, to their USB hardware design. Cc: <stable@vger.kernel.org> #4.1+ Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com> Signed-off-by: Peter Chen <peter.chen@nxp.com>
2016-09-08cpufreq-stats: Minor documentation fixJean Delvare1-1/+1
The cpufreq-stats code can no longer be built as a module, so it now appears with square brackets in menuconfig. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 1aefc75b2449 (cpufreq: stats: Make the stats code non-modular) Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>