summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-01-15page-flags: introduce page flags policies wrt compound pagesKirill A. Shutemov3-64/+116
This patch adds a third argument to macros which create function definitions for page flags. This argument defines how page-flags helpers behave on compound functions. For now we define four policies: - PF_ANY: the helper function operates on the page it gets, regardless if it's non-compound, head or tail. - PF_HEAD: the helper function operates on the head page of the compound page if it gets tail page. - PF_NO_TAIL: only head and non-compond pages are acceptable for this helper function. - PF_NO_COMPOUND: only non-compound pages are acceptable for this helper function. For now we use policy PF_ANY for all helpers, which matches current behaviour. We do not enforce the policy for TESTPAGEFLAG, because we have flags checked for random pages all over the kernel. Noticeable exception to this is PageTransHuge() which triggers VM_BUG_ON() for tail page. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15page-flags: move code aroundKirill A. Shutemov1-20/+21
The preparation patch: we are going to use compound_head(), PageTail() and PageCompound() to define page-flags helpers. Let's define them before macros. We cannot user PageHead() helper in PageCompound() as it's not yet defined -- use test_bit(PG_head, &page->flags) instead. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15page-flags: trivial cleanup for PageTrans* helpersKirill A. Shutemov1-15/+3
Use TESTPAGEFLAG_FALSE() to get it a bit cleaner. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds29-66/+158
Pull networking fixes from David Miller: "A quick set of bug fixes after there initial networking merge: 1) Netlink multicast group storage allocator only was tested with nr_groups equal to 1, make it work for other values too. From Matti Vaittinen. 2) Check build_skb() return value in macb and hip04_eth drivers, from Weidong Wang. 3) Don't leak x25_asy on x25_asy_open() failure. 4) More DMA map/unmap fixes in 3c59x from Neil Horman. 5) Don't clobber IP skb control block during GSO segmentation, from Konstantin Khlebnikov. 6) ECN helpers for ipv6 don't fixup the checksum, from Eric Dumazet. 7) Fix SKB segment utilization estimation in xen-netback, from David Vrabel. 8) Fix lockdep splat in bridge addrlist handling, from Nikolay Aleksandrov" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) bgmac: Fix reversed test of build_skb() return value. bridge: fix lockdep addr_list_lock false positive splat net: smsc: Add support h8300 xen-netback: free queues after freeing the net device xen-netback: delete NAPI instance when queue fails to initialize xen-netback: use skb to determine number of required guest Rx requests net: sctp: Move sequence start handling into sctp_transport_get_idx() ipv6: update skb->csum when CE mark is propagated net: phy: turn carrier off on phy attach net: macb: clear interrupts when disabling them sctp: support to lookup with ep+paddr in transport rhashtable net: hns: fixes no syscon error when init mdio dts: hisi: fixes no syscon fault when init mdio net: preserve IP control block during GSO segmentation fsl/fman: Delete one function call "put_device" in dtsec_config() hip04_eth: fix missing error handle for build_skb failed 3c59x: fix another page map/single unmap imbalance 3c59x: balance page maps and unmaps x25_asy: Free x25_asy on x25_asy_open() failure. mlxsw: fix SWITCHDEV_OBJ_ID_PORT_MDB ...
2016-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2-8/+9
Pull sparc fixes from David Miller: "Two sparc bug fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix numa node distance initialization sparc64: fix incorrect sign extension in sys_sparc64_personality
2016-01-15Merge tag 'powerpc-4.5-1' of ↵Linus Torvalds251-3848/+6827
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Core: - Ground work for the new Power9 MMU from Aneesh Kumar K.V - Optimise FP/VMX/VSX context switching from Anton Blanchard Misc: - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan - Allow wrapper to work on non-english system from Laurent Vivier - Add rN aliases to the pt_regs_offset table from Rashmica Gupta - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt - Include KVM guest test in all interrupt vectors from Paul Mackerras - Fix DSCR inheritance over fork() from Anton Blanchard - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng - Print MSR TM bits in oops messages from Michael Neuling - Add TM signal return & invalid stack selftests from Michael Neuling - Limit EPOW reset event warnings from Vipin K Parashar - Remove the Cell QPACE code from Rashmica Gupta - Append linux_banner to exception information in xmon from Rashmica Gupta - Add selftest to check if VSRs are corrupted from Rashmica Gupta - Remove broken GregorianDay() from Daniel Axtens - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman - Add selftest script to test HMI functionality from Daniel Axtens - Remove obsolete OPAL v2 support from Stewart Smith - Make enter_rtas() private from Michael Ellerman - PPR exception cleanups from Michael Ellerman - Add page soft dirty tracking from Laurent Dufour - Add support for Nvlink NPUs from Alistair Popple - Add support for kexec on 476fpe from Alistair Popple - Enable kernel CPU dlpar from sysfs from Nathan Fontenot - Copy only required pieces of the mm_context_t to the paca from Michael Neuling - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt - Add HWCAP bits for Power9 from Michael Ellerman - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand cxl: - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain - cxl: use correct operator when writing pcie config space values from Andrew Donnellan - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain - cxl: fix build for GCC 4.6.x from Brian Norris - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan Freescale: - Freescale updates from Scott: Highlights include moving QE code out of arch/powerpc (to be shared with arm), device tree updates, and minor fixes" * tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits) powerpc/module: Handle R_PPC64_ENTRY relocations scripts/recordmcount.pl: support data in text section on powerpc powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff powerpc/mm: Fix _PAGE_PTE breaking swapoff cxl: Enable PCI device ID for future IBM CXL adapter cxl: use -Werror only with CONFIG_PPC_WERROR cxl: fix build for GCC 4.6.x powerpc: Add HWCAP bits for Power9 powerpc/powernv: Reserve PE#0 on NPU powerpc/powernv: Change NPU PE# assignment powerpc/powernv: Fix update of NVLink DMA mask powerpc/powernv: Remove misleading comment in pci.c powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing powerpc: Fix build break due to paca mm_context_t changes cxl: Fix DSI misses when the context owning task exits MAINTAINERS: Update Scott Wood's e-mail address powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery() powerpc: Fix style of self-test config prompts powerpc/powernv: Only delay opal_rtc_read() retry when necessary ...
2016-01-15bgmac: Fix reversed test of build_skb() return value.David S. Miller1-1/+1
Fixes: f1640c3ddeec ("bgmac: fix a missing check for build_skb") Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15Merge tag 'vfio-v4.5-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds7-9/+214
Pull VFIO updates from Alex Williamson: - Fixes in AMD xgbe reset, spapr structure padding, type 1 flags (Dan Carpenter, Alexey Kardashevskiy, Pierre Morel) - Re-introduce no-iommu mode, with a user this time (Alex Williamson) * tag 'vfio-v4.5-rc1' of git://github.com/awilliam/linux-vfio: vfio/iommu_type1: make use of info.flags vfio: Include No-IOMMU mode vfio: Add explicit alignments in vfio_iommu_spapr_tce_create VFIO: platform: reset: fix a warning message condition
2016-01-15Merge tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linuxLinus Torvalds19-57/+370
Pull nfsd updates from Bruce Fields: "Smaller bugfixes and cleanup, including a fix for a failures of kerberized NFSv4.1 mounts, and Scott Mayhew's work addressing ACK storms that can affect some high-availability NFS setups" * tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linux: nfsd: add new io class tracepoint nfsd: give up on CB_LAYOUTRECALLs after two lease periods nfsd: Fix nfsd leaks sunrpc module references lockd: constify nlmsvc_binding structure lockd: use to_delayed_work nfsd: use to_delayed_work Revert "svcrdma: Do not send XDR roundup bytes for a write chunk" lockd: Register callbacks on the inetaddr_chain and inet6addr_chain nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain sunrpc: Add a function to close temporary transports immediately nfsd: don't base cl_cb_status on stale information nfsd4: fix gss-proxy 4.1 mounts for some AD principals nfsd: fix unlikely NULL deref in mach_creds_match nfsd: minor consolidation of mach_cred handling code nfsd: helper for dup of possibly NULL string svcrpc: move some initialization to common code nfsd: fix a warning message nfsd: constify nfsd4_callback_ops structure nfsd: recover: constify nfsd4_client_tracking_ops structures svcrdma: Do not send XDR roundup bytes for a write chunk
2016-01-15Merge branch 'for-linus' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs regression fix from Al Viro: "Fix for braino introduced in vfs.git#work.misc" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: amdkfd: Copy from the proper user command pointer
2016-01-15bridge: fix lockdep addr_list_lock false positive splatNikolay Aleksandrov1-0/+8
After promisc mode management was introduced a bridge device could do dev_set_promiscuity from its ndo_change_rx_flags() callback which in turn can be called after the bridge's addr_list_lock has been taken (e.g. by dev_uc_add). This causes a false positive lockdep splat because the port interfaces' addr_list_lock is taken when br_manage_promisc() runs after the bridge's addr list lock was already taken. To remove the false positive introduce a custom bridge addr_list_lock class and set it on bridge init. A simple way to reproduce this is with the following: $ brctl addbr br0 $ ip l add l br0 br0.100 type vlan id 100 $ ip l set br0 up $ ip l set br0.100 up $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering $ brctl addif br0 eth0 Splat: [ 43.684325] ============================================= [ 43.684485] [ INFO: possible recursive locking detected ] [ 43.684636] 4.4.0-rc8+ #54 Not tainted [ 43.684755] --------------------------------------------- [ 43.684906] brctl/1187 is trying to acquire lock: [ 43.685047] (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.685460] but task is already holding lock: [ 43.685618] (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.686015] other info that might help us debug this: [ 43.686316] Possible unsafe locking scenario: [ 43.686743] CPU0 [ 43.686967] ---- [ 43.687197] lock(_xmit_ETHER); [ 43.687544] lock(_xmit_ETHER); [ 43.687886] *** DEADLOCK *** [ 43.688438] May be due to missing lock nesting notation [ 43.688882] 2 locks held by brctl/1187: [ 43.689134] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20 [ 43.689852] #1: (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.690575] stack backtrace: [ 43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54 [ 43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 43.691770] ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0 [ 43.692425] ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139 [ 43.693071] ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020 [ 43.693709] Call Trace: [ 43.693931] [<ffffffff81360ceb>] dump_stack+0x4b/0x70 [ 43.694199] [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90 [ 43.694483] [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0 [ 43.694789] [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0 [ 43.695064] [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0 [ 43.695340] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.695623] [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80 [ 43.695901] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.696180] [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.696460] [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50 [ 43.696750] [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge] [ 43.697052] [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge] [ 43.697348] [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge] [ 43.697655] [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0 [ 43.697943] [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90 [ 43.698223] [<ffffffff815072de>] dev_uc_add+0x5e/0x80 [ 43.698498] [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q] [ 43.698798] [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80 [ 43.699083] [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20 [ 43.699374] [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80 [ 43.699678] [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20 [ 43.699973] [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge] [ 43.700259] [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge] [ 43.700548] [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge] [ 43.700836] [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0 [ 43.701106] [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0 [ 43.701379] [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450 [ 43.701665] [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450 [ 43.701947] [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50 [ 43.702219] [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290 [ 43.702500] [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0 [ 43.702771] [<ffffffff81243079>] SyS_ioctl+0x79/0x90 [ 43.703033] [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a CC: Vlad Yasevich <vyasevic@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Bridge list <bridge@lists.linux-foundation.org> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.") Reported-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15Merge tag 'md/4.5' of git://neil.brown.name/mdLinus Torvalds13-231/+649
Pull md updates from Neil Brown: "Mostly clustered-raid1 and raid5 journal updates. one Y2038 fix and other minor stuff. One patch removes me from the MAINTAINERS file and adds a record of my md maintainership to Credits" Many thanks to Neil, who has been around for a _looong_ time. * tag 'md/4.5' of git://neil.brown.name/md: (26 commits) md/raid: only permit hot-add of compatible integrity profiles Remove myself as MD Maintainer, and add to Credits. raid5-cache: handle journal hotadd in quiesce MD: add journal with array suspended md: set MD_HAS_JOURNAL in correct places md: Remove 'ready' field from mddev. md: remove unnecesary md_new_event_inintr raid5: allow r5l_io_unit allocations to fail raid5-cache: use a mempool for the metadata block raid5-cache: use a bio_set raid5-cache: add journal hot add/remove support drivers: md: use ktime_get_real_seconds() md: avoid warning for 32-bit sector_t raid5-cache: free meta_page earlier raid5-cache: simplify r5l_move_io_unit_list md: update comment for md_allow_write md-cluster: update comments for MD_CLUSTER_SEND_LOCKED_ALREADY md-cluster: Protect communication with mutexes md-cluster: Defer MD reloading to mddev->thread md-cluster: update the documentation ...
2016-01-15Merge tag 'regulator-v4.5' of ↵Linus Torvalds40-248/+2414
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "Aside from a fix for a spurious warning (which caused more problems than it fixed in the fixing really) this is all driver updates, including new drivers for Dialog PV88060/90 and TI LM363x and TPS65086 devices. The qcom_smd driver has had PM8916 and PMA8084 support added" * tag 'regulator-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (36 commits) regulator: core: remove some dead code regulator: core: use dev_to_rdev regulator: lp872x: Get rid of duplicate reference to DVS GPIO regulator: lp872x: Add missing of_match in regulators descriptions regulator: axp20x: Fix GPIO LDO enable value for AXP22x regulator: lp8788: constify regulator_ops structures regulator: wm8*: constify regulator_ops structures regulator: da9*: constify regulator_ops structures regulator: mt6311: Use REGCACHE_RBTREE regulator: tps65917/palmas: Add bypass ops for LDOs with bypass capability regulator: qcom-smd: Add support for PMA8084 regulator: qcom-smd: Add PM8916 support soc: qcom: documentation: Update SMD/RPM Docs regulator: pv88090: logical vs bitwise AND typo regulator: pv88090: Fix irq leak regulator: pv88090: new regulator driver regulator: wm831x-ldo: Use platform_register/unregister_drivers() regulator: wm831x-dcdc: Use platform_register/unregister_drivers() regulator: lp8788-ldo: Use platform_register/unregister_drivers() regulator: core: Fix nested locking of supplies ...
2016-01-15net: smsc: Add support h8300Yoshinori Sato2-2/+13
Add H8/300 platform support for smc91x Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15amdkfd: Copy from the proper user command pointerBorislav Petkov1-2/+1
8f1d57c17248 ("amdkfd: don't open-code memdup_user()") mistakenly uses an uninitialized local pointer, gcc complains: drivers/gpu/drm/amd/amdkfd/kfd_chardev.c: In function ‘kfd_ioctl_dbg_address_watch’: drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:562:12: warning: ‘args_buff’ may be used uninitialized in this function [-Wmaybe-uninitialized] args_buff = memdup_user(args_buff, ^ Fix it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-15Merge branch 'mailbox-for-next' of ↵Linus Torvalds1-1/+1
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox fixlet from Jussi Brar. * 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: constify mbox_chan_ops structure
2016-01-15Merge branch 'xen-netback-fixes'David S. Miller2-22/+15
David Vrabel says: ==================== xen-netback: use skb to determine number of required (etc.) "xen-netback: use skb to determine number of required" plus two other minor fixes I found down the back of the sofa. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15xen-netback: free queues after freeing the net deviceDavid Vrabel1-11/+5
If a queue still has a NAPI instance added to the net device, freeing the queues early results in a use-after-free. The shouldn't ever happen because we disconnect and tear down all queues before freeing the net device, but doing this makes it obviously safe. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15xen-netback: delete NAPI instance when queue fails to initializeDavid Vrabel1-0/+1
When xenvif_connect() fails it may leave a stale NAPI instance added to the device. Make sure we delete it in the error path. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15xen-netback: use skb to determine number of required guest Rx requestsDavid Vrabel1-11/+9
Using the MTU or GSO size to determine the number of required guest Rx requests for an skb was subtly broken since these value may change at runtime. After 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b (xen-netback: always fully coalesce guest Rx packets) we always fully pack a packet into its guest Rx slots. Calculating the number of required slots from the packet length is then easy. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15net: sctp: Move sequence start handling into sctp_transport_get_idx()Geert Uytterhoeven1-3/+3
net/sctp/proc.c: In function ‘sctp_transport_get_idx’: net/sctp/proc.c:313: warning: ‘obj’ may be used uninitialized in this function This is currently a false positive, as all callers check for a zero offset first, and handle this case in the exact same way. Move the check and handling into sctp_transport_get_idx() to kill the compiler warning, and avoid future bugs. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15ipv6: update skb->csum when CE mark is propagatedEric Dumazet2-4/+17
When a tunnel decapsulates the outer header, it has to comply with RFC 6080 and eventually propagate CE mark into inner header. It turns out IP6_ECN_set_ce() does not correctly update skb->csum for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure" messages and stack traces. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15Merge branch 'for_linus' of ↵Linus Torvalds11-201/+196
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF fixes and quota cleanups from Jan Kara: "Several UDF fixes and some minor quota cleanups" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Check output buffer length when converting name to CS0 udf: Prevent buffer overrun with multi-byte characters quota: constify qtree_fmt_operations structures udf: avoid uninitialized variable use udf: Fix lost indirect extent block udf: Factor out code for creating indirect extent udf: limit the maximum number of indirect extents in a row udf: limit the maximum number of TD redirections fs: make quota/dquot.c explicitly non-modular fs: make quota/netlink.c explicitly non-modular
2016-01-15net: phy: turn carrier off on phy attachSjoerd Simons1-0/+5
The operstate of a networking device initially IF_OPER_UNKNOWN aka "unknown", updated on carrier state changes (with carrier state being on by default). This means it will stay unknown unless the carrier state goes to off at some point, which is not the case if the phy is already up/connected at startup. Explicitly turn off the carrier on phy attach, leaving the phy state machine to turn the carrier on when it has done the initial negotiation. Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15net: macb: clear interrupts when disabling themNathan Sullivan1-0/+4
Disabling interrupts with the IDR register does not stop the macb hardware from asserting its interrupt line if there are interrupts pending. Always clear the interrupts using ISR, and be sure to write it on hardware that is not read-to-clear, like Zynq. Not doing so will cause interrupts when the driver doesn't expect them. Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds176-1284/+1852
Merge first patch-bomb from Andrew Morton: - A few hotfixes which missed 4.4 becasue I was asleep. cc'ed to -stable - A few misc fixes - OCFS2 updates - Part of MM. Including pretty large changes to page-flags handling and to thp management which have been buffered up for 2-3 cycles now. I have a lot of MM material this time. [ It turns out the THP part wasn't quite ready, so that got dropped from this series - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) zsmalloc: reorganize struct size_class to pack 4 bytes hole mm/zbud.c: use list_last_entry() instead of list_tail_entry() zram/zcomp: do not zero out zcomp private pages zram: pass gfp from zcomp frontend to backend zram: try vmalloc() after kmalloc() zram/zcomp: use GFP_NOIO to allocate streams mm: add tracepoint for scanning pages drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64 mm/page_isolation: use macro to judge the alignment mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE() mm: rework virtual memory accounting include/linux/memblock.h: fix ordering of 'flags' argument in comments mm: move lru_to_page to mm_inline.h Documentation/filesystems: describe the shared memory usage/accounting memory-hotplug: don't BUG() in register_memory_resource() hugetlb: make mm and fs code explicitly non-modular mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd() mm: make sure isolate_lru_page() is never called for tail page vmstat: make vmstat_updater deferrable again and shut down on idle ...
2016-01-15sctp: support to lookup with ep+paddr in transport rhashtableXin Long1-15/+23
Now, when we sendmsg, we translate the ep to laddr by selecting the first element of the list, and then do a lookup for a transport. But sctp_hash_cmp() will compare it against asoc addr_list, which may be a subset of ep addr_list, meaning that this chosen laddr may not be there, and thus making it impossible to find the transport. So we fix it by using ep + paddr to lookup transports in hashtable. In sctp_hash_cmp, if .ep is set, we will check if this ep == asoc->ep, or we will do the laddr check. Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reported-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15zsmalloc: reorganize struct size_class to pack 4 bytes holeWeijie Yang1-2/+2
Reoder the pages_per_zspage field in struct size_class which can eliminate the 4 bytes hole between it and stats field. Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/zbud.c: use list_last_entry() instead of list_tail_entry()Geliang Tang1-4/+1
list_last_entry*( has been defined in list.h, so replace list_tail_entry() with it. Signed-off-by: Geliang Tang <geliangtang@163.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15zram/zcomp: do not zero out zcomp private pagesSergey Senozhatsky2-4/+4
Do not __GFP_ZERO allocated zcomp ->private pages. We keep allocated streams around and use them for read/write requests, so we supply a zeroed out ->private to compression algorithm as a scratch buffer only once -- the first time we use that stream. For the rest of IO requests served by this stream ->private usually contains some temporarily data from the previous requests. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15zram: pass gfp from zcomp frontend to backendMinchan Kim4-35/+23
Each zcomp backend uses own gfp flag but it's pointless because the context they could be called is driven by upper layer(ie, zcomp frontend). As well, zcomp frondend could call them in different context. One context(ie, zram init part) is it should be better to make sure successful allocation other context(ie, further stream allocation part for accelarating I/O speed) is just optional so let's pass gfp down from driver (ie, zcomp frontend) like normal MM convention. [sergey.senozhatsky@gmail.com: add missing __vmalloc zero and highmem gfps] Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15zram: try vmalloc() after kmalloc()Kyeongdon Kim2-4/+42
When we're using LZ4 multi compression streams for zram swap, we found out page allocation failure message in system running test. That was not only once, but a few(2 - 5 times per test). Also, some failure cases were continually occurring to try allocation order 3. In order to make parallel compression private data, we should call kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order 2/3 size memory to allocate in that time, page allocation fails. This patch makes to use vmalloc() as fallback of kmalloc(), this prevents page alloc failure warning. After using this, we never found warning message in running test, also It could reduce process startup latency about 60-120ms in each case. For reference a call trace : Binder_1: page allocation failure: order:3, mode:0x10c0d0 CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20 Call trace: dump_backtrace+0x0/0x270 show_stack+0x10/0x1c dump_stack+0x1c/0x28 warn_alloc_failed+0xfc/0x11c __alloc_pages_nodemask+0x724/0x7f0 __get_free_pages+0x14/0x5c kmalloc_order_trace+0x38/0xd8 zcomp_lz4_create+0x2c/0x38 zcomp_strm_alloc+0x34/0x78 zcomp_strm_multi_find+0x124/0x1ec zcomp_strm_find+0xc/0x18 zram_bvec_rw+0x2fc/0x780 zram_make_request+0x25c/0x2d4 generic_make_request+0x80/0xbc submit_bio+0xa4/0x15c __swap_writepage+0x218/0x230 swap_writepage+0x3c/0x4c shrink_page_list+0x51c/0x8d0 shrink_inactive_list+0x3f8/0x60c shrink_lruvec+0x33c/0x4cc shrink_zone+0x3c/0x100 try_to_free_pages+0x2b8/0x54c __alloc_pages_nodemask+0x514/0x7f0 __get_free_pages+0x14/0x5c proc_info_read+0x50/0xe4 vfs_read+0xa0/0x12c SyS_read+0x44/0x74 DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB [minchan@kernel.org: change vmalloc gfp and adding comment about gfp] [sergey.senozhatsky@gmail.com: tweak comments and styles] Signed-off-by: Kyeongdon Kim <kyeongdon.kim@lge.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15zram/zcomp: use GFP_NOIO to allocate streamsSergey Senozhatsky3-4/+4
We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: (jbd2_handle){+.+.?.}, at: start_this_handle+0x4ca/0x555 {IN-RECLAIM_FS-W} state was registered at: __lock_acquire+0x8da/0x117b lock_acquire+0x10c/0x1a7 start_this_handle+0x52d/0x555 jbd2__journal_start+0xb4/0x237 __ext4_journal_start_sb+0x108/0x17e ext4_dirty_inode+0x32/0x61 __mark_inode_dirty+0x16b/0x60c iput+0x11e/0x274 __dentry_kill+0x148/0x1b8 shrink_dentry_list+0x274/0x44a prune_dcache_sb+0x4a/0x55 super_cache_scan+0xfc/0x176 shrink_slab.part.14.constprop.25+0x2a2/0x4d3 shrink_zone+0x74/0x140 kswapd+0x6b7/0x930 kthread+0x107/0x10f ret_from_fork+0x3f/0x70 irq event stamp: 138297 hardirqs last enabled at (138297): debug_check_no_locks_freed+0x113/0x12f hardirqs last disabled at (138296): debug_check_no_locks_freed+0x33/0x12f softirqs last enabled at (137818): __do_softirq+0x2d3/0x3e9 softirqs last disabled at (137813): irq_exit+0x41/0x95 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(jbd2_handle); <Interrupt> lock(jbd2_handle); *** DEADLOCK *** 5 locks held by git/20158: #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 stack backtrace: CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 Call Trace: dump_stack+0x4c/0x6e mark_lock+0x384/0x56d mark_held_locks+0x5f/0x76 lockdep_trace_alloc+0xb2/0xb5 kmem_cache_alloc_trace+0x32/0x1e2 zcomp_strm_alloc+0x25/0x73 [zram] zcomp_strm_multi_find+0xe7/0x173 [zram] zcomp_strm_find+0xc/0xe [zram] zram_bvec_rw+0x2ca/0x7e0 [zram] zram_make_request+0x1fa/0x301 [zram] generic_make_request+0x9c/0xdb submit_bio+0xf7/0x120 ext4_io_submit+0x2e/0x43 ext4_bio_write_page+0x1b7/0x300 mpage_submit_page+0x60/0x77 mpage_map_and_submit_buffers+0x10f/0x21d ext4_writepages+0xc8c/0xe1b do_writepages+0x23/0x2c __filemap_fdatawrite_range+0x84/0x8b filemap_flush+0x1c/0x1e ext4_alloc_da_blocks+0xb8/0x117 ext4_rename+0x132/0x6dc ? mark_held_locks+0x5f/0x76 ext4_rename2+0x29/0x2b vfs_rename+0x540/0x636 SyS_renameat2+0x359/0x44d SyS_rename+0x1e/0x20 entry_SYSCALL_64_fastpath+0x12/0x6f [minchan@kernel.org: add stable mark] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Kyeongdon Kim <kyeongdon.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15Merge branch 'hisi-fixes'David S. Miller4-3/+24
Kejian Yan says: ==================== dts: hisi: fixes no syscon fault when init mdio This patchset fixes the bug that eth can't initial successful on hip05-D02 because the dts files doesn't match the source code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15net: hns: fixes no syscon error when init mdioyankejian1-1/+1
As dtsi files use the normal naming conventions using '-' instead of '_' inside of property names, the driver needs to update the phandle name strings of the of_parse_phandle func. Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15dts: hisi: fixes no syscon fault when init mdioyankejian3-2/+23
When linux start up, we get the log below: "Hi-HNS_MDIO 803c0000.mdio: no syscon hisilicon,peri-c-subctrl mdio_bus mdio@803c0000: mdio sys ctl reg has not maped" The source code about the subctrl is dealt syscon, but dts doesn't. It cause such fault, so this patch adds the syscon info on dts files to fixes it. Signed-off-by: Kejian Yan <yankejian@huawei.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-15net: preserve IP control block during GSO segmentationKonstantin Khlebnikov5-5/+11
Skb_gso_segment() uses skb control block during segmentation. This patch adds 32-bytes room for previous control block which will be copied into all resulting segments. This patch fixes kernel crash during fragmenting forwarded packets. Fragmentation requires valid IP CB in skb for clearing ip options. Also patch removes custom save/restore in ovs code, now it's redundant. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-14Merge branch 'for-linus' of ↵Linus Torvalds22-29/+30
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: floppy: make local variable non-static exynos: fixes an incorrect header guard dt-bindings: fixes some incorrect header guards cpufreq-dt: correct dead link in documentation cpufreq: ARM big LITTLE: correct dead link in documentation treewide: Fix typos in printk Documentation: filesystem: Fix typo in fs/eventfd.c fs/super.c: use && instead of & for warn_on condition Documentation: fix sysfs-ptp lib: scatterlist: fix Kconfig description
2016-01-14Merge branch 'for-linus' of ↵Linus Torvalds18-427/+340
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - RO/NX attribute fixes for patch module relocations from Josh Poimboeuf. As part of this effort, module.c has been cleaned up as well and livepatching is piggy-backing on this cleanup. Rusty is OK with this whole lot going through livepatching tree. - symbol disambiguation support from Chris J Arges. That series is also Reviewed-by: Miroslav Benes <mbenes@suse.cz> but this came in only after I've alredy pushed out. Didn't want to rebase because of that, hence I am mentioning it here. - symbol lookup fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Cleanup module page permission changes module: keep percpu symbols in module's symtab module: clean up RO/NX handling. module: use a structure to encapsulate layout. gcov: use within_module() helper. module: Use the same logic for setting and unsetting RO/NX livepatch: function,sympos scheme in livepatch sysfs directory livepatch: add sympos as disambiguator field to klp_reloc livepatch: add old_sympos as disambiguator field to klp_func
2016-01-14Merge branch 'for-linus' of ↵Linus Torvalds35-574/+819
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - appoint Benjamin Tissoires as co-maintainer / designated reviewer - sysfs report_descriptor visibility fix for unclaimed devices, from Andy Lutomirski - suspend/resume fixes for Sony driver from Frank Praznik - IRQ deadlock fix from Ioan-Adrian Ratiu - hid-i2c fixes affecting (at least) Yoga 900 from Mika Westerberg and Srinivas Pandruvada - a lot of new device support (especially, but not limited to, Wacom) and assorted small misc fixes - almost complete G920 support; the only bit that is missing is switching the device to HID mode automatically; Simon Wood and Michal Maly are working on it. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (46 commits) Revert "INPUT: xpad: switch Logitech G920 Wheel into HID mode" HID: sensor-hub: Add quirk for Lenovo Yoga 900 with ITE Chips HID: Add new PID for Microchip Pick16F1454 HID: wacom: Use correct report to query pen ID from INTUOSHT2 devices HID: i2c-hid: Prevent sending reports from racing with device reset HID: use kobj_to_dev() HID: wiimote: use dev_to_wii() HID: add a new helper to_hid_driver() HID: use to_hid_device() HID: move to_hid_device() to hid.h HID: usbhid: use to_usb_device HID: corsair: Convert to use module_hid_driver HID: input: ignore the battery in OKLICK Laser BTmouse HID: wacom: Fix pad button range for CINTIQ_COMPANION_2 HID: wacom: Fix touchring value reporting HID: wacom: Report 'strip2' values in ABS_RY HID: wacom: Limit touchstrip data to 13 bits HID: wacom: bitwise vs logical ORs HID: wacom: Apply lowres quirk to BAMBOO_TOUCH devices HID: enable hid device to suspend/resume asynchronously ...
2016-01-14Merge tag 'nfs-for-4.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds36-519/+1480
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix a regression in the SunRPC socket polling code - Fix the attribute cache revalidation code - Fix race in __update_open_stateid() - Fix an lo->plh_block_lgets imbalance in layoutreturn - Fix an Oopsable typo in ff_mirror_match_fh() Features: - pNFS layout recall performance improvements. - pNFS/flexfiles: Support server-supplied layoutstats sampling period Bugfixes + cleanups: - NFSv4: Don't perform cached access checks before we've OPENed the file - Fix starvation issues with background flushes - Reclaim writes should be flushed as unstable writes if there are already entries in the commit lists - Various bugfixes from Chuck to fix NFS/RDMA send queue ordering problems - Ensure that we propagate fatal layoutget errors back to the application - Fixes for sundry flexfiles layoutstats bugs - Fix files/flexfiles to not cache invalidated layouts in the DS commit buckets" * tag 'nfs-for-4.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (68 commits) NFS: Fix a compile warning about unused variable in nfs_generic_pg_pgios() NFSv4: Fix a compile warning about no prototype for nfs4_ioctl() NFS: Use wait_on_atomic_t() for unlock after readahead SUNRPC: Fixup socket wait for memory NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid() NFSv4.1/pNFS: Fix a race in initiate_file_draining() NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn() NFS: Relax requirements in nfs_flush_incompatible NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid NFS: Allow multiple commit requests in flight per file NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs SUNRPC: Fix a missing break in rpc_anyaddr() pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh() NFS: Fix attribute cache revalidation NFS: Ensure we revalidate attributes before using execute_ok() ...
2016-01-14Merge branch 'for-linus' of ↵Linus Torvalds3-5/+8
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fix from Al Viro: "Don't put symlink bodies in pagecache into highmem" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Make sure that highmem pages are not added to symlink page cache
2016-01-14mm: add tracepoint for scanning pagesEbru Akagunduz2-32/+270
This patch series makes swapin readahead up to a certain number to gain more thp performance and adds tracepoint for khugepaged_scan_pmd, collapse_huge_page, __collapse_huge_page_isolate. This patch series was written to deal with programs that access most, but not all, of their memory after they get swapped out. Currently these programs do not get their memory collapsed into THPs after the system swapped their memory out, while they would get THPs before swapping happened. This patch series was tested with a test program, it allocates 400MB of memory, writes to it, and then sleeps. I force the system to swap out all. Afterwards, the test program touches the area by writing and leaves a piece of it without writing. This shows how much swap in readahead made by the patch. Test results: After swapped out ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 90076 kB | 88064 kB | 309928 kB | %99 | ------------------------------------------------------------------- Without patch | 194068 kB | 192512 kB | 205936 kB | %99 | ------------------------------------------------------------------- After swapped in ------------------------------------------------------------------- | Anonymous | AnonHugePages | Swap | Fraction | ------------------------------------------------------------------- With patch | 201408 kB | 198656 kB | 198596 kB | %98 | ------------------------------------------------------------------- Without patch | 292624 kB | 192512 kB | 107380 kB | %65 | ------------------------------------------------------------------- This patch (of 3): Using static tracepoints, data of functions is recorded. It is good to automatize debugging without doing a lot of changes in the source code. This patch adds tracepoint for khugepaged_scan_pmd, collapse_huge_page and __collapse_huge_page_isolate. [dan.carpenter@oracle.com: add a missing tab] Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64John Allen1-10/+6
Fix a bug where a kernel warning is triggered when performing a memory hotplug on ppc64. This warning may also occur on any architecture that uses the memory_probe_store interface. WARNING: at drivers/base/memory.c:200 CPU: 9 PID: 13042 Comm: systemd-udevd Not tainted 4.4.0-rc4-00113-g0bd0f1e-dirty #7 NIP [c00000000055e034] pages_correctly_reserved+0x134/0x1b0 LR [c00000000055e7f8] memory_subsys_online+0x68/0x140 Call Trace: memory_subsys_online+0x68/0x140 device_online+0xb4/0x120 store_mem_state+0xb0/0x180 dev_attr_store+0x34/0x60 sysfs_kf_write+0x64/0xa0 kernfs_fop_write+0x17c/0x1e0 __vfs_write+0x40/0x160 vfs_write+0xb8/0x200 SyS_write+0x60/0x110 system_call+0x38/0xd0 The warning is triggered because there is a udev rule that automatically tries to online memory after it has been added. The udev rule varies from distro to distro, but will generally look something like: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" On any architecture that uses memory_probe_store to reserve memory, the udev rule will be triggered after the first section of the block is reserved and will subsequently attempt to online the entire block, interrupting the memory reservation process and causing the warning. This patch modifies memory_probe_store to add a block of memory with a single call to add_memory as opposed to looping through and adding each section individually. A single call to add_memory is protected by the mem_hotplug mutex which will prevent the udev rule from onlining memory until the reservation of the entire block is complete. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14mm/page_isolation: use macro to judge the alignmentNaoya Horiguchi1-2/+2
Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()Joshua Clayton1-1/+1
Running sparse on drivers/staging/lustre results in dozens of warnings: include/linux/gfp.h:281:41: warning: odd constant _Bool cast (400000 becomes 1) Use "!!" to explicitly convert to bool and get rid of the warning. Signed-off-by: Joshua Clayton <stillcompiling@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14mm: rework virtual memory accountingKonstantin Khlebnikov9-58/+54
When inspecting a vague code inside prctl(PR_SET_MM_MEM) call (which testing the RLIMIT_DATA value to figure out if we're allowed to assign new @start_brk, @brk, @start_data, @end_data from mm_struct) it's been commited that RLIMIT_DATA in a form it's implemented now doesn't do anything useful because most of user-space libraries use mmap() syscall for dynamic memory allocations. Linus suggested to convert RLIMIT_DATA rlimit into something suitable for anonymous memory accounting. But in this patch we go further, and the changes are bundled together as: * keep vma counting if CONFIG_PROC_FS=n, will be used for limits * replace mm->shared_vm with better defined mm->data_vm * account anonymous executable areas as executable * account file-backed growsdown/up areas as stack * drop struct file* argument from vm_stat_account * enforce RLIMIT_DATA for size of data areas This way code looks cleaner: now code/stack/data classification depends only on vm_flags state: VM_EXEC & ~VM_WRITE -> code (VmExe + VmLib in proc) VM_GROWSUP | VM_GROWSDOWN -> stack (VmStk) VM_WRITE & ~VM_SHARED & !stack -> data (VmData) The rest (VmSize - VmData - VmStk - VmExe - VmLib) could be called "shared", but that might be strange beast like readonly-private or VM_IO area. - RLIMIT_AS limits whole address space "VmSize" - RLIMIT_STACK limits stack "VmStk" (but each vma individually) - RLIMIT_DATA now limits "VmData" Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Willy Tarreau <w@1wt.eu> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Kees Cook <keescook@google.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14include/linux/memblock.h: fix ordering of 'flags' argument in commentsFlorian Fainelli1-2/+2
for_each_free_mem_range() and for_each_free_mem_range_reverse() both accept a 'flags' argument, the comment surrounding the macro placed the 'flags' documentation at the very end, while 'flags' is in fact the 3rd argument to the macro, so let's preserve natural ordering here. Fixes: fc6daaf931518 ("mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14mm: move lru_to_page to mm_inline.hGeliang Tang3-2/+3
Move lru_to_page() from internal.h to mm_inline.h. Signed-off-by: Geliang Tang <geliangtang@163.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14Documentation/filesystems: describe the shared memory usage/accountingRodrigo Freire2-4/+6
The Shared Memory accounting support is present in Kernel since commit 4b02108ac1b3 ("mm: oom analysis: add shmem vmstat") and in userland free(1) since 2014. This patch updates the Documentation to reflect this change. Signed-off-by: Rodrigo Freire <rfreire@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>