summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-12kexec: move vmcoreinfo out of the kernel's .bss sectionXunlei Pang8-21/+29
As Eric said, "what we need to do is move the variable vmcoreinfo_note out of the kernel's .bss section. And modify the code to regenerate and keep this information in something like the control page. Definitely something like this needs a page all to itself, and ideally far away from any other kernel data structures. I clearly was not watching closely the data someone decided to keep this silly thing in the kernel's .bss section." This patch allocates extra pages for these vmcoreinfo_XXX variables, one advantage is that it enhances some safety of vmcoreinfo, because vmcoreinfo now is kept far away from other kernel data structures. Link: http://lkml.kernel.org/r/1493281021-20737-1-git-send-email-xlpang@redhat.com Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Reviewed-by: Juergen Gross <jgross@suse.com> Suggested-by: Eric Biederman <ebiederm@xmission.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Young <dyoung@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kernel/fork.c: virtually mapped stacks: do not disable interruptsChristoph Lameter1-11/+5
The reason to disable interrupts seems to be to avoid switching to a different processor while handling per cpu data using individual loads and stores. If we use per cpu RMV primitives we will not have to disable interrupts. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705171055130.5898@east.gentwo.org Signed-off-by: Christoph Lameter <cl@linux.com> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12mm/memory.c: mark create_huge_pmd() inline to prevent build failureGeert Uytterhoeven1-1/+1
With gcc 4.1.2: mm/memory.o: In function `create_huge_pmd': memory.c:(.text+0x93e): undefined reference to `do_huge_pmd_anonymous_page' Interestingly, create_huge_pmd() is emitted in the assembler output, but never called. Converting transparent_hugepage_enabled() from a macro to a static inline function reduced the ability of the compiler to remove unused code. Fix this by marking create_huge_pmd() inline. Fixes: 16981d763501c0e0 ("mm: improve readability of transparent_hugepage_enabled()") Link: http://lkml.kernel.org/r/1499842660-10665-1-git-send-email-geert@linux-m68k.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kernel.h: handle pointers to arrays better in container_of()Ian Abbott1-3/+7
If the first parameter of container_of() is a pointer to a non-const-qualified array type (and the third parameter names a non-const-qualified array member), the local variable __mptr will be defined with a const-qualified array type. In ISO C, these types are incompatible. They work as expected in GNU C, but some versions will issue warnings. For example, GCC 4.9 produces the warning "initialization from incompatible pointer type". Here is an example of where the problem occurs: ------------------------------------------------------- #include <linux/kernel.h> #include <linux/module.h> MODULE_LICENSE("GPL"); struct st { int a; char b[16]; }; static int __init example_init(void) { struct st t = { .a = 101, .b = "hello" }; char (*p)[16] = &t.b; struct st *x = container_of(p, struct st, b); printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x); return 0; } static void __exit example_exit(void) { } module_init(example_init); module_exit(example_exit); ------------------------------------------------------- Building the module with gcc-4.9 results in these warnings (where '{m}' is the module source and '{k}' is the kernel source): ------------------------------------------------------- In file included from {m}/example.c:1:0: {m}/example.c: In function `example_init': {k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ {k}/include/linux/kernel.h:854:48: warning: (near initialization for `x') const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ {m}/example.c:14:17: note: in expansion of macro `container_of' struct st *x = container_of(p, struct st, b); ^ ------------------------------------------------------- Replace the type checking performed by the macro to avoid these warnings. Make sure `*(ptr)` either has type compatible with the member, or has type compatible with `void`, ignoring qualifiers. Raise compiler errors if this is not true. This is stronger than the previous behaviour, which only resulted in compiler warnings for a type mismatch. [arnd@arndb.de: fix new warnings for container_of()] Link: http://lkml.kernel.org/r/20170620200940.90557-1-arnd@arndb.de Link: http://lkml.kernel.org/r/20170525120316.24473-7-abbotti@mev.co.uk Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Borislav Petkov <bp@suse.de> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12include/linux/dcache.h: use unsigned chars in struct name_snapshotStephen Rothwell1-2/+2
"kernel.h: handle pointers to arrays better in container_of()" triggers: In file included from include/uapi/linux/stddef.h:1:0, from include/linux/stddef.h:4, from include/uapi/linux/posix_types.h:4, from include/uapi/linux/types.h:13, from include/linux/types.h:5, from include/linux/syscalls.h:71, from fs/dcache.c:17: fs/dcache.c: In function 'release_dentry_name_snapshot': include/linux/compiler.h:542:38: error: call to '__compiletime_assert_305' declared with attribute error: pointer type mismatch in container_of() _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/compiler.h:525:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^ include/linux/compiler.h:542:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__) ^ include/linux/build_bug.h:46:37: note: in expansion of macro 'compiletime_assert' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/kernel.h:860:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \ ^ fs/dcache.c:305:7: note: in expansion of macro 'container_of' p = container_of(name->name, struct external_name, name[0]); Switch name_snapshot to use unsigned chars, matching struct qstr and struct external_name. Link: http://lkml.kernel.org/r/20170710152134.0f78c1e6@canb.auug.org.au Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12Merge branch 'i2c/for-4.13' of ↵Linus Torvalds41-2483/+4944
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This pull request contains: - i2c core reorganization. One source file became too monolithic. It is now split up, yet we still have the same named object as the final output. This should ease maintenance. - new drivers: ZTE ZX2967 family, ASPEED 24XX/25XX - designware driver gained slave mode support - xgene-slimpro driver gained ACPI support - bigger overhaul for pca-platform driver - the algo-bit module now supports messages with enforced STOP - slightly bigger than usual set of driver updates and improvements and with much appreciated quality assurance from Andy Shevchenko" * 'i2c/for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (51 commits) i2c: Provide a stub for i2c_detect_slave_mode() i2c: designware: Let slave adapter support be optional i2c: designware: Make HW init functions static i2c: designware: fix spelling mistakes i2c: pca-platform: propagate error from i2c_pca_add_numbered_bus i2c: pca-platform: correctly set algo_data.reset_chip i2c: acpi: Do not create i2c-clients for LNXVIDEO ACPI devices i2c: designware: enable SLAVE in platform module i2c: designware: add SLAVE mode functions i2c: zx2967: drop COMPILE_TEST dependency i2c: zx2967: always use the same device when printing errors i2c: pca-platform: use dev_warn/dev_info instead of printk i2c: pca-platform: use device managed allocations i2c: pca-platform: add devicetree awareness i2c: pca-platform: switch to struct gpio_desc dt-bindings: add bindings for i2c-pca-platform i2c: cadance: fix ctrl/addr reg write order i2c: zx2967: add i2c controller driver for ZTE's zx2967 family dt: bindings: add documentation for zx2967 family i2c controller i2c: algo-bit: add support for I2C_M_STOP ...
2017-07-12Merge tag 'iommu-updates-v4.13' of ↵Linus Torvalds22-548/+1233
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ...
2017-07-12Merge branch 'overlayfs-linus' of ↵Linus Torvalds12-383/+1456
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "This work from Amir introduces the inodes index feature, which provides: - hardlinks are not broken on copy up - infrastructure for overlayfs NFS export This also fixes constant st_ino for samefs case for lower hardlinks" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (33 commits) ovl: mark parent impure and restore timestamp on ovl_link_up() ovl: document copying layers restrictions with inodes index ovl: cleanup orphan index entries ovl: persistent overlay inode nlink for indexed inodes ovl: implement index dir copy up ovl: move copy up lock out ovl: rearrange copy up ovl: add flag for upper in ovl_entry ovl: use struct copy_up_ctx as function argument ovl: base tmpfile in workdir too ovl: factor out ovl_copy_up_inode() helper ovl: extract helper to get temp file in copy up ovl: defer upper dir lock to tempfile link ovl: hash overlay non-dir inodes by copy up origin ovl: cleanup bad and stale index entries on mount ovl: lookup index entry for copy up origin ovl: verify index dir matches upper dir ovl: verify upper root dir matches lower root dir ovl: introduce the inodes index dir feature ovl: generalize ovl_create_workdir() ...
2017-07-12fix a braino in compat_sys_getrlimit()Al Viro1-1/+1
Reported-and-tested-by: Meelis Roos <mroos@linux.ee> Fixes: commit d9e968cb9f84 "getrlimit()/setrlimit(): move compat to native" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds7-20/+57
Pull sparc fixes from David Miller: - Fix symbol version generation for assembler on sparc, from Nagarathnam Muthusamy. - Fix compound page handling in gup_huge_pmd(), from Nitin Gupta. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix gup_huge_pmd Adding the type of exported symbols sed regex in Makefile.build requires line break between exported symbols Adding asm-prototypes.h for genksyms to generate crc
2017-07-11Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds40-440/+600
Pull more block updates from Jens Axboe: "This is a followup for block changes, that didn't make the initial pull request. It's a bit of a mixed bag, this contains: - A followup pull request from Sagi for NVMe. Outside of fixups for NVMe, it also includes a series for ensuring that we properly quiesce hardware queues when browsing live tags. - Set of integrity fixes from Dmitry (mostly), fixing various issues for folks using DIF/DIX. - Fix for a bug introduced in cciss, with the req init changes. From Christoph. - Fix for a bug in BFQ, from Paolo. - Two followup fixes for lightnvm/pblk from Javier. - Depth fix from Ming for blk-mq-sched. - Also from Ming, performance fix for mtip32xx that was introduced with the dynamic initialization of commands" * 'for-linus' of git://git.kernel.dk/linux-block: (44 commits) block: call bio_uninit in bio_endio nvmet: avoid unneeded assignment of submit_bio return value nvme-pci: add module parameter for io queue depth nvme-pci: compile warnings in nvme_alloc_host_mem() nvmet_fc: Accept variable pad lengths on Create Association LS nvme_fc/nvmet_fc: revise Create Association descriptor length lightnvm: pblk: remove unnecessary checks lightnvm: pblk: control I/O flow also on tear down cciss: initialize struct scsi_req null_blk: fix error flow for shared tags during module_init block: Fix __blkdev_issue_zeroout loop nvme-rdma: unconditionally recycle the request mr nvme: split nvme_uninit_ctrl into stop and uninit virtio_blk: quiesce/unquiesce live IO when entering PM states mtip32xx: quiesce request queues to make sure no submissions are inflight nbd: quiesce request queues to make sure no submissions are inflight nvme: kick requeue list when requeueing a request instead of when starting the queues nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues ...
2017-07-11Merge tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds15-179/+180
Pull cifs fixes and sane default from Steve French: "Upgrade default dialect to more secure SMB3 from older cifs dialect" * tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6: cifs: Clean up unused variables in smb2pdu.c [SMB3] Improve security, move default dialect to SMB3 from old CIFS [SMB3] Remove ifdef since SMB3 (and later) now STRONGLY preferred CIFS: Reconnect expired SMB sessions CIFS: Display SMB2 error codes in the hex format cifs: Use smb 2 - 3 and cifsacl mount options setacl function cifs: prototype declaration and definition to set acl for smb 2 - 3 and cifsacl mount options
2017-07-11Merge tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds28-476/+2308
Pull ceph updates from Ilya Dryomov: "The main item here is support for v12.y.z ("Luminous") clusters: RESEND_ON_SPLIT, RADOS_BACKOFF, OSDMAP_PG_UPMAP and CRUSH_CHOOSE_ARGS feature bits, and various other changes in the RADOS client protocol. On top of that we have a new fsc mount option to allow supplying fscache uniquifier (similar to NFS) and the usual pile of filesystem fixes from Zheng" * tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client: (44 commits) libceph: advertise support for NEW_OSDOP_ENCODING and SERVER_LUMINOUS libceph: osd_state is 32 bits wide in luminous crush: remove an obsolete comment crush: crush_init_workspace starts with struct crush_work libceph, crush: per-pool crush_choose_arg_map for crush_do_rule() crush: implement weight and id overrides for straw2 libceph: apply_upmap() libceph: compute actual pgid in ceph_pg_to_up_acting_osds() libceph: pg_upmap[_items] infrastructure libceph: ceph_decode_skip_* helpers libceph: kill __{insert,lookup,remove}_pg_mapping() libceph: introduce and switch to decode_pg_mapping() libceph: don't pass pgid by value libceph: respect RADOS_BACKOFF backoffs libceph: make DEFINE_RB_* helpers more general libceph: avoid unnecessary pi lookups in calc_target() libceph: use target pi for calc_target() calculations libceph: always populate t->target_{oid,oloc} in calc_target() libceph: make sure need_resend targets reflect latest map libceph: delete from need_resend_linger before check_linger_pool_dne() ...
2017-07-11Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds26-646/+1077
Pull watchdog updates from Wim Van Sebroeck: - Add Renesas RZ/A WDT Watchdog driver - STM32 Independent WatchDoG (IWDG) support - UniPhier watchdog support - Add F71868 support - Add support for NCT6793D and NCT6795D - dw_wdt: add reset lines support - core: add option to avoid early handling of watchdog - core: introduce watchdog_worker_should_ping helper - Cleanups and improvements for sama5d4, intel-mid_wdt, s3c2410_wdt, orion_wdt, gpio_wdt, it87_wdt, meson_wdt, davinci_wdt, bcm47xx_wdt, zx2967_wdt, cadence_wdt * git://www.linux-watchdog.org/linux-watchdog: (32 commits) watchdog: introduce watchdog_worker_should_ping helper watchdog: uniphier: add UniPhier watchdog driver dt-bindings: watchdog: add description for UniPhier WDT controller watchdog: cadence_wdt: make of_device_ids const. watchdog: zx2967: constify zx2967_wdt_ops. watchdog: bcm47xx_wdt: constify bcm47xx_wdt_hard_ops and bcm47xx_wdt_soft_ops watchdog: davinci: Add missing clk_disable_unprepare(). watchdog: davinci: Handle return value of clk_prepare_enable watchdog: meson: Handle return value of clk_prepare_enable watchdog: it87: Add support for various Super-IO chips watchdog: it87: Use infrastructure to stop watchdog on reboot watchdog: it87: Drop support for resetting watchdog though CIR and Game port watchdog: it87: Convert to use watchdog core infrastructure watchdog: it87: Drop FSF mailing address watchdog: dw_wdt: get reset lines from dt watchdog: bindings: dw_wdt: add reset lines watchdog: w83627hf: Add support for NCT6793D and NCT6795D watchdog: core: add option to avoid early handling of watchdog watchdog: f71808e_wdt: Add F71868 support watchdog: Add STM32 IWDG driver ...
2017-07-11Merge tag 'chrome-platform-for-linus-4.13' of ↵Linus Torvalds17-84/+1393
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: "Changes in this pull request are around catching up cros_ec with the internal chromeos-kernel versions of cros_ec, cros_ec_lpc, and cros_ec_lightbar. Also, switching maintainership from olof to bleung" * tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome : Add myself as Maintainer platform/chrome: cros_ec_lightbar - hide unused PM functions cros_ec: Don't signal wake event for non-wake host events cros_ec: Fix deadlock when EC is not responsive at probe cros_ec: Don't return error when checking command version platform/chrome: cros_ec_lightbar - Avoid I2C xfer to EC during suspend platform/chrome: cros_ec_lightbar - Add userspace lightbar control bit to EC platform/chrome: cros_ec_lightbar - Control of suspend/resume lightbar sequence platform/chrome: cros_ec_lightbar - Add lightbar program feature to sysfs platform/chrome: cros_ec_lpc: Add MKBP events support over ACPI platform/chrome: cros_ec_lpc: Add power management ops platform/chrome: cros_ec_lpc: Add support for GOOG004 ACPI device platform/chrome: cros_ec_lpc: Add support for mec1322 EC platform/chrome: cros_ec_lpc: Add R/W helpers to LPC protocol variants mfd: cros_ec: Add support for dumping panic information cros_ec_debugfs: Pass proper struct sizes to cros_ec_cmd_xfer() mfd: cros_ec: add debugfs, console log file mfd: cros_ec: Add EC console read structures definitions mfd: cros_ec: Add helper for event notifier.
2017-07-11Merge branch 'for-next' of ↵Linus Torvalds6-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull x86nommu update from Greg Ungerer: "Only a single change, to remove old Kconfig options from defconfigs" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: defconfig: Cleanup from old Kconfig options
2017-07-10Merge branch 'akpm' (patches from Andrew)Linus Torvalds103-1249/+1538
Merge more updates from Andrew Morton: - most of the rest of MM - KASAN updates - lib/ updates - checkpatch updates - some binfmt_elf changes - various misc bits * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (115 commits) kernel/exit.c: avoid undefined behaviour when calling wait4() kernel/signal.c: avoid undefined behaviour in kill_something_info binfmt_elf: safely increment argv pointers s390: reduce ELF_ET_DYN_BASE powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB arm64: move ELF_ET_DYN_BASE to 4GB / 4MB arm: move ELF_ET_DYN_BASE to 4MB binfmt_elf: use ELF_ET_DYN_BASE only for PIE fs, epoll: short circuit fetching events if thread has been killed checkpatch: improve multi-line alignment test checkpatch: improve macro reuse test checkpatch: change format of --color argument to --color[=WHEN] checkpatch: silence perl 5.26.0 unescaped left brace warnings checkpatch: improve tests for multiple line function definitions checkpatch: remove false warning for commit reference checkpatch: fix stepping through statements with $stat and ctx_statement_block checkpatch: [HLP]LIST_HEAD is also declaration checkpatch: warn when a MAINTAINERS entry isn't [A-Z]:\t checkpatch: improve the unnecessary OOM message test lib/bsearch.c: micro-optimize pivot position calculation ...
2017-07-10kernel/exit.c: avoid undefined behaviour when calling wait4()zhongjiang1-0/+4
wait4(-2147483648, 0x20, 0, 0xdd0000) triggers: UBSAN: Undefined behaviour in kernel/exit.c:1651:9 The related calltrace is as follows: negation of -2147483648 cannot be represented in type 'int': CPU: 9 PID: 16482 Comm: zj Tainted: G B ---- ------- 3.10.0-327.53.58.71.x86_64+ #66 Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011 Call Trace: dump_stack+0x19/0x1b ubsan_epilogue+0xd/0x50 __ubsan_handle_negate_overflow+0x109/0x14e SyS_wait4+0x1cb/0x1e0 system_call_fastpath+0x16/0x1b Exclude the overflow to avoid the UBSAN warning. Link: http://lkml.kernel.org/r/1497264618-20212-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhongjiang <zhongjiang@huawei.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10kernel/signal.c: avoid undefined behaviour in kill_something_infozhongjiang1-0/+4
When running kill(72057458746458112, 0) in userspace I hit the following issue. UBSAN: Undefined behaviour in kernel/signal.c:1462:11 negation of -2147483648 cannot be represented in type 'int': CPU: 226 PID: 9849 Comm: test Tainted: G B ---- ------- 3.10.0-327.53.58.70.x86_64_ubsan+ #116 Hardware name: Huawei Technologies Co., Ltd. RH8100 V3/BC61PBIA, BIOS BLHSV028 11/11/2014 Call Trace: dump_stack+0x19/0x1b ubsan_epilogue+0xd/0x50 __ubsan_handle_negate_overflow+0x109/0x14e SYSC_kill+0x43e/0x4d0 SyS_kill+0xe/0x10 system_call_fastpath+0x16/0x1b Add code to avoid the UBSAN detection. [akpm@linux-foundation.org: tweak comment] Link: http://lkml.kernel.org/r/1496670008-59084-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhongjiang <zhongjiang@huawei.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10binfmt_elf: safely increment argv pointersKees Cook1-11/+9
When building the argv/envp pointers, the envp is needlessly pre-incremented instead of just continuing after the argv pointers are finished. In some (likely impossible) race where the strings could be changed from userspace between copy_strings() and here, it might be possible to confuse the envp position. Instead, just use sp like everything else. Link: http://lkml.kernel.org/r/20170622173838.GA43308@beast Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10s390: reduce ELF_ET_DYN_BASEKees Cook1-8/+7
Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For s390 the position could be 0x10000, but that is needlessly close to the NULL address. Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10powerpc: move ELF_ET_DYN_BASE to 4GB / 4MBKees Cook1-6/+7
Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10arm64: move ELF_ET_DYN_BASE to 4GB / 4MBKees Cook1-6/+6
Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, to match ARM. This could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running arm compat PIE will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498251600-132458-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10arm: move ELF_ET_DYN_BASE to 4MBKees Cook1-6/+2
Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. 4MB is chosen here mainly to have parity with x86, where this is the traditional minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For ARM the position could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running PIE on 32-bit ARM will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10binfmt_elf: use ELF_ET_DYN_BASE only for PIEKees Cook2-14/+58
The ELF_ET_DYN_BASE position was originally intended to keep loaders away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2 /bin/cat" might cause the subsequent load of /bin/cat into where the loader had been loaded.) With the advent of PIE (ET_DYN binaries with an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial portion of the address space is unused. For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are loaded above the mmap region. This means they can be made to collide (CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with pathological stack regions. Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap region in all cases, and will now additionally avoid programs falling back to the mmap region by enforcing MAP_FIXED for program loads (i.e. if it would have collided with the stack, now it will fail to load instead of falling back to the mmap region). To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP) are loaded into the mmap region, leaving space available for either an ET_EXEC binary with a fixed location or PIE being loaded into mmap by the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE, which means architectures can now safely lower their values without risk of loaders colliding with their subsequently loaded programs. For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and suggestions on how to implement this solution. Fixes: d1fd836dcf00 ("mm: split ET_DYN ASLR from mmap ASLR") Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10fs, epoll: short circuit fetching events if thread has been killedDavid Rientjes1-0/+10
We've encountered zombies that are waiting for a thread to exit that are looping in ep_poll() almost endlessly although there is a pending SIGKILL as a result of a group exit. This happens because we always find ep_events_available() and fetch more events and never are able to check for signal_pending() that would break from the loop and return -EINTR. Special case fatal signals and break immediately to guarantee that we loop to fetch more events and delay making a timely exit. It would also be possible to simply move the check for signal_pending() higher than checking for ep_events_available(), but there have been no reports of delayed signal handling other than SIGKILL preventing zombies from exiting that would be fixed by this. It fixes an issue for us where we have witnessed zombies sticking around for at least O(minutes), but considering the code has been like this forever and nobody else has complained that I have found, I would simply queue it up for 4.12. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1705031722350.76784@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: improve multi-line alignment testJoe Perches1-1/+1
The current test fails to warn about improper alignment with code like foo->bar = func(arg1, arg2); because foo->bar is not a single identifier. Convert the $Ident to $Lval which allows for multiple dereferences. Link: http://lkml.kernel.org/r/01c35b9b6a12a415e57746d45d589bfaad39952a.1498841563.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: improve macro reuse testJoe Perches1-6/+6
checkpatch reports a false positive when using token pasting argument multiple times in a macro. Fix it. Miscellanea: o Make the $tmp variable name used in the macro argument tests a bit more descriptive Link: http://lkml.kernel.org/r/cf434ae7602838388c7cb49d42bca93ab88527e7.1498483044.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: change format of --color argument to --color[=WHEN]John Brooks1-6/+29
The boolean --color argument did not offer the ability to force colourized output even if stdout is not a terminal. Change the format of the argument to the familiar --color[=WHEN] construct as seen in common Linux utilities such as git, ls and dmesg, which allows the user to specify whether to colourize output "always", "never", or "auto" when the output is a terminal. The default is "auto". The old command-line uses of --color and --no-color are unchanged. Link: http://lkml.kernel.org/r/efe43bdbad400f39ba691ae663044462493b0773.1496799721.git.joe@perches.com Signed-off-by: John Brooks <john@fastquake.com> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: silence perl 5.26.0 unescaped left brace warningsCyril Bur1-3/+3
As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have occurred when running checkpatch. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){ <-- HERE \s*/ at scripts/checkpatch.pl line 3544. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){ <-- HERE \s*/ at scripts/checkpatch.pl line 3885. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374. It seems perfectly reasonable to do as the warning suggests and simply escape the left brace in these three locations. Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.com Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Acked-by: Joe Perches <joe@perches.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: improve tests for multiple line function definitionsJoe Perches1-1/+25
Add a block that identifies multiple line function definitions. Save the function name into $context_function to improve the embedded function name test. Look for misplaced open brace on the function definition. Emit an OPEN_BRACE error when the function definition is similar to void foo(int arg1, int arg2) { Miscellanea: o Remove the $realfile test in function declaration w/o named arguments test o Comment the function declaration w/o named arguments test Link: http://lkml.kernel.org/r/de620ed6ebab75fdfa323741ada2134a0f545892.1496835238.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Tested-by: David Kershner <david.kershner@unisys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: remove false warning for commit referenceHeinrich Schuchardt1-1/+3
Checkpatch warns of an incorrect commit reference style for any hexadecimal number of 12 digits and more. Numbers of 12 digits are not necessarily commit ids. For an example provoking the problem see https://patchwork.kernel.org/patch/9170897/ Checkpatch should only warn if the number refers to an existing commit. Link: http://lkml.kernel.org/r/20170607184008.5869-1-xypron.glpk@gmx.de Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: fix stepping through statements with $stat and ctx_statement_blockJoe Perches1-1/+1
Fix the off-by-one in the suppression of lines in a statement block. This means that for multiple line statements like foo(bar, baz, qux); $stat has been inspected first correctly for the entire statement, and subsequently incorrectly just for qux); This fix will help make tracking appropriate indentation a little easier. Link: http://lkml.kernel.org/r/71b25979c90412133c717084036c9851cd2b7bcb.1496862585.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: [HLP]LIST_HEAD is also declarationSteffen Maier1-1/+1
Fixes the following false warning among others for LLIST_HEAD and PLIST_HEAD: WARNING: Missing a blank line after declarations #71: FILE: drivers/s390/scsi/zfcp_fsf.c:422: + struct hlist_node *tmp; + HLIST_HEAD(remove_queue); Link: http://lkml.kernel.org/r/20170614133512.89425-1-maier@linux.vnet.ibm.com Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: warn when a MAINTAINERS entry isn't [A-Z]:\tJoe Perches1-0/+11
For consistency, MAINTAINERS entries should be an upper case letter, then a colon, then a tab, then the value. Warn when an entry doesn't have this form. --fix it too. Link: http://lkml.kernel.org/r/9aaaf03ec10adf3888b5e98dd2176b7fe9b5fad8.1496343345.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10checkpatch: improve the unnecessary OOM message testJoe Perches1-1/+1
Use the context around a patch to avoid missing some candidates. Link: http://lkml.kernel.org/r/865e874fbae5decc331a849bd8d71c325db6bc80.1496343345.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/bsearch.c: micro-optimize pivot position calculationSergey Senozhatsky1-10/+12
There is a slightly faster way (in terms of the number of instructions being used) to calculate the position of a middle element, preserving integer overflow safeness. ./scripts/bloat-o-meter lib/bsearch.o.old lib/bsearch.o.new add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-24 (-24) function old new delta bsearch 122 98 -24 TEST INT array of size 100001, elements [0..100000]. gcc 7.1, Os, x86_64. a) bsearch() of existing key "100001 - 2": BASE ==== $ perf stat ./a.out Performance counter stats for './a.out': 619.445196 task-clock:u (msec) # 0.999 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 133 page-faults:u # 0.215 K/sec 1,949,517,279 cycles:u # 3.147 GHz (83.06%) 181,017,938 stalled-cycles-frontend:u # 9.29% frontend cycles idle (83.05%) 82,959,265 stalled-cycles-backend:u # 4.26% backend cycles idle (67.02%) 4,355,706,383 instructions:u # 2.23 insn per cycle # 0.04 stalled cycles per insn (83.54%) 1,051,539,242 branches:u # 1697.550 M/sec (83.54%) 15,263,381 branch-misses:u # 1.45% of all branches (83.43%) 0.620082548 seconds time elapsed PATCHED ======= $ perf stat ./a.out Performance counter stats for './a.out': 475.097316 task-clock:u (msec) # 0.999 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 135 page-faults:u # 0.284 K/sec 1,487,467,717 cycles:u # 3.131 GHz (82.95%) 186,537,162 stalled-cycles-frontend:u # 12.54% frontend cycles idle (82.93%) 28,797,869 stalled-cycles-backend:u # 1.94% backend cycles idle (67.10%) 3,807,564,203 instructions:u # 2.56 insn per cycle # 0.05 stalled cycles per insn (83.57%) 1,049,344,291 branches:u # 2208.693 M/sec (83.60%) 5,485 branch-misses:u # 0.00% of all branches (83.58%) 0.475760235 seconds time elapsed b) bsearch() of un-existing key "100001 + 2": BASE ==== $ perf stat ./a.out Performance counter stats for './a.out': 499.244480 task-clock:u (msec) # 0.999 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 132 page-faults:u # 0.264 K/sec 1,571,194,855 cycles:u # 3.147 GHz (83.18%) 13,450,980 stalled-cycles-frontend:u # 0.86% frontend cycles idle (83.18%) 21,256,072 stalled-cycles-backend:u # 1.35% backend cycles idle (66.78%) 4,171,197,909 instructions:u # 2.65 insn per cycle # 0.01 stalled cycles per insn (83.68%) 1,009,175,281 branches:u # 2021.405 M/sec (83.79%) 3,122 branch-misses:u # 0.00% of all branches (83.37%) 0.499871144 seconds time elapsed PATCHED ======= $ perf stat ./a.out Performance counter stats for './a.out': 399.023499 task-clock:u (msec) # 0.998 CPUs utilized 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 134 page-faults:u # 0.336 K/sec 1,245,793,991 cycles:u # 3.122 GHz (83.39%) 11,529,273 stalled-cycles-frontend:u # 0.93% frontend cycles idle (83.46%) 12,116,311 stalled-cycles-backend:u # 0.97% backend cycles idle (66.92%) 3,679,710,005 instructions:u # 2.95 insn per cycle # 0.00 stalled cycles per insn (83.47%) 1,009,792,625 branches:u # 2530.660 M/sec (83.46%) 2,590 branch-misses:u # 0.00% of all branches (83.12%) 0.399733539 seconds time elapsed Link: http://lkml.kernel.org/r/20170607150457.5905-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/extable.c: use bsearch() library function in search_extable()Thomas Meyer8-56/+63
[thomas@m3y3r.de: v3: fix arch specific implementations] Link: http://lkml.kernel.org/r/1497890858.12931.7.camel@m3y3r.de Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/rhashtable.c: use kvzalloc() in bucket_table_alloc() when possibleMichal Hocko1-4/+3
bucket_table_alloc() can be currently called with GFP_KERNEL or GFP_ATOMIC. For the former we basically have an open coded kvzalloc() while the later only uses kzalloc(). Let's simplify the code a bit by the dropping the open coded path and replace it with kvzalloc(). Link: http://lkml.kernel.org/r/20170531155145.17111-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/interval_tree_test.c: allow full tree searchDavidlohr Bueso1-5/+10
... such that a user can specify visiting all the nodes in the tree (intersects with the world). This is a nice opposite from the very basic default query which is a single point. Link: http://lkml.kernel.org/r/20170518174936.20265-5-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/interval_tree_test.c: allow users to limit scope of endpointDavidlohr Bueso1-10/+13
Add a 'max_endpoint' parameter such that users may easily limit the size of the intervals that are randomly generated. Link: http://lkml.kernel.org/r/20170518174936.20265-4-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/interval_tree_test.c: make test options module parametersDavidlohr Bueso1-17/+40
Allows for more flexible debugging. Link: http://lkml.kernel.org/r/20170518174936.20265-3-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/interval_tree_test.c: allow the module to be compiled-inDavidlohr Bueso1-1/+1
Patch series "lib/interval_tree_test: some debugging improvements". Here are some patches that update the interval_tree_test module allowing users to pass finer grained options to run the actual test. This patch (of 4): It is a tristate after all, and also serves well for quick debugging. Link: http://lkml.kernel.org/r/20170518174936.20265-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/kstrtox.c: use "unsigned int" moreAlexey Dobriyan1-4/+6
gcc does generates stupid code sign extending data back and forth. Help by using "unsigned int". add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-61 (-61) function old new delta _parse_integer 128 123 -5 It _still_ does generate useless MOVSX but I don't know how to delete it: 0000000000000070 <_parse_integer>: ... a0: 89 c2 mov edx,eax a2: 83 e8 30 sub eax,0x30 a5: 83 f8 09 cmp eax,0x9 a8: 76 11 jbe bb <_parse_integer+0x4b> aa: 83 ca 20 or edx,0x20 ad: 0f be c2 ===> movsx eax,dl <=== useless b0: 8d 50 9f lea edx,[rax-0x61] b3: 83 fa 05 cmp edx,0x5 Patch also helps on embedded archs which generally only like "int". On arm "and 0xff" is generated which is waste because all values used in comparisons are positive. Link: http://lkml.kernel.org/r/20170514194720.GB32563@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/kstrtox.c: delete end-of-string testAlexey Dobriyan1-1/+1
Standard "while (*s)" test is unnecessary because NUL won't pass valid-digit test anyway. Save one branch per parsed character. Link: http://lkml.kernel.org/r/20170514193756.GA32563@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10bitmap: use memcmp optimisation in more situationsMatthew Wilcox2-3/+2
Commit 7dd968163f7c ("bitmap: bitmap_equal memcmp optimization") was rather more restrictive than necessary; we can use memcmp() to implement bitmap_equal() as long as the number of bits can be proved to be a multiple of 8. And architectures other than s390 may be able to make good use of this optimisation. [arnd@arndb.de: fix build: add a memcmp() declaration] Link: http://lkml.kernel.org/r/20170630153908.3439707-1-arnd@arndb.de Link: http://lkml.kernel.org/r/20170628153221.11322-5-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when ↵Matthew Wilcox1-0/+6
possible Several callers have constant 'start' and an 'nbits' that is a multiple of 8, so we can turn them into calls to memset. We don't need the entirety of 'start' and 'nbits' to be constant, we just need to know whether they're divisible by 8. Link: http://lkml.kernel.org/r/20170628153221.11322-4-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10bitmap: optimise bitmap_set and bitmap_clear of a single bitMatthew Wilcox3-10/+24
We have eight users calling bitmap_clear for a single bit and seventeen calling bitmap_set for a single bit. Rather than fix all of them to call __clear_bit or __set_bit, turn bitmap_clear and bitmap_set into inline functions and make this special case efficient. Link: http://lkml.kernel.org/r/20170628153221.11322-3-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10lib/test_bitmap.c: add optimisation testsMatthew Wilcox1-0/+32
Patch series "Bitmap optimisations", v2. These three bitmap patches use more efficient specialisations when the compiler can figure out that it's safe to do so. Thanks to Rasmus's eagle eyes, a nasty bug in v1 was avoided, and I've added a test case which would have caught it. This patch (of 4): This version of the test is actually a no-op; the next patch will enable it. Link: http://lkml.kernel.org/r/20170628153221.11322-2-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-10MAINTAINERS: give proc sysctl some maintainer loveLuis R. Rodriguez1-0/+11
We poke at proc sysctl enough that really we should declare it maintained. We'll just be Cc'd and sending updates / ACK'ing changes through akpm's tree. Link: http://lkml.kernel.org/r/20170524231305.8649-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>