summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2017-07-20x86/defconfig: Remove stale, old Kconfig optionsKrzysztof Kozlowski2-6/+0
Remove old, dead Kconfig options (in order appearing in this commit): - EXPERIMENTAL is gone since v3.9; - IP_NF_TARGET_ULOG: commit d4da843e6fad ("netfilter: kill remnants of ulog targets"); - USB_LIBUSUAL: commit f61870ee6f8c ("usb: remove libusual"); Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1500526885-4341-1-git-send-email-krzk@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/ioapic: Pass the correct data to unmask_ioapic_irq()Seunghun Han1-1/+1
One of the rarely executed code pathes in check_timer() calls unmask_ioapic_irq() passing irq_get_chip_data(0) as argument. That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of interrupt 0. irq_get_chip_data(0) returns NULL, so the following dereference in unmask_ioapic_irq() causes a kernel panic. The issue went unnoticed in the first place because irq_get_chip_data() returns a void pointer so the compiler cannot do a type check on the argument. The code path was added for machines with broken configuration, but it seems that those machines are either not running current kernels or simply do not longer exist. Hand in irq_get_irq_data(0) as argument which provides the correct data. [ tglx: Rewrote changelog ] Fixes: 4467715a44cc ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c") Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-20x86/acpi: Prevent out of bound access caused by broken ACPI tablesSeunghun Han1-0/+8
The bus_irq argument of mp_override_legacy_irq() is used as the index into the isa_irq_to_gsi[] array. The bus_irq argument originates from ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI tables, but is nowhere sanity checked. That allows broken or malicious ACPI tables to overwrite memory, which might cause malfunction, panic or arbitrary code execution. Add a sanity check and emit a warning when that triggers. [ tglx: Added warning and rewrote changelog ] Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-18x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNTRoman Kagan1-1/+1
A recent commit: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") introduced a VM_WARN_ON(!in_atomic()) which generates false positives on every VM entry on !CONFIG_PREEMPT_COUNT kernels. Replace it with a test for preemptible(), which appears to match the original intent and works across different CONFIG_PREEMPT* variations. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: linux-mm@kvack.org Fixes: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-16x86/platform/uv/BAU: Fix congested_response_us not taking effectJustin Ernst1-4/+2
Bug fix for the BAU tunable congested_cycles not being set to the user defined value. Instead of referencing a global variable when deciding on BAU shutdown, a node will reference its own tunable set value ( cong_response_us). This results in the user set tunable value congested_response_us taking effect correctly. Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Acked-by: Andrew Banman <abanman@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: mike.travis@hpe.com Cc: sivanich@hpe.com Link: http://lkml.kernel.org/r/1499970803-282432-1-git-send-email-justin.ernst@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-16x86/cpu: Use indirect call to measure performance in init_amd_k6()Mikulas Patocka1-0/+1
This old piece of code is supposed to measure the performance of indirect calls to determine if the processor is buggy or not, however the compiler optimizer turns it into a direct call. Use the OPTIMIZER_HIDE_VAR() macro to thwart the optimization, so that a real indirect call is generated. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707110737530.8746@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-15Merge branch 'work.mount' of ↵Linus Torvalds1-3/+19
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ->s_options removal from Al Viro: "Preparations for fsmount/fsopen stuff (coming next cycle). Everything gets moved to explicit ->show_options(), killing ->s_options off + some cosmetic bits around fs/namespace.c and friends. Basically, the stuff needed to work with fsmount series with minimum of conflicts with other work. It's not strictly required for this merge window, but it would reduce the PITA during the coming cycle, so it would be nice to have those bits and pieces out of the way" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: isofs: Fix isofs_show_options() VFS: Kill off s_options and helpers orangefs: Implement show_options 9p: Implement show_options isofs: Implement show_options afs: Implement show_options affs: Implement show_options befs: Implement show_options spufs: Implement show_options bpf: Implement show_options ramfs: Implement show_options pstore: Implement show_options omfs: Implement show_options hugetlbfs: Implement show_options VFS: Don't use save/replace_mount_options if not using generic_show_options VFS: Provide empty name qstr VFS: Make get_filesystem() return the affected filesystem VFS: Clean up whitespace in fs/namespace.c and fs/super.c Provide a function to create a NUL-terminated string from unterminated data
2017-07-15Merge branch 'work.uaccess-unaligned' of ↵Linus Torvalds21-406/+145
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uacess-unaligned removal from Al Viro: "That stuff had just one user, and an exotic one, at that - binfmt_flat on arm and m68k" * 'work.uaccess-unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: kill {__,}{get,put}_user_unaligned() binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail
2017-07-15Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds127-1771/+2206
Pull MIPS updates from Ralf Baechle: "Boston platform support: - Document DT bindings - Add CLK driver for board clocks CM: - Avoid per-core locking with CM3 & higher - WARN on attempt to lock invalid VP, not BUG CPS: - Select CONFIG_SYS_SUPPORTS_SCHED_SMT for MIPSr6 - Prevent multi-core with dcache aliasing - Handle cores not powering down more gracefully - Handle spurious VP starts more gracefully DSP: - Add lwx & lhx missaligned access support eBPF: - Add MIPS support along with many supporting change to add the required infrastructure Generic arch code: - Misc sysmips MIPS_ATOMIC_SET fixes - Drop duplicate HAVE_SYSCALL_TRACEPOINTS - Negate error syscall return in trace - Correct forced syscall errors - Traced negative syscalls should return -ENOSYS - Allow samples/bpf/tracex5 to access syscall arguments for sane traces - Cleanup from old Kconfig options in defconfigs - Fix PREF instruction usage by memcpy for MIPS R6 - Fix various special cases in the FPU eulation - Fix some special cases in MIPS16e2 support - Fix MIPS I ISA /proc/cpuinfo reporting - Sort MIPS Kconfig alphabetically - Fix minimum alignment requirement of IRQ stack as required by ABI / GCC - Fix special cases in the module loader - Perform post-DMA cache flushes on systems with MAARs - Probe the I6500 CPU - Cleanup cmpxchg and add support for 1 and 2 byte operations - Use queued read/write locks (qrwlock) - Use queued spinlocks (qspinlock) - Add CPU shared FTLB feature detection - Handle tlbex-tlbp race condition - Allow storing pgd in C0_CONTEXT for MIPSr6 - Use current_cpu_type() in m4kc_tlbp_war() - Support Boston in the generic kernel Generic platform: - yamon-dt: Pull YAMON DT shim code out of SEAD-3 board - yamon-dt: Support > 256MB of RAM - yamon-dt: Use serial* rather than uart* aliases - Abstract FDT fixup application - Set RTC_ALWAYS_BCD to 0 - Add a MAINTAINERS entry core kernel: - qspinlock.c: include linux/prefetch.h Loongson 3: - Add support Perf: - Add I6500 support SEAD-3: - Remove GIC timer from DT - Set interrupt-parent per-device, not at root node - Fix GIC interrupt specifiers SMP: - Skip IPI setup if we only have a single CPU VDSO: - Make comment match reality - Improvements to time code in VDSO" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (86 commits) locking/qspinlock: Include linux/prefetch.h MIPS: Fix MIPS I ISA /proc/cpuinfo reporting MIPS: Fix minimum alignment requirement of IRQ stack MIPS: generic: Support MIPS Boston development boards MIPS: DTS: img: Don't attempt to build-in all .dtb files clk: boston: Add a driver for MIPS Boston board clocks dt-bindings: Document img,boston-clock binding MIPS: Traced negative syscalls should return -ENOSYS MIPS: Correct forced syscall errors MIPS: Negate error syscall return in trace MIPS: Drop duplicate HAVE_SYSCALL_TRACEPOINTS select MIPS16e2: Provide feature overrides for non-MIPS16 systems MIPS: MIPS16e2: Report ASE presence in /proc/cpuinfo MIPS: MIPS16e2: Subdecode extended LWSP/SWSP instructions MIPS: MIPS16e2: Identify ASE presence MIPS: VDSO: Fix a mismatch between comment and preprocessor constant MIPS: VDSO: Add implementation of gettimeofday() fallback MIPS: VDSO: Add implementation of clock_gettime() fallback MIPS: VDSO: Fix conversions in do_monotonic()/do_monotonic_coarse() MIPS: Use current_cpu_type() in m4kc_tlbp_war() ...
2017-07-15Merge branch 'for-linus-4.13-rc1' of ↵Linus Torvalds21-82/+201
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: "Mostly fixes for UML: - First round of fixes for PTRACE_GETRESET/SETREGSET - A printf vs printk cleanup - Minor improvements" * 'for-linus-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Correctly check for PTRACE_GETRESET/SETREGSET um: v2: Use generic NOTES macro um: Add kerneldoc for userspace_tramp() and start_userspace() um: Add kerneldoc for segv_handler um: stub-data.h: remove superfluous include um: userspace - be more verbose in ptrace set regs error um: add dummy ioremap and iounmap functions um: Allow building and running on older hosts um: Avoid longjmp/setjmp symbol clashes with libpthread.a um: console: Ignore console= option um: Use os_warn to print out pre-boot warning/error messages um: Add os_warn() for pre-boot warning/error messages um: Use os_info for the messages on normal path um: Add os_info() for pre-boot information messages um: Use printk instead of printf in make_uml_dir
2017-07-15Merge tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds14-172/+320
Pull more KVM updates from Radim Krčmář: "Second batch of KVM updates for v4.13 Common: - add uevents for VM creation/destruction - annotate and properly access RCU-protected objects s390: - rename IOCTL added in the first v4.13 merge x86: - emulate VMLOAD VMSAVE feature in SVM - support paravirtual asynchronous page fault while nested - add Hyper-V userspace interfaces for better migration - improve master clock corner cases - extend internal error reporting after EPT misconfig - correct single-stepping of emulated instructions in SVM - handle MCE during VM entry - fix nVMX VM entry checks and nVMX VMCS shadowing" * tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) kvm: x86: hyperv: make VP_INDEX managed by userspace KVM: async_pf: Let guest support delivery of async_pf from guest mode KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf KVM: async_pf: Add L1 guest async_pf #PF vmexit handler KVM: x86: Simplify kvm_x86_ops->queue_exception parameter list kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2 KVM: x86: make backwards_tsc_observed a per-VM variable KVM: trigger uevents when creating or destroying a VM KVM: SVM: Enable Virtual VMLOAD VMSAVE feature KVM: SVM: Add Virtual VMLOAD VMSAVE feature definition KVM: SVM: Rename lbr_ctl field in the vmcb control area KVM: SVM: Prepare for new bit definition in lbr_ctl KVM: SVM: handle singlestep exception when skipping emulated instructions KVM: x86: take slots_lock in kvm_free_pit KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definition kvm: vmx: Properly handle machine check during VM-entry KVM: x86: update master clock before computing kvmclock_offset kvm: nVMX: Shadow "high" parts of shadowed 64-bit VMCS fields kvm: nVMX: Fix nested_vmx_check_msr_bitmap_controls kvm: nVMX: Validate the I/O bitmaps on nested VM-entry ...
2017-07-14Merge branch 'linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - fix new compiler warnings in cavium - set post-op IV properly in caam (this fixes chaining) - fix potential use-after-free in atmel in case of EBUSY - fix sleeping in softirq path in chcr - disable buggy sha1-avx2 driver (may overread and page fault) - fix use-after-free on signals in caam * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: cavium - make several functions static crypto: chcr - Avoid algo allocation in softirq. crypto: caam - properly set IV after {en,de}crypt crypto: atmel - only treat EBUSY as transient if backlog crypto: af_alg - Avoid sock_graft call warning crypto: caam - fix signals handling crypto: sha1-ssse3 - Disable avx2
2017-07-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-12/+1
Merge even more updates from Andrew Morton: - a few leftovers - fault-injector rework - add a module loader test driver * emailed patches from Andrew Morton <akpm@linux-foundation.org>: kmod: throttle kmod thread limit kmod: add test driver to stress test the module loader MAINTAINERS: give kmod some maintainer love xtensa: use generic fb.h fault-inject: add /proc/<pid>/fail-nth fault-inject: simplify access check for fail-nth fault-inject: make fail-nth read/write interface symmetric fault-inject: parse as natural 1-based value for fail-nth write interface fault-inject: automatically detect the number base for fail-nth write interface kernel/watchdog.c: use better pr_fmt prefix MAINTAINERS: move the befs tree to kernel.org lib/atomic64_test.c: add a test that atomic64_inc_not_zero() returns an int mm: fix overflow check in expand_upwards()
2017-07-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds3-75/+74
Pull arch/tile updates from Chris Metcalf: "This adds support for an <arch/intreg.h> to help with removing __need_xxx #defines from glibc, and removes some dead code in arch/tile/mm/init.c" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: mm, tile: drop arch_{add,remove}_memory tile: prefer <arch/intreg.h> to __need_int_reg_t
2017-07-14Merge tag 'powerpc-4.13-2' of ↵Linus Torvalds14-31/+162
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Nothing that really stands out, just a bunch of fixes that have come in in the last couple of weeks. None of these are actually fixes for code that is new in 4.13. It's roughly half older bugs, with fixes going to stable, and half fixes/updates for Power9. Thanks to: Aneesh Kumar K.V, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt, Madhavan Srinivasan, Michael Neuling, Nicholas Piggin, Oliver O'Halloran" * tag 'powerpc-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Fix atomic64_inc_not_zero() to return an int powerpc: Fix emulation of mfocrf in emulate_step() powerpc: Fix emulation of mcrf in emulate_step() powerpc/perf: Add POWER9 alternate PM_RUN_CYC and PM_RUN_INST_CMPL events powerpc/perf: Fix SDAR_MODE value for continous sampling on Power9 powerpc/asm: Mark cr0 as clobbered in mftb() powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9 powerpc/mm/radix: Synchronize updates to the process table powerpc/mm/radix: Properly clear process table entry powerpc/powernv: Tell OPAL about our MMU mode on POWER9 powerpc/kexec: Fix radix to hash kexec due to IAMR/AMOR
2017-07-14xtensa: use generic fb.hTobias Klauser2-12/+1
The arch uses a verbatim copy of the asm-generic version and does not add any own implementations to the header, so use asm-generic/fb.h instead of duplicating code. Link: http://lkml.kernel.org/r/20170517083545.2115-1-tklauser@distanz.ch Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-14Merge tag 'pci-v4.13-fixes-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix a typo that broke Rockchip enumeration - fix a new memory leak in the ARM host bridge failure path * tag 'pci-v4.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rockchip: Check for pci_scan_root_bus_bridge() failure correctly ARM/PCI: Fix pcibios_init_resource() struct pci_host_bridge leak
2017-07-14kvm: x86: hyperv: make VP_INDEX managed by userspaceRoman Kagan4-19/+40
Hyper-V identifies vCPUs by Virtual Processor Index, which can be queried via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a sequential number which can't exceed the maximum number of vCPUs per VM. APIC ids can be sparse and thus aren't a valid replacement for VP indices. Current KVM uses its internal vcpu index as VP_INDEX. However, to make it predictable and persistent across VM migrations, the userspace has to control the value of VP_INDEX. This patch achieves that, by storing vp_index explicitly on vcpu, and allowing HV_X64_MSR_VP_INDEX to be set from the host side. For compatibility it's initialized to KVM vcpu index. Also a few variables are renamed to make clear distinction betweed this Hyper-V vp_index and KVM vcpu_id (== APIC id). Besides, a new capability, KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip attempting msr writes where unsupported, to avoid spamming error logs. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-14KVM: async_pf: Let guest support delivery of async_pf from guest modeWanpeng Li6-5/+13
Adds another flag bit (bit 2) to MSR_KVM_ASYNC_PF_EN. If bit 2 is 1, async page faults are delivered to L1 as #PF vmexits; if bit 2 is 0, kvm_can_do_async_pf returns 0 if in guest mode. This is similar to what svm.c wanted to do all along, but it is only enabled for Linux as L1 hypervisor. Foreign hypervisors must never receive async page faults as vmexits, because they'd probably be very confused about that. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-14KVM: async_pf: Force a nested vmexit if the injected #PF is async_pfWanpeng Li5-10/+35
Add an nested_apf field to vcpu->arch.exception to identify an async page fault, and constructs the expected vm-exit information fields. Force a nested VM exit from nested_vmx_check_exception() if the injected #PF is async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-14KVM: async_pf: Add L1 guest async_pf #PF vmexit handlerWanpeng Li5-37/+51
This patch adds the L1 guest async page fault #PF vmexit handler, such by L1 similar to ordinary async page fault. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> [Passed insn parameters to kvm_mmu_page_fault().] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-14KVM: x86: Simplify kvm_x86_ops->queue_exception parameter listWanpeng Li4-13/+12
This patch removes all arguments except the first in kvm_x86_ops->queue_exception since they can extract the arguments from vcpu->arch.exception themselves. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-13Merge tag 'kbuild-v4.13-2' of ↵Linus Torvalds42-404/+370
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - Move generic-y of exported headers to uapi/asm/Kbuild for complete de-coupling of UAPI - Clean up scripts/Makefile.headersinst - Fix host programs for 32 bit machine with XFS file system * tag 'kbuild-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits) kbuild: Enable Large File Support for hostprogs kbuild: remove wrapper files handling from Makefile.headersinst kbuild: split exported generic header creation into uapi-asm-generic kbuild: do not include old-kbuild-file from Makefile.headersinst xtensa: move generic-y of exported headers to uapi/asm/Kbuild unicore32: move generic-y of exported headers to uapi/asm/Kbuild tile: move generic-y of exported headers to uapi/asm/Kbuild sparc: move generic-y of exported headers to uapi/asm/Kbuild sh: move generic-y of exported headers to uapi/asm/Kbuild parisc: move generic-y of exported headers to uapi/asm/Kbuild openrisc: move generic-y of exported headers to uapi/asm/Kbuild nios2: move generic-y of exported headers to uapi/asm/Kbuild nios2: remove unneeded arch/nios2/include/(generated/)asm/signal.h microblaze: move generic-y of exported headers to uapi/asm/Kbuild metag: move generic-y of exported headers to uapi/asm/Kbuild m68k: move generic-y of exported headers to uapi/asm/Kbuild m32r: move generic-y of exported headers to uapi/asm/Kbuild ia64: remove redundant generic-y += kvm_para.h from asm/Kbuild hexagon: move generic-y of exported headers to uapi/asm/Kbuild h8300: move generic-y of exported headers to uapi/asm/Kbuild ...
2017-07-13kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2Roman Kagan4-6/+17
There is a flaw in the Hyper-V SynIC implementation in KVM: when message page or event flags page is enabled by setting the corresponding msr, KVM zeroes it out. This is problematic because on migration the corresponding MSRs are loaded on the destination, so the content of those pages is lost. This went unnoticed so far because the only user of those pages was in-KVM hyperv synic timers, which could continue working despite that zeroing. Newer QEMU uses those pages for Hyper-V VMBus implementation, and zeroing them breaks the migration. Besides, in newer QEMU the content of those pages is fully managed by QEMU, so zeroing them is undesirable even when writing the MSRs from the guest side. To support this new scheme, introduce a new capability, KVM_CAP_HYPERV_SYNIC2, which, when enabled, makes sure that the synic pages aren't zeroed out in KVM. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-13KVM: x86: make backwards_tsc_observed a per-VM variableLadi Prosek2-4/+3
The backwards_tsc_observed global introduced in commit 16a9602 is never reset to false. If a VM happens to be running while the host is suspended (a common source of the TSC jumping backwards), master clock will never be enabled again for any VM. In contrast, if no VM is running while the host is suspended, master clock is unaffected. This is inconsistent and unnecessarily strict. Let's track the backwards_tsc_observed variable separately and let each VM start with a clean slate. Real world impact: My Windows VMs get slower after my laptop undergoes a suspend/resume cycle. The only way to get the perf back is unloading and reloading the kvm module. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-07-12x86/efi: move asmlinkage before return typeJoe Perches1-2/+2
Make the code like the rest of the kernel. Link: http://lkml.kernel.org/r/1cd3d401626e51ea0e2333a860e76e80bc560a4c.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12sh: move inline before return typeJoe Perches1-1/+1
Make the code like the rest of the kernel. Link: http://lkml.kernel.org/r/f81bb2a67a97b1fd8b6ea99bd350d8a0f6864fb1.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12MIPS: SMP: move asmlinkage before return typeJoe Perches1-1/+1
Make the code like the rest of the kernel. Link: http://lkml.kernel.org/r/756d3fb543e981b9284e756fa27616725a354b28.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12m68k: coldfire: move inline before return typeJoe Perches1-2/+2
Make the code like the rest of the kernel. Link: http://lkml.kernel.org/r/14db9c166d5b68efa77e337cfe49bb9b29bca3f7.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12ia64: sn: pci: move inline before typeJoe Perches2-3/+3
Make the use of inline like the rest of the kernel. Link: http://lkml.kernel.org/r/f42b2202bd0d4e7ccf79ce5348bb255a035e67bb.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12ia64: move inline before return typeJoe Perches1-1/+1
Make the use of inline like the rest of the kernel. Link: http://lkml.kernel.org/r/d47074493af80ce12590340294bc49618165c30d.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12FRV: tlbflush: move asmlinkage before return typeJoe Perches1-4/+4
Make the use of asmlinkage like the rest of the kernel. Link: http://lkml.kernel.org/r/efb2dfed4d9315bf68ec0334c81b65af176a0174.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12CRIS: gpio: move inline before return typeJoe Perches1-2/+2
Move inline to be like the rest of the kernel. Link: http://lkml.kernel.org/r/6bf1bec049897c4158f698b866810f47c728f233.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12ARM: HP Jornada 7XX: move inline before return typeJoe Perches1-1/+1
Convert 'u8 inline' to 'inline u8' to be the same style used by the rest of the kernel. Miscellanea: jornada_ssp_reverse is an odd function. It is declared inline but is also EXPORT_SYMBOL. It is also apparently only used by jornada720_ssp.c Likely the EXPORT_SYMBOL could be removed and the function converted to static. The addition of static and removal of EXPORT_SYMBOL was not done. Link: http://lkml.kernel.org/r/5bd3b2bf39c6c9caf773949f18158f8f5ec08582.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12ARM: KVM: move asmlinkage before typeJoe Perches1-4/+4
asmlinkage is either 'extern "C"' or blank. Move the uses of asmlinkage before the return types to be similar to the rest of the kernel. Link: http://lkml.kernel.org/r/005b8e120650c6a13b541e420f4e3605603fe9e6.1499284835.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful ↵Michal Hocko3-3/+3
semantic __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to the page allocator. This has been true but only for allocations requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always ignored for smaller sizes. This is a bit unfortunate because there is no way to express the same semantic for those requests and they are considered too important to fail so they might end up looping in the page allocator for ever, similarly to GFP_NOFAIL requests. Now that the whole tree has been cleaned up and accidental or misled usage of __GFP_REPEAT flag has been removed for !costly requests we can give the original flag a better name and more importantly a more useful semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user that the allocator would try really hard but there is no promise of a success. This will work independent of the order and overrides the default allocator behavior. Page allocator users have several levels of guarantee vs. cost options (take GFP_KERNEL as an example) - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_ attempt to free memory at all. The most light weight mode which even doesn't kick the background reclaim. Should be used carefully because it might deplete the memory and the next user might hit the more aggressive reclaim - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic allocation without any attempt to free memory from the current context but can wake kswapd to reclaim memory if the zone is below the low watermark. Can be used from either atomic contexts or when the request is a performance optimization and there is another fallback for a slow path. - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) - non sleeping allocation with an expensive fallback so it can access some portion of memory reserves. Usually used from interrupt/bh context with an expensive slow path fallback. - GFP_KERNEL - both background and direct reclaim are allowed and the _default_ page allocator behavior is used. That means that !costly allocation requests are basically nofail but there is no guarantee of that behavior so failures have to be checked properly by callers (e.g. OOM killer victim is allowed to fail currently). - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior and all allocation requests fail early rather than cause disruptive reclaim (one round of reclaim in this implementation). The OOM killer is not invoked. - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator behavior and all allocation requests try really hard. The request will fail if the reclaim cannot make any progress. The OOM killer won't be triggered. - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior and all allocation requests will loop endlessly until they succeed. This might be really dangerous especially for larger orders. Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL because they already had their semantic. No new users are added. __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if there is no progress and we have already passed the OOM point. This means that all the reclaim opportunities have been exhausted except the most disruptive one (the OOM killer) and a user defined fallback behavior is more sensible than keep retrying in the page allocator. [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c] [mhocko@suse.com: semantic fix] Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz [mhocko@kernel.org: address other thing spotted by Vlastimil] Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Belits <alex.belits@cavium.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David Daney <david.daney@cavium.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: NeilBrown <neilb@suse.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12MIPS: do not use __GFP_REPEAT for order-0 requestMichal Hocko1-1/+1
Patch series "mm: give __GFP_REPEAT a better semantic". The main motivation for the change is that the current implementation of __GFP_REPEAT is not very much useful. The documentation says: * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt * _might_ fail. This depends upon the particular VM implementation. It just fails to mention that this is true only for large (costly) high order which has been the case since the flag was introduced. A similar semantic would be really helpful for smal orders as well, though, because we have places where a failure with a specific fallback error handling is preferred to a potential endless loop inside the page allocator. The earlier cleanup dropped __GFP_REPEAT usage for low (!costly) order users so only those which might use larger orders have stayed. One new user added in the meantime is addressed in patch 1. Let's rename the flag to something more verbose and use it for existing users. Semantic for those will not change. Then implement low (!costly) orders failure path which is hit after the page allocator is about to invoke the oom killer. With that we have a good counterpart for __GFP_NORETRY and finally can tell try as hard as possible without the OOM killer. Xfs code already has an existing annotation for allocations which are allowed to fail and we can trivially map them to the new gfp flag because it will provide the semantic KM_MAYFAIL wants. Christoph didn't consider the new flag really necessary but didn't respond to the OOM killer aspect of the change so I have kept the patch. If this is still seen as not really needed I can drop the patch. kvmalloc will allow also !costly high order allocations to retry hard before falling back to the vmalloc. drm/i915 asked for the new semantic explicitly. Memory migration code, especially for the memory hotplug, should back off rather than invoking the OOM killer as well. This patch (of 6): Commit 3377e227af44 ("MIPS: Add 48-bit VA space (and 4-level page tables) for 4K pages.") has added a new __GFP_REPEAT user but using this flag doesn't really make any sense for order-0 request which is the case here because PUD_ORDER is 0. __GFP_REPEAT has historically effect only on allocation requests with order > PAGE_ALLOC_COSTLY_ORDER. This doesn't introduce any functional change. This is a preparatory patch for later work which renames the flag and redefines its semantic. Link: http://lkml.kernel.org/r/20170623085345.11304-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Belits <alex.belits@cavium.com> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: NeilBrown <neilb@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12powerpc,mmap: properly account for stack randomization in mmap_baseRik van Riel1-9/+19
When RLIMIT_STACK is, for example, 256MB, the current code results in a gap between the top of the task and mmap_base of 256MB, failing to take into account the amount by which the stack address was randomized. In other words, the stack gets less than RLIMIT_STACK space. Ensure that the gap between the stack and mmap_base always takes stack randomization and the stack guard gap into account. Inspired by Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170622200033.25714-4-riel@redhat.com Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Florian Weimer <fweimer@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hughd@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12arm64/mmap: properly account for stack randomization in mmap_baseRik van Riel1-1/+6
When RLIMIT_STACK is, for example, 256MB, the current code results in a gap between the top of the task and mmap_base of 256MB, failing to take into account the amount by which the stack address was randomized. In other words, the stack gets less than RLIMIT_STACK space. Ensure that the gap between the stack and mmap_base always takes stack randomization and the stack guard gap into account. Obtained from Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170622200033.25714-3-riel@redhat.com Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Florian Weimer <fweimer@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hughd@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12x86/mmap: properly account for stack randomization in mmap_baseRik van Riel1-1/+6
When RLIMIT_STACK is, for example, 256MB, the current code results in a gap between the top of the task and mmap_base of 256MB, failing to take into account the amount by which the stack address was randomized. In other words, the stack gets less than RLIMIT_STACK space. Ensure that the gap between the stack and mmap_base always takes stack randomization and the stack guard gap into account. Obtained from Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170622200033.25714-2-riel@redhat.com Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Florian Weimer <fweimer@redhat.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hughd@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12sh64: ascii armor the sh64 boot init stack canaryRik van Riel1-0/+1
Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170524123446.78510066@annuminas.surriel.com Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12arm64: ascii armor the arm64 boot init stack canaryRik van Riel1-0/+1
Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170524155751.424-5-riel@redhat.com Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12x86: ascii armor the x86_64 boot init stack canaryRik van Riel1-0/+1
Use the ascii-armor canary to prevent unterminated C string overflows from being able to successfully overwrite the canary, even if they somehow obtain the canary value. Inspired by execshield ascii-armor and Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170524155751.424-4-riel@redhat.com Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12sh: mark end of BUG() implementation as unreachableKees Cook1-0/+1
When building the sh architecture, the compiler doesn't realize that BUG() doesn't return, so it will complain about functions using BUG() that are marked with the noreturn attribute: lib/string.c: In function 'fortify_panic': >> lib/string.c:986:1: warning: 'noreturn' function does return } ^ Link: http://lkml.kernel.org/r/20170627192050.GA66784@beast Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12include/linux/string.h: add the option of fortified string.h functionsDaniel Micay9-1/+36
This adds support for compiling with a rough equivalent to the glibc _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer overflow checks for string.h functions when the compiler determines the size of the source or destination buffer at compile-time. Unlike glibc, it covers buffer reads in addition to writes. GNU C __builtin_*_chk intrinsics are avoided because they would force a much more complex implementation. They aren't designed to detect read overflows and offer no real benefit when using an implementation based on inline checks. Inline checks don't add up to much code size and allow full use of the regular string intrinsics while avoiding the need for a bunch of _chk functions and per-arch assembly to avoid wrapper overhead. This detects various overflows at compile-time in various drivers and some non-x86 core kernel code. There will likely be issues caught in regular use at runtime too. Future improvements left out of initial implementation for simplicity, as it's all quite optional and can be done incrementally: * Some of the fortified string functions (strncpy, strcat), don't yet place a limit on reads from the source based on __builtin_object_size of the source buffer. * Extending coverage to more string functions like strlcat. * It should be possible to optionally use __builtin_object_size(x, 1) for some functions (C strings) to detect intra-object overflows (like glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative approach to avoid likely compatibility issues. * The compile-time checks should be made available via a separate config option which can be enabled by default (or always enabled) once enough time has passed to get the issues it catches fixed. Kees said: "This is great to have. While it was out-of-tree code, it would have blocked at least CVE-2016-3858 from being exploitable (improper size argument to strlcpy()). I've sent a number of fixes for out-of-bounds-reads that this detected upstream already" [arnd@arndb.de: x86: fix fortified memcpy] Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de [keescook@chromium.org: avoid panic() in favor of BUG()] Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help] Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12powerpc: make feature-fixup tests fortify-safeDaniel Axtens1-90/+90
Testing the fortified string functions[1] would cause a kernel panic on boot in test_feature_fixups() due to a buffer overflow in memcmp. This boils down to things like this: extern unsigned int ftr_fixup_test1; extern unsigned int ftr_fixup_test1_orig; check(memcmp(&ftr_fixup_test1, &ftr_fixup_test1_orig, size) == 0); We know that these are asm labels so it is safe to read up to 'size' bytes at those addresses. However, because we have passed the address of a single unsigned int to memcmp, the compiler believes the underlying object is in fact a single unsigned int. So if size > sizeof(unsigned int), there will be a panic at runtime. We can fix this by changing the types: instead of calling the asm labels unsigned ints, call them unsigned int[]s. Therefore the size isn't incorrectly determined at compile time and we get a regular unsafe memcmp and no panic. [1] http://openwall.com/lists/kernel-hardening/2017/05/09/2 Link: http://lkml.kernel.org/r/1497903987-21002-7-git-send-email-keescook@chromium.org Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Kees Cook <keescook@chromium.org> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12powerpc: don't fortify prom_initDaniel Axtens1-0/+3
prom_init is a bit special; in theory it should be able to be linked separately to the kernel. To keep this from getting too complex, the symbols that prom_init.c uses are checked. Fortification adds symbols, and it gets quite messy as it includes things like panic(). So just don't fortify prom_init.c for now. Link: http://lkml.kernel.org/r/1497903987-21002-6-git-send-email-keescook@chromium.org Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12powerpc/64s: implement arch-specific hardlockup watchdogNicholas Piggin9-25/+458
Implement an arch-speicfic watchdog rather than use the perf-based hardlockup detector. The new watchdog takes the soft-NMI directly, rather than going through perf. Perf interrupts are to be made maskable in future, so that would prevent the perf detector from working in those regions. Additionally, implement a SMP based detector where all CPUs watch one another by pinging a shared cpumask. This is because powerpc Book3S does not have a true periodic local NMI, but some platforms do implement a true NMI IPI. If a CPU is stuck with interrupts hard disabled, the soft-NMI watchdog does not work, but the SMP watchdog will. Even on platforms without a true NMI IPI to get a good trace from the stuck CPU, other CPUs will notice the lockup sufficiently to report it and panic. [npiggin@gmail.com: honor watchdog disable at boot/hotplug] Link: http://lkml.kernel.org/r/20170621001346.5bb337c9@roar.ozlabs.ibm.com [npiggin@gmail.com: fix false positive warning at CPU unplug] Link: http://lkml.kernel.org/r/20170630080740.20766-1-npiggin@gmail.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170616065715.18390-6-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kernel/watchdog: split up config optionsNicholas Piggin5-5/+26
Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR. LOCKUP_DETECTOR implies the general boot, sysctl, and programming interfaces for the lockup detectors. An architecture that wants to use a hard lockup detector must define HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH. Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and does not implement the LOCKUP_DETECTOR interfaces. sparc is unusual in that it has started to implement some of the interfaces, but not fully yet. It should probably be converted to a full HAVE_HARDLOCKUP_DETECTOR_ARCH. [npiggin@gmail.com: fix] Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12kernel/watchdog: introduce arch_touch_nmi_watchdog()Nicholas Piggin7-10/+13
For architectures that define HAVE_NMI_WATCHDOG, instead of having them provide the complete touch_nmi_watchdog() function, just have them provide arch_touch_nmi_watchdog(). This gives the generic code more flexibility in implementing this function, and arch implementations don't miss out on touching the softlockup watchdog or other generic details. Link: http://lkml.kernel.org/r/20170616065715.18390-3-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Tested-by: Babu Moger <babu.moger@oracle.com> [sparc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>