summaryrefslogtreecommitdiff
path: root/include/asm-generic/vmlinux.lds.h
AgeCommit message (Collapse)AuthorFilesLines
2020-10-27vmlinux.lds.h: Keep .ctors.* with .ctorsKees Cook1-0/+1
Under some circumstances, the compiler generates .ctors.* sections. This is seen doing a cross compile of x86_64 from a powerpc64el host: x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/trace_clock.o' being placed in section `.ctors.65435' x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ftrace.o' being placed in section `.ctors.65435' x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from `kernel/trace/ring_buffer.o' being placed in section `.ctors.65435' Include these orphans along with the regular .ctors section. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 83109d5d5fba ("x86/build: Warn on orphan section placement") Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201005025720.2599682-1-keescook@chromium.org
2020-10-18Merge tag 'linux-kselftest-kunit-5.10-rc1' of ↵Linus Torvalds1-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull more Kunit updates from Shuah Khan: - add Kunit to kernel_init() and remove KUnit from init calls entirely. This addresses the concern that Kunit would not work correctly during late init phase. - add a linker section where KUnit can put references to its test suites. This is the first step in transitioning to dispatching all KUnit tests from a centralized executor rather than having each as its own separate late_initcall. - add a centralized executor to dispatch tests rather than relying on late_initcall to schedule each test suite separately. Centralized execution is for built-in tests only; modules will execute tests when loaded. - convert bitfield test to use KUnit framework - Documentation updates for naming guidelines and how kunit_test_suite() works. - add test plan to KUnit TAP format * tag 'linux-kselftest-kunit-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: lib: kunit: Fix compilation test when using TEST_BIT_FIELD_COMPILE lib: kunit: add bitfield test conversion to KUnit Documentation: kunit: add a brief blurb about kunit_test_suite kunit: test: add test plan to KUnit TAP format init: main: add KUnit to kernel init kunit: test: create a single centralized executor for all tests vmlinux.lds.h: add linker section for KUnit test suites Documentation: kunit: Add naming guidelines
2020-10-12Merge tag 'core-static_call-2020-10-12' of ↵Linus Torvalds1-0/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull static call support from Ingo Molnar: "This introduces static_call(), which is the idea of static_branch() applied to indirect function calls. Remove a data load (indirection) by modifying the text. They give the flexibility of function pointers, but with better performance. (This is especially important for cases where retpolines would otherwise be used, as retpolines can be pretty slow.) API overview: DECLARE_STATIC_CALL(name, func); DEFINE_STATIC_CALL(name, func); DEFINE_STATIC_CALL_NULL(name, typename); static_call(name)(args...); static_call_cond(name)(args...); static_call_update(name, func); x86 is supported via text patching, otherwise basic indirect calls are used, with function pointers. There's a second variant using inline code patching, inspired by jump-labels, implemented on x86 as well. The new APIs are utilized in the x86 perf code, a heavy user of function pointers, where static calls speed up the PMU handler by 4.2% (!). The generic implementation is not really excercised on other architectures, outside of the trivial test_static_call_init() self-test" * tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) static_call: Fix return type of static_call_init tracepoint: Fix out of sync data passing by static caller tracepoint: Fix overly long tracepoint names x86/perf, static_call: Optimize x86_pmu methods tracepoint: Optimize using static_call() static_call: Allow early init static_call: Add some validation static_call: Handle tail-calls static_call: Add static_call_cond() x86/alternatives: Teach text_poke_bp() to emulate RET static_call: Add simple self-test for static calls x86/static_call: Add inline static call implementation for x86-64 x86/static_call: Add out-of-line static call implementation static_call: Avoid kprobes on inline static_call()s static_call: Add inline static call infrastructure static_call: Add basic static call infrastructure compiler.h: Make __ADDRESSABLE() symbol truly unique jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() module: Properly propagate MODULE_STATE_COMING failure module: Fix up module_notifier return values ...
2020-10-12Merge tag 'core-build-2020-10-12' of ↵Linus Torvalds1-7/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull orphan section checking from Ingo Molnar: "Orphan link sections were a long-standing source of obscure bugs, because the heuristics that various linkers & compilers use to handle them (include these bits into the output image vs discarding them silently) are both highly idiosyncratic and also version dependent. Instead of this historically problematic mess, this tree by Kees Cook (et al) adds build time asserts and build time warnings if there's any orphan section in the kernel or if a section is not sized as expected. And because we relied on so many silent assumptions in this area, fix a metric ton of dependencies and some outright bugs related to this, before we can finally enable the checks on the x86, ARM and ARM64 platforms" * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/boot/compressed: Warn on orphan section placement x86/build: Warn on orphan section placement arm/boot: Warn on orphan section placement arm/build: Warn on orphan section placement arm64/build: Warn on orphan section placement x86/boot/compressed: Add missing debugging sections to output x86/boot/compressed: Remove, discard, or assert for unwanted sections x86/boot/compressed: Reorganize zero-size section asserts x86/build: Add asserts for unwanted sections x86/build: Enforce an empty .got.plt section x86/asm: Avoid generating unused kprobe sections arm/boot: Handle all sections explicitly arm/build: Assert for unwanted sections arm/build: Add missing sections arm/build: Explicitly keep .ARM.attributes sections arm/build: Refactor linker script headers arm64/build: Assert for unwanted sections arm64/build: Add missing DWARF sections arm64/build: Use common DISCARDS in linker script arm64/build: Remove .eh_frame* sections due to unwind tables ...
2020-10-09vmlinux.lds.h: add linker section for KUnit test suitesBrendan Higgins1-1/+9
Add a linker section where KUnit can put references to its test suites. This patch is the first step in transitioning to dispatching all KUnit tests from a centralized executor rather than having each as its own separate late_initcall. Co-developed-by: Iurii Zaikin <yzaikin@google.com> Signed-off-by: Iurii Zaikin <yzaikin@google.com> Signed-off-by: Brendan Higgins <brendanhiggins@google.com> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-09-21bpf: Prevent .BTF section eliminationTony Ambardar1-1/+1
Systems with memory or disk constraints often reduce the kernel footprint by configuring LD_DEAD_CODE_DATA_ELIMINATION. However, this can result in removal of any BTF information. Use the KEEP() macro to preserve the BTF data as done with other important sections, while still allowing for smaller kernels. Fixes: 90ceddcb4950 ("bpf: Support llvm-objcopy for vmlinux BTF") Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/a635b5d3e2da044e7b51ec1315e8910fbce0083f.1600417359.git.Tony.Ambardar@gmail.com
2020-09-01x86/static_call: Add inline static call implementation for x86-64Josh Poimboeuf1-0/+6
Add the inline static call implementation for x86-64. The generated code is identical to the out-of-line case, except we move the trampoline into it's own section. Objtool uses the trampoline naming convention to detect all the call sites. It then annotates those call sites in the .static_call_sites section. During boot (and module init), the call sites are patched to call directly into the destination function. The temporary trampoline is then no longer used. [peterz: merged trampolines, put trampoline in section] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org
2020-09-01static_call: Add inline static call infrastructureJosh Poimboeuf1-0/+7
Add infrastructure for an arch-specific CONFIG_HAVE_STATIC_CALL_INLINE option, which is a faster version of CONFIG_HAVE_STATIC_CALL. At runtime, the static call sites are patched directly, rather than using the out-of-line trampolines. Compared to out-of-line static calls, the performance benefits are more modest, but still measurable. Steven Rostedt did some tracepoint measurements: https://lkml.kernel.org/r/20181126155405.72b4f718@gandalf.local.home This code is heavily inspired by the jump label code (aka "static jumps"), as some of the concepts are very similar. For more details, see the comments in include/linux/static_call.h. [peterz: simplified interface; merged trampolines] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135804.684334440@infradead.org
2020-09-01vmlinux.lds.h: Add PGO and AutoFDO input sectionsNick Desaulniers1-1/+4
Basically, consider .text.{hot|unlikely|unknown}.* part of .text, too. When compiling with profiling information (collected via PGO instrumentations or AutoFDO sampling), Clang will separate code into .text.hot, .text.unlikely, or .text.unknown sections based on profiling information. After D79600 (clang-11), these sections will have a trailing `.` suffix, ie. .text.hot., .text.unlikely., .text.unknown.. When using -ffunction-sections together with profiling infomation, either explicitly (FGKASLR) or implicitly (LTO), code may be placed in sections following the convention: .text.hot.<foo>, .text.unlikely.<bar>, .text.unknown.<baz> where <foo>, <bar>, and <baz> are functions. (This produces one section per function; we generally try to merge these all back via linker script so that we don't have 50k sections). For the above cases, we need to teach our linker scripts that such sections might exist and that we'd explicitly like them grouped together, otherwise we can wind up with code outside of the _stext/_etext boundaries that might not be mapped properly for some architectures, resulting in boot failures. If the linker script is not told about possible input sections, then where the section is placed as output is a heuristic-laiden mess that's non-portable between linkers (ie. BFD and LLD), and has resulted in many hard to debug bugs. Kees Cook is working on cleaning this up by adding --orphan-handling=warn linker flag used in ARCH=powerpc to additional architectures. In the case of linker scripts, borrowing from the Zen of Python: explicit is better than implicit. Also, ld.bfd's internal linker script considers .text.hot AND .text.hot.* to be part of .text, as well as .text.unlikely and .text.unlikely.*. I didn't see support for .text.unknown.*, and didn't see Clang producing such code in our kernel builds, but I see code in LLVM that can produce such section names if profiling information is missing. That may point to a larger issue with generating or collecting profiles, but I would much rather be safe and explicit than have to debug yet another issue related to orphan section placement. Reported-by: Jian Cai <jiancai@google.com> Suggested-by: Fāng-ruì Sòng <maskray@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Luis Lozano <llozano@google.com> Tested-by: Manoj Gupta <manojgupta@google.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: linux-arch@vger.kernel.org Cc: stable@vger.kernel.org Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655 Link: https://reviews.llvm.org/D79600 Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084760 Link: https://lore.kernel.org/r/20200821194310.3089815-7-keescook@chromium.org Debugged-by: Luis Lozano <llozano@google.com>
2020-09-01vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to ELF_DETAILSKees Cook1-1/+4
When linking vmlinux with LLD, the synthetic sections .symtab, .strtab, and .shstrtab are listed as orphaned. Add them to the ELF_DETAILS section so there will be no warnings when --orphan-handling=warn is used more widely. (They are added above comment as it is the more common order[1].) ld.lld: warning: <internal>:(.symtab) is being placed in '.symtab' ld.lld: warning: <internal>:(.shstrtab) is being placed in '.shstrtab' ld.lld: warning: <internal>:(.strtab) is being placed in '.strtab' [1] https://lore.kernel.org/lkml/20200622224928.o2a7jkq33guxfci4@google.com/ Reported-by: Fangrui Song <maskray@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-6-keescook@chromium.org
2020-09-01vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUGKees Cook1-2/+6
The .comment section doesn't belong in STABS_DEBUG. Split it out into a new macro named ELF_DETAILS. This will gain other non-debug sections that need to be accounted for when linking with --orphan-handling=warn. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sectionsKees Cook1-0/+20
KASAN (-fsanitize=kernel-address) and KCSAN (-fsanitize=thread) produce unwanted[1] .eh_frame and .init_array.* sections. Add them to COMMON_DISCARDS, except with CONFIG_CONSTRUCTORS, which wants to keep .init_array.* sections. [1] https://bugs.llvm.org/show_bug.cgi?id=46478 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Marco Elver <elver@google.com> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-4-keescook@chromium.org
2020-09-01vmlinux.lds.h: Add .gnu.version* to COMMON_DISCARDSKees Cook1-1/+3
For vmlinux linking, no architecture uses the .gnu.version* sections, so remove it via the COMMON_DISCARDS macro in preparation for adding --orphan-handling=warn more widely. This is a work-around for what appears to be a bug[1] in ld.bfd which warns for this synthetic section even when none is found in input objects, and even when no section is emitted for an output object[2]. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=26153 [2] https://lore.kernel.org/lkml/202006221524.CEB86E036B@keescook/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-3-keescook@chromium.org
2020-09-01vmlinux.lds.h: Create COMMON_DISCARDSKees Cook1-3/+6
Collect the common DISCARD sections for architectures that need more specialized discard control than what the standard DISCARDS section provides. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-2-keescook@chromium.org
2020-08-14include/asm-generic/vmlinux.lds.h: align ro_after_initRomain Naour1-0/+1
Since the patch [1], building the kernel using a toolchain built with binutils 2.33.1 prevents booting a sh4 system under Qemu. Apply the patch provided by Alan Modra [2] that fix alignment of rodata. [1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ebd2263ba9a9124d93bbc0ece63d7e0fae89b40e [2] https://www.sourceware.org/ml/binutils/2019-12/msg00112.html Signed-off-by: Romain Naour <romain.naour@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alan Modra <amodra@gmail.com> Cc: Bin Meng <bin.meng@windriver.com> Cc: Chen Zhou <chenzhou10@huawei.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Rich Felker <dalias@libc.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <stable@vger.kernel.org> Link: https://marc.info/?l=linux-sh&m=158429470221261 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-0/+4
Pull networking updates from David Miller: 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan. 2) Support UDP segmentation in code TSO code, from Eric Dumazet. 3) Allow flashing different flash images in cxgb4 driver, from Vishal Kulkarni. 4) Add drop frames counter and flow status to tc flower offloading, from Po Liu. 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni. 6) Various new indirect call avoidance, from Eric Dumazet and Brian Vazquez. 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from Yonghong Song. 8) Support querying and setting hardware address of a port function via devlink, use this in mlx5, from Parav Pandit. 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson. 10) Switch qca8k driver over to phylink, from Jonathan McDowell. 11) In bpftool, show list of processes holding BPF FD references to maps, programs, links, and btf objects. From Andrii Nakryiko. 12) Several conversions over to generic power management, from Vaibhav Gupta. 13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry Yakunin. 14) Various https url conversions, from Alexander A. Klimov. 15) Timestamping and PHC support for mscc PHY driver, from Antoine Tenart. 16) Support bpf iterating over tcp and udp sockets, from Yonghong Song. 17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov. 18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan. 19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several drivers. From Luc Van Oostenryck. 20) XDP support for xen-netfront, from Denis Kirjanov. 21) Support receive buffer autotuning in MPTCP, from Florian Westphal. 22) Support EF100 chip in sfc driver, from Edward Cree. 23) Add XDP support to mvpp2 driver, from Matteo Croce. 24) Support MPTCP in sock_diag, from Paolo Abeni. 25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic infrastructure, from Jakub Kicinski. 26) Several pci_ --> dma_ API conversions, from Christophe JAILLET. 27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel. 28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki. 29) Refactor a lot of networking socket option handling code in order to avoid set_fs() calls, from Christoph Hellwig. 30) Add rfc4884 support to icmp code, from Willem de Bruijn. 31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei. 32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin. 33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin. 34) Support TCP syncookies in MPTCP, from Flowian Westphal. 35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano Brivio. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits) net: thunderx: initialize VF's mailbox mutex before first usage usb: hso: remove bogus check for EINPROGRESS usb: hso: no complaint about kmalloc failure hso: fix bailout in error case of probe ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM selftests/net: relax cpu affinity requirement in msg_zerocopy test mptcp: be careful on subflow creation selftests: rtnetlink: make kci_test_encap() return sub-test result selftests: rtnetlink: correct the final return value for the test net: dsa: sja1105: use detected device id instead of DT one on mismatch tipc: set ub->ifindex for local ipv6 address ipv6: add ipv6_dev_find() net: openvswitch: silence suspicious RCU usage warning Revert "vxlan: fix tos value before xmit" ptp: only allow phase values lower than 1 period farsync: switch from 'pci_' to 'dma_' API wan: wanxl: switch from 'pci_' to 'dma_' API hv_netvsc: do not use VF device if link is down dpaa2-eth: Fix passing zero to 'PTR_ERR' warning net: macb: Properly handle phylink on at91sam9x ...
2020-08-05Merge tag 'char-misc-5.9-rc1' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char and misc and other driver subsystem patches for 5.9-rc1. Lots of new driver submissions in here, and cleanups and features for existing drivers. Highlights are: - habanalabs driver updates - coresight driver updates - nvmem driver updates - huge number of "W=1" build warning cleanups from Lee Jones - dyndbg updates - virtbox driver fixes and updates - soundwire driver updates - mei driver updates - phy driver updates - fpga driver updates - lots of smaller individual misc/char driver cleanups and fixes Full details are in the shortlog. All of these have been in linux-next with no reported issues" * tag 'char-misc-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (322 commits) habanalabs: remove unused but set variable 'ctx_asid' nvmem: qcom-spmi-sdam: Enable multiple devices dt-bindings: nvmem: SID: add binding for A100's SID controller nvmem: update Kconfig description nvmem: qfprom: Add fuse blowing support dt-bindings: nvmem: Add properties needed for blowing fuses dt-bindings: nvmem: qfprom: Convert to yaml nvmem: qfprom: use NVMEM_DEVID_AUTO for multiple instances nvmem: core: add support to auto devid nvmem: core: Add nvmem_cell_read_u8() nvmem: core: Grammar fixes for help text nvmem: sc27xx: add sc2730 efuse support nvmem: Enforce nvmem stride in the sysfs interface MAINTAINERS: Add git tree for NVMEM FRAMEWORK nvmem: sprd: Fix return value of sprd_efuse_probe() drivers: android: Fix the SPDX comment style drivers: android: Fix a variable declaration coding style issue drivers: android: Remove braces for a single statement if-else block drivers: android: Remove the use of else after return drivers: android: Fix a variable declaration coding style issue ...
2020-08-03Merge tag 'sched-core-2020-08-03' of ↵Linus Torvalds1-2/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve uclamp performance by using a static key for the fast path - Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for better power efficiency of RT tasks on battery powered devices. (The default is to maximize performance & reduce RT latencies.) - Improve utime and stime tracking accuracy, which had a fixed boundary of error, which created larger and larger relative errors as the values become larger. This is now replaced with more precise arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h. - Improve the deadline scheduler, such as making it capacity aware - Improve frequency-invariant scheduling - Misc cleanups in energy/power aware scheduling - Add sched_update_nr_running tracepoint to track changes to nr_running - Documentation additions and updates - Misc cleanups and smaller fixes * tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst sched/doc: Document capacity aware scheduling sched: Document arch_scale_*_capacity() arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE Documentation/sysctl: Document uclamp sysctl knobs sched/uclamp: Add a new sysctl to control RT default boost value sched/uclamp: Fix a deadlock when enabling uclamp static key sched: Remove duplicated tick_nohz_full_enabled() check sched: Fix a typo in a comment sched/uclamp: Remove unnecessary mutex_init() arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry arch_topology, sched/core: Cleanup thermal pressure definition trace/events/sched.h: fix duplicated word linux/sched/mm.h: drop duplicated words in comments smp: Fix a potential usage of stale nr_cpus sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal sched: nohz: stop passing around unused "ticks" parameter. sched: Better document ttwu() sched: Add a tracepoint to track rq->nr_running ...
2020-07-27Merge 5.8-rc7 into char-misc-nextGreg Kroah-Hartman1-1/+4
This should resolve the merge/build issues reported when trying to create linux-next. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+4
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24dyndbg: rename __verbose section to __dyndbgJim Cromie1-3/+3
dyndbg populates its callsite info into __verbose section, change that to a more specific and descriptive name, __dyndbg. Also, per checkpatch: simplify __attribute(..) to __section(__dyndbg) declaration. and 1 spelling fix, decriptor Acked-by: <jbaron@akamai.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20200719231058.1586423-6-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22x86, vmlinux.lds: Page-align end of ..page_aligned sectionsJoerg Roedel1-1/+4
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is page-aligned, but the end of the .bss..page_aligned section is not guaranteed to be page-aligned. As a result, objects from other .bss sections may end up on the same 4k page as the idt_table, and will accidentially get mapped read-only during boot, causing unexpected page-faults when the kernel writes to them. This could be worked around by making the objects in the page aligned sections page sized, but that's wrong. Explicit sections which store only page aligned objects have an implicit guarantee that the object is alone in the page in which it is placed. That works for all objects except the last one. That's inconsistent. Enforcing page sized objects for these sections would wreckage memory sanitizers, because the object becomes artificially larger than it should be and out of bound access becomes legit. Align the end of the .bss..page_aligned and .data..page_aligned section on page-size so all objects places in these sections are guaranteed to have their own page. [ tglx: Amended changelog ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
2020-07-13bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macrosJiri Olsa1-0/+4
Adding support to generate .BTF_ids section that will hold BTF ID lists for verifier. Adding macros that will help to define lists of BTF ID values placed in .BTF_ids section. They are initially filled with zeros (during compilation) and resolved later during the linking phase by resolve_btfids tool. Following defines list of one BTF ID value: BTF_ID_LIST(bpf_skb_output_btf_ids) BTF_ID(struct, sk_buff) It also defines following variable to access the list: extern u32 bpf_skb_output_btf_ids[]; The BTF_ID_UNUSED macro defines 4 zero bytes. It's used when we want to define 'unused' entry in BTF_ID_LIST, like: BTF_ID_LIST(bpf_skb_output_btf_ids) BTF_ID(struct, sk_buff) BTF_ID_UNUSED BTF_ID(struct, task_struct) Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200711215329.41165-4-jolsa@kernel.org
2020-07-08sched, vmlinux.lds: Increase STRUCT_ALIGNMENT to 64 bytes for GCC-4.9Peter Zijlstra1-7/+11
For some mysterious reason GCC-4.9 has a 64 byte section alignment for structures, all other GCC versions (and Clang) tested (including 4.8 and 5.0) are fine with the 32 bytes alignment. Getting this right is important for the new SCHED_DATA macro that creates an explicitly ordered array of 'struct sched_class' in the linker script and expect pointer arithmetic to work. Fixes: c3a340f7e7ea ("sched: Have sched_class_highest define by vmlinux.lds.h") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200630144905.GX4817@hirez.programming.kicks-ass.net
2020-06-25sched: Have sched_class_highest define by vmlinux.lds.hSteven Rostedt (VMware)1-1/+4
Now that the sched_class descriptors are defined by the linker script, and this needs to be aware of the existance of stop_sched_class when SMP is enabled or not, as it is used as the "highest" priority when defined. Move the declaration of sched_class_highest to the same location in the linker script that inserts stop_sched_class, and this will also make it easier to see what should be defined as the highest class, as this linker script location defines the priorities as well. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20191219214558.682913590@goodmis.org
2020-06-25sched: Force the address order of each sched class descriptorSteven Rostedt (VMware)1-0/+13
In order to make a micro optimization in pick_next_task(), the order of the sched class descriptor address must be in the same order as their priority to each other. That is: &idle_sched_class < &fair_sched_class < &rt_sched_class < &dl_sched_class < &stop_sched_class In order to guarantee this order of the sched class descriptors, add each one into their own data section and force the order in the linker script. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/157675913272.349305.8936736338884044103.stgit@localhost.localdomain
2020-05-19vmlinux.lds.h: Create section for protection against instrumentationThomas Gleixner1-0/+10
Some code pathes, especially the low level entry code, must be protected against instrumentation for various reasons: - Low level entry code can be a fragile beast, especially on x86. - With NO_HZ_FULL RCU state needs to be established before using it. Having a dedicated section for such code allows to validate with tooling that no unsafe functions are invoked. Add the .noinstr.text section and the noinstr attribute to mark functions. noinstr implies notrace. Kprobes will gain a section check later. Provide also a set of markers: instrumentation_begin()/end() These are used to mark code inside a noinstr function which calls into regular instrumentable text section as safe. The instrumentation markers are only active when CONFIG_DEBUG_ENTRY is enabled as the end marker emits a NOP to prevent the compiler from merging the annotation points. This means the objtool verification requires a kernel compiled with this option. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134100.075416272@linutronix.de
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-0/+15
Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...
2020-03-27x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDSH.J. Lu1-2/+9
In the x86 kernel, .exit.text and .exit.data sections are discarded at runtime, not by the linker. Add RUNTIME_DISCARD_EXIT to generic DISCARDS and define it in the x86 kernel linker script to keep them. The sections are added before the DISCARD directive so document here only the situation explicitly as this change doesn't have any effect on the generated kernel. Also, other architectures like ARM64 will use it too so generalize the approach with the RUNTIME_DISCARD_EXIT define. [ bp: Massage and extend commit message. ] Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200326193021.255002-1-hjl.tools@gmail.com
2020-03-19bpf: Support llvm-objcopy for vmlinux BTFFangrui Song1-0/+15
Simplify gen_btf logic to make it work with llvm-objcopy. The existing 'file format' and 'architecture' parsing logic is brittle and does not work with llvm-objcopy/llvm-objdump. 'file format' output of llvm-objdump>=11 will match GNU objdump, but 'architecture' (bfdarch) may not. .BTF in .tmp_vmlinux.btf is non-SHF_ALLOC. Add the SHF_ALLOC flag because it is part of vmlinux image used for introspection. C code can reference the section via linker script defined __start_BTF and __stop_BTF. This fixes a small problem that previous .BTF had the SHF_WRITE flag (objcopy -I binary -O elf* synthesized .data). Additionally, `objcopy -I binary` synthesized symbols _binary__btf_vmlinux_bin_start and _binary__btf_vmlinux_bin_stop (not used elsewhere) are replaced with more commonplace __start_BTF and __stop_BTF. Add 2>/dev/null because GNU objcopy (but not llvm-objcopy) warns "empty loadable segment detected at vaddr=0xffffffff81000000, is this intentional?" We use a dd command to change the e_type field in the ELF header from ET_EXEC to ET_REL so that lld will accept .btf.vmlinux.bin.o. Accepting ET_EXEC as an input file is an extremely rare GNU ld feature that lld does not intend to support, because this is error-prone. The output section description .BTF in include/asm-generic/vmlinux.lds.h avoids potential subtle orphan section placement issues and suppresses --orphan-handling=warn warnings. Fixes: df786c9b9476 ("bpf: Force .BTF section start to zero when dumping from vmlinux") Fixes: cb0cc635c7a9 ("powerpc: Include .BTF section") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Stanislav Fomichev <sdf@google.com> Tested-by: Andrii Nakryiko <andriin@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Link: https://github.com/ClangBuiltLinux/linux/issues/871 Link: https://lore.kernel.org/bpf/20200318222746.173648-1-maskray@google.com
2019-11-27Merge tag 'trace-v5.5' of ↵Linus Torvalds1-2/+11
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - New PERMANENT flag to ftrace_ops when attaching a callback to a function. As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes" * tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits) tracing: Enable syscall optimization for MIPS tracing: Use xarray for syscall trace events tracing: Sample module to demonstrate kernel access to Ftrace instances. tracing: Adding new functions for kernel access to Ftrace instances tracing: Fix Kconfig indentation ring-buffer: Fix typos in function ring_buffer_producer ftrace: Use BIT() macro ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured ftrace: Rename ftrace_graph_stub to ftrace_stub_graph ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization ftrace: Add helper find_direct_entry() to consolidate code ftrace: Add another check for match in register_ftrace_direct() ftrace: Fix accounting bug with direct->count in register_ftrace_direct() ftrace/selftests: Fix spelling mistake "wakeing" -> "waking" tracing: Increase SYNTH_FIELDS_MAX for synthetic_events ftrace/samples: Add a sample module that implements modify_ftrace_direct() ftrace: Add modify_ftrace_direct() tracing: Add missing "inline" in stub function of latency_fsnotify() tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text tracing: Use seq_buf_hex_dump() to dump buffers ...
2019-11-26Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds1-14/+39
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - Cross-arch changes to move the linker sections for NOTES and EXCEPTION_TABLE into the RO_DATA area, where they belong on most architectures. (Kees Cook) - Switch the x86 linker fill byte from x90 (NOP) to 0xcc (INT3), to trap jumps into the middle of those padding areas instead of sliding execution. (Kees Cook) - A thorough cleanup of symbol definitions within x86 assembler code. The rather randomly named macros got streamlined around a (hopefully) straightforward naming scheme: SYM_START(name, linkage, align...) SYM_END(name, sym_type) SYM_FUNC_START(name) SYM_FUNC_END(name) SYM_CODE_START(name) SYM_CODE_END(name) SYM_DATA_START(name) SYM_DATA_END(name) etc - with about three times of these basic primitives with some label, local symbol or attribute variant, expressed via postfixes. No change in functionality intended. (Jiri Slaby) - Misc other changes, cleanups and smaller fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) x86/entry/64: Remove pointless jump in paranoid_exit x86/entry/32: Remove unused resume_userspace label x86/build/vdso: Remove meaningless CFLAGS_REMOVE_*.o m68k: Convert missed RODATA to RO_DATA x86/vmlinux: Use INT3 instead of NOP for linker fill bytes x86/mm: Report actual image regions in /proc/iomem x86/mm: Report which part of kernel image is freed x86/mm: Remove redundant address-of operators on addresses xtensa: Move EXCEPTION_TABLE to RO_DATA segment powerpc: Move EXCEPTION_TABLE to RO_DATA segment parisc: Move EXCEPTION_TABLE to RO_DATA segment microblaze: Move EXCEPTION_TABLE to RO_DATA segment ia64: Move EXCEPTION_TABLE to RO_DATA segment h8300: Move EXCEPTION_TABLE to RO_DATA segment c6x: Move EXCEPTION_TABLE to RO_DATA segment arm64: Move EXCEPTION_TABLE to RO_DATA segment alpha: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Actually use _etext for the end of the text segment vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATA ...
2019-11-18ftrace: Rename ftrace_graph_stub to ftrace_stub_graphSteven Rostedt (VMware)1-4/+4
The ftrace_graph_stub was created and points to ftrace_stub as a way to assign the functon graph tracer function pointer to a stub function with a different prototype than what ftrace_stub has and not trigger the C verifier. The ftrace_graph_stub was created via the linker script vmlinux.lds.h. Unfortunately, powerpc already uses the name ftrace_graph_stub for its internal implementation of the function graph tracer, and even though powerpc would still build, the change via the linker script broke function tracer on powerpc from working. By using the name ftrace_stub_graph, which does not exist anywhere else in the kernel, this should not be a problem. Link: https://lore.kernel.org/r/1573849732.5937.136.camel@lca.pw Fixes: b83b43ffc6e4 ("fgraph: Fix function type mismatches of ftrace_graph_return using ftrace_stub") Reorted-by: Qian Cai <cai@lca.pw> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-14fgraph: Fix function type mismatches of ftrace_graph_return using ftrace_stubSteven Rostedt (VMware)1-3/+14
The C compiler is allowing more checks to make sure that function pointers are assigned to the correct prototype function. Unfortunately, the function graph tracer uses a special name with its assigned ftrace_graph_return function pointer that maps to a stub function used by the function tracer (ftrace_stub). The ftrace_graph_return variable is compared to the ftrace_stub in some archs to know if the function graph tracer is enabled or not. This means we can not just simply create a new function stub that compares it without modifying all the archs. Instead, have the linker script create a function_graph_stub that maps to ftrace_stub, and this way we can define the prototype for it to match the prototype of ftrace_graph_return, and make the compiler checks all happy! Link: http://lkml.kernel.org/r/20191015090055.789a0aed@gandalf.local.home Cc: linux-sh@vger.kernel.org Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-06module/ftrace: handle patchable-function-entryMark Rutland1-7/+7
When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-11-04vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATAKees Cook1-0/+12
Many architectures have an EXCEPTION_TABLE that needs to be only readable. As such, it should live in RO_DATA. Create a macro to identify this case for the architectures that can move EXCEPTION_TABLE into RO_DATA. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Will Deacon <will@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-15-keescook@chromium.org
2019-11-04vmlinux.lds.h: Replace RW_DATA_SECTION with RW_DATAKees Cook1-2/+2
Rename RW_DATA_SECTION to RW_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-14-keescook@chromium.org
2019-11-04vmlinux.lds.h: Replace RO_DATA_SECTION with RO_DATAKees Cook1-5/+2
Finish renaming RO_DATA_SECTION to RO_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-13-keescook@chromium.org
2019-11-04vmlinux.lds.h: Replace RODATA with RO_DATAKees Cook1-3/+1
There's no reason to keep the RODATA macro: replace the callers with the expected RO_DATA macro. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-12-keescook@chromium.org
2019-11-04vmlinux.lds.h: Move NOTES into RO_DATAKees Cook1-4/+5
The .notes section should be non-executable read-only data. As such, move it to the RO_DATA macro instead of being per-architecture defined. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-11-keescook@chromium.org
2019-11-04vmlinux.lds.h: Move Program Header restoration into NOTES macroKees Cook1-2/+11
In preparation for moving NOTES into RO_DATA, make the Program Header assignment restoration be part of the NOTES macro itself. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-10-keescook@chromium.org
2019-11-04vmlinux.lds.h: Provide EMIT_PT_NOTE to indicate export of .notesKees Cook1-0/+8
In preparation for moving NOTES into RO_DATA, provide a mechanism for architectures that want to emit a PT_NOTE Program Header to do so. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-9-keescook@chromium.org
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds1-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-08-19security: Support early LSMsMatthew Garrett1-1/+7
The lockdown module is intended to allow for kernels to be locked down early in boot - sufficiently early that we don't have the ability to kmalloc() yet. Add support for early initialisation of some LSMs, and then add them to the list of names when we do full initialisation later. Early LSMs are initialised in link order and cannot be overridden via boot parameters, and cannot make use of kmalloc() (since the allocator isn't initialised yet). (Fixed by Stephen Rothwell to include a stub to fix builds when !CONFIG_SECURITY) Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Morris <jmorris@namei.org>
2019-07-17Merge branch 'next' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management updates from Zhang Rui: - Convert thermal documents to ReST (Mauro Carvalho Chehab) - Fix a cyclic depedency in between thermal core and governors (Daniel Lezcano) - Fix processor_thermal_device driver to re-evaluate power limits after resume (Srinivas Pandruvada, Zhang Rui) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: drivers: thermal: processor_thermal_device: Fix build warning docs: thermal: convert to ReST thermal/drivers/core: Use governor table to initialize thermal/drivers/core: Add init section table for self-encapsulation drivers: thermal: processor_thermal: Read PPCC on resume
2019-06-27thermal/drivers/core: Add init section table for self-encapsulationDaniel Lezcano1-0/+11
Currently the governors are declared in their respective files but they export their [un]register functions which in turn call the [un]register governors core's functions. That implies a cyclic dependency which is not desirable. There is a way to self-encapsulate the governors by letting them to declare themselves in a __init section table. Define the table in the asm generic linker description like the other tables and provide the specific macros to deal with. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-08parisc: add dynamic ftraceSven Schnelle1-0/+7
This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-05-14Merge tag 'modules-for-v5.2' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: - Use a separate table to store symbol types instead of hijacking fields in struct Elf_Sym - Trivial code cleanups * tag 'modules-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: add stubs for within_module functions kallsyms: store type information in its own array vmlinux.lds.h: drop unused __vermagic
2019-05-07moduleparam: Save information about built-in modules in separate fileAlexey Gladkov1-0/+1
Problem: When a kernel module is compiled as a separate module, some important information about the kernel module is available via .modinfo section of the module. In contrast, when the kernel module is compiled into the kernel, that information is not available. Information about built-in modules is necessary in the following cases: 1. When it is necessary to find out what additional parameters can be passed to the kernel at boot time. 2. When you need to know which module names and their aliases are in the kernel. This is very useful for creating an initrd image. Proposal: The proposed patch does not remove .modinfo section with module information from the vmlinux at the build time and saves it into a separate file after kernel linking. So, the kernel does not increase in size and no additional information remains in it. Information is stored in the same format as in the separate modules (null-terminated string array). Because the .modinfo section is already exported with a separate modules, we are not creating a new API. It can be easily read in the userspace: $ tr '\0' '\n' < modules.builtin.modinfo ext4.softdep=pre: crc32c ext4.license=GPL ext4.description=Fourth Extended Filesystem ext4.author=Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others ext4.alias=fs-ext4 ext4.alias=ext3 ext4.alias=fs-ext3 ext4.alias=ext2 ext4.alias=fs-ext2 md_mod.alias=block-major-9-* md_mod.alias=md md_mod.description=MD RAID framework md_mod.license=GPL md_mod.parmtype=create_on_open:bool md_mod.parmtype=start_dirty_degraded:int ... Co-Developed-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Signed-off-by: Alexey Gladkov <gladkov.alexey@gmail.com> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-03-18vmlinux.lds.h: drop unused __vermagicMathias Krause1-1/+0
The reference to '__vermagic' is a relict from v2.5 times. And even there it had a very short life time, from v2.5.59 (commit 1d411b80ee18 ("Module Sanity Check") in the historic tree) to v2.5.69 (commit 67ac5b866bda ("[PATCH] complete modinfo section")). Neither current kernels nor modules contain a '__vermagic' section any more, so get rid of it. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Jessica Yu <jeyu@kernel.org>