summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-04-27 16:36:55 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2023-04-27 16:36:55 -0700
commitb6a7828502dc769e1a5329027bc5048222fa210a (patch)
tree60418229584831505036bd2d368320b7387e7b3a /include
parentd06f5a3f7140921ada47d49574ae6fa4de5e2a89 (diff)
parent8660484ed1cf3261e89e0bad94c6395597e87599 (diff)
Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain: "The summary of the changes for this pull requests is: - Song Liu's new struct module_memory replacement - Nick Alcock's MODULE_LICENSE() removal for non-modules - My cleanups and enhancements to reduce the areas where we vmalloc module memory for duplicates, and the respective debug code which proves the remaining vmalloc pressure comes from userspace. Most of the changes have been in linux-next for quite some time except the minor fixes I made to check if a module was already loaded prior to allocating the final module memory with vmalloc and the respective debug code it introduces to help clarify the issue. Although the functional change is small it is rather safe as it can only *help* reduce vmalloc space for duplicates and is confirmed to fix a bootup issue with over 400 CPUs with KASAN enabled. I don't expect stable kernels to pick up that fix as the cleanups would have also had to have been picked up. Folks on larger CPU systems with modules will want to just upgrade if vmalloc space has been an issue on bootup. Given the size of this request, here's some more elaborate details: The functional change change in this pull request is the very first patch from Song Liu which replaces the 'struct module_layout' with a new 'struct module_memory'. The old data structure tried to put together all types of supported module memory types in one data structure, the new one abstracts the differences in memory types in a module to allow each one to provide their own set of details. This paves the way in the future so we can deal with them in a cleaner way. If you look at changes they also provide a nice cleanup of how we handle these different memory areas in a module. This change has been in linux-next since before the merge window opened for v6.3 so to provide more than a full kernel cycle of testing. It's a good thing as quite a bit of fixes have been found for it. Jason Baron then made dynamic debug a first class citizen module user by using module notifier callbacks to allocate / remove module specific dynamic debug information. Nick Alcock has done quite a bit of work cross-tree to remove module license tags from things which cannot possibly be module at my request so to: a) help him with his longer term tooling goals which require a deterministic evaluation if a piece a symbol code could ever be part of a module or not. But quite recently it is has been made clear that tooling is not the only one that would benefit. Disambiguating symbols also helps efforts such as live patching, kprobes and BPF, but for other reasons and R&D on this area is active with no clear solution in sight. b) help us inch closer to the now generally accepted long term goal of automating all the MODULE_LICENSE() tags from SPDX license tags In so far as a) is concerned, although module license tags are a no-op for non-modules, tools which would want create a mapping of possible modules can only rely on the module license tag after the commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"). Nick has been working on this *for years* and AFAICT I was the only one to suggest two alternatives to this approach for tooling. The complexity in one of my suggested approaches lies in that we'd need a possible-obj-m and a could-be-module which would check if the object being built is part of any kconfig build which could ever lead to it being part of a module, and if so define a new define -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've suggested would be to have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names mapping to modules always, and I don't think that's the case today. I am not aware of Nick or anyone exploring either of these options. Quite recently Josh Poimboeuf has pointed out that live patching, kprobes and BPF would benefit from resolving some part of the disambiguation as well but for other reasons. The function granularity KASLR (fgkaslr) patches were mentioned but Joe Lawrence has clarified this effort has been dropped with no clear solution in sight [1]. In the meantime removing module license tags from code which could never be modules is welcomed for both objectives mentioned above. Some developers have also welcomed these changes as it has helped clarify when a module was never possible and they forgot to clean this up, and so you'll see quite a bit of Nick's patches in other pull requests for this merge window. I just picked up the stragglers after rc3. LWN has good coverage on the motivation behind this work [2] and the typical cross-tree issues he ran into along the way. The only concrete blocker issue he ran into was that we should not remove the MODULE_LICENSE() tags from files which have no SPDX tags yet, even if they can never be modules. Nick ended up giving up on his efforts due to having to do this vetting and backlash he ran into from folks who really did *not understand* the core of the issue nor were providing any alternative / guidance. I've gone through his changes and dropped the patches which dropped the module license tags where an SPDX license tag was missing, it only consisted of 11 drivers. To see if a pull request deals with a file which lacks SPDX tags you can just use: ./scripts/spdxcheck.py -f \ $(git diff --name-only commid-id | xargs echo) You'll see a core module file in this pull request for the above, but that's not related to his changes. WE just need to add the SPDX license tag for the kernel/module/kmod.c file in the future but it demonstrates the effectiveness of the script. Most of Nick's changes were spread out through different trees, and I just picked up the slack after rc3 for the last kernel was out. Those changes have been in linux-next for over two weeks. The cleanups, debug code I added and final fix I added for modules were motivated by David Hildenbrand's report of boot failing on a systems with over 400 CPUs when KASAN was enabled due to running out of virtual memory space. Although the functional change only consists of 3 lines in the patch "module: avoid allocation if module is already present and ready", proving that this was the best we can do on the modules side took quite a bit of effort and new debug code. The initial cleanups I did on the modules side of things has been in linux-next since around rc3 of the last kernel, the actual final fix for and debug code however have only been in linux-next for about a week or so but I think it is worth getting that code in for this merge window as it does help fix / prove / evaluate the issues reported with larger number of CPUs. Userspace is not yet fixed as it is taking a bit of time for folks to understand the crux of the issue and find a proper resolution. Worst come to worst, I have a kludge-of-concept [3] of how to make kernel_read*() calls for modules unique / converge them, but I'm currently inclined to just see if userspace can fix this instead" Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0] Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1] Link: https://lwn.net/Articles/927569/ [2] Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3] * tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits) module: add debugging auto-load duplicate module support module: stats: fix invalid_mod_bytes typo module: remove use of uninitialized variable len module: fix building stats for 32-bit targets module: stats: include uapi/linux/module.h module: avoid allocation if module is already present and ready module: add debug stats to help identify memory pressure module: extract patient module check into helper modules/kmod: replace implementation with a semaphore Change DEFINE_SEMAPHORE() to take a number argument module: fix kmemleak annotations for non init ELF sections module: Ignore L0 and rename is_arm_mapping_symbol() module: Move is_arm_mapping_symbol() to module_symbol.h module: Sync code of is_arm_mapping_symbol() scripts/gdb: use mem instead of core_layout to get the module address interconnect: remove module-related code interconnect: remove MODULE_LICENSE in non-modules zswap: remove MODULE_LICENSE in non-modules zpool: remove MODULE_LICENSE in non-modules x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules ...
Diffstat (limited to 'include')
-rw-r--r--include/linux/dynamic_debug.h68
-rw-r--r--include/linux/kallsyms.h7
-rw-r--r--include/linux/module.h141
-rw-r--r--include/linux/module_symbol.h17
-rw-r--r--include/linux/semaphore.h10
5 files changed, 171 insertions, 72 deletions
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 41682278d2e8..061dd84d09f3 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -128,17 +128,16 @@ struct ddebug_class_param {
const struct ddebug_class_map *map;
};
-#if defined(CONFIG_DYNAMIC_DEBUG_CORE)
-
-int ddebug_add_module(struct _ddebug_info *dyndbg, const char *modname);
+/*
+ * pr_debug() and friends are globally enabled or modules have selectively
+ * enabled them.
+ */
+#if defined(CONFIG_DYNAMIC_DEBUG) || \
+ (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
-extern int ddebug_remove_module(const char *mod_name);
extern __printf(2, 3)
void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...);
-extern int ddebug_dyndbg_module_param_cb(char *param, char *val,
- const char *modname);
-
struct device;
extern __printf(3, 4)
@@ -287,10 +286,6 @@ void __dynamic_ibdev_dbg(struct _ddebug *descriptor,
KERN_DEBUG, prefix_str, prefix_type, \
rowsize, groupsize, buf, len, ascii)
-struct kernel_param;
-int param_set_dyndbg_classes(const char *instr, const struct kernel_param *kp);
-int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp);
-
/* for test only, generally expect drm.debug style macro wrappers */
#define __pr_debug_cls(cls, fmt, ...) do { \
BUILD_BUG_ON_MSG(!__builtin_constant_p(cls), \
@@ -298,21 +293,38 @@ int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp);
dynamic_pr_debug_cls(cls, fmt, ##__VA_ARGS__); \
} while (0)
-#else /* !CONFIG_DYNAMIC_DEBUG_CORE */
+#else /* !(CONFIG_DYNAMIC_DEBUG || (CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE)) */
#include <linux/string.h>
#include <linux/errno.h>
#include <linux/printk.h>
-static inline int ddebug_add_module(struct _ddebug_info *dinfo, const char *modname)
-{
- return 0;
-}
+#define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)
+#define DYNAMIC_DEBUG_BRANCH(descriptor) false
-static inline int ddebug_remove_module(const char *mod)
-{
- return 0;
-}
+#define dynamic_pr_debug(fmt, ...) \
+ do { if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); } while (0)
+#define dynamic_dev_dbg(dev, fmt, ...) \
+ do { if (0) dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); } while (0)
+#define dynamic_hex_dump(prefix_str, prefix_type, rowsize, \
+ groupsize, buf, len, ascii) \
+ do { if (0) \
+ print_hex_dump(KERN_DEBUG, prefix_str, prefix_type, \
+ rowsize, groupsize, buf, len, ascii); \
+ } while (0)
+
+#endif /* CONFIG_DYNAMIC_DEBUG || (CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE) */
+
+
+#ifdef CONFIG_DYNAMIC_DEBUG_CORE
+
+extern int ddebug_dyndbg_module_param_cb(char *param, char *val,
+ const char *modname);
+struct kernel_param;
+int param_set_dyndbg_classes(const char *instr, const struct kernel_param *kp);
+int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp);
+
+#else
static inline int ddebug_dyndbg_module_param_cb(char *param, char *val,
const char *modname)
@@ -326,25 +338,15 @@ static inline int ddebug_dyndbg_module_param_cb(char *param, char *val,
return -EINVAL;
}
-#define dynamic_pr_debug(fmt, ...) \
- do { if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); } while (0)
-#define dynamic_dev_dbg(dev, fmt, ...) \
- do { if (0) dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); } while (0)
-#define dynamic_hex_dump(prefix_str, prefix_type, rowsize, \
- groupsize, buf, len, ascii) \
- do { if (0) \
- print_hex_dump(KERN_DEBUG, prefix_str, prefix_type, \
- rowsize, groupsize, buf, len, ascii); \
- } while (0)
-
struct kernel_param;
static inline int param_set_dyndbg_classes(const char *instr, const struct kernel_param *kp)
{ return 0; }
static inline int param_get_dyndbg_classes(char *buffer, const struct kernel_param *kp)
{ return 0; }
-#endif /* !CONFIG_DYNAMIC_DEBUG_CORE */
+#endif
+
extern const struct kernel_param_ops param_ops_dyndbg_classes;
-#endif
+#endif /* _DYNAMIC_DEBUG_H */
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index 0065209cc004..fe3c9993b5bf 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -67,8 +67,7 @@ static inline void *dereference_symbol_descriptor(void *ptr)
#ifdef CONFIG_KALLSYMS
unsigned long kallsyms_sym_address(int idx);
-int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
- unsigned long),
+int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
void *data);
int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
const char *name, void *data);
@@ -166,8 +165,8 @@ static inline bool kallsyms_show_value(const struct cred *cred)
return false;
}
-static inline int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
- unsigned long), void *data)
+static inline int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long),
+ void *data)
{
return -EOPNOTSUPP;
}
diff --git a/include/linux/module.h b/include/linux/module.h
index 3730ed99e5f7..9e56763dff81 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -27,6 +27,7 @@
#include <linux/tracepoint-defs.h>
#include <linux/srcu.h>
#include <linux/static_call_types.h>
+#include <linux/dynamic_debug.h>
#include <linux/percpu.h>
#include <asm/module.h>
@@ -320,17 +321,47 @@ struct mod_tree_node {
struct latch_tree_node node;
};
-struct module_layout {
- /* The actual code + data. */
+enum mod_mem_type {
+ MOD_TEXT = 0,
+ MOD_DATA,
+ MOD_RODATA,
+ MOD_RO_AFTER_INIT,
+ MOD_INIT_TEXT,
+ MOD_INIT_DATA,
+ MOD_INIT_RODATA,
+
+ MOD_MEM_NUM_TYPES,
+ MOD_INVALID = -1,
+};
+
+#define mod_mem_type_is_init(type) \
+ ((type) == MOD_INIT_TEXT || \
+ (type) == MOD_INIT_DATA || \
+ (type) == MOD_INIT_RODATA)
+
+#define mod_mem_type_is_core(type) (!mod_mem_type_is_init(type))
+
+#define mod_mem_type_is_text(type) \
+ ((type) == MOD_TEXT || \
+ (type) == MOD_INIT_TEXT)
+
+#define mod_mem_type_is_data(type) (!mod_mem_type_is_text(type))
+
+#define mod_mem_type_is_core_data(type) \
+ (mod_mem_type_is_core(type) && \
+ mod_mem_type_is_data(type))
+
+#define for_each_mod_mem_type(type) \
+ for (enum mod_mem_type (type) = 0; \
+ (type) < MOD_MEM_NUM_TYPES; (type)++)
+
+#define for_class_mod_mem_type(type, class) \
+ for_each_mod_mem_type(type) \
+ if (mod_mem_type_is_##class(type))
+
+struct module_memory {
void *base;
- /* Total size. */
unsigned int size;
- /* The size of the executable code. */
- unsigned int text_size;
- /* Size of RO section of the module (text+rodata) */
- unsigned int ro_size;
- /* Size of RO after init section */
- unsigned int ro_after_init_size;
#ifdef CONFIG_MODULES_TREE_LOOKUP
struct mod_tree_node mtn;
@@ -339,9 +370,9 @@ struct module_layout {
#ifdef CONFIG_MODULES_TREE_LOOKUP
/* Only touch one cacheline for common rbtree-for-core-layout case. */
-#define __module_layout_align ____cacheline_aligned
+#define __module_memory_align ____cacheline_aligned
#else
-#define __module_layout_align
+#define __module_memory_align
#endif
struct mod_kallsyms {
@@ -426,12 +457,7 @@ struct module {
/* Startup function. */
int (*init)(void);
- /* Core layout: rbtree is accessed frequently, so keep together. */
- struct module_layout core_layout __module_layout_align;
- struct module_layout init_layout;
-#ifdef CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
- struct module_layout data_layout;
-#endif
+ struct module_memory mem[MOD_MEM_NUM_TYPES] __module_memory_align;
/* Arch-specific module values */
struct mod_arch_specific arch;
@@ -554,6 +580,9 @@ struct module {
struct error_injection_entry *ei_funcs;
unsigned int num_ei_funcs;
#endif
+#ifdef CONFIG_DYNAMIC_DEBUG_CORE
+ struct _ddebug_info dyndbg_info;
+#endif
} ____cacheline_aligned __randomize_layout;
#ifndef MODULE_ARCH_INIT
#define MODULE_ARCH_INIT {}
@@ -581,23 +610,35 @@ bool __is_module_percpu_address(unsigned long addr, unsigned long *can_addr);
bool is_module_percpu_address(unsigned long addr);
bool is_module_text_address(unsigned long addr);
+static inline bool within_module_mem_type(unsigned long addr,
+ const struct module *mod,
+ enum mod_mem_type type)
+{
+ unsigned long base, size;
+
+ base = (unsigned long)mod->mem[type].base;
+ size = mod->mem[type].size;
+ return addr - base < size;
+}
+
static inline bool within_module_core(unsigned long addr,
const struct module *mod)
{
-#ifdef CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC
- if ((unsigned long)mod->data_layout.base <= addr &&
- addr < (unsigned long)mod->data_layout.base + mod->data_layout.size)
- return true;
-#endif
- return (unsigned long)mod->core_layout.base <= addr &&
- addr < (unsigned long)mod->core_layout.base + mod->core_layout.size;
+ for_class_mod_mem_type(type, core) {
+ if (within_module_mem_type(addr, mod, type))
+ return true;
+ }
+ return false;
}
static inline bool within_module_init(unsigned long addr,
const struct module *mod)
{
- return (unsigned long)mod->init_layout.base <= addr &&
- addr < (unsigned long)mod->init_layout.base + mod->init_layout.size;
+ for_class_mod_mem_type(type, init) {
+ if (within_module_mem_type(addr, mod, type))
+ return true;
+ }
+ return false;
}
static inline bool within_module(unsigned long addr, const struct module *mod)
@@ -622,10 +663,46 @@ void symbol_put_addr(void *addr);
to handle the error case (which only happens with rmmod --wait). */
extern void __module_get(struct module *module);
-/* This is the Right Way to get a module: if it fails, it's being removed,
- * so pretend it's not there. */
+/**
+ * try_module_get() - take module refcount unless module is being removed
+ * @module: the module we should check for
+ *
+ * Only try to get a module reference count if the module is not being removed.
+ * This call will fail if the module is already being removed.
+ *
+ * Care must also be taken to ensure the module exists and is alive prior to
+ * usage of this call. This can be gauranteed through two means:
+ *
+ * 1) Direct protection: you know an earlier caller must have increased the
+ * module reference through __module_get(). This can typically be achieved
+ * by having another entity other than the module itself increment the
+ * module reference count.
+ *
+ * 2) Implied protection: there is an implied protection against module
+ * removal. An example of this is the implied protection used by kernfs /
+ * sysfs. The sysfs store / read file operations are guaranteed to exist
+ * through the use of kernfs's active reference (see kernfs_active()) and a
+ * sysfs / kernfs file removal cannot happen unless the same file is not
+ * active. Therefore, if a sysfs file is being read or written to the module
+ * which created it must still exist. It is therefore safe to use
+ * try_module_get() on module sysfs store / read ops.
+ *
+ * One of the real values to try_module_get() is the module_is_live() check
+ * which ensures that the caller of try_module_get() can yield to userspace
+ * module removal requests and gracefully fail if the module is on its way out.
+ *
+ * Returns true if the reference count was successfully incremented.
+ */
extern bool try_module_get(struct module *module);
+/**
+ * module_put() - release a reference count to a module
+ * @module: the module we should release a reference count for
+ *
+ * If you successfully bump a reference count to a module with try_module_get(),
+ * when you are finished you must call module_put() to release that reference
+ * count.
+ */
extern void module_put(struct module *module);
#else /*!CONFIG_MODULE_UNLOAD*/
@@ -782,7 +859,7 @@ void *dereference_module_function_descriptor(struct module *mod, void *ptr)
#ifdef CONFIG_SYSFS
extern struct kset *module_kset;
-extern struct kobj_type module_ktype;
+extern const struct kobj_type module_ktype;
#endif /* CONFIG_SYSFS */
#define symbol_request(x) try_then_request_module(symbol_get(x), "symbol:" #x)
@@ -836,8 +913,7 @@ static inline bool module_sig_ok(struct module *module)
#if defined(CONFIG_MODULES) && defined(CONFIG_KALLSYMS)
int module_kallsyms_on_each_symbol(const char *modname,
- int (*fn)(void *, const char *,
- struct module *, unsigned long),
+ int (*fn)(void *, const char *, unsigned long),
void *data);
/* For kallsyms to ask for address resolution. namebuf should be at
@@ -870,8 +946,7 @@ unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name);
#else /* CONFIG_MODULES && CONFIG_KALLSYMS */
static inline int module_kallsyms_on_each_symbol(const char *modname,
- int (*fn)(void *, const char *,
- struct module *, unsigned long),
+ int (*fn)(void *, const char *, unsigned long),
void *data)
{
return -EOPNOTSUPP;
diff --git a/include/linux/module_symbol.h b/include/linux/module_symbol.h
new file mode 100644
index 000000000000..7ace7ba30203
--- /dev/null
+++ b/include/linux/module_symbol.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _LINUX_MODULE_SYMBOL_H
+#define _LINUX_MODULE_SYMBOL_H
+
+/* This ignores the intensely annoying "mapping symbols" found in ELF files. */
+static inline int is_mapping_symbol(const char *str)
+{
+ if (str[0] == '.' && str[1] == 'L')
+ return true;
+ if (str[0] == 'L' && str[1] == '0')
+ return true;
+ return str[0] == '$' &&
+ (str[1] == 'a' || str[1] == 'd' || str[1] == 't' || str[1] == 'x')
+ && (str[2] == '\0' || str[2] == '.');
+}
+
+#endif /* _LINUX_MODULE_SYMBOL_H */
diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
index 6694d0019a68..04655faadc2d 100644
--- a/include/linux/semaphore.h
+++ b/include/linux/semaphore.h
@@ -25,8 +25,14 @@ struct semaphore {
.wait_list = LIST_HEAD_INIT((name).wait_list), \
}
-#define DEFINE_SEMAPHORE(name) \
- struct semaphore name = __SEMAPHORE_INITIALIZER(name, 1)
+/*
+ * Unlike mutexes, binary semaphores do not have an owner, so up() can
+ * be called in a different thread from the one which called down().
+ * It is also safe to call down_trylock() and up() from interrupt
+ * context.
+ */
+#define DEFINE_SEMAPHORE(_name, _n) \
+ struct semaphore _name = __SEMAPHORE_INITIALIZER(_name, _n)
static inline void sema_init(struct semaphore *sem, int val)
{