summaryrefslogtreecommitdiff
path: root/lib
diff options
context:
space:
mode:
authorNicholas Piggin <npiggin@gmail.com>2023-02-03 17:18:36 +1000
committerAndrew Morton <akpm@linux-foundation.org>2023-03-28 16:20:08 -0700
commit2655421ae69fa479df1575cb2630af9131d28939 (patch)
tree6e0520cb3d3ddcf9844743bc3440878f4474fcf7 /lib
parent88e3009b5283bbd41447f0352d0b9df16cf6f183 (diff)
lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme
On big systems, the mm refcount can become highly contented when doing a lot of context switching with threaded applications. user<->idle switch is one of the important cases. Abandoning lazy tlb entirely slows this switching down quite a bit in the common uncontended case, so that is not viable. Implement a scheme where lazy tlb mm references do not contribute to the refcount, instead they get explicitly removed when the refcount reaches zero. The final mmdrop() sends IPIs to all CPUs in the mm_cpumask and they switch away from this mm to init_mm if it was being used as the lazy tlb mm. Enabling the shoot lazies option therefore requires that the arch ensures that mm_cpumask contains all CPUs that could possibly be using mm. A DEBUG_VM option IPIs every CPU in the system after this to ensure there are no references remaining before the mm is freed. Shootdown IPIs cost could be an issue, but they have not been observed to be a serious problem with this scheme, because short-lived processes tend not to migrate CPUs much, therefore they don't get much chance to leave lazy tlb mm references on remote CPUs. There are a lot of options to reduce them if necessary, described in comments. The near-worst-case can be benchmarked with will-it-scale: context_switch1_threads -t $(($(nproc) / 2)) This will create nproc threads (nproc / 2 switching pairs) all sharing the same mm that spread over all CPUs so each CPU does thread->idle->thread switching. [ Rik came up with basically the same idea a few years ago, so credit to him for that. ] Link: https://lore.kernel.org/linux-mm/20230118080011.2258375-1-npiggin@gmail.com/ Link: https://lore.kernel.org/all/20180728215357.3249-11-riel@surriel.com/ Link: https://lkml.kernel.org/r/20230203071837.1136453-5-npiggin@gmail.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'lib')
-rw-r--r--lib/Kconfig.debug10
1 files changed, 10 insertions, 0 deletions
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 39d1d93164bd..5cd8183bb4c1 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -791,6 +791,16 @@ config DEBUG_VM
If unsure, say N.
+config DEBUG_VM_SHOOT_LAZIES
+ bool "Debug MMU_LAZY_TLB_SHOOTDOWN implementation"
+ depends on DEBUG_VM
+ depends on MMU_LAZY_TLB_SHOOTDOWN
+ help
+ Enable additional IPIs that ensure lazy tlb mm references are removed
+ before the mm is freed.
+
+ If unsure, say N.
+
config DEBUG_VM_MAPLE_TREE
bool "Debug VM maple trees"
depends on DEBUG_VM