summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFrederic Weisbecker <frederic@kernel.org>2021-05-26 01:58:49 +0200
committerIngo Molnar <mingo@kernel.org>2021-05-26 08:58:53 +0200
commita8ea6fc9b089156d9230bfeef964dd9be101a4a9 (patch)
tree53e75046d2c0c7b2b06b45edf6a6cb47b5ce318d
parent1699949d3314e5d1956fb082e4cd4798bf6149fc (diff)
sched: Stop PF_NO_SETAFFINITY from being inherited by various init system threads
Commit: 00b89fe0197f ("sched: Make the idle task quack like a per-CPU kthread") ... added PF_KTHREAD | PF_NO_SETAFFINITY to the idle kernel threads. Unfortunately these properties are inherited to the init/0 children through kernel_thread() calls: init/1 and kthreadd. There are several side effects to that: 1) kthreadd affinity can not be reset anymore from userspace. Also PF_NO_SETAFFINITY propagates to all kthreadd children, including the unbound kthreads Therefore it's not possible anymore to overwrite the affinity of any of them. Here is an example of warning reported by rcutorture: WARNING: CPU: 0 PID: 116 at kernel/rcu/tree_nocb.h:1306 rcu_bind_current_to_nocb+0x31/0x40 Call Trace: rcu_torture_fwd_prog+0x62/0x730 kthread+0x122/0x140 ret_from_fork+0x22/0x30 2) init/1 does an exec() in the end which clears both PF_KTHREAD and PF_NO_SETAFFINITY so we are fine once kernel_init() escapes to userspace. But until then, no initcall or init code can successfully call sched_setaffinity() to init/1. Also PF_KTHREAD looks legit on init/1 before it calls exec() but we better be careful with unknown introduced side effects. One way to solve the PF_NO_SETAFFINITY issue is to not inherit this flag on copy_process() at all. The cases where it matters are: * fork_idle(): explicitly set the flag already. * fork() syscalls: userspace tasks that shouldn't be concerned by that. * create_io_thread(): the callers explicitly attribute the flag to the newly created tasks. * kernel_thread(): - Fix the issues on init/1 and kthreadd - Fix the issues on kthreadd children. - Usermode helper created by an unbound workqueue. This shouldn't matter. In the worst case it gives more control to userspace on setting affinity to these short living tasks although this can be tuned with inherited unbound workqueues affinity already. Fixes: 00b89fe0197f ("sched: Make the idle task quack like a per-CPU kthread") Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20210525235849.441842-1-frederic@kernel.org
-rw-r--r--kernel/fork.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/kernel/fork.c b/kernel/fork.c
index ace4631b5b54..e595e77913eb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2000,7 +2000,7 @@ static __latent_entropy struct task_struct *copy_process(
goto bad_fork_cleanup_count;
delayacct_tsk_init(p); /* Must remain after dup_task_struct() */
- p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE);
+ p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE | PF_NO_SETAFFINITY);
p->flags |= PF_FORKNOEXEC;
INIT_LIST_HEAD(&p->children);
INIT_LIST_HEAD(&p->sibling);