diff options
Diffstat (limited to 'Documentation/cgroups/unified-hierarchy.txt')
-rw-r--r-- | Documentation/cgroups/unified-hierarchy.txt | 102 |
1 files changed, 84 insertions, 18 deletions
diff --git a/Documentation/cgroups/unified-hierarchy.txt b/Documentation/cgroups/unified-hierarchy.txt index eb102fb72213..86847a7647ab 100644 --- a/Documentation/cgroups/unified-hierarchy.txt +++ b/Documentation/cgroups/unified-hierarchy.txt @@ -17,15 +17,18 @@ CONTENTS 3. Structural Constraints 3-1. Top-down 3-2. No internal tasks -4. Other Changes - 4-1. [Un]populated Notification - 4-2. Other Core Changes - 4-3. Per-Controller Changes - 4-3-1. blkio - 4-3-2. cpuset - 4-3-3. memory -5. Planned Changes - 5-1. CAP for resource control +4. Delegation + 4-1. Model of delegation + 4-2. Common ancestor rule +5. Other Changes + 5-1. [Un]populated Notification + 5-2. Other Core Changes + 5-3. Per-Controller Changes + 5-3-1. blkio + 5-3-2. cpuset + 5-3-3. memory +6. Planned Changes + 6-1. CAP for resource control 1. Background @@ -245,9 +248,72 @@ cgroup must create children and transfer all its tasks to the children before enabling controllers in its "cgroup.subtree_control" file. -4. Other Changes +4. Delegation -4-1. [Un]populated Notification +4-1. Model of delegation + +A cgroup can be delegated to a less privileged user by granting write +access of the directory and its "cgroup.procs" file to the user. Note +that the resource control knobs in a given directory concern the +resources of the parent and thus must not be delegated along with the +directory. + +Once delegated, the user can build sub-hierarchy under the directory, +organize processes as it sees fit and further distribute the resources +it got from the parent. The limits and other settings of all resource +controllers are hierarchical and regardless of what happens in the +delegated sub-hierarchy, nothing can escape the resource restrictions +imposed by the parent. + +Currently, cgroup doesn't impose any restrictions on the number of +cgroups in or nesting depth of a delegated sub-hierarchy; however, +this may in the future be limited explicitly. + + +4-2. Common ancestor rule + +On the unified hierarchy, to write to a "cgroup.procs" file, in +addition to the usual write permission to the file and uid match, the +writer must also have write access to the "cgroup.procs" file of the +common ancestor of the source and destination cgroups. This prevents +delegatees from smuggling processes across disjoint sub-hierarchies. + +Let's say cgroups C0 and C1 have been delegated to user U0 who created +C00, C01 under C0 and C10 under C1 as follows. + + ~~~~~~~~~~~~~ - C0 - C00 + ~ cgroup ~ \ C01 + ~ hierarchy ~ + ~~~~~~~~~~~~~ - C1 - C10 + +C0 and C1 are separate entities in terms of resource distribution +regardless of their relative positions in the hierarchy. The +resources the processes under C0 are entitled to are controlled by +C0's ancestors and may be completely different from C1. It's clear +that the intention of delegating C0 to U0 is allowing U0 to organize +the processes under C0 and further control the distribution of C0's +resources. + +On traditional hierarchies, if a task has write access to "tasks" or +"cgroup.procs" file of a cgroup and its uid agrees with the target, it +can move the target to the cgroup. In the above example, U0 will not +only be able to move processes in each sub-hierarchy but also across +the two sub-hierarchies, effectively allowing it to violate the +organizational and resource restrictions implied by the hierarchical +structure above C0 and C1. + +On the unified hierarchy, let's say U0 wants to write the pid of a +process which has a matching uid and is currently in C10 into +"C00/cgroup.procs". U0 obviously has write access to the file and +migration permission on the process; however, the common ancestor of +the source cgroup C10 and the destination cgroup C00 is above the +points of delegation and U0 would not have write access to its +"cgroup.procs" and thus be denied with -EACCES. + + +5. Other Changes + +5-1. [Un]populated Notification cgroup users often need a way to determine when a cgroup's subhierarchy becomes empty so that it can be cleaned up. cgroup @@ -289,7 +355,7 @@ supported and the interface files "release_agent" and "notify_on_release" do not exist. -4-2. Other Core Changes +5-2. Other Core Changes - None of the mount options is allowed. @@ -306,14 +372,14 @@ supported and the interface files "release_agent" and - The "cgroup.clone_children" file is removed. -4-3. Per-Controller Changes +5-3. Per-Controller Changes -4-3-1. blkio +5-3-1. blkio - blk-throttle becomes properly hierarchical. -4-3-2. cpuset +5-3-2. cpuset - Tasks are kept in empty cpusets after hotplug and take on the masks of the nearest non-empty ancestor, instead of being moved to it. @@ -322,7 +388,7 @@ supported and the interface files "release_agent" and masks of the nearest non-empty ancestor. -4-3-3. memory +5-3-3. memory - use_hierarchy is on by default and the cgroup file for the flag is not created. @@ -407,9 +473,9 @@ supported and the interface files "release_agent" and memory.low, memory.high, and memory.max will use the string "max" to indicate and set the highest possible value. -5. Planned Changes +6. Planned Changes -5-1. CAP for resource control +6-1. CAP for resource control Unified hierarchy will require one of the capabilities(7), which is yet to be decided, for all resource control related knobs. Process |