diff options
author | Michal Hocko <mhocko@suse.cz> | 2014-05-22 10:54:36 +1000 |
---|---|---|
committer | Stephen Rothwell <sfr@canb.auug.org.au> | 2014-06-03 18:50:35 +1000 |
commit | 3eeb44cfe228b5d17d225d4b3dba25ed26ab65d7 (patch) | |
tree | 4413bc5cd01834ac324da3b0a8ac16ab7fd141c3 /mm | |
parent | 8a841d658cb74970ee4d99067ee49c10bc51be94 (diff) |
vmscan: memcg: check whether the low limit should be ignored
Low-limit (aka guarantee) is ignored when there is no group scanned during
the first round of __shink_zone. This approach doesn't work when multiple
reclaimers race and reclaim the same hierarchy (e.g. kswapd vs. direct
reclaim or multiple tasks hitting the hard limit) because memcg iterator
makes sure that multiple reclaimers are interleaved in the hierarchy.
This means that some reclaimers can see 0 scanned groups although there
are groups which are above the low-limit and they were reclaimed on behalf
of other reclaimers. This leads to a premature low-limit break.
This patch adds mem_cgroup_all_within_guarantee() which will check whether
all the groups in the reclaimed hierarchy are within their low limit and
shrink_zone will allow the fallback reclaim only when that is true. This
alone is still not sufficient however because it would lead to another
problem. If a reclaimer constantly fails to scan anything because it sees
only groups within their guarantees while others do the reclaim then the
reclaim priority would drop down very quickly. shrink_zone has to be
careful to preserve scan at least one group semantic so __shrink_zone has
to be retried until at least one group is scanned.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'mm')
-rw-r--r-- | mm/memcontrol.c | 13 | ||||
-rw-r--r-- | mm/vmscan.c | 17 |
2 files changed, 25 insertions, 5 deletions
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index df96e7d28c1..982301deb10 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2821,6 +2821,19 @@ bool mem_cgroup_within_guarantee(struct mem_cgroup *memcg, return false; } +bool mem_cgroup_all_within_guarantee(struct mem_cgroup *root) +{ + struct mem_cgroup *iter; + + for_each_mem_cgroup_tree(iter, root) + if (!mem_cgroup_within_guarantee(iter, root)) { + mem_cgroup_iter_break(root, iter); + return false; + } + + return true; +} + struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page) { struct mem_cgroup *memcg = NULL; diff --git a/mm/vmscan.c b/mm/vmscan.c index ff8da5e759b..9d2c08833e9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2308,13 +2308,20 @@ static unsigned __shrink_zone(struct zone *zone, struct scan_control *sc, static void shrink_zone(struct zone *zone, struct scan_control *sc) { - if (!__shrink_zone(zone, sc, true)) { + bool honor_guarantee = true; + + while (!__shrink_zone(zone, sc, honor_guarantee)) { /* - * First round of reclaim didn't find anything to reclaim - * because of the memory guantees for all memcgs in the - * reclaim target so try again and ignore guarantees this time. + * The previous round of reclaim didn't find anything to scan + * because + * a) the whole reclaimed hierarchy is within guarantee so + * we fallback to ignore the guarantee because other option + * would be the OOM + * b) multiple reclaimers are racing and so the first round + * should be retried */ - __shrink_zone(zone, sc, false); + if (mem_cgroup_all_within_guarantee(sc->target_mem_cgroup)) + honor_guarantee = false; } } |