mm: memcontrol: charge swap to cgroup2

This patchset introduces swap accounting to cgroup2. This patch (of 7): In the legacy hierarchy we charge memsw, which is dubious, because: - memsw.limit must be >= memory.limit, so it is impossible to limit swap usage less than memory usage. Taking into account the fact that the primary limiting mechanism in the unified hierarchy is memory.high while memory.limit is either left unset or set to a very large value, moving memsw.limit knob to the unified hierarchy would effectively make it impossible to limit swap usage according to the user preference. - memsw.usage != memory.usage + swap.usage, because a page occupying both swap entry and a swap cache page is charged only once to memsw counter. As a result, it is possible to effectively eat up to memory.limit of memory pages *and* memsw.limit of swap entries, which looks unexpected. That said, we should provide a different swap limiting mechanism for cgroup2. This patch adds mem_cgroup->swap counter, which charges the actual number of swap entries used by a cgroup. It is only charged in the unified hierarchy, while the legacy hierarchy memsw logic is left intact. The swap usage can be monitored using new memory.swap.current file and limited using memory.swap.max. Note, to charge swap resource properly in the unified hierarchy, we have to make swap_entry_free uncharge swap only when ->usage reaches zero, not just ->count, i.e. when all references to a swap entry, including the one taken by swap cache, are gone. This is necessary, because otherwise swap-in could result in uncharging swap even if the page is still in swap cache and hence still occupies a swap entry. At the same time, this shouldn't break memsw counter logic, where a page is never charged twice for using both memory and swap, because in case of legacy hierarchy we uncharge swap on commit (see mem_cgroup_commit_charge). Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
author: Vladimir Davydov <vdavydov@virtuozzo.com> 2016-01-20 15:02:56 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2016-01-20 17:09:18 -0800
commit: 37e84351198be087335ad2b2253b35c7cc76a5ad (patch)
tree: 3f7cfe687fdc86bea76f2e47787ff1f7c79bef23 /mm/swapfile.c
parent: 0b8f73e104285a4badf9d768d1c39b06d77d1f97 (diff)
download: linux-37e84351198be087335ad2b2253b35c7cc76a5ad.tar.gz
linux-37e84351198be087335ad2b2253b35c7cc76a5ad.tar.bz2
linux-37e84351198be087335ad2b2253b35c7cc76a5ad.zip
1 files changed, 1 insertions, 3 deletions
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2bb30aa3a412..22a7a1fc1e47 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -785,14 +785,12 @@ static unsigned char swap_entry_free(struct swap_info_struct *p,
 			count--;
 	}
 
-	if (!count)
-		mem_cgroup_uncharge_swap(entry);
-
 	usage = count | has_cache;
 	p->swap_map[offset] = usage;
 
 	/* free if no reference */
 	if (!usage) {
+		mem_cgroup_uncharge_swap(entry);
 		dec_cluster_info_page(p, p->cluster_info, offset);
 		if (offset < p->lowest_bit)
 			p->lowest_bit = offset;
author	Vladimir Davydov <vdavydov@virtuozzo.com>	2016-01-20 15:02:56 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2016-01-20 17:09:18 -0800
commit	37e84351198be087335ad2b2253b35c7cc76a5ad (patch)
tree	3f7cfe687fdc86bea76f2e47787ff1f7c79bef23 /mm/swapfile.c
parent	0b8f73e104285a4badf9d768d1c39b06d77d1f97 (diff)
download	linux-37e84351198be087335ad2b2253b35c7cc76a5ad.tar.gz linux-37e84351198be087335ad2b2253b35c7cc76a5ad.tar.bz2 linux-37e84351198be087335ad2b2253b35c7cc76a5ad.zip