zswap: make shrinking memcg-aware

Currently, we only have a single global LRU for zswap. This makes it impossible to perform worload-specific shrinking - an memcg cannot determine which pages in the pool it owns, and often ends up writing pages from other memcgs. This issue has been previously observed in practice and mitigated by simply disabling memcg-initiated shrinking: https://lore.kernel.org/all/20230530232435.3097106-1-nphamcs@gmail.com/T/#u This patch fully resolves the issue by replacing the global zswap LRU with memcg- and NUMA-specific LRUs, and modify the reclaim logic: a) When a store attempt hits an memcg limit, it now triggers a synchronous reclaim attempt that, if successful, allows the new hotter page to be accepted by zswap. b) If the store attempt instead hits the global zswap limit, it will trigger an asynchronous reclaim attempt, in which an memcg is selected for reclaim in a round-robin-like fashion. [nphamcs@gmail.com: use correct function for the onlineness check, use mem_cgroup_iter_break()] Link: https://lkml.kernel.org/r/20231205195419.2563217-1-nphamcs@gmail.com [nphamcs@gmail.com: drop the pool's reference at the end of the writeback step] Link: https://lkml.kernel.org/r/20231206030627.4155634-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20231130194023.4102148-4-nphamcs@gmail.com Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Co-developed-by: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Nhat Pham <nphamcs@gmail.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Chris Li <chrisl@kernel.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Seth Jennings <sjenning@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
author: Domenico Cerasuolo <cerasuolodomenico@gmail.com> 2023-11-30 11:40:20 -0800
committer: Andrew Morton <akpm@linux-foundation.org> 2023-12-12 10:57:01 -0800
commit: a65b0e7607ccb5e5184591f73e48512f25c76061 (patch)
tree: 677026f6d0d29bdbc3242fd2c628c1f385432b54 /mm/swap.h
parent: fdc4161ff6a5e96222e159c1f1b28d31a985130d (diff)
download: linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.tar.gz
linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.tar.bz2
linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.zip
1 files changed, 2 insertions, 1 deletions
diff --git a/mm/swap.h b/mm/swap.h
index 73c332ee4d91..c0dc73e10e91 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -51,7 +51,8 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 				   struct swap_iocb **plug);
 struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 				     struct mempolicy *mpol, pgoff_t ilx,
-				     bool *new_page_allocated);
+				     bool *new_page_allocated,
+				     bool skip_if_exists);
 struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag,
 				    struct mempolicy *mpol, pgoff_t ilx);
 struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
author	Domenico Cerasuolo <cerasuolodomenico@gmail.com>	2023-11-30 11:40:20 -0800
committer	Andrew Morton <akpm@linux-foundation.org>	2023-12-12 10:57:01 -0800
commit	a65b0e7607ccb5e5184591f73e48512f25c76061 (patch)
tree	677026f6d0d29bdbc3242fd2c628c1f385432b54 /mm/swap.h
parent	fdc4161ff6a5e96222e159c1f1b28d31a985130d (diff)
download	linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.tar.gz linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.tar.bz2 linux-stable-a65b0e7607ccb5e5184591f73e48512f25c76061.zip