summaryrefslogtreecommitdiffstats
path: root/kernel/module
diff options
context:
space:
mode:
authorChangbin Du <changbin.du@huawei.com>2024-02-27 10:35:46 +0800
committerAndrew Morton <akpm@linux-foundation.org>2024-03-04 17:01:27 -0800
commit8f8cd6c0a43ed637e620bbe45a8d0e0c2f4d5130 (patch)
tree8bc526035c32bc77f122440fe31786264e3fcfe5 /kernel/module
parent63b774993dd02b17127cb404b7362fc436632995 (diff)
downloadlinux-8f8cd6c0a43ed637e620bbe45a8d0e0c2f4d5130.tar.gz
linux-8f8cd6c0a43ed637e620bbe45a8d0e0c2f4d5130.tar.bz2
linux-8f8cd6c0a43ed637e620bbe45a8d0e0c2f4d5130.zip
modules: wait do_free_init correctly
The synchronization here is to ensure the ordering of freeing of a module init so that it happens before W+X checking. It is worth noting it is not that the freeing was not happening, it is just that our sanity checkers raced against the permission checkers which assume init memory is already gone. Commit 1a7b7d922081 ("modules: Use vmalloc special flag") moved calling do_free_init() into a global workqueue instead of relying on it being called through call_rcu(..., do_free_init), which used to allowed us call do_free_init() asynchronously after the end of a subsequent grace period. The move to a global workqueue broke the gaurantees for code which needed to be sure the do_free_init() would complete with rcu_barrier(). To fix this callers which used to rely on rcu_barrier() must now instead use flush_work(&init_free_wq). Without this fix, we still could encounter false positive reports in W+X checking since the rcu_barrier() here can not ensure the ordering now. Even worse, the rcu_barrier() can introduce significant delay. Eric Chanudet reported that the rcu_barrier introduces ~0.1s delay on a PREEMPT_RT kernel. [ 0.291444] Freeing unused kernel memory: 5568K [ 0.402442] Run /sbin/init as init process With this fix, the above delay can be eliminated. Link: https://lkml.kernel.org/r/20240227023546.2490667-1-changbin.du@huawei.com Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag") Signed-off-by: Changbin Du <changbin.du@huawei.com> Tested-by: Eric Chanudet <echanude@redhat.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Xiaoyi Su <suxiaoyi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'kernel/module')
-rw-r--r--kernel/module/main.c9
1 files changed, 7 insertions, 2 deletions
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 36681911c05a..b0b99348e1a8 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2489,6 +2489,11 @@ static void do_free_init(struct work_struct *w)
}
}
+void flush_module_init_free_work(void)
+{
+ flush_work(&init_free_wq);
+}
+
#undef MODULE_PARAM_PREFIX
#define MODULE_PARAM_PREFIX "module."
/* Default value for module->async_probe_requested */
@@ -2593,8 +2598,8 @@ static noinline int do_init_module(struct module *mod)
* Note that module_alloc() on most architectures creates W+X page
* mappings which won't be cleaned up until do_free_init() runs. Any
* code such as mark_rodata_ro() which depends on those mappings to
- * be cleaned up needs to sync with the queued work - ie
- * rcu_barrier()
+ * be cleaned up needs to sync with the queued work by invoking
+ * flush_module_init_free_work().
*/
if (llist_add(&freeinit->node, &init_free_list))
schedule_work(&init_free_wq);