From 7dab174e2e27eeaf10273e597ffbef4f8ea032bb Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Fri, 10 Feb 2023 01:07:07 -0800 Subject: dax/hmem: Move hmem device registration to dax_hmem.ko In preparation for the CXL region driver to take over the responsibility of registering device-dax instances for CXL regions, move the registration of "hmem" devices to dax_hmem.ko. Previously the builtin component of this enabling (drivers/dax/hmem/device.o) would register platform devices for each address range and trigger the dax_hmem.ko module to load and attach device-dax instances to those devices. Now, the ranges are collected from the HMAT and EFI memory map walking, but the device creation is deferred. A new "hmem_platform" device is created which triggers dax_hmem.ko to load and register the platform devices. Tested-by: Fan Ni Reviewed-by: Dave Jiang Reviewed-by: Jonathan Cameron Link: https://lore.kernel.org/r/167602002771.1924368.5653558226424530127.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams --- drivers/dax/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/dax/Kconfig') diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index 5fdf269a822e..d13c889c2a64 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -46,7 +46,7 @@ config DEV_DAX_HMEM Say M if unsure. config DEV_DAX_HMEM_DEVICES - depends on DEV_DAX_HMEM && DAX=y + depends on DEV_DAX_HMEM && DAX def_bool y config DEV_DAX_KMEM -- cgit v1.2.3 From e9ee9fe3a9d4ae0e1e935fc2ec1218b66a043cae Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Fri, 10 Feb 2023 01:07:13 -0800 Subject: dax: Assign RAM regions to memory-hotplug by default The default mode for device-dax instances is backwards for RAM-regions as evidenced by the fact that it tends to catch end users by surprise. "Where is my memory?". Recall that platforms are increasingly shipping with performance-differentiated memory pools beyond typical DRAM and NUMA effects. This includes HBM (high-bandwidth-memory) and CXL (dynamic interleave, varied media types, and future fabric attached possibilities). For this reason the EFI_MEMORY_SP (EFI Special Purpose Memory => Linux 'Soft Reserved') attribute is expected to be applied to all memory-pools that are not the general purpose pool. This designation gives an Operating System a chance to defer usage of a memory pool until later in the boot process where its performance properties can be interrogated and administrator policy can be applied. 'Soft Reserved' memory can be anything from too limited and precious to be part of the general purpose pool (HBM), too slow to host hot kernel data structures (some PMEM media), or anything in between. However, in the absence of an explicit policy, the memory should at least be made usable by default. The current device-dax default hides all non-general-purpose memory behind a device interface. The expectation is that the distribution of users that want the memory online by default vs device-dedicated-access by default follows the Pareto principle. A small number of enlightened users may want to do userspace memory management through a device, but general users just want the kernel to make the memory available with an option to get more advanced later. Arrange for all device-dax instances not backed by PMEM to default to attaching to the dax_kmem driver. From there the baseline memory hotplug policy (CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE / memhp_default_state=) gates whether the memory comes online or stays offline. Where, if it stays offline, it can be reliably converted back to device-mode where it can be partitioned, or fronted by a userspace allocator. So, if someone wants device-dax instances for their 'Soft Reserved' memory: 1/ Build a kernel with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n or boot with memhp_default_state=offline, or roll the dice and hope that the kernel has not pinned a page in that memory before step 2. 2/ Write a udev rule to convert the target dax device(s) from 'system-ram' mode to 'devdax' mode: daxctl reconfigure-device $dax -m devdax -f Cc: Michal Hocko Cc: David Hildenbrand Cc: Dave Hansen Reviewed-by: Gregory Price Tested-by: Fan Ni Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/167602003336.1924368.6809503401422267885.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams --- drivers/dax/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/dax/Kconfig') diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index d13c889c2a64..1163eb62e5f6 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -50,7 +50,7 @@ config DEV_DAX_HMEM_DEVICES def_bool y config DEV_DAX_KMEM - tristate "KMEM DAX: volatile-use of persistent memory" + tristate "KMEM DAX: map dax-devices as System-RAM" default DEV_DAX depends on DEV_DAX depends on MEMORY_HOTPLUG # for add_memory() and friends -- cgit v1.2.3 From 09d09e04d2fcf88c4620dd28097e0e2a8f720eac Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Fri, 10 Feb 2023 01:07:19 -0800 Subject: cxl/dax: Create dax devices for CXL RAM regions While platform firmware takes some responsibility for mapping the RAM capacity of CXL devices present at boot, the OS is responsible for mapping the remainder and hot-added devices. Platform firmware is also responsible for identifying the platform general purpose memory pool, typically DDR attached DRAM, and arranging for the remainder to be 'Soft Reserved'. That reservation allows the CXL subsystem to route the memory to core-mm via memory-hotplug (dax_kmem), or leave it for dedicated access (device-dax). The new 'struct cxl_dax_region' object allows for a CXL memory resource (region) to be published, but also allow for udev and module policy to act on that event. It also prevents cxl_core.ko from having a module loading dependency on any drivers/dax/ modules. Tested-by: Fan Ni Reviewed-by: Dave Jiang Reviewed-by: Jonathan Cameron Link: https://lore.kernel.org/r/167602003896.1924368.10335442077318970468.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams --- drivers/dax/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+) (limited to 'drivers/dax/Kconfig') diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index 1163eb62e5f6..bd06e16c7ac8 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -45,6 +45,19 @@ config DEV_DAX_HMEM Say M if unsure. +config DEV_DAX_CXL + tristate "CXL DAX: direct access to CXL RAM regions" + depends on CXL_REGION && DEV_DAX + default CXL_REGION && DEV_DAX + help + CXL RAM regions are either mapped by platform-firmware + and published in the initial system-memory map as "System RAM", mapped + by platform-firmware as "Soft Reserved", or dynamically provisioned + after boot by the CXL driver. In the latter two cases a device-dax + instance is created to access that unmapped-by-default address range. + Per usual it can remain as dedicated access via a device interface, or + converted to "System RAM" via the dax_kmem facility. + config DEV_DAX_HMEM_DEVICES depends on DEV_DAX_HMEM && DAX def_bool y -- cgit v1.2.3 From 0c16c83ed57fc66b033306ba426a5b324966a33e Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 14 Feb 2023 11:30:49 +0100 Subject: dax: cxl: add CXL_REGION dependency There is already a dependency on CXL_REGION, which depends on CXL_BUS, but since CXL_REGION is a 'bool' symbol, it's possible to configure DAX as built-in even though CXL itself is a loadable module: x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_probe': cxl.c:(.text+0xb): undefined reference to `to_cxl_dax_region' x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_init': cxl.c:(.init.text+0x10): undefined reference to `__cxl_driver_register' x86_64-linux-ld: drivers/dax/cxl.o: in function `cxl_dax_region_driver_exit': cxl.c:(.exit.text+0x9): undefined reference to `cxl_driver_unregister' Prevent this with another depndency on the tristate symbol. Fixes: 09d09e04d2fc ("cxl/dax: Create dax devices for CXL RAM regions") Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20230214103054.1082908-1-arnd@kernel.org Signed-off-by: Dan Williams --- drivers/dax/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/dax/Kconfig') diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index bd06e16c7ac8..7e1008d756b8 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -47,7 +47,7 @@ config DEV_DAX_HMEM config DEV_DAX_CXL tristate "CXL DAX: direct access to CXL RAM regions" - depends on CXL_REGION && DEV_DAX + depends on CXL_BUS && CXL_REGION && DEV_DAX default CXL_REGION && DEV_DAX help CXL RAM regions are either mapped by platform-firmware -- cgit v1.2.3