linux-stable.git - Linux kernel stable tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'for-6.7/cxl-rch-eh' into cxl/next	Dan Williams	2023-10-31	1	-3/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	Restricted CXL Host (RCH) Error Handling undoes the topology munging of CXL 1.1 to enabled some AER recovery, and lands some base infrastructure for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include this long running series finally for v6.7.
\| *	cxl/pci: Remove Component Register base address from struct cxl_dev_state	Robert Richter	2023-10-27	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @reg_map instead. Remove the base address. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231018171713.1883517-9-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	tools/testing/cxl: Slow down the mock firmware transfer	Vishal Verma	2023-10-27	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cxl-cli unit test for firmware update does operations like starting an asynchronous firmware update, making sure it is in progress, and attempting to cancel it. In some cases, such as with no or minimal dynamic debugging turned on, the firmware update completes too quickly, not allowing the test to have a chance to verify it was in progress. This caused a failure of the signature: expected fw_update_in_progress:true test/cxl-update-firmware.sh: failed at line 88 Fix this by adding a delay (~1.5 - 2 ms) to each firmware transfer request handled by the mocked interface. Reported-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20231026-vv-fw_upd_test_fix-v2-1-5282fd193883@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	cxl/region: Fix x1 root-decoder granularity calculations	Jim Harris	2023-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Root decoder granularity must match value from CFWMS, which may not be the region's granularity for non-interleaved root decoders. So when calculating granularities for host bridge decoders, use the region's granularity instead of the root decoder's granularity to ensure the correct granularities are set for the host bridge decoders and any downstream switch decoders. Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch. Region created with 2048 granularity using following command line: cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ -g 2048 -s 2048M Use "cxl list -PDE \| grep granularity" to get a view of the granularity set at each level of the topology. Before this patch: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":256, After: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Cc: <stable@vger.kernel.org> Signed-off-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in [djbw: fixup the prebuilt cxl_test region] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	tools/testing/cxl: Add 'sanitize notifier' support	Dan Williams	2023-10-09	1	-1/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow for cxl_test regression of the sanitize notifier. Reuse the core setup infrastructure, and trigger notifications upon any sanitize submission with a programmable notification delay. Cc: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	tools/testing/cxl: Make cxl_memdev_state available to other command emulation	Dan Williams	2023-10-09	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move @mds out of the event specific 'struct mock_event_store' and into the base 'struct cxl_mockmem_data' directly. This is in preparation for enabling cxl_test to exercise the notifier flow for 'sanitize' operation completion. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	cxl/pci: Clarify devm host for memdev relative setup	Dan Williams	2023-10-06	1	-2/+2
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is all too easy to get confused about @dev usage in the CXL driver stack. Before adding a new cxl_pci_probe() setup operation that has a devm lifetime dependent on @cxlds->dev binding, but also references @cxlmd->dev, and prints messages, rework the devm_cxl_add_memdev() and cxl_memdev_setup_fw_upload() function signatures to make this distinction explicit. I.e. pass in the devm context as an @host argument rather than infer it from other objects. This is in preparation for adding a devm_cxl_sanitize_setup_notifier(). Note the whitespace fixup near the change of the devm_cxl_add_memdev() signature. That uncaught typo originated in the patch that added cxl_memdev_security_init(). Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Remove unused SZ_512G macro	Xiao Yang	2023-07-20	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \|	SZ_512G macro has become useless since commit b2f3b74e1072 ("tools/testing/cxl: Move cxl_test resources to the top of memory") so remove it directly. Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Link: https://lore.kernel.org/r/20230719163103.3392-1-yangx.jy@fujitsu.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
*	Merge branch 'for-6.5/cxl-rch-eh' into for-6.5/cxl	Dan Williams	2023-06-25	4	-35/+45
\|\ \| \| \| \| \| \| \| \| \| \|	Pick up the first half of the RCH error handling series. The back half needs some fixups for test regressions. Small conflicts with the PMU work around register enumeration and setup helpers.
\| *	cxl: Rename 'uport' to 'uport_dev'	Dan Williams	2023-06-25	2	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For symmetry with the recent rename of ->dport_dev for a 'struct cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct cxl_port'. These devices represent the downstream-port-device and upstream-port-device respectively in the CXL/PCIe topology. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability	Dan Williams	2023-06-25	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prepare cxl_probe_rcrb() for retrieving more than just the component register block. The RCH AER handling code wants to get back to the AER capability that happens to be MMIO mapped rather then configuration cycles. Move RCRB specific downstream port data, like the RCRB base and the AER capability offset, into its own data structure ('struct cxl_rcrb_info') for cxl_probe_rcrb() to fill. Extend 'struct cxl_dport' to include a 'struct cxl_rcrb_info' attribute. This centralizes all RCRB scanning in one routine. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-4-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	cxl/acpi: Probe RCRB later during RCH downstream port creation	Robert Richter	2023-06-25	4	-21/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The RCRB is extracted already during ACPI CEDT table parsing while the data of this is needed not earlier than dport creation. This implementation comes with drawbacks: During ACPI table scan there is already MMIO access including mapping and unmapping, but only ACPI data should be collected here. The collected data must be transferred through a couple of interfaces until it is finally consumed when creating the dport. This causes complex data structures and function interfaces. Additionally, RCRB parsing will be extended to also extract AER data, it would be much easier do this at a later point during port and dport creation when the data structures are available to hold that data. To simplify all that, probe the RCRB at a later point during RCH downstream port creation. Change ACPI table parser to only extract the base address of either the component registers or the RCRB. Parse and extract the RCRB in devm_cxl_add_rch_dport(). This is in preparation to centralize all RCRB scanning. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-2-terry.bowman@amd.com Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20230622205523.85375-3-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	Merge branch 'for-6.5/cxl-perf' into for-6.5/cxl	Dan Williams	2023-06-25	1	-0/+1
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pick up initial support for the CXL 3.0 performance monitoring definition. Small conflicts with the firmware update work as they both placed their init code in the same location.
\| * \|	cxl/pci: Find and register CXL PMU devices	Jonathan Cameron	2023-05-30	1	-0/+1
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CXL PMU devices can be found from entries in the Register Locator DVSEC. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230526095824.16336-4-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	Merge branch 'for-6.5/cxl-type-2' into for-6.5/cxl	Dan Williams	2023-06-25	4	-104/+79
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pick up the driver cleanups identified in preparation for CXL "type-2" (accelerator) device support. The major change here from a conflict generation perspective is the split of 'struct cxl_memdev_state' from the core 'struct cxl_dev_state'. Since an accelerator may not care about all the optional features that are standard on a CXL "type-3" (host-only memory expander) device. A silent conflict also occurs with the move of the endpoint port to be a formal property of a 'struct cxl_memdev' rather than drvdata.
\| * \|	Revert "cxl/port: Enable the HDM decoder capability for switch ports"	Dan Williams	2023-06-25	2	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") ...was added on the observation of CXL memory not being accessible after setting up a region on a "cold-plugged" device. A "cold-plugged" CXL device is one that was not present at boot, so platform-firmware/BIOS has no chance to set it up. While it is true that the debug found the enable bit clear in the host-bridge's instance of the global control register (CXL 3.0 8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is described as: "This bit is only applicable to CXL.mem devices and shall return 0 on CXL Host Bridges and Upstream Switch Ports." So it is meant to be zero, and further testing confirmed that this "fix" had no effect on the failure. Revert it, and be more vigilant about proposed fixes in the future. Since the original copied stable@, flag this revert for stable@ as well. Cc: <stable@vger.kernel.org> Fixes: eb0764b822b9 ("cxl/port: Enable the HDM decoder capability for switch ports") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \|	cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM}	Dan Williams	2023-06-25	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for support for HDM-D and HDM-DB configuration (device-memory, and device-memory with back-invalidate). Rename the current type designators to use HOSTONLYMEM and DEVMEM as a suffix. HDM-DB can be supported by devices that are not accelerators, so DEVMEM is a more generic term for that case. Fixup one location where this type value was open coded. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \|	cxl/mbox: Move mailbox related driver state to its own data structure	Dan Williams	2023-06-25	1	-19/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'struct cxl_dev_state' makes too many assumptions about the capabilities of a CXL device. In particular it assumes a CXL device has a mailbox and all of the infrastructure and state that comes along with that. In preparation for supporting accelerator / Type-2 devices that may not have a mailbox and in general maintain a minimal core context structure, make mailbox functionality a super-set of 'struct cxl_dev_state' with 'struct cxl_memdev_state'. With this reorganization it allows for CXL devices that support HDM decoder mapping, but not other general-expander / Type-3 capabilities, to only enable that subset without the rest of the mailbox infrastructure coming along for the ride. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679260240.3436160.15520641540463704524.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \|	tools/testing/cxl: Remove unused @cxlds argument	Dan Williams	2023-06-25	1	-47/+39
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for plumbing a 'struct cxl_memdev_state' as a superset of a 'struct cxl_dev_state' cleanup the usage of @cxlds in the unit test infrastructure. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679258640.3436160.7641308222525246728.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	Merge branch 'for-6.5/cxl-fwupd' into for-6.5/cxl	Dan Williams	2023-06-25	1	-9/+183
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the first typical (non-sanitization) consumer of the new background command infrastructure, firmware update. Given both firmware-update and sanitization were developed in parallel from the common background-command baseline, resolve some minor context conflicts.
\| * \|	tools/testing/cxl: add firmware update emulation to CXL memdevs	Vishal Verma	2023-06-25	1	-0/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add emulation for the 'Get FW Info', 'Transfer FW', and 'Activate FW' CXL mailbox commands to the cxl_test emulated memdevs to enable end-to-end unit testing of a firmware update flow. For now, only advertise an 'offline activation' capability as that is all the CXL memdev driver currently implements. Add some canned values for the serial number fields, and create a platform device sysfs knob to calculate the sha256sum of the firmware image that was received, so a unit test can compare it with the original file that was uploaded. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-4-c6265bd7343b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \|	tools/testing/cxl: Use named effects for the Command Effect Log	Vishal Verma	2023-06-25	1	-9/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As more emulated mailbox commands are added to cxl_test, it is a pain point to look up command effect numbers for each effect. Replace the bare numbers in the mock driver with an enum that lists all possible effects. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-3-c6265bd7343b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \|	tools/testing/cxl: Fix command effects for inject/clear poison	Vishal Verma	2023-06-25	1	-2/+2
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The CXL spec (3.0, section 8.2.9.8.4) Lists Inject Poison and Clear Poison as having the effects of "Immediate Data Change". Fix this in the mock driver so that the command effect log is populated correctly. Fixes: 371c16101ee8 ("tools/testing/cxl: Mock the Inject Poison mailbox command") Cc: Alison Schofield <alison.schofield@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-2-c6265bd7343b@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	cxl/test: Add Secure Erase opcode support	Davidlohr Bueso	2023-06-25	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support to emulate the CXL the "Secure Erase" operation. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230612181038.14421-8-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	cxl/test: Add Sanitize opcode support	Davidlohr Bueso	2023-06-25	1	-0/+25
\|/ \| \| \| \| \| \| \| \| \| \|	Add support to emulate the "Sanitize" operation, without incurring in the background. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230612181038.14421-6-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	cxl: Move cxl_await_media_ready() to before capacity info retrieval	Dave Jiang	2023-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing IDENTIFY and retrieving memory device information to ensure that the device is ready to provide the information. Allow cxl_pci_probe() to succeed even if media is not ready. Cache the media failure in cxlds and don't ask the device for any media information. The rationale for proceeding in the !media_ready case is to allow for mailbox operations to interrogate and/or remediate the device. After media is repaired then rebinding the cxl_pci driver is expected to restart the capacity scan. Suggested-by: Dan Williams <dan.j.williams@intel.com> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168445310026.3251520.8124296540679268206.stgit@djiang5-mobl3 [djbw: fixup cxl_test] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	cxl/port: Enable the HDM decoder capability for switch ports	Dan Williams	2023-05-18	2	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Derick noticed, when testing hot plug, that hot-add behaves nominally after a removal. However, if the hot-add is done without a prior removal, CXL.mem accesses fail. It turns out that the original implementation of the port driver and region programming wrongly assumed that platform-firmware always enables the host-bridge HDM decoder capability. Add support turning on switch-level HDM decoders in the case where platform-firmware has not. The implementation is careful to only arrange for the enable to be undone if the current instance of the driver was the one that did the enable. This is to interoperate with platform-firmware that may expect CXL.mem to remain active after the driver is shutdown. This comes at the cost of potentially not shutting down the enable on kexec flows, but it is mitigated by the fact that the related HDM decoders still need to be enabled on an individual basis. Cc: <stable@vger.kernel.org> Reported-by: Derick Marks <derick.w.marks@intel.com> Fixes: 54cdbf845cf7 ("cxl/port: Add a driver for 'struct cxl_port' objects") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Use DEFINE_STATIC_SRCU()	Dan Williams	2023-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Starting with commit: 95433f726301 ("srcu: Begin offloading srcu_struct fields to srcu_update") ...it is no longer possible to do: static DEFINE_SRCU(x) Switch to DEFINE_STATIC_SRCU(x) to fix: tools/testing/cxl/test/mock.c:22:1: error: duplicate ‘static’ 22 \| static DEFINE_SRCU(cxl_mock_srcu); \| ^~~~~~ Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168392709546.1135523.10424917245934547117.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	cxl/test: Add mock test for set_timestamp	Davidlohr Bueso	2023-04-24	1	-0/+24
\| \| \| \| \| \| \| \| \|	Support the command testing in a unit-test fashion. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230423221231.6357-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Require CONFIG_DEBUG_FS	Alison Schofield	2023-04-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The cxl_mem driver uses debugfs to support poison inject and clear. Add debugfs to the list of required symbols so that cxl_test can emulate those poison operations. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/4f3aab57fbf1cc3ccde2eb887c5d90566c8d0e90.1681874357.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Add a sysfs attr to test poison inject limits	Alison Schofield	2023-04-23	1	-4/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CXL devices may report a maximum number of addresses that a device allows to be poisoned using poison injection. When cxl_test creates mock CXL memory devices, it defaults to MOCK_INJECT_DEV_MAX==88 for all mocked memdevs. Add a sysfs attribute, poison_inject_max to module cxl_mock_mem so that users can set a custom device injection limit. Fail, and return -EBUSY, if the mock poison list is not empty. /sys/bus/platform/drivers/cxl_mock_mem/poison_inject_max A simple usage model is to set the attribute before running a test in order to emulate a device's poison handling. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/0f25b2862b90013545450222d2199e435c6cc11a.1681874357.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Use injected poison for get poison list	Alison Schofield	2023-04-23	1	-19/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to poison inject support, the mock of 'Get Poison List' returned a poison list containing a single mocked error record. Following the addition of poison inject and clear support to the mock driver, use the mock_poison_list[], rather than faking an error record. Mock_poison_list[] list tracks the actual poison inject and clear requests issued by userspace. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/0f4242c81821f4982b02cb1009c22783ef66b2f1.1681874357.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Mock the Clear Poison mailbox command	Alison Schofield	2023-04-23	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	Mock the clear of poison by deleting the device:address entry from the mock_poison_list[]. Behave like a real CXL device and do not fail if the address is not in the poison list, but offer a dev_dbg() message. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/ecf19743c6572e60971bbd078f67d520cf5bca5d.1681874357.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Mock the Inject Poison mailbox command	Alison Schofield	2023-04-23	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mock the injection of poison by storing the device:address entries in mock_poison_list[]. Enforce a limit of 8 poison injections per memdev device and 128 total entries for the cxl_test mock driver. Introducing the mock_poison[] list here, makes it available for use in the mock of Clear Poison, and the mock of Get Poison List. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/f6b7f03541eaa8c2260d3eafadd04afe3f0d7962.1681874357.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Mock support for Get Poison List	Alison Schofield	2023-04-23	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make mock memdevs support the Get Poison List mailbox command. Return a fake poison error record when the get poison list command is issued. This supports testing the kernel tracing and cxl list capabilities for media errors. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/14d661ce3e3a32b7d8e76b8ecc5eb88343b3d09c.1681838292.git.alison.schofield@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	Merge branch 'for-6.3/cxl-rr-emu' into cxl/next	Dan Williams	2023-02-14	4	-12/+37
\|\ \| \| \| \| \| \| \| \| \| \|	Pick up the CXL DVSEC range register emulation for v6.3, and resolve conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region) and event handling (from for-6.3/cxl-events).
\| *	cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders	Dave Jiang	2023-02-14	3	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CXL rev3 spec 8.1.3 RCDs may not have HDM register blocks. Create a fake HDM with information from the CXL PCIe DVSEC registers. The decoder count will be set to the HDM count retrieved from the DVSEC cap register. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	cxl/hdm: Emulate HDM decoder from DVSEC range registers	Dave Jiang	2023-02-14	3	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the case where HDM decoder register block exists but is not programmed and at the same time the DVSEC range register range is active, populate the CXL decoder object 'cxl_decoder' with info from DVSEC range registers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| *	cxl/port: Export cxl_dvsec_rr_decode() to cxl_port	Dave Jiang	2023-02-14	2	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Call cxl_dvsec_rr_decode() in the beginning of cxl_port_probe() and preserve the decoded information in a local 'struct cxl_endpoint_dvsec_info'. This info can be passed to various functions later on in order to support the HDM decoder emulation. The invocation of cxl_dvsec_rr_decode() in cxl_hdm_decode_init() is removed and a pointer to the 'struct cxl_endpoint_dvsec_info' is passed in. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167640367377.935665.2848747799651019676.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	Merge branch 'for-6.3/cxl-ram-region' into cxl/next	Dan Williams	2023-02-10	1	-10/+137
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Include the support for enumerating and provisioning ram regions for v6.3. This also include a default policy change for ram / volatile device-dax instances to assign them to the dax_kmem driver by default.
\| * \|	tools/testing/cxl: Define a fixed volatile configuration to parse	Dan Williams	2023-02-10	1	-10/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Take two endpoints attached to the first switch on the first host-bridge in the cxl_test topology and define a pre-initialized region. This is a x2 interleave underneath a x1 CXL Window. $ modprobe cxl_test $ # cxl list -Ru { "region":"region3", "resource":"0xf010000000", "size":"512.00 MiB (536.87 MB)", "interleave_ways":2, "interleave_granularity":4096, "decode_state":"commit" } Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \| \|	Merge branch 'for-6.3/cxl-events' into cxl/next	Dan Williams	2023-02-07	2	-1/+353
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \|	Add the CXL event and interrupt support for the v6.3 update.
\| * \| \|	cxl/test: Simulate event log overflow	Ira Weiny	2023-01-26	1	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Log overflow is marked by a separate trace message. Simulate a log with lots of messages and flag overflow until space is cleared. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-8-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \|	cxl/test: Add specific events	Ira Weiny	2023-01-26	1	-0/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Each type of event has different trace point outputs. Add mock General Media Event, DRAM event, and Memory Module Event records to the mock list of events returned. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-7-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
\| * \| \|	cxl/test: Add generic mock events	Ira Weiny	2023-01-26	2	-1/+232
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Facilitate testing basic Get/Clear Event functionality by creating multiple logs and generic events with made up UUID's. Data is completely made up with data patterns which should be easy to spot in trace output. A single sysfs entry resets the event data and triggers collecting the events for testing. Test traces are easy to obtain with a small script such as this: #!/bin/bash -x devices=`find /sys/devices/platform -name cxl_mem*` # Turn on tracing echo "" > /sys/kernel/tracing/trace echo 1 > /sys/kernel/tracing/events/cxl/enable echo 1 > /sys/kernel/tracing/tracing_on # Generate fake interrupt for device in $devices; do echo 1 > $device/event_trigger done # Turn off tracing and report events echo 0 > /sys/kernel/tracing/tracing_on cat /sys/kernel/tracing/trace Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-6-2316a5c8f7d8@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* / /	tools/testing/cxl: Remove cxl_test module math loading message	Alison Schofield	2023-01-26	1	-3/+1
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit "tools/testing/cxl: Add XOR Math support to cxl_test" added a module parameter to cxl_test for the interleave_arithmetic option. In doing so, it also added this dev_dbg() message describing which option cxl_test used during load: "[ 111.743246] (NULL device ): cxl_test loading modulo math option" That "(NULL device )" has raised needless user concern. Remove the dev_dbg() message and make the module_param readable via sysfs for users that need to know which math option is active. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20230126170555.701240-1-alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	tools/testing/cxl: require 64-bit	Luis Chamberlain	2023-01-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	size_t is limited to 32-bits and so the gen_pool_alloc() using the size of SZ_64G would map to 0, triggering a low allocation which is not expected. Force the dependency on 64-bit for cxl_test as that is what it was designed for. This issue was found by build test reports when converting this driver as a proper upstream driver. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20221219195050.325959-1-mcgrof@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	tools/testing/cxl: Prevent cxl_test from confusing production modules	Dan Williams	2023-01-05	8	-0/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cxl_test machinery builds modified versions of the modules in drivers/cxl/ and intercepts some of their calls to allow cxl_test to inject mock CXL topologies for test. However, if cxl_test attempts the same with production modules, fireworks ensue as Luis discovered [1]. Prevent that scenario by arranging for cxl_test to check for a "watermark" symbol in each of the modules it expects to be modified before the test can run. This turns undefined runtime behavior or crashes into a safer failure to load the cxl_test module. Link: http://lore.kernel.org/r/20221209062919.1096779-1-mcgrof@kernel.org [1] Reported-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* \|	cxl/pci: Move tracepoint definitions to drivers/cxl/core/	Dan Williams	2023-01-04	1	-0/+2
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CXL is using tracepoints for reporting RAS capability register payloads for AER events, and has plans to use tracepoints for the output payload of Get Poison List and Get Event Records commands. For organization purposes it would be nice to keep those all under a single + local CXL trace system. This also organization also potentially helps in the future when CXL drivers expand beyond generic memory expanders, however that would also entail a move away from the expander-specific cxl_dev_state context, save that for later. Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl' trace system, however, it is unlikely that a single platform will ever load both drivers simultaneously. Cc: Steven Rostedt <rostedt@goodmis.org> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
*	tools/testing/cxl: Require cache invalidation bypass	Dan Williams	2022-12-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	The typical environment where cxl_test is run, QEMU, does not support cpu_cache_invalidate_memregion(). Add the 'test' bypass symbols to the configuration check. Reported-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/167026948179.3527561.4535373655515827457.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>