summaryrefslogtreecommitdiffstats
path: root/arch/s390/include
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | s390/uaccess: Define INLINE_COPY_FROM_USER and INLINE_COPY_TO_USERHeiko Carstens2025-03-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inline copy_from_user() and copy_to_user(). With the shortened inline assemblies of raw_copy_to_user() and raw_copy_from_user() the additional kernel text size is acceptable, considering that this avoids function calls on hot paths. This increases the kernel text size by ~90kb (defconfig, gcc 14.2.0): add/remove: 13/4 grow/shrink: 650/14 up/down: 93484/-3254 (90230) Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | | s390/uaccess: Separate key uaccess functionsHeiko Carstens2025-03-041-52/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement separate raw_copy_to_user_key() and raw_copy_from_user_key() functions, which allows to remove the open-coded operand access control handling from the normal raw_copy_to_user() / raw_copy_from_user() functions - they are simplified to use immediate instructions to load hard-coded operand access control values into register zero, which saves one instruction. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * | | s390/uaccess: Shorten raw_copy_from_user() / raw_copy_to_user() inline ↵Heiko Carstens2025-03-042-64/+58
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | assemblies Add specific exception handler for copy_to_user() / copy_from_user() mvcos fault handling, which allows to shorten the inline assemblies to three instructions. On fault the exception handler adjusts the length used by the mvcos instruction in a way that the instruction completes with condition code zero, indicating the number of bytes copied with the input/output operand 'size'. This allows to calculate and return the number of bytes not copied, if any, like required. Loop and return value handling is changed to C so that the compiler may optimize the code. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | | Merge tag 'pci-v6.15-changes' of ↵Linus Torvalds2025-03-281-0/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Enable Configuration RRS SV, which makes device readiness visible, early instead of during child bus scanning (Bjorn Helgaas) - Log debug messages about reset methods being used (Bjorn Helgaas) - Avoid reset when it has been disabled via sysfs (Nishanth Aravamudan) - Add common pci-ep-bus.yaml schema for exporting several peripherals of a single PCI function via devicetree (Andrea della Porta) - Create DT nodes for PCI host bridges to enable loading device tree overlays to create platform devices for PCI devices that have several features that require multiple drivers (Herve Codina) Resource management: - Enlarge devres table[] to accommodate bridge windows, ROM, IOV BARs, etc., and validate BAR index in devres interfaces (Philipp Stanner) - Fix typo that repeatedly distributed resources to a bridge instead of iterating over subordinate bridges, which resulted in too little space to assign some BARs (Kai-Heng Feng) - Relax bridge window tail sizing for optional resources, e.g., IOV BARs, to avoid failures when removing and re-adding devices (Ilpo Järvinen) - Allow drivers to enable devices even if we haven't assigned optional IOV resources to them (Ilpo Järvinen) - Rework handling of optional resources (IOV BARs, ROMs) to reduce failures if we can't allocate them (Ilpo Järvinen) - Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory) - Fix s390 mmio_read/write syscalls, which didn't cause page faults in some cases, which broke vfio-pci lazy mapping on first access (Niklas Schnelle) - Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was disabled only for s390 (Niklas Schnelle) - Support mmap of PCI resources on s390 except for ISM devices (Niklas Schnelle) ASPM: - Delay pcie_link_state deallocation to avoid dangling pointers that cause invalid references during hot-unplug (Daniel Stodden) Power management: - Allow PCI bridges to go to D3Hot when suspending on all non-x86 systems (Manivannan Sadhasivam) Power control: - Create pwrctrl devices in pci_scan_device() to make it more symmetric with pci_pwrctrl_unregister() and make pwrctrl devices for PCI bridges possible (Manivannan Sadhasivam) - Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc. can still access devices after pci_stop_dev() (Manivannan Sadhasivam) - If there's a pwrctrl device for a PCI device, skip scanning it because the pwrctrl core will rescan the bus after the device is powered on (Manivannan Sadhasivam) - Add a pwrctrl driver for PCI slots based on voltage regulators described via devicetree (Manivannan Sadhasivam) Bandwidth control: - Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the set_pcie_cooling_state.sh test case (Yi Lai) - Avoid a NULL pointer dereference when we run out of bus numbers to assign for a bridge secondary bus (Lukas Wunner) Hotplug: - Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and NULL pointer checks (Lukas Wunner) - Drop shpchp module init/exit logging, replace shpchp dbg() with ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers (Ilpo Järvinen) - Drop 'shpchp_debug' module parameter in favor of standard dynamic debugging (Ilpo Järvinen) - Drop unused cpcihp .get_power(), .set_power() function pointers (Guilherme Giacomo Simoes) - Disable hotplug interrupts in portdrv only when pciehp is not enabled to avoid issuing two hotplug commands too close together (Feng Tang) - Skip pciehp 'device replaced' check if the device has been removed to address a deadlock when resuming after a device was removed during system sleep (Lukas Wunner) - Don't enable pciehp hotplug interupt when resuming in poll mode (Ilpo Järvinen) Virtualization: - Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar Dave) DOE: - Expose supported DOE features via sysfs (Alistair Francis) - Allow DOE support to be enabled even if CXL isn't enabled (Alistair Francis) Endpoint framework: - Convert PCI device data so pci-epf-test works correctly on big-endian endpoint systems (Niklas Cassel) - Add BAR_RESIZABLE type to endpoint framework and add DWC core support for EPF drivers to set BAR_RESIZABLE type and size (Niklas Cassel) - Fix pci-epf-test double free that causes an oops if the host reboots and PERST# deassertion restarts endpoint BAR allocation (Christian Bruel) - Fix endpoint BAR testing so tests can skip disabled BARs instead of reporting them as failures (Niklas Cassel) - Widen endpoint test BAR size variable to accommodate BARs larger than INT_MAX (Niklas Cassel) - Remove unused tools 'pci' build target left over after moving tests to tools/testing/selftests/pci_endpoint (Jianfeng Liu) Altera PCIe controller driver: - Add DT binding and driver support for Agilex family (P-Tile, F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar) AMD MDB PCIe controller driver: - Add DT binding and driver for AMD MDB (Multimedia DMA Bridge) (Thippeswamy Havalige) Broadcom STB PCIe controller driver: - Add BCM2712 MSI-X DT binding and interrupt controller drivers and add softdep on irq_bcm2712_mip driver to ensure that it is loaded first (Stanimir Varbanov) - Expand inbound window map to 64GB so it can accommodate BCM2712 (Stanimir Varbanov) - Add BCM2712 support and DT updates (Stanimir Varbanov) - Apply link speed restriction before bringing link up, not after (Jim Quinlan) - Update Max Link Speed in Link Capabilities via the internal writable register, not the read-only config register (Jim Quinlan) - Handle regulator_bulk_get() error to avoid panic when we call regulator_bulk_free() later (Jim Quinlan) - Disable regulators only when removing the bus immediately below a Root Port because we don't support regulators deeper in the hierarchy (Jim Quinlan) - Make const read-only arrays static (Colin Ian King) Cadence PCIe endpoint driver: - Correct MSG TLP generation so endpoints can generate INTx messages (Hans Zhang) Freescale i.MX6 PCIe controller driver: - Identify the second controller on i.MX8MQ based on devicetree 'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu) - Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Freescale Layerscape PCIe controller driver: - Drop deprecated 'num-ib-windows' and 'num-ob-windows' and unnecessary 'status' from example (Krzysztof Kozlowski) - Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg") arg_count to fix probe failure on LS1043A (Ioana Ciornei) HiSilicon STB PCIe controller driver: - Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe JAILLET) Intel Gateway PCIe controller driver: - Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU input address (using parent_bus_offset) from devicetree (Frank Li) Intel VMD host bridge driver: - Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so pci_ops.read() will never sleep, even on PREEMPT_RT where spinlock_t becomes a sleepable lock, to avoid calling a sleeping function from invalid context (Ryo Takakura) MediaTek PCIe Gen3 controller driver: - Remove leftover mac_reset assert for Airoha EN7581 SoC (Lorenzo Bianconi) - Add EN7581 PBUS controller 'mediatek,pbus-csr' DT property and program host bridge memory aperture to this syscon node (Lorenzo Bianconi) Qualcomm PCIe controller driver: - Add qcom,pcie-ipq5332 binding (Varadarajan Narayanan) - Add qcom i.MX8QM and i.MX8QXP/DXP optional DMA interrupt (Alexander Stein) - Add optional dma-coherent DT property for Qualcomm SA8775P (Dmitry Baryshkov) - Make DT iommu property required for SA8775P and prohibited for SDX55 (Dmitry Baryshkov) - Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry Baryshkov) - Add endpoint DT properties for SAR2130P and enable endpoint mode in driver (Dmitry Baryshkov) - Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as RESERVED (Manivannan Sadhasivam) Rockchip DesignWare PCIe controller driver: - Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas Cassel) Synopsys DesignWare PCIe controller driver: - Add debugfs-based Silicon Debug, Error Injection, Statistical Counter support for DWC (Shradha Todi) - Add debugfs property to expose LTSSM status of DWC PCIe link (Hans Zhang) - Add Rockchip support for DWC debugfs features (Niklas Cassel) - Add dw_pcie_parent_bus_offset() to look up the parent bus address of a specified 'reg' property and return the offset from the CPU physical address (Frank Li) - Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset via 'reg[config]' for host controllers and 'reg[addr_space]' for endpoint controllers (Frank Li) - Apply struct dw_pcie.parent_bus_offset in ATU users to remove use of .cpu_addr_fixup() when programming ATU (Frank Li) TI J721E PCIe driver: - Correct the 'link down' interrupt bit for J784S4 (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce alignment requirement from 1MB to 64KB (Niklas Cassel) Xilinx Versal CPM PCIe controller driver: - Free IRQ domain in probe error path to avoid leaking it (Thippeswamy Havalige) - Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for Versal Net CPM5NC Root Port controller (Thippeswamy Havalige) - Add driver support for CPM5_HOST1 (Thippeswamy Havalige) Miscellaneous: - Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer) - Use for_each_available_child_of_node_scoped() to simplify apple, kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)" * tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits) PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args() PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4 misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts PCI: endpoint: Add intx_capable to epc_features struct dt-bindings: PCI: Add common schema for devices accessible through PCI BARs PCI: intel-gw: Remove intel_pcie_cpu_addr() PCI: imx6: Remove imx_pcie_cpu_addr_fixup() PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup() PCI: dwc: ep: Ensure proper iteration over outbound map windows PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources() PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init() PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug PCI: dwc: Add dw_pcie_parent_bus_offset() PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant PCI: brcmstb: Make const read-only arrays static ...
| * | | s390/pci: Support mmap() of PCI resources except for ISM devicesNiklas Schnelle2025-03-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far s390 does not allow mmap() of PCI resources to user-space via the usual mechanisms, though it does use it for RDMA. For the PCI sysfs resource files and /proc/bus/pci it defines neither HAVE_PCI_MMAP nor ARCH_GENERIC_PCI_MMAP_RESOURCE. For vfio-pci s390 previously relied on disabled VFIO_PCI_MMAP and now relies on setting pdev->non_mappable_bars for all devices. This is partly because access to mapped PCI resources from user-space requires special PCI load/store memory-I/O (MIO) instructions, or the special MMIO syscalls when these are not available. Still, such access is possible and useful not just for RDMA, in fact not being able to mmap() PCI resources has previously caused extra work when testing devices. One thing that doesn't work with PCI resources mapped to user-space though is the s390 specific virtual ISM device. Not only because the BAR size of 256 TiB prevents mapping the whole BAR but also because access requires use of the legacy PCI instructions which are not accessible to user-space on systems with the newer MIO PCI instructions. Now with the pdev->non_mappable_bars flag ISM can be excluded from mapping its resources while making this functionality available for all other PCI devices. To this end introduce a minimal implementation of PCI_QUIRKS and use that to set pdev->non_mappable_bars for ISM devices only. Then also set ARCH_GENERIC_PCI_MMAP_RESOURCE to take advantage of the generic implementation of pci_mmap_resource_range() enabling only the newer sysfs mmap() interface. This follows the recommendation in Documentation/PCI/sysfs-pci.rst. Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-3-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* | | | Merge tag 'net-next-6.15' of ↵Linus Torvalds2025-03-261-1/+0
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Continue Netlink conversions to per-namespace RTNL lock (IPv4 routing, routing rules, routing next hops, ARP ioctls) - Continue extending the use of netdev instance locks. As a driver opt-in protect queue operations and (in due course) ethtool operations with the instance lock and not RTNL lock. - Support collecting TCP timestamps (data submitted, sent, acked) in BPF, allowing for transparent (to the application) and lower overhead tracking of TCP RPC performance. - Tweak existing networking Rx zero-copy infra to support zero-copy Rx via io_uring. - Optimize MPTCP performance in single subflow mode by 29%. - Enable GRO on packets which went thru XDP CPU redirect (were queued for processing on a different CPU). Improving TCP stream performance up to 2x. - Improve performance of contended connect() by 200% by searching for an available 4-tuple under RCU rather than a spin lock. Bring an additional 229% improvement by tweaking hash distribution. - Avoid unconditionally touching sk_tsflags on RX, improving performance under UDP flood by as much as 10%. - Avoid skb_clone() dance in ping_rcv() to improve performance under ping flood. - Avoid FIB lookup in netfilter if socket is available, 20% perf win. - Rework network device creation (in-kernel) API to more clearly identify network namespaces and their roles. There are up to 4 namespace roles but we used to have just 2 netns pointer arguments, interpreted differently based on context. - Use sysfs_break_active_protection() instead of trylock to avoid deadlocks between unregistering objects and sysfs access. - Add a new sysctl and sockopt for capping max retransmit timeout in TCP. - Support masking port and DSCP in routing rule matches. - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST. - Support specifying at what time packet should be sent on AF_XDP sockets. - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users. - Add Netlink YAML spec for WiFi (nl80211) and conntrack. - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols which only need to be exported when IPv6 support is built as a module. - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to normal bridging. - Allow users to specify source port range for GENEVE tunnels. - netconsole: allow attaching kernel release, CPU ID and task name to messages as metadata Driver API: - Continue rework / fixing of Energy Efficient Ethernet (EEE) across the SW layers. Delegate the responsibilities to phylink where possible. Improve its handling in phylib. - Support symmetric OR-XOR RSS hashing algorithm. - Support tracking and preserving IRQ affinity by NAPI itself. - Support loopback mode speed selection for interface selftests. Device drivers: - Remove the IBM LCS driver for s390 - Remove the sb1000 cable modem driver - Add support for SFP module access over SMBus - Add MCTP transport driver for MCTP-over-USB - Enable XDP metadata support in multiple drivers - Ethernet high-speed NICs: - Broadcom (bnxt): - add PCIe TLP Processing Hints (TPH) support for new AMD platforms - support dumping RoCE queue state for debug - opt into instance locking - Intel (100G, ice, idpf): - ice: rework MSI-X IRQ management and distribution - ice: support for E830 devices - iavf: add support for Rx timestamping - iavf: opt into instance locking - nVidia/Mellanox: - mlx4: use page pool memory allocator for Rx - mlx5: support for one PTP device per hardware clock - mlx5: support for 200Gbps per-lane link modes - mlx5: move IPSec policy check after decryption - AMD/Solarflare: - support FW flashing via devlink - Cisco (enic): - use page pool memory allocator for Rx - enable 32, 64 byte CQEs - get max rx/tx ring size from the device - Meta (fbnic): - support flow steering and RSS configuration - report queue stats - support TCP segmentation - support IRQ coalescing - support ring size configuration - Marvell/Cavium: - support AF_XDP - Wangxun: - support for PTP clock and timestamping - Huawei (hibmcge): - checksum offload - add more statistics - Ethernet virtual: - VirtIO net: - aggressively suppress Tx completions, improve perf by 96% with 1 CPU and 55% with 2 CPUs - expose NAPI to IRQ mapping and persist NAPI settings - Google (gve): - support XDP in DQO RDA Queue Format - opt into instance locking - Microsoft vNIC: - support BIG TCP - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - cleanup Tx and Tx clock setting and other link-focused cleanups - enable SGMII and 2500BASEX mode switching for Intel platforms - support Sophgo SG2044 - Broadcom switches (b53): - support for BCM53101 - TI: - iep: add perout configuration support - icssg: support XDP - Cadence (macb): - implement BQL - Xilinx (axinet): - support dynamic IRQ moderation and changing coalescing at runtime - implement BQL - report standard stats - MediaTek: - support phylink managed EEE - Intel: - igc: don't restart the interface on every XDP program change - RealTek (r8169): - support reading registers of internal PHYs directly - increase max jumbo packet size on RTL8125/RTL8126 - Airoha: - support for RISC-V NPU packet processing unit - enable scatter-gather and support MTU up to 9kB - Tehuti (tn40xx): - support cards with TN4010 MAC and an Aquantia AQR105 PHY - Ethernet PHYs: - support for TJA1102S, TJA1121 - dp83tg720: add randomized polling intervals for link detection - dp83822: support changing the transmit amplitude voltage - support for LEDs on 88q2xxx - CAN: - canxl: support Remote Request Substitution bit access - flexcan: add S32G2/S32G3 SoC - WiFi: - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support - batman-adv: add support for jumbo frames - WiFi drivers: - RealTek (rtw88): - support RTL8814AE and RTL8814AU - RealTek (rtw89): - switch using wiphy_lock and wiphy_work - add BB context to manipulate two PHY as preparation of MLO - improve BT-coexistence mechanism to play A2DP smoothly - Intel (iwlwifi): - add new iwlmld sub-driver for latest HW/FW combinations - MediaTek (mt76): - preparation for mt7996 Multi-Link Operation (MLO) support - Qualcomm/Atheros (ath12k): - continued work on MLO - Silabs (wfx): - Wake-on-WLAN support - Bluetooth: - add support for skb TX SND/COMPLETION timestamping - hci_core: enable buffer flow control for SCO/eSCO - coredump: log devcd dumps into the monitor - Bluetooth drivers: - intel: add support to configure TX power - nxp: handle bootloader error during cmd5 and cmd7" * tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits) unix: fix up for "apparmor: add fine grained af_unix mediation" mctp: Fix incorrect tx flow invalidation condition in mctp-i2c net: usb: asix: ax88772: Increase phy_name size net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string net: libwx: fix Tx L4 checksum net: libwx: fix Tx descriptor content for some tunnel packets atm: Fix NULL pointer dereference net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus net: phy: aquantia: add essential functions to aqr105 driver net: phy: aquantia: search for firmware-name in fwnode net: phy: aquantia: add probe function to aqr105 for firmware loading net: phy: Add swnode support to mdiobus_scan gve: add XDP DROP and PASS support for DQ gve: update XDP allocation path support RX buffer posting gve: merge packet buffer size fields gve: update GQ RX to use buf_size gve: introduce config-based allocation for XDP gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics ...
| * \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2025-03-061-4/+12
| |\ \ \ \ | | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") 637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2025-02-201-1/+5
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR (net-6.14-rc4). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * \ \ \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2025-02-134-21/+32
| |\ \ \ \ \ | | | |_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR (net-6.14-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | | | | s390/net: Remove LCS driverAswin Karuvally2025-02-051-1/+0
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original Open Systems Adapter (OSA) was introduced by IBM in the mid-90s. These were then superseded by OSA-Express in 1999 which used Queued Direct IO to greatly improve throughput. The newer cards retained the older, slower non-QDIO (OSE) modes for compatibility with older systems. In Linux, the lcs driver was responsible for cards operating in the older OSE mode and the qeth driver was introduced to allow the OSA-Express cards to operate in the newer QDIO (OSD) mode. For an S390 machine from 1998 or later, there is no reason to use the OSE mode and lcs driver as all OSA cards since 1999 provide the faster OSD mode. As a result, it's been years since we have heard of a customer configuration involving the lcs driver. This patch removes the lcs driver. The technology it supports has been obsolete for past 25+ years and is irrelevant for current use cases. Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Aswin Karuvally <aswin@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250204103135.1619097-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | | | | Merge tag 'iommu-updates-v6.15' of ↵Linus Torvalds2025-03-262-2/+6
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: "Core iommufd dependencies from Jason: - Change the iommufd fault handle into an always present hwpt handle in the domain - Give iommufd its own SW_MSI implementation along with some IRQ layer rework - Improvements to the handle attach API Core fixes for probe-issues from Robin Intel VT-d changes: - Checking for SVA support in domain allocation and attach paths - Move PCI ATS and PRI configuration into probe paths - Fix a pentential hang on reboot -f - Miscellaneous cleanups AMD-Vi changes: - Support for up to 2k IRQs per PCI device function - Set of smaller fixes ARM-SMMU changes: - SMMUv2 devicetree binding updates for Qualcomm implementations (QCS8300 GPU and MSM8937) - Clean up SMMUv2 runtime PM implementation to help with wider rework of pm_runtime_put_autosuspend() Rockchip driver changes: - Driver adjustments for recent DT probing changes S390 IOMMU changes: - Support for IOMMU passthrough Apple Dart changes: - Driver adjustments to meet ISP device requirements - Null-ptr deref fix - Disable subpage protection for DART 1" * tag 'iommu-updates-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (54 commits) iommu/vt-d: Fix possible circular locking dependency iommu/vt-d: Don't clobber posted vCPU IRTE when host IRQ affinity changes iommu/vt-d: Put IRTE back into posted MSI mode if vCPU posting is disabled iommu: apple-dart: fix potential null pointer deref iommu/rockchip: Retire global dma_dev workaround iommu/rockchip: Register in a sensible order iommu/rockchip: Allocate per-device data sensibly iommu/mediatek-v1: Support COMPILE_TEST iommu/amd: Enable support for up to 2K interrupts per function iommu/amd: Rename DTE_INTTABLEN* and MAX_IRQS_PER_TABLE macro iommu/amd: Replace slab cache allocator with page allocator iommu/amd: Introduce generic function to set multibit feature value iommu: Don't warn prematurely about dodgy probes iommu/arm-smmu: Set rpm auto_suspend once during probe dt-bindings: arm-smmu: Document QCS8300 GPU SMMU iommu: Get DT/ACPI parsing into the proper probe path iommu: Keep dev->iommu state consistent iommu: Resolve ops in iommu_init_device() iommu: Handle race with default domain setup iommu: Unexport iommu_fwspec_free() ...
| | \ \ \ \
| | \ \ \ \
| *-. \ \ \ \ Merge branches 'apple/dart', 'arm/smmu/updates', 'arm/smmu/bindings', ↵Joerg Roedel2025-03-202-2/+6
| |\ \ \ \ \ \ | | | |_|_|/ / | | |/| | | / | | |_|_|_|/ | |/| | | | 'rockchip', 's390', 'core', 'intel/vt-d' and 'amd/amd-vi' into next
| | | * | | iommu/s390: handle IOAT registration based on domainMatthew Rosato2025-02-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At this point, the dma_table is really a property of the s390-iommu domain. Rather than checking its contents elsewhere in the codebase, move the code that registers the table with firmware into s390-iommu and make a decision what to register with firmware based upon the type of domain in use for the device in question. Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20250212213418.182902-4-mjrosato@linux.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | * | | s390/pci: check for relaxed translation capabilityMatthew Rosato2025-02-212-2/+4
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For each zdev, record whether or not CLP indicates relaxed translation capability for the associated device group. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20250212213418.182902-2-mjrosato@linux.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
* | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2025-03-251-1/+0
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM: - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events LoongArch: - Remove unnecessary header include path - Assume constant PGD during VM context switch - Add perf events support for guest VM RISC-V: - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal x86: - Add support for aging of SPTEs without holding mmu_lock. Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs. This includes an implementation of per-rmap-entry locking; aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas locking an rmap for write requires taking both the per-rmap spinlock and the mmu_lock. Note that this decreases slightly the accuracy of accessed-page information, because changes to the SPTE outside aging might not use atomic operations even if they could race against a clear of the Accessed bit. This is deliberate because KVM and mm/ tolerate false positives/negatives for accessed information, and testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2) - Drop "support" for async page faults for protected guests that do not set SEND_ALWAYS (i.e. that only want async page faults at CPL3) - Bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. Particularly, destroy vCPUs before the MMU, despite the latter being a VM-wide operation - Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing - Don't take kvm->lock when iterating over vCPUs in the suspend notifier to fix a largely theoretical deadlock - Use the vCPU's actual Xen PV clock information when starting the Xen timer, as the cached state in arch.hv_clock can be stale/bogus - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend notifier only accounts for kvmclock, and there's no evidence that the flag is actually supported by Xen guests - Clean up the per-vCPU "cache" of its reference pvclock, and instead only track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately expensive to compute, and rarely changes for modern setups) - Don't write to the Xen hypercall page on MSR writes that are initiated by the host (userspace or KVM) to fix a class of bugs where KVM can write to guest memory at unexpected times, e.g. during vCPU creation if userspace has set the Xen hypercall MSR index to collide with an MSR that KVM emulates - Restrict the Xen hypercall MSR index to the unofficial synthetic range to reduce the set of possible collisions with MSRs that are emulated by KVM (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside in the synthetic range) - Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config - Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID entries when updating PV clocks; there is no guarantee PV clocks will be updated between TSC frequency changes and CPUID emulation, and guest reads of the TSC leaves should be rare, i.e. are not a hot path x86 (Intel): - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1 - Pass XFD_ERR as the payload when injecting #NM, as a preparatory step for upcoming FRED virtualization support - Decouple the EPT entry RWX protection bit macros from the EPT Violation bits, both as a general cleanup and in anticipation of adding support for emulating Mode-Based Execution Control (MBEC) - Reject KVM_RUN if userspace manages to gain control and stuff invalid guest state while KVM is in the middle of emulating nested VM-Enter - Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs in anticipation of adding sanity checks for secondary exit controls (the primary field is out of bits) x86 (AMD): - Ensure the PSP driver is initialized when both the PSP and KVM modules are built-in (the initcall framework doesn't handle dependencies) - Use long-term pins when registering encrypted memory regions, so that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to excessive fragmentation - Add macros and helpers for setting GHCB return/error codes - Add support for Idle HLT interception, which elides interception if the vCPU has a pending, unmasked virtual IRQ when HLT is executed - Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical address - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g. because the vCPU was "destroyed" via SNP's AP Creation hypercall - Reject SNP AP Creation if the requested SEV features for the vCPU don't match the VM's configured set of features Selftests: - Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data instead of executing code. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration - Fix a few minor bugs related to handling of stats FDs - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation) - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits) RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed RISC-V: KVM: Teardown riscv specific bits after kvm_exit LoongArch: KVM: Register perf callbacks for guest LoongArch: KVM: Implement arch-specific functions for guest perf LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel() LoongArch: KVM: Remove PGD saving during VM context switch LoongArch: KVM: Remove unnecessary header include path KVM: arm64: Tear down vGIC on failed vCPU creation KVM: arm64: PMU: Reload when resetting KVM: arm64: PMU: Reload when user modifies registers KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs KVM: arm64: PMU: Assume PMU presence in pmu-emul.c KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected KVM: x86: Add infrastructure for secure TSC KVM: x86: Push down setting vcpu.arch.user_set_tsc ...
| * \ \ \ \ Merge branch 'kvm-nvmx-and-vm-teardown' into HEADPaolo Bonzini2025-03-201-1/+0
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The immediate issue being fixed here is a nVMX bug where KVM fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). However, checking for a pending interrupt accesses the legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit results in a NULL pointer deref; that's a prerequisite for the nVMX fix. The remaining patches attempt to bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. E.g. KVM currently unloads each vCPU's MMUs in a separate operation from destroying vCPUs, all because when guest SMP support was added, KVM had a kludgy MMU teardown flow that broke when a VM had more than one 1 vCPU. And that oddity lived on, for 18 years... Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | | KVM: Drop kvm_arch_sync_events() now that all implementations are nopsSean Christopherson2025-02-261-1/+0
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove kvm_arch_sync_events() now that x86 no longer uses it (no other arch has ever used it). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Message-ID: <20250224235542.2562848-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | | | Merge tag 'timers-vdso-2025-03-23' of ↵Linus Torvalds2025-03-254-49/+2
|\ \ \ \ \ \ | |/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull VDSO infrastructure updates from Thomas Gleixner: - Consolidate the VDSO storage The VDSO data storage and data layout has been largely architecture specific for historical reasons. That increases the maintenance effort and causes inconsistencies over and over. There is no real technical reason for architecture specific layouts and implementations. The architecture specific details can easily be integrated into a generic layout, which also reduces the amount of duplicated code for managing the mappings. Convert all architectures over to a unified layout and common mapping infrastructure. This splits the VDSO data layout into subsystem specific blocks, timekeeping, random and architecture parts, which provides a better structure and allows to improve and update the functionalities without conflict and interaction. - Rework the timekeeping data storage The current implementation is designed for exposing system timekeeping accessors, which was good enough at the time when it was designed. PTP and Time Sensitive Networking (TSN) change that as there are requirements to expose independent PTP clocks, which are not related to system timekeeping. Replace the monolithic data storage by a structured layout, which allows to add support for independent PTP clocks on top while reusing both the data structures and the time accessor implementations. * tag 'timers-vdso-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) sparc/vdso: Always reject undefined references during linking x86/vdso: Always reject undefined references during linking vdso: Rework struct vdso_time_data and introduce struct vdso_clock vdso: Move architecture related data before basetime data powerpc/vdso: Prepare introduction of struct vdso_clock arm64/vdso: Prepare introduction of struct vdso_clock x86/vdso: Prepare introduction of struct vdso_clock time/namespace: Prepare introduction of struct vdso_clock vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct vdso/vsyscall: Prepare introduction of struct vdso_clock vdso/gettimeofday: Prepare helper functions for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock vdso/gettimeofday: Prepare introduction of struct vdso_clock vdso/helpers: Prepare introduction of struct vdso_clock vdso/datapage: Define vdso_clock to prepare for multiple PTP clocks vdso: Make vdso_time_data cacheline aligned arm64: Make asm/cache.h compatible with vDSO ...
| * | | | | s390/vdso: Switch to generic storage implementationThomas Weißschuh2025-02-214-49/+2
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The generic storage implementation provides the same features as the custom one. However it can be shared between architectures, making maintenance easier. Co-developed-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-12-13a4669dfc8c@linutronix.de
* | | | | KVM: s390: pv: fix race when making a page secureClaudio Imbrenda2025-03-142-2/+1
| |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Holding the pte lock for the page that is being converted to secure is needed to avoid races. A previous commit removed the locking, which caused issues. Fix by locking the pte again. Fixes: 5cbe24350b7d ("KVM: s390: move pv gmap functions into kvm") Reported-by: David Hildenbrand <david@redhat.com> Tested-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [david@redhat.com: replace use of get_locked_pte() with folio_walk_start()] Link: https://lore.kernel.org/r/20250312184912.269414-2-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250312184912.269414-2-imbrenda@linux.ibm.com>
* | | | mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear()Ryan Roberts2025-02-271-4/+12
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to fix a bug, arm64 needs to be told the size of the huge page for which the huge_pte is being cleared in huge_ptep_get_and_clear(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear() and set_huge_pte_at(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, loongarch, mips, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. Cc: stable@vger.kernel.org Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> # s390 Link: https://lore.kernel.org/r/20250226120656.2400136-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | | s390/bitops: Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHESHeiko Carstens2025-02-111-1/+5
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With PROFILE_ALL_BRANCHES enabled gcc sometimes fails to handle __builtin_constant_p() correctly: In function 'arch_test_bit', inlined from 'node_state' at include/linux/nodemask.h:423:9, inlined from 'warn_if_node_offline' at include/linux/gfp.h:252:2, inlined from '__alloc_pages_node_noprof' at include/linux/gfp.h:267:2, inlined from 'alloc_pages_node_noprof' at include/linux/gfp.h:296:9, inlined from 'vm_area_alloc_pages.constprop' at mm/vmalloc.c:3591:11: >> arch/s390/include/asm/bitops.h:60:17: warning: 'asm' operand 2 probably does not match constraints 60 | asm volatile( | ^~~ >> arch/s390/include/asm/bitops.h:60:17: error: impossible constraint in 'asm' Therefore disable the optimization for this case. This is similar to commit 63678eecec57 ("s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES") Fixes: b2bc1b1a77c0 ("s390/bitops: Provide optimized arch_test_bit()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502091912.xL2xTCGw-lkp@intel.com/ Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* | Merge tag 'kvm-s390-next-6.14-2' of ↵Paolo Bonzini2025-02-044-21/+32
|\ \ | |/ |/| | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - some selftest fixes - move some kvm-related functions from mm into kvm - remove all usage of page->index and page->lru from kvm - fixes and cleanups for vsie
| * KVM: s390: remove the last user of page->indexClaudio Imbrenda2025-01-311-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shadow page tables use page->index to keep the g2 address of the guest page table being shadowed. Instead of keeping the information in page->index, split the address and smear it over the 16-bit softbits areas of 4 PGSTEs. This removes the last s390 user of page->index. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-16-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-16-imbrenda@linux.ibm.com>
| * KVM: s390: move PGSTE softbitsClaudio Imbrenda2025-01-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the softbits in the PGSTEs to the other usable area. This leaves the 16-bit block of usable bits free, which will be used in the next patch for something else. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-15-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-15-imbrenda@linux.ibm.com>
| * KVM: s390: move gmap_shadow_pgt_lookup() into kvmClaudio Imbrenda2025-01-311-2/+1
| | | | | | | | | | | | | | | | | | | | | | Move gmap_shadow_pgt_lookup() from mm/gmap.c into kvm/gaccess.c . Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-13-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-13-imbrenda@linux.ibm.com>
| * KVM: s390: stop using lists to keep track of used dat tablesClaudio Imbrenda2025-01-311-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, every dat table allocated to map a guest was put in a linked list. The page->lru field of struct page was used to keep track of which pages were being used, and when the gmap is torn down, the list was walked and all pages freed. This patch gets rid of the usage of page->lru. Page tables are now freed by recursively walking the dat table tree. Since s390_unlist_old_asce() becomes useless now, remove it. Acked-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-12-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-12-imbrenda@linux.ibm.com>
| * KVM: s390: move some gmap shadowing functions away from mm/gmap.cClaudio Imbrenda2025-01-311-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Move some gmap shadowing functions from mm/gmap.c to kvm/kvm-s390.c and the newly created kvm/gmap-vsie.c This is a step toward removing gmap from mm. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-10-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-10-imbrenda@linux.ibm.com>
| * KVM: s390: get rid of gmap_translate()Claudio Imbrenda2025-01-311-1/+0
| | | | | | | | | | | | | | | | | | | | | | Add gpa_to_hva(), which uses memslots, and use it to replace all uses of gmap_translate(). Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-9-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-9-imbrenda@linux.ibm.com>
| * KVM: s390: get rid of gmap_fault()Claudio Imbrenda2025-01-311-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All gmap page faults are already handled in kvm by the function kvm_s390_handle_dat_fault(); only few users of gmap_fault remained, all within kvm. Convert those calls to use kvm_s390_handle_dat_fault() instead. Remove gmap_fault() entirely since it has no more users. Acked-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-8-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-8-imbrenda@linux.ibm.com>
| * KVM: s390: move pv gmap functions into kvmClaudio Imbrenda2025-01-312-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Move gmap related functions from kernel/uv into kvm. Create a new file to collect gmap-related functions. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> [fixed unpack_one(), thanks mhartmay@linux.ibm.com] Link: https://lore.kernel.org/r/20250123144627.312456-6-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-6-imbrenda@linux.ibm.com>
| * KVM: s390: fake memslot for ucontrol VMsClaudio Imbrenda2025-01-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a fake memslot for ucontrol VMs. The fake memslot identity-maps userspace. Now memslots will always be present, and ucontrol is not a special case anymore. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-4-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-4-imbrenda@linux.ibm.com>
| * KVM: s390: vsie: stop using "struct page" for vsie pageDavid Hildenbrand2025-01-311-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Now that we no longer use page->index and the page refcount explicitly, let's avoid messing with "struct page" completely. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Tested-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Message-ID: <20250107154344.1003072-5-david@redhat.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
* | Merge tag 's390-6.14-3' of ↵Linus Torvalds2025-01-301-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Architecutre-specific ftrace recursion trylock tests were removed in favour of the generic function_graph_enter(), but s390 got missed. Remove this test for s390 as well. - Add ftrace_get_symaddr() for s390, which returns the symbol address from ftrace 'ip' parameter * tag 's390-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/tracing: Define ftrace_get_symaddr() for s390 s390/fgraph: Fix to remove ftrace_test_recursion_trylock()
| * | s390/tracing: Define ftrace_get_symaddr() for s390Masami Hiramatsu (Google)2025-01-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ftrace_get_symaddr() for s390, which returns the symbol address from ftrace's 'ip' parameter. Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/173807818869.1854334.15474589105952793986.stgit@devnote2 Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* | | Merge tag 's390-6.14-2' of ↵Linus Torvalds2025-01-3010-398/+518
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Alexander Gordeev: - The rework that uncoupled physical and virtual address spaces inadvertently prevented KASAN shadow mappings from using large pages. Restore large page mappings for KASAN shadows - Add decompressor routine physmem_alloc() that may fail, unlike physmem_alloc_or_die(). This allows callers to implement fallback paths - Allow falling back from large pages to smaller pages (1MB or 4KB) if the allocation of 2GB pages in the decompressor can not be fulfilled - Add to the decompressor boot print support of "%%" format string, width and padding hadnling, length modifiers and decimal conversion specifiers - Add to the decompressor message severity levels similar to kernel ones. Support command-line options that control console output verbosity - Replaces boot_printk() calls with appropriate loglevel- specific helpers such as boot_emerg(), boot_warn(), and boot_debug(). - Collect all boot messages into a ring buffer independent of the current log level. This is particularly useful for early crash analysis - If 'earlyprintk' command line parameter is not specified, store decompressor boot messages in a ring buffer to be printed later by the kernel, once the console driver is registered - Add 'bootdebug' command line parameter to enable printing of decompressor debug messages when needed. That parameters allows message suppressing and filtering - Dump boot messages on a decompressor crash, but only if 'bootdebug' command line parameter is enabled - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages in the same format as regular printk() - Dump physical memory tracking information on boot: online ranges, reserved areas and vmem allocations - Dump virtual memory layout and randomization details - Improve decompression error reporting and dump the message ring buffer in case the boot failed and system halted - Add an exception handler which handles exceptions when FPU control register is attempted to be set to an invalid value. Remove '.fixup' section as result of this change - Use 'A', 'O', and 'R' inline assembly format flags, which allows recent Clang compilers to generate better FPU code - Rework uaccess code so it reads better and generates more efficient code - Cleanup futex inline assembly code - Disable KMSAN instrumention for futex inline assemblies, which contain dereferenced user pointers. Otherwise, shadows for the user pointers would be accessed - PFs which are not initially configured but in standby create only a single-function PCI domain. If they are configured later on, sibling PFs and their child VFs will not be added to their PCI domain breaking SR-IOV expectations. Fix that by allowing initially configured but in standby PFs create multi-function PCI domains - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid compile errors caused by kernel's own definitions of 'bool', 'false', and 'true' conflicting with the C23 reserved keywords - Fix sclp subsystem failure when a sclp console is not present - Fix misuse of non-NULL terminated strings in vmlogrdr driver - Various other small improvements, cleanups and fixes * tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (53 commits) s390/vmlogrdr: Use array instead of string initializer s390/vmlogrdr: Use internal_name for error messages s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init() s390/tools: Use array instead of string initializer s390/vmem: Fix null-pointer-arithmetic warning in vmem_map_init() s390: Add '-std=gnu11' to decompressor and purgatory CFLAGS s390/bitops: Use correct constraint for arch_test_bit() inline assembly s390/pci: Fix SR-IOV for PFs initially in standby s390/futex: Avoid KMSAN instrumention for user pointers s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inline s390/futex: Cleanup futex_atomic_cmpxchg_inatomic() s390/futex: Generate futex atomic op functions s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER s390/uaccess: Use asm goto for put_user()/get_user() s390/uaccess: Remove usage of the oac specifier s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handling s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly constraints s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappers s390/uaccess: Move put_user() / __put_user() close to put_user() asm code s390/uaccess: Use asm goto for __mvc_kernel_nofault() ...
| * | s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()Heiko Carstens2025-01-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the switch to GENERIC_CPU_DEVICES an early call to the sclp subsystem was added to smp_prepare_cpus(). This will usually succeed since the sclp subsystem is implicitly initialized early enough if an sclp based console is present. If no such console is present the initialization happens with an arch_initcall(); in such cases calls to the sclp subsystem will fail. For CPU detection this means that the fallback sigp loop will be used permanently to detect CPUs instead of the preferred READ_CPU_INFO sclp request. Fix this by adding an explicit early sclp_init() call via arch_cpu_finalize_init(). Reported-by: Sheshu Ramanandan <sheshu.ramanandan@ibm.com> Fixes: 4a39f12e753d ("s390/smp: Switch to GENERIC_CPU_DEVICES") Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/bitops: Use correct constraint for arch_test_bit() inline assemblyHeiko Carstens2025-01-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the "Q" instead of "R" constraint to correctly reflect the instruction format of the tm instruction: the first operand is a memory reference without index register and short displacement. The "R" constraint indicates a memory reference with index register instead. This may lead to compile errors like: arch/s390/include/asm/bitops.h: Assembler messages: arch/s390/include/asm/bitops.h:60: Error: operand 1: syntax error; missing ')' after base register arch/s390/include/asm/bitops.h:60: Error: operand 2: syntax error; ')' not allowed here arch/s390/include/asm/bitops.h:60: Error: junk at end of line: `,4' Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/r/20250122-s390-fix-std-for-gcc-15-v1-1-8b00cadee083@kernel.org Fixes: b2bc1b1a77c0 ("s390/bitops: Provide optimized arch_test_bit()") Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/futex: Avoid KMSAN instrumention for user pointersHeiko Carstens2025-01-261-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to commit eb6efdfeaeca ("s390/uaccess: add KMSAN support to put_user() and get_user()") disable KMSAN instrumention for futex inline assemblies, which contain dereferenced user pointers. With KMSAN instrumentation this would lead to accesses of shadows for user pointers, which should not happen. Handle the futex operations like they copy a value (old) from user space to kernel space. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inlineHeiko Carstens2025-01-261-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename get_put_user_noinstr_attributes to a more generic uaccess_kmsan_or_inline name. This allows to use it for other non put_user()/get_user() uaccess functions withour causing confusion. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/futex: Cleanup futex_atomic_cmpxchg_inatomic()Heiko Carstens2025-01-261-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup the futex_atomic_cmpxchg_inatomic() inline assembly to make it a bit more readable. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/futex: Generate futex atomic op functionsHeiko Carstens2025-01-261-34/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | Cleanup the futex atomic op inline assembly and generate a function for each futex atomic op. This makes the code hopefully a bit more readable. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USERHeiko Carstens2025-01-261-30/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The s390 implementations of raw_copy_from_user() and raw_copy_to_user() are never inlined. However INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER are still set. This leads to the odd situation that only the error handling (memset to zero of the not copied bytes) of copy_from_user() is inlined, while the actual fast path code is out-of-line. This would make sense if raw_copy_from_user() and raw_copy_to_user() were implemented in assembler files, where inlining is not possible. But the current s390 setup does not make any sense. Address this by moving the raw uaccess copy inline assemblies to the uaccess header file, and remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER definitions. This way the uaccess code, but now including error handling, is still out-of-line with the common code _copy_from_user() and _copy_to_user() variants, which inline the raw uaccess functions via _inline_copy_from_user() and _inline_copy_to_user(). This reduces the size of the kernel image by ~17kb. (defconfig, gcc 14.2.0) Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Use asm goto for put_user()/get_user()Heiko Carstens2025-01-261-6/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | Use asm goto if available for put_user() and get_user(). This generates slightly better code. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Remove usage of the oac specifierHeiko Carstens2025-01-261-12/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove usage of the operand access control specifier for put_user() and get_user() (again). Instead hardcode the specifier for both inline assemblies. This saves one instruction. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handlingHeiko Carstens2025-01-262-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove EX_TABLE_UA_LOAD_MEM exception handling and replace it with EX_TABLE_UA_FAULT. Open code the return code check, and also open code setting of the destination to zero in case of an error. In almost all cases the compiler is able to optimize the open coded checks away, since all users of get_users() must check the return code, and are not supposed to use the result in case of an error. In addition this allows to change the get_user() inline assembly so that the "Q" constraint can be used for the destination, instead of only an "a" constraint. This generates slightly better code. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly ↵Heiko Carstens2025-01-261-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constraints Remove superfluous underscores, brackets, and early clobber to make the code a bit more readable. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappersHeiko Carstens2025-01-261-81/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __put_user_fn() and __get_user_fn() wrappers are leftovers from the time where the kernel was compiled with or without mvcos support plus they were later used to workaround the problems that came with asm register constructs. Both reasons do not exist anymore, therefore remove the wrappers and shorten the code. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Move put_user() / __put_user() close to put_user() asm codeHeiko Carstens2025-01-261-30/+30
| | | | | | | | | | | | | | | | | | | | | | | | Keep put_user() and get_user() code separated. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
| * | s390/uaccess: Use asm goto for __mvc_kernel_nofault()Heiko Carstens2025-01-261-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use asm goto for __mvc_kernel_nofault() if available. This generates slightly better code, since the error checking happens implicitly with the goto (aka exception) and the good path comes without any checks and branches. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>