summaryrefslogtreecommitdiffstats
path: root/drivers/vdpa
Commit message (Collapse)AuthorAgeFilesLines
* vDPA: report virtio-blk flush info to user spaceZhu Lingshan2024-03-191-0/+14
| | | | | | | | | This commit reports whether a virtio-blk device support cache flush command to user space Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-11-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block read-only info to user spaceZhu Lingshan2024-03-191-0/+14
| | | | | | | | | This commit report read-only information of virtio-blk devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-10-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block write zeroes configuration to user spaceZhu Lingshan2024-03-191-0/+23
| | | | | | | | | | | This commits reports write zeroes configuration of virtio-block devices to user space, includes: 1)maximum write zeroes sectors size 2)maximum write zeroes segment number Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-9-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block discarding configuration to user spaceZhu Lingshan2024-03-191-0/+26
| | | | | | | | | | | | This commit reports virtio-blk discarding configuration to user space,includes: 1) the maximum discard sectors 2) maximum number of discard segments for the block driver to use 3) the alignment for splitting a discarding request Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-8-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block topology info to user spaceZhu Lingshan2024-03-191-0/+32
| | | | | | | | | | | | | This commit allows vDPA reporting topology information of virtio-blk devices to user space, includes: 1) the number of logical blocks per physical block 2) offset of first aligned logical block 3) suggested minimum I/O size in blocks 4) optimal (suggested maximum) I/O size in blocks Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block MQ info to user spaceZhu Lingshan2024-03-191-0/+16
| | | | | | | | | This commits allows vDPA reporting virtio-block multi-queue configuration to user sapce. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-6-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block max segments in a request to user spaceZhu Lingshan2024-03-191-0/+17
| | | | | | | | | | This commit allows vDPA reporting the maximum number of segments in a request of virtio-block devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block block-size to user spaceZhu Lingshan2024-03-191-0/+18
| | | | | | | | | This commit allows reporting the block size of a virtio-block device to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-4-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block max segment size to user spaceZhu Lingshan2024-03-191-0/+17
| | | | | | | | | This commit allows reporting the max size of any single segment of virtio-block devices to user space. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-3-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA: report virtio-block capacity to user spaceZhu Lingshan2024-03-191-0/+35
| | | | | | | | | This commit allows userspace to query capacity of a virtio-block device. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240218185606.13509-2-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vdpa: make vdpa_bus constRicardo B. Marliere2024-03-191-1/+1
| | | | | | | | | | | | | Now that the driver core can properly handle constant struct bus_type, move the vdpa_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Message-Id: <20240204-bus_cleanup-vdpa-v1-1-1745eccb0a5c@marliere.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_minZhu Lingshan2024-03-192-0/+7
| | | | | | | | | | | | | | | | | IFCVF HW supports operation with vq size less than the max size, as the spec required. This commit implements vdpa_config_ops.get_vq_num_min to report the minimal size of the virtqueues, which gives vDPA framework a chance to reduce the vring size. We need at least one descriptor to be functional, but it is better no less than 64 to meet ceratin performance requirements. Actually the framework would allocate at least a PAGE for the vq. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-11-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA/ifcvf: get_max_vq_size to return max sizeZhu Lingshan2024-03-191-5/+1
| | | | | | | | | | Since we already implemented vdpa_config_ops.get_vq_size, so get_max_vq_size can return the acutal max size of the virtqueues other than the max allowed safe size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-10-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vduse: implement vdpa_config_ops.get_vq_size for vduseZhu Lingshan2024-03-191-0/+12
| | | | | | | | | This commit implements get_vq_size for vdpa_config_ops. This new interface is used to report per vq size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-8-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vdpa_sim: implement vdpa_config_ops.get_vq_size for vDPA simulatorZhu Lingshan2024-03-191-0/+12
| | | | | | | | | This commit implements vdpa_config_ops.get_vq_size for vDPA simulator, this new interface can help report per vq size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-7-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* eni_vdpa: implement vdpa_config_ops.get_vq_sizeZhu Lingshan2024-03-191-0/+8
| | | | | | | | | This commit implements get_vq_size which report per vq size in vdpa_config_ops Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-6-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vp_vdpa: implement vdpa_config_ops.get_vq_sizeZhu Lingshan2024-03-191-0/+8
| | | | | | | | | This commit implements vdpa_config_ops.get_vq_size in vp_vdpa, which reports per virtqueue size. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-5-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vDPA/ifcvf: implement vdpa_config_ops.get_vq_sizeZhu Lingshan2024-03-193-1/+14
| | | | | | | | | This commit implements vdpa_ops.get_vq_size to report the size of a specific virtqueue. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20240202163905.8834-4-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vdpa/pds: fixes for VF vdpa flr-aer handlingShannon Nelson2024-03-193-4/+19
| | | | | | | | | | | | This addresses a couple of things found while testing the FLR and AER handling with the VFs. - release irqs before calling vp_modern_remove() - make sure we have a valid struct pointer before using it to release irqs - make sure the FW is alive before trying to add a new device Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20240220011050.30913-1-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vduse: implement DMA sync callbacksMaxime Coquelin2024-03-193-3/+54
| | | | | | | | | | | | | | | | | | Since commit 295525e29a5b ("virtio_net: merge dma operations when filling mergeable buffers"), VDUSE device require support for DMA's .sync_single_for_cpu() operation as the memory is non-coherent between the device and CPU because of the use of a bounce buffer. This patch implements both .sync_single_for_cpu() and .sync_single_for_device() callbacks, and also skip bounce buffer copies during DMA map and unmap operations if the DMA_ATTR_SKIP_CPU_SYNC attribute is set to avoid extra copies of the same buffer. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20240219170606.587290-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vdpa/mlx5: Allow CVQ size changesJonah Palmer2024-03-191-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | The MLX driver was not updating its control virtqueue size at set_vq_num and instead always initialized to MLX5_CVQ_MAX_ENT (16) at setup_cvq_vring. Qemu would try to set the size to 64 by default, however, because the CVQ size always was initialized to 16, an error would be thrown when sending >16 control messages (as used-ring entry 17 is initialized to 0). For example, starting a guest with x-svq=on and then executing the following command would produce the error below: # for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done qemu-system-x86_64: Insufficient written data (0) [ 435.331223] virtio_net virtio0: Failed to set mac address by vq command. SIOCSIFHWADDR: Invalid argument Acked-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com> Message-Id: <20240216142502.78095-1-jonah.palmer@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com> Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")
* vdpa_sim: reset must not runSteve Sistare2024-03-191-1/+2
| | | | | | | | | | | | | | vdpasim_do_reset sets running to true, which is wrong, as it allows vdpasim_kick_vq to post work requests before the device has been configured. To fix, do not set running until VIRTIO_CONFIG_S_DRIVER_OK is set. Fixes: 0c89e2a3a9d0 ("vdpa_sim: Implement suspend vdpa op") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1707517807-137331-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2024-01-185-41/+257
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull virtio updates from Michael Tsirkin: - vdpa/mlx5: support for resumable vqs - virtio_scsi: mq_poll support - 3virtio_pmem: support SHMEM_REGION - virtio_balloon: stay awake while adjusting balloon - virtio: support for no-reset virtio PCI PM - Fixes, cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Add mkey leak detection vdpa/mlx5: Introduce reference counting to mrs vdpa/mlx5: Use vq suspend/resume during .set_map vdpa/mlx5: Mark vq state for modification in hw vq vdpa/mlx5: Mark vq addrs for modification in hw vq vdpa/mlx5: Introduce per vq and device resume vdpa/mlx5: Allow modifying multiple vq fields in one modify command vdpa/mlx5: Expose resumable vq capability vdpa: Block vq property changes in DRIVER_OK vdpa: Track device suspended state scsi: virtio_scsi: Add mq_poll support virtio_pmem: support feature SHMEM_REGION virtio_balloon: stay awake while adjusting balloon vdpa: Remove usage of the deprecated ida_simple_xx() API virtio: Add support for no-reset virtio PCI PM virtio_net: fix missing dma unmap for resize vhost-vdpa: account iommu allocations vdpa: Fix an error handling path in eni_vdpa_probe()
| * vdpa/mlx5: Add mkey leak detectionDragos Tatulea2024-01-103-0/+27
| | | | | | | | | | | | | | | | | | | | | | Track allocated mrs in a list and show warning when leaks are detected on device free or reset. Reviewed-by: Gal Pressman <gal@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-9-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Introduce reference counting to mrsDragos Tatulea2024-01-103-25/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deleting the old mr during mr update (.set_map) and then modifying the vqs with the new mr is not a good flow for firmware. The firmware expects that mkeys are deleted after there are no more vqs referencing them. Introduce reference counting for mrs to fix this. It is the only way to make sure that mkeys are not in use by vqs. An mr reference is taken when the mr is associated to the mr asid table and when the mr is linked to the vq on create/modify. The reference is released when the mkey is unlinked from the vq (trough modify/destroy) and from the mr asid table. To make things consistent, get rid of mlx5_vdpa_destroy_mr and use get/put semantics everywhere. Reviewed-by: Gal Pressman <gal@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-8-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Use vq suspend/resume during .set_mapDragos Tatulea2024-01-101-8/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of tearing down and setting up vq resources, use vq suspend/resume during .set_map to speed things up a bit. The vq mr is updated with the new mapping while the vqs are suspended. If the device doesn't support resumable vqs, do the old teardown and setup dance. Reviewed-by: Gal Pressman <gal@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-7-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Mark vq state for modification in hw vqDragos Tatulea2024-01-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | .set_vq_state will set the indices and mark the fields to be modified in the hw vq. Advertise that the device supports changing the vq state when the device is in DRIVER_OK state and suspended. Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20231225151203.152687-6-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Mark vq addrs for modification in hw vqDragos Tatulea2024-01-101-0/+9
| | | | | | | | | | | | | | | | | | | | Addresses get set by .set_vq_address. hw vq addresses will be updated on next modify_virtqueue. Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-5-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Introduce per vq and device resumeDragos Tatulea2024-01-101-7/+62
| | | | | | | | | | | | | | | | | | | | | | | | Implement vdpa vq and device resume if capability detected. Add support for suspend -> ready state change. Reviewed-by: Gal Pressman <gal@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-4-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa/mlx5: Allow modifying multiple vq fields in one modify commandDragos Tatulea2024-01-101-8/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a bitmask variable that tracks hw vq field changes that are supposed to be modified on next hw vq change command. This will be useful to set multiple vq fields when resuming the vq. Reviewed-by: Gal Pressman <gal@nvidia.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231225151203.152687-3-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vdpa: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET2023-12-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Message-Id: <d7534cc4caf4ff9d6b072744352c1b69487779ea.1702230703.git.christophe.jaillet@wanadoo.fr> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
| * vdpa: Fix an error handling path in eni_vdpa_probe()Christophe JAILLET2023-12-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | After a successful vp_legacy_probe() call, vp_legacy_remove() should be called in the error handling path, as already done in the remove function. Add the missing call. Fixes: e85087beedca ("eni_vdpa: add vDPA driver for Alibaba ENI") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Message-Id: <a7b0ef1eabd081f1c7c894e9b11de01678e85dee.1666293559.git.christophe.jaillet@wanadoo.fr> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | Merge tag 'vfs-6.8.misc' of ↵Linus Torvalds2024-01-081-4/+4
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Add Jan Kara as VFS reviewer - Show correct device and inode numbers in proc/<pid>/maps for vma files on stacked filesystems. This is now easily doable thanks to the backing file work from the last cycles. This comes with selftests Cleanups: - Remove a redundant might_sleep() from wait_on_inode() - Initialize pointer with NULL, not 0 - Clarify comment on access_override_creds() - Rework and simplify eventfd_signal() and eventfd_signal_mask() helpers - Process aio completions in batches to avoid needless wakeups - Completely decouple struct mnt_idmap from namespaces. We now only keep the actual idmapping around and don't stash references to namespaces - Reformat maintainer entries to indicate that a given subsystem belongs to fs/ - Simplify fput() for files that were never opened - Get rid of various pointless file helpers - Rename various file helpers - Rename struct file members after SLAB_TYPESAFE_BY_RCU switch from last cycle - Make relatime_need_update() return bool - Use GFP_KERNEL instead of GFP_USER when allocating superblocks - Replace deprecated ida_simple_*() calls with their current ida_*() counterparts Fixes: - Fix comments on user namespace id mapping helpers. They aren't kernel doc comments so they shouldn't be using /** - s/Retuns/Returns/g in various places - Add missing parameter documentation on can_move_mount_beneath() - Rename i_mapping->private_data to i_mapping->i_private_data - Fix a false-positive lockdep warning in pipe_write() for watch queues - Improve __fget_files_rcu() code generation to improve performance - Only notify writer that pipe resizing has finished after setting pipe->max_usage otherwise writers are never notified that the pipe has been resized and hang - Fix some kernel docs in hfsplus - s/passs/pass/g in various places - Fix kernel docs in ntfs - Fix kcalloc() arguments order reported by gcc 14 - Fix uninitialized value in reiserfs" * tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits) reiserfs: fix uninit-value in comp_keys watch_queue: fix kcalloc() arguments order ntfs: dir.c: fix kernel-doc function parameter warnings fs: fix doc comment typo fs tree wide selftests/overlayfs: verify device and inode numbers in /proc/pid/maps fs/proc: show correct device and inode numbers in /proc/pid/maps eventfd: Remove usage of the deprecated ida_simple_xx() API fs: super: use GFP_KERNEL instead of GFP_USER for super block allocation fs/hfsplus: wrapper.c: fix kernel-doc warnings fs: add Jan Kara as reviewer fs/inode: Make relatime_need_update return bool pipe: wakeup wr_wait after setting max_usage file: remove __receive_fd() file: stop exposing receive_fd_user() fs: replace f_rcuhead with f_task_work file: remove pointless wrapper file: s/close_fd_get_file()/file_close_fd()/g Improve __fget_files_rcu() code generation (and thus __fget_light()) file: massage cleanup of files that failed to open fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() ...
| * Merge branch 'vfs.file'Christian Brauner2023-12-211-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bring in the changes to the file infrastructure for this cycle. Mostly cleanups and some performance tweaks. * file: remove __receive_fd() * file: stop exposing receive_fd_user() * fs: replace f_rcuhead with f_task_work * file: remove pointless wrapper * file: s/close_fd_get_file()/file_close_fd()/g * Improve __fget_files_rcu() code generation (and thus __fget_light()) * file: massage cleanup of files that failed to open Signed-off-by: Christian Brauner <brauner@kernel.org>
| | * file: remove __receive_fd()Christian Brauner2023-12-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Honestly, there's little value in having a helper with and without that int __user *ufd argument. It's just messy and doesn't really give us anything. Just expose receive_fd() with that argument and get rid of that helper. Link: https://lore.kernel.org/r/20231130-vfs-files-fixes-v1-5-e73ca6f4ea83@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
| * | eventfd: simplify eventfd_signal()Christian Brauner2023-11-281-3/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | pds_vdpa: set features orderShannon Nelson2023-12-011-2/+1
| | | | | | | | | | | | | | | | | | | | | | Fix up the order that the device and negotiated features are checked to get a more reliable difference when things get changed. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-4-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | pds_vdpa: clear config callback when status goes to 0Shannon Nelson2023-12-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | If the client driver is setting status to 0, something is getting shutdown and possibly removed. Make sure we clear the config_cb so that it doesn't end up crashing when trying to call a bogus callback. Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-3-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | pds_vdpa: fix up format-truncation complaintShannon Nelson2023-12-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Our friendly kernel test robot has recently been pointing out some format-truncation issues. Here's a fix for one of them. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311040109.RfgJoE7L-lkp@intel.com/ Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20231110221802.46841-2-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* | vdpa/mlx5: preserve CVQ vringh indexSteve Sistare2023-12-011-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5_vdpa does not preserve userland's view of vring base for the control queue in the following sequence: ioctl VHOST_SET_VRING_BASE ioctl VHOST_VDPA_SET_STATUS VIRTIO_CONFIG_S_DRIVER_OK mlx5_vdpa_set_status() setup_cvq_vring() vringh_init_iotlb() vringh_init_kern() vrh->last_avail_idx = 0; ioctl VHOST_GET_VRING_BASE To fix, restore the value of cvq->vring.last_avail_idx after calling vringh_init_iotlb. Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <1699014387-194368-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vdpa_sim_blk: allocate the buffer zeroedStefano Garzarella2023-11-011-2/+2
|/ | | | | | | | | | | | | | Deleting and recreating a device can lead to having the same content as the old device, so let's always allocate buffers completely zeroed out. Fixes: abebb16254b3 ("vdpa_sim_blk: support shared backend") Suggested-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20231031144339.121453-1-sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
* vdpa_sim: implement .reset_map supportSi-Wei Liu2023-11-011-9/+43
| | | | | | | | | | | | | | | | | | | | | In order to reduce excessive memory mapping cost in live migration and VM reboot, it is desirable to decouple the vhost-vdpa IOTLB abstraction from the virtio device life cycle, i.e. mappings can be kept intact across virtio device reset. Leverage the .reset_map callback, which is meant to destroy the iotlb on the given ASID and recreate the 1:1 passthrough/identity mapping. To be consistent, the mapping on device creation is initiailized to passthrough/identity with PA 1:1 mapped as IOVA. With this the device .reset op doesn't have to maintain and clean up memory mappings by itself. Additionally, implement .compat_reset to cater for older userspace, which may wish to see mapping to be cleared during reset. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <1697880319-4937-8-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vdpa/mlx5: implement .reset_map driver opSi-Wei Liu2023-11-013-3/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 6f5312f80183 ("vdpa/mlx5: Add support for running with virtio_vdpa"), mlx5_vdpa starts with preallocate 1:1 DMA MR at device creation time. This 1:1 DMA MR will be implicitly destroyed while the first .set_map call is invoked, in which case callers like vhost-vdpa will start to set up custom mappings. When the .reset callback is invoked, the custom mappings will be cleared and the 1:1 DMA MR will be re-created. In order to reduce excessive memory mapping cost in live migration, it is desirable to decouple the vhost-vdpa IOTLB abstraction from the virtio device life cycle, i.e. mappings can be kept around intact across virtio device reset. Leverage the .reset_map callback, which is meant to destroy the regular MR (including cvq mapping) on the given ASID and recreate the initial DMA mapping. That way, the device .reset op runs free from having to maintain and clean up memory mappings by itself. Additionally, implement .compat_reset to cater for older userspace, which may wish to see mapping to be cleared during reset. Co-developed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Message-Id: <1697880319-4937-7-git-send-email-si-wei.liu@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vduse: make vduse_class constantGreg Kroah-Hartman2023-11-011-19/+21
| | | | | | | | | | | | | | | | | Now that the driver core allows for struct class to be in read-only memory, we should make all 'class' structures declared at build time placing them into read-only memory, instead of having to be dynamically allocated at runtime. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <2023100643-tricolor-citizen-6c2d@gregkh> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
* mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OKEugenio Pérez2023-11-011-0/+7
| | | | | | | | | | Offer this backend feature as mlx5 is compatible with it. It allows it to do live migration with CVQ, dynamically switching between passthrough and shadow virtqueue. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20230703142514.363256-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vdpa/mlx5: Update cvq iotlb mapping on ASID changeDragos Tatulea2023-11-013-1/+36
| | | | | | | | | | | | | | | | | | | | | | For the following sequence: - cvq group is in ASID 0 - .set_map(1, cvq_iotlb) - .set_group_asid(cvq_group, 1) ... the cvq mapping from ASID 0 will be used. This is not always correct behaviour. This patch adds support for the above mentioned flow by saving the iotlb on each .set_map and updating the cvq iotlb with it on a cvq group change. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231018171456.1624030-18-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vdpa/mlx5: Make iotlb helper functions more genericDragos Tatulea2023-11-011-8/+11
| | | | | | | | | | | | | | | They will be used in a follow-up patch. For dup_iotlb, avoid the src == dst case. This is an error. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231018171456.1624030-17-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vdpa/mlx5: Enable hw support for vq descriptor mappingDragos Tatulea2023-11-011-1/+23
| | | | | | | | | | | | | | | | | | | | | | | Vq descriptor mappings are supported in hardware by filling in an additional mkey which contains the descriptor mappings to the hw vq. A previous patch in this series added support for hw mkey (mr) creation for ASID 1. This patch fills in both the vq data and vq descriptor mkeys based on group ASID mapping. The feature is signaled to the vdpa core through the presence of the .get_vq_desc_group op. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231018171456.1624030-16-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vdpa/mlx5: Introduce mr for vq descriptorDragos Tatulea2023-11-013-14/+25
| | | | | | | | | | | | | | | | Introduce the vq descriptor group and mr per ASID. Until now .set_map on ASID 1 was only updating the cvq iotlb. From now on it also creates a mkey for it. The current patch doesn't use it but follow-up patches will add hardware support for mapping the vq descriptors. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231018171456.1624030-15-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Lei Yang <leiyang@redhat.com>
* vdpa/mlx5: Improve mr update flowDragos Tatulea2023-11-013-72/+82
| | | | | | | | | | | | | | | | | | | | | | | | The current flow for updating an mr works directly on mvdev->mr which makes it cumbersome to handle multiple new mr structs. This patch makes the flow more straightforward by having mlx5_vdpa_create_mr return a new mr which will update the old mr (if any). The old mr will be deleted and unlinked from mvdev. For the case when the iotlb is empty (not NULL), the old mr will be cleared. This change paves the way for adding mrs for different ASIDs. The initialized bool is no longer needed as mr is now a pointer in the mlx5_vdpa_dev struct which will be NULL when not initialized. Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20231018171456.1624030-14-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Si-Wei Liu <si-wei.liu@oracle.com> Tested-by: Lei Yang <leiyang@redhat.com>