summaryrefslogtreecommitdiffstats
path: root/drivers/block/xen-blkfront.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linuxLinus Torvalds2023-06-261-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: - NVMe pull request via Keith: - Various cleanups all around (Irvin, Chaitanya, Christophe) - Better struct packing (Christophe JAILLET) - Reduce controller error logs for optional commands (Keith) - Support for >=64KiB block sizes (Daniel Gomez) - Fabrics fixes and code organization (Max, Chaitanya, Daniel Wagner) - bcache updates via Coly: - Fix a race at init time (Mingzhe Zou) - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye) - use page pinning in the block layer for dio (David) - convert old block dio code to page pinning (David, Christoph) - cleanups for pktcdvd (Andy) - cleanups for rnbd (Guoqing) - use the unchecked __bio_add_page() for the initial single page additions (Johannes) - fix overflows in the Amiga partition handling code (Michael) - improve mq-deadline zoned device support (Bart) - keep passthrough requests out of the IO schedulers (Christoph, Ming) - improve support for flush requests, making them less special to deal with (Christoph) - add bdev holder ops and shutdown methods (Christoph) - fix the name_to_dev_t() situation and use cases (Christoph) - decouple the block open flags from fmode_t (Christoph) - ublk updates and cleanups, including adding user copy support (Ming) - BFQ sanity checking (Bart) - convert brd from radix to xarray (Pankaj) - constify various structures (Thomas, Ivan) - more fine grained persistent reservation ioctl capability checks (Jingbo) - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan, Jordy, Li, Min, Yu, Zhong, Waiman) * tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits) scsi/sg: don't grab scsi host module reference ext4: Fix warning in blkdev_put() block: don't return -EINVAL for not found names in devt_from_devname cdrom: Fix spectre-v1 gadget block: Improve kernel-doc headers blk-mq: don't insert passthrough request into sw queue bsg: make bsg_class a static const structure ublk: make ublk_chr_class a static const structure aoe: make aoe_class a static const structure block/rnbd: make all 'class' structures const block: fix the exclusive open mask in disk_scan_partitions block: add overflow checks for Amiga partition support block: change all __u32 annotations to __be32 in affs_hardblocks.h block: fix signed int overflow in Amiga partition support block: add capacity validation in bdev_add_partition() block: fine-granular CAP_SYS_ADMIN for Persistent Reservation block: disallow Persistent Reservation on partitions reiserfs: fix blkdev_put() warning from release_journal_dev() block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions() block: document the holder argument to blkdev_get_by_path ...
| * block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig2023-06-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | xen/blkfront: Only check REQ_FUA for writesRoss Lagerwall2023-05-241-1/+2
|/ | | | | | | | | | | | | | | | | The existing code silently converts read operations with the REQ_FUA bit set into write-barrier operations. This results in data loss as the backend scribbles zeroes over the data instead of returning it. While the REQ_FUA bit doesn't make sense on a read operation, at least one well-known out-of-tree kernel module does set it and since it results in data loss, let's be safe here and only look at REQ_FUA for writes. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230426164005.2213139-1-ross.lagerwall@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
* Merge tag 'for-linus-6.2-rc4-tag' of ↵Linus Torvalds2023-01-121-2/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two cleanup patches - a fix of a memory leak in the Xen pvfront driver - a fix of a locking issue in the Xen hypervisor console driver * tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvcalls: free active map buffer on pvcalls_front_free_map hvc/xen: lock console list traversal x86/xen: Remove the unused function p2m_index() xen: make remove callback of xen driver void returned
| * xen: make remove callback of xen driver void returnedDawei Li2022-12-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit fc7a6209d571 ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. This change is for xen bus based drivers. Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB23238119AB4DF190997075C9CAE39@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Juergen Gross <jgross@suse.com>
* | block: set the disk capacity to 0 in blk_mark_disk_deadChristoph Hellwig2022-11-021-1/+0
|/ | | | | | | | | | | | | | | | | | | | nvme and xen-blkfront are already doing this to stop buffered writes from creating dirty pages that can't be written out later. Move it to the common code. This also removes the comment about the ordering from nvme, as bd_mutex not only is gone entirely, but also hasn't been used for locking updates to the disk size long before that, and thus the ordering requirement documented there doesn't apply any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chao Leng <lengchao@huawei.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'for-linus-6.0-rc4-tag' of ↵Linus Torvalds2022-09-031-8/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a minor fix for the Xen grant driver - a small series fixing a recently introduced problem in the Xen blkfront/blkback drivers with negotiation of feature usage * tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/grants: prevent integer overflow in gnttab_dma_alloc_pages() xen-blkfront: Cache feature_persistent value before advertisement xen-blkfront: Advertise feature-persistent as user requested xen-blkback: Advertise feature-persistent as user requested
| * xen-blkfront: Cache feature_persistent value before advertisementSeongJae Park2022-09-021-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xen blkfront advertises its support of the persistent grants feature when it first setting up and when resuming in 'talk_to_blkback()'. Then, blkback reads the advertised value when it connects with blkfront and decides if it will use the persistent grants feature or not, and advertises its decision to blkfront. Blkfront reads the blkback's decision and it also makes the decision for the use of the feature. Commit 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect"), however, made the blkfront's read of the parameter for disabling the advertisement, namely 'feature_persistent', to be done when it negotiate, not when advertise. Therefore blkfront advertises without reading the parameter. As the field for caching the parameter value is zero-initialized, it always advertises as the feature is disabled, so that the persistent grants feature becomes always disabled. This commit fixes the issue by making the blkfront does parmeter caching just before the advertisement. Fixes: 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect") Cc: <stable@vger.kernel.org> # 5.10.x Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220831165824.94815-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
| * xen-blkfront: Advertise feature-persistent as user requestedSeongJae Park2022-09-021-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The advertisement of the persistent grants feature (writing 'feature-persistent' to xenbus) should mean not the decision for using the feature but only the availability of the feature. However, commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") made a field of blkfront, which was a place for saving only the negotiation result, to be used for yet another purpose: caching of the 'feature_persistent' parameter value. As a result, the advertisement, which should follow only the parameter value, becomes inconsistent. This commit fixes the misuse of the semantic by making blkfront saves the parameter value in a separate place and advertises the support based on only the saved value. Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Juergen Gross <jgross@suse.com> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220831165824.94815-3-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-linus-6.0-rc1b-tag' of ↵Linus Torvalds2022-08-141-3/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: - fix the handling of the "persistent grants" feature negotiation between Xen blkfront and Xen blkback drivers - a cleanup of xen.config and adding xen.config to Xen section in MAINTAINERS - support HVMOP_set_evtchn_upcall_vector, which is more compliant to "normal" interrupt handling than the global callback used up to now - further small cleanups * tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections xen: remove XEN_SCRUB_PAGES in xen.config xen/pciback: Fix comment typo xen/xenbus: fix return type in xenbus_file_read() xen-blkfront: Apply 'feature_persistent' parameter when connect xen-blkback: Apply 'feature_persistent' parameter when connect xen-blkback: fix persistent grants negotiation x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
| * xen-blkfront: Apply 'feature_persistent' parameter when connectSeongJae Park2022-08-121-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. Similar behavioral change has made on 'blkfront' by commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants"). This commit changes the behavior of the parameter to make effect for every connect, so that the previous behavior of 'blkfront' can be restored. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-blockLinus Torvalds2022-08-021-2/+2
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: - Improve the type checking of request flags (Bart) - Ensure queue mapping for a single queues always picks the right queue (Bart) - Sanitize the io priority handling (Jan) - rq-qos race fix (Jinke) - Reserved tags handling improvements (John) - Separate memory alignment from file/disk offset aligment for O_DIRECT (Keith) - Add new ublk driver, userspace block driver using io_uring for communication with the userspace backend (Ming) - Use try_cmpxchg() to cleanup the code in various spots (Uros) - Finally remove bdevname() (Christoph) - Clean up the zoned device handling (Christoph) - Clean up independent access range support (Christoph) - Clean up and improve block sysfs handling (Christoph) - Clean up and improve teardown of block devices. This turns the usual two step process into something that is simpler to implement and handle in block drivers (Christoph) - Clean up chunk size handling (Christoph) - Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu, Ming, Sebastian, Yang, Ying) * tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits) ublk_drv: fix double shift bug ublk_drv: make sure that correct flags(features) returned to userspace ublk_drv: fix error handling of ublk_add_dev ublk_drv: fix lockdep warning block: remove __blk_get_queue block: call blk_mq_exit_queue from disk_release for never added disks blk-mq: fix error handling in __blk_mq_alloc_disk ublk: defer disk allocation ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask ublk: fold __ublk_create_dev into ublk_ctrl_add_dev ublk: cleanup ublk_ctrl_uring_cmd ublk: simplify ublk_ch_open and ublk_ch_release ublk: remove the empty open and release block device operations ublk: remove UBLK_IO_F_PREFLUSH ublk: add a MAINTAINERS entry block: don't allow the same type rq_qos add more than once mmc: fix disk/queue leak in case of adding disk failure ublk_drv: fix an IS_ERR() vs NULL check ublk: remove UBLK_IO_F_INTEGRITY ublk_drv: remove unneeded semicolon ...
| * block: remove blk_cleanup_diskChristoph Hellwig2022-06-281-2/+2
| | | | | | | | | | | | | | | | | | | | blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | xen/blkfront: force data bouncing when backend is untrustedRoger Pau Monne2022-07-011-15/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the current bounce buffering logic used with persistent grants into it's own option, and allow enabling it independently of persistent grants. This allows to reuse the same code paths to perform the bounce buffering required to avoid leaking contiguous data in shared pages not part of the request fragments. Reporting whether the backend is to be trusted can be done using a module parameter, or from the xenstore frontend path as set by the toolstack when adding the device. This is CVE-2022-33742, part of XSA-403. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | xen/blkfront: fix leaking data in shared pagesRoger Pau Monne2022-07-011-2/+3
|/ | | | | | | | | | | | | When allocating pages to be used for shared communication with the backend always zero them, this avoids leaking unintended data present on the pages. This is CVE-2022-26365, part of XSA-403. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* xen-blkfront: Handle NULL gendiskJason Andryuk2022-06-211-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a VBD is not fully created and then closed, the kernel can have a NULL pointer dereference: The reproducer is trivial: [user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz [user@dom0 ~]$ xl block-list work Vdev BE handle state evt-ch ring-ref BE-path 51712 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51712 51728 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51728 51744 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51744 51760 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51760 51840 3 241 3 -1 -1 /local/domain/3/backend/vbd/241/51840 ^ note state, the /dev/sdz doesn't exist in the backend [user@dom0 ~]$ sudo xl block-detach work xvdi [user@dom0 ~]$ xl block-list work Vdev BE handle state evt-ch ring-ref BE-path work is an invalid domain identifier And its console has: BUG: kernel NULL pointer dereference, address: 0000000000000050 PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1 RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40 Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48 RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293 RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8 R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000 R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00 FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0 Call Trace: <TASK> blkback_changed+0x95/0x137 [xen_blkfront] ? read_reply+0x160/0x160 xenwatch_thread+0xc0/0x1a0 ? do_wait_intr_irq+0xa0/0xa0 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront CR2: 0000000000000050 ---[ end trace 7bc9597fd06ae89d ]--- RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40 Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48 RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293 RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000 RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8 R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000 R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00 FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0 Kernel panic - not syncing: Fatal exception Kernel Offset: disabled info->rq and info->gd are only set in blkfront_connect(), which is called for state 4 (XenbusStateConnected). Guard against using NULL variables in blkfront_closing() to avoid the issue. The rest of blkfront_closing looks okay. If info->nr_rings is 0, then for_each_rinfo won't do anything. blkfront_remove also needs to check for non-NULL pointers before cleaning up the gendisk and request queue. Fixes: 05d69d950d9d "xen-blkfront: sanitize the removal state machine" Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220601195341.28581-1-jandryuk@gmail.com Signed-off-by: Juergen Gross <jgross@suse.com>
* Merge tag 'for-linus-5.19-rc1b-tag' of ↵Linus Torvalds2022-06-041-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: "Two cleanup patches for Xen related code and (more important) an update of MAINTAINERS for Xen, as Boris Ostrovsky decided to step down" * tag 'for-linus-5.19-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: replace xen_remap() with memremap() MAINTAINERS: Update Xen maintainership xen: switch gnttab_end_foreign_access() to take a struct page pointer
| * xen: switch gnttab_end_foreign_access() to take a struct page pointerJuergen Gross2022-05-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of a virtual kernel address use a pointer of the associated struct page as second parameter of gnttab_end_foreign_access(). Most users have that pointer available already and are creating the virtual address from it, risking problems in case the memory is located in highmem. gnttab_end_foreign_access() itself won't need to get the struct page from the address again. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-linus-5.19-rc1-tag' of ↵Linus Torvalds2022-05-231-40/+17
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - decouple the PV interface from kernel internals in the Xen scsifront/scsiback pv drivers - harden the Xen scsifront PV driver against a malicious backend driver - simplify Xen PV frontend driver ring page setup - support Xen setups with multiple domains created at boot time to tolerate Xenstore coming up late - two small cleanup patches * tag 'for-linus-5.19-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits) xen: add support for initializing xenstore later as HVM domain xen: sync xs_wire.h header with upstream xen x86: xen: remove STACK_FRAME_NON_STANDARD from xen_cpuid xen-blk{back,front}: Update contact points for buffer_squeeze_duration_ms and feature_persistent xen/xenbus: eliminate xenbus_grant_ring() xen/sndfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/usbfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/scsifront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/pcifront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/drmfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/tpmfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/netfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/blkfront: use xenbus_setup_ring() and xenbus_teardown_ring() xen/xenbus: add xenbus_setup_ring() service function xen: update ring.h xen/shbuf: switch xen-front-pgdir-shbuf to use INVALID_GRANT_REF xen/dmabuf: switch gntdev-dmabuf to use INVALID_GRANT_REF xen/sound: switch xen_snd_front to use INVALID_GRANT_REF xen/drm: switch xen_drm_front to use INVALID_GRANT_REF xen/usb: switch xen-hcd to use INVALID_GRANT_REF ...
| * xen/blkfront: use xenbus_setup_ring() and xenbus_teardown_ring()Juergen Gross2022-05-191-29/+8
| | | | | | | | | | | | | | | | | | | | Simplify blkfront's ring creation and removal via xenbus_setup_ring() and xenbus_teardown_ring(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
| * xen/blkfront: switch blkfront to use INVALID_GRANT_REFJuergen Gross2022-05-191-14/+12
| | | | | | | | | | | | | | | | | | | | Instead of using a private macro for an invalid grant reference use the common one. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | block: decouple REQ_OP_SECURE_ERASE from REQ_OP_DISCARDChristoph Hellwig2022-04-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Secure erase is a very different operation from discard in that it is a data integrity operation vs hint. Fully split the limits and helper infrastructure to make the separation more clear. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2] Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Chao Yu <chao@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | block: remove QUEUE_FLAG_DISCARDChristoph Hellwig2022-04-171-2/+1
|/ | | | | | | | | | | | | | | | | | | Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds2022-04-011-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
| * xen/blkfront: fix comment for need_copyDongli Zhang2022-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'need_copy' is set when rq_data_dir(req) returns WRITE, in order to copy the written data to persistent page. ".need_copy = rq_data_dir(req) && info->feature_persistent," Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Fixes: c004a6fe0c40 ('block/xen-blkfront: Make it running on 64KB page granularity') Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220317220930.5698-1-dongli.zhang@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge tag 'for-linus-5.18-rc1-tag' of ↵Linus Torvalds2022-03-281-4/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - A bunch of minor cleanups - A fix for kexec in Xen dom0 when executed on a high cpu number - A fix for resuming after suspend of a Xen guest with assigned PCI devices - A fix for a crash due to not disabled preemption when resuming as Xen dom0 * tag 'for-linus-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix is_xen_pmu() xen: don't hang when resuming PCI device arch:x86:xen: Remove unnecessary assignment in xen_apic_read() xen/grant-table: remove readonly parameter from functions xen/grant-table: remove gnttab_*transfer*() functions drivers/xen: use helper macro __ATTR_RW x86/xen: Fix kerneldoc warning xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 xen: use time_is_before_eq_jiffies() instead of open coding it
| * | xen/grant-table: remove readonly parameter from functionsJuergen Gross2022-03-151-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gnttab_end_foreign_access() family of functions is taking a "readonly" parameter, which isn't used. Remove it from the function parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220311103429.12845-3-jgross@suse.com Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
* | | Merge tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-blockLinus Torvalds2022-03-211-1/+4
|\ \ \ | |/ / |/| / | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver updates from Jens Axboe: - NVMe updates via Christoph: - add vectored-io support for user-passthrough (Kanchan Joshi) - add verbose error logging (Alan Adamson) - support buffered I/O on block devices in nvmet (Chaitanya Kulkarni) - central discovery controller support (Martin Belanger) - fix and extended the globally unique idenfier validation (Christoph) - move away from the deprecated IDA APIs (Sagi Grimberg) - misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin, Chaitanya Kulkarni) - add lockdep annotations for in-kernel sockets (Chris Leech) - use vmalloc for ANA log buffer (Hannes Reinecke) - kerneldoc fixes (Chaitanya Kulkarni) - cleanups (Guoqing Jiang, Chaitanya Kulkarni, Christoph) - warn about shared namespaces without multipathing (Christoph) - MD updates via Song with a set of cleanups (Christoph, Mariusz, Paul, Erik, Dirk) - loop cleanups and queue depth configuration (Chaitanya) - null_blk cleanups and fixes (Chaitanya) - Use descriptive init/exit names in virtio_blk (Randy) - Use bvec_kmap_local() in drivers (Christoph) - bcache fixes (Mingzhe) - xen blk-front persistent grant speedups (Juergen) - rnbd fix and cleanup (Gioh) - Misc fixes (Christophe, Colin) * tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-block: (76 commits) virtio_blk: eliminate anonymous module_init & module_exit nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH nvme: remove nvme_alloc_request and nvme_alloc_request_qid nvme: cleanup how disk->disk_name is assigned nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate nvmet: use snprintf() with PAGE_SIZE in configfs nvmet: don't fold lines nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport nvme-tcp: lockdep: annotate in-kernel sockets nvme-tcp: don't fold the line nvme-tcp: don't initialize ret variable nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio nvme-multipath: use vmalloc for ANA log buffer xen/blkfront: speed up purge_persistent_grants() raid5: initialize the stripe_head embeeded bios as needed raid5-cache: statically allocate the recovery ra bio raid5-cache: fully initialize flush_bio when needed raid5-ppl: fully initialize the bio in ppl_new_iounit ...
| * xen/blkfront: speed up purge_persistent_grants()Juergen Gross2022-03-111-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | purge_persistent_grants() is scanning the grants list for persistent grants being no longer in use by the backend. When having found such a grant, it will be set to "invalid" and pushed to the tail of the list. Instead of pushing it directly to the end of the list, add it first to a temporary list, avoiding to scan those entries again in the main list traversal. After having finished the scan, append the temporary list to the grant list. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Link: https://lore.kernel.org/r/20220311103527.12931-1-jgross@suse.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | xen/blkfront: don't use gnttab_query_foreign_access() for mapped statusJuergen Gross2022-03-071-26/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function. In case the call was done in preparation of revoking a grant it is better to do so via gnttab_end_foreign_access_ref() and check the success of that operation instead. For the ring allocation use alloc_pages_exact() in order to avoid high order pages in case of a multi-page ring. If a grant wasn't unmapped by the backend without persistent grants being used, set the device state to "error". This is CVE-2022-23036 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- V2: - use gnttab_try_end_foreign_access() V4: - use alloc_pages_exact() and free_pages_exact() - set state to error if backend didn't unmap (Roger Pau Monné)
* | block: fix surprise removal for drivers calling blk_set_queue_dyingChristoph Hellwig2022-02-171-1/+1
|/ | | | | | | | | | | | | | | | | Various block drivers call blk_set_queue_dying to mark a disk as dead due to surprise removal events, but since commit 8e141f9eb803 that doesn't work given that the GD_DEAD flag needs to be set to stop I/O. Replace the driver calls to blk_set_queue_dying with a new (and properly documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the only remaining caller. Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk") Reported-by: Markus Blöchl <markus.bloechl@ipetronik.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'for-5.17/block-2022-01-11' of git://git.kernel.dk/linux-blockLinus Torvalds2022-01-121-15/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block updates from Jens Axboe: - Unify where the struct request handling code is located in the blk-mq code (Christoph) - Header cleanups (Christoph) - Clean up the io_context handling code (Christoph, me) - Get rid of ->rq_disk in struct request (Christoph) - Error handling fix for add_disk() (Christoph) - request allocation cleanusp (Christoph) - Documentation updates (Eric, Matthew) - Remove trivial crypto unregister helper (Eric) - Reduce shared tag overhead (John) - Reduce poll_stats memory overhead (me) - Known indirect function call for dio (me) - Use atomic references for struct request (me) - Support request list issue for block and NVMe (me) - Improve queue dispatch pinning (Ming) - Improve the direct list issue code (Keith) - BFQ improvements (Jan) - Direct completion helper and use it in mmc block (Sebastian) - Use raw spinlock for the blktrace code (Wander) - fsync error handling fix (Ye) - Various fixes and cleanups (Lukas, Randy, Yang, Tetsuo, Ming, me) * tag 'for-5.17/block-2022-01-11' of git://git.kernel.dk/linux-block: (132 commits) MAINTAINERS: add entries for block layer documentation docs: block: remove queue-sysfs.rst docs: sysfs-block: document virt_boundary_mask docs: sysfs-block: document stable_writes docs: sysfs-block: fill in missing documentation from queue-sysfs.rst docs: sysfs-block: add contact for nomerges docs: sysfs-block: sort alphabetically docs: sysfs-block: move to stable directory block: don't protect submit_bio_checks by q_usage_counter block: fix old-style declaration nvme-pci: fix queue_rqs list splitting block: introduce rq_list_move block: introduce rq_list_for_each_safe macro block: move rq_list macros to blk-mq.h block: drop needless assignment in set_task_ioprio() block: remove unnecessary trailing '\' bio.h: fix kernel-doc warnings block: check minor range in device_add_disk() block: use "unsigned long" for blk_validate_block_size(). block: fix error unwinding in device_add_disk ...
| * block: remove GENHD_FL_CDChristoph Hellwig2021-11-291-15/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GENHD_FL_CD marks a gendisk as a vaguely CD-ROM like device. Besides being used internally inside of sunvdc.c an xen-blkfront it is used by xen-blkback as a hint to claim a device exported to a guest is a CD-ROM like device. Just check for disk->cdi instead which is the right indicator for "real" CD-ROM or DVD drivers. This will miss the paravirtualized guest drivers, but those make little sense to report anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211122130625.1136848-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | xen/blkfront: harden blkfront against event channel stormsJuergen Gross2021-12-161-3/+12
|/ | | | | | | | | | | The Xen blkfront driver is still vulnerable for an attack via excessive number of events sent by the backend. Fix that by using lateeoi event channels. This is part of XSA-391 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
* xen-blkfront: add error handling support for add_disk()Luis Chamberlain2021-10-211-1/+7
| | | | | | | | | | | | | | | | | | We never checked for errors on device_add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. The function xlvbd_alloc_gendisk() typically does the unwinding on error on allocating the disk and creating the tag, but since all that error handling was stuffed inside xlvbd_alloc_gendisk() we must repeat the tag free'ing as well. We set the info->rq to NULL to ensure blkif_free() doesn't crash on blk_mq_stop_hw_queues() on device_add_disk() error as the queue will be long gone by then. Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211015233028.2167651-6-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: drop unused includes in <linux/genhd.h>Christoph Hellwig2021-10-181-0/+1
| | | | | | | | | | Drop various include not actually used in genhd.h itself, and move the remaning includes closer together. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210920123328.1399408-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'for-linus-5.15-rc1-tag' of ↵Linus Torvalds2021-09-021-42/+84
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - some small cleanups - a fix for a bug when running as Xen PV guest which could result in not all memory being transferred in case of a migration of the guest - a small series for getting rid of code for supporting very old Xen hypervisor versions nobody should be using since many years now - a series for hardening the Xen block frontend driver - a fix for Xen PV boot code issuing warning messages due to a stray preempt_disable() on the non-boot processors * tag 'for-linus-5.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: remove stray preempt_disable() from PV AP startup code xen/pcifront: Removed unnecessary __ref annotation x86: xen: platform-pci-unplug: use pr_err() and pr_warn() instead of raw printk() drivers/xen/xenbus/xenbus_client.c: fix bugon.cocci warnings xen/blkfront: don't trust the backend response data blindly xen/blkfront: don't take local copy of a request from the ring page xen/blkfront: read response from backend only once xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guests xen: assume XENFEAT_mmu_pt_update_preserve_ad being set for pv guests xen: check required Xen features xen: fix setting of max_pfn in shared_info
| * xen/blkfront: don't trust the backend response data blindlyJuergen Gross2021-08-301-17/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today blkfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Introduce a new state of the ring BLKIF_STATE_ERROR which will be switched to in case an inconsistency is being detected. Recovering from this state is possible only via removing and adding the virtual device again (e.g. via a suspend/resume cycle). Make all warning messages issued due to valid error responses rate limited in order to avoid message floods being triggered by a malicious backend. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-4-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
| * xen/blkfront: don't take local copy of a request from the ring pageJuergen Gross2021-08-301-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | In order to avoid a malicious backend being able to influence the local copy of a request build the request locally first and then copy it to the ring page instead of doing it the other way round as today. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-3-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
| * xen/blkfront: read response from backend only onceJuergen Gross2021-08-301-17/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210730103854.12681-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com>
* | xen-blkfront: Remove redundant assignment to variable errColin Ian King2021-08-091-1/+0
|/ | | | | | | | | | | The variable err is being assigned a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210806110601.11386-1-colin.king@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* xen-blkfront: sanitize the removal state machineChristoph Hellwig2021-07-151-198/+26
| | | | | | | | | | | | | | | | | xen-blkfront has a weird protocol where close message from the remote side can be delayed, and where hot removals are treated somewhat differently from regular removals, all leading to potential NULL pointer removals, and a del_gendisk from the block device release method, which will deadlock. Fix this by just performing normal hot removals even when the device is opened like all other Linux block drivers. Fixes: c76f48eb5c08 ("block: take bd_mutex around delete_partitions in del_gendisk") Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20210715141711.1257293-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* xen-blkfront: use blk_mq_alloc_disk and blk_cleanup_diskChristoph Hellwig2021-06-111-57/+39
| | | | | | | | | | Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210602065345.355274-26-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: move bd_mutex to struct gendiskChristoph Hellwig2021-06-011-4/+4
| | | | | | | | | | | Replace the per-block device bd_mutex with a per-gendisk open_mutex, thus simplifying locking wherever we deal with partitions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210525061301.2242282-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-blockLinus Torvalds2021-04-281-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver updates from Jens Axboe: - MD changes via Song: - raid5 POWER fix - raid1 failure fix - UAF fix for md cluster - mddev_find_or_alloc() clean up - Fix NULL pointer deref with external bitmap - Performance improvement for raid10 discard requests - Fix missing information of /proc/mdstat - rsxx const qualifier removal (Arnd) - Expose allocated brd pages (Calvin) - rnbd via Gioh Kim: - Change maintainer - Change domain address of maintainers' email - Add polling IO mode and document update - Fix memory leak and some bug detected by static code analysis tools - Code refactoring - Series of floppy cleanups/fixes (Denis) - s390 dasd fixes (Julian) - kerneldoc fixes (Lee) - null_blk double free (Lv) - null_blk virtual boundary addition (Max) - Remove xsysace driver (Michal) - umem driver removal (Davidlohr) - ataflop fixes (Dan) - Revalidate disk removal (Christoph) - Bounce buffer cleanups (Christoph) - Mark lightnvm as deprecated (Christoph) - mtip32xx init cleanups (Shixin) - Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang) * tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits) async_xor: increase src_offs when dropping destination page drivers/block/null_blk/main: Fix a double free in null_init. md/raid1: properly indicate failure when ending a failed write request md-cluster: fix use-after-free issue when removing rdev nvme: introduce generic per-namespace chardev nvme: cleanup nvme_configure_apst nvme: do not try to reconfigure APST when the controller is not live nvme: add 'kato' sysfs attribute nvme: sanitize KATO setting nvmet: avoid queuing keep-alive timer if it is disabled brd: expose number of allocated pages in debugfs ataflop: fix off by one in ataflop_probe() ataflop: potential out of bounds in do_format() drbd: Fix fall-through warnings for Clang block/rnbd: Use strscpy instead of strlcpy block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name block/rnbd-clt: Remove max_segment_size block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Documentation/ABI/rnbd-clt: Add description for nr_poll_queues ...
| * block: xen-blkfront: Demote kernel-doc abusesLee Jones2021-04-061-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following W=1 kernel build warning(s): drivers/block/xen-blkfront.c:1960: warning: Function parameter or member 'dev' not described in 'blkfront_probe' drivers/block/xen-blkfront.c:1960: warning: Function parameter or member 'id' not described in 'blkfront_probe' drivers/block/xen-blkfront.c:1960: warning: expecting prototype for Allocate the basic(). Prototype was for blkfront_probe() instead drivers/block/xen-blkfront.c:2085: warning: Function parameter or member 'dev' not described in 'blkfront_resume' drivers/block/xen-blkfront.c:2085: warning: expecting prototype for or a backend(). Prototype was for blkfront_resume() instead drivers/block/xen-blkfront.c:2444: warning: wrong kernel-doc identifier on line: Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: xen-devel@lists.xenproject.org Cc: linux-block@vger.kernel.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210312105530.2219008-11-lee.jones@linaro.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | xen-blkfront: Fix 'physical' typosBjorn Helgaas2021-04-231-1/+1
|/ | | | | | | | | Fix misspelling of "physical". Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210126205509.2917606-1-helgaas@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
* Merge tag 'for-linus-5.11-rc6-tag' of ↵Linus Torvalds2021-01-281-13/+7
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for a regression introduced in 5.11 resulting in Xen dom0 having problems to correctly initialize Xenstore. - A fix for avoiding WARN splats when booting as Xen dom0 with CONFIG_AMD_MEM_ENCRYPT enabled due to a missing trap handler for the #VC exception (even if the handler should never be called). - A fix for the Xen bklfront driver adapting to the correct but unexpected behavior of new qemu. * tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled xen: Fix XenStore initialisation for XS_LOCAL xen-blkfront: allow discard-* nodes to be optional
| * xen-blkfront: allow discard-* nodes to be optionalRoger Pau Monne2021-01-261-13/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is inline with the specification described in blkif.h: * discard-granularity: should be set to the physical block size if node is not present. * discard-alignment, discard-secure: should be set to 0 if node not present. This was detected as QEMU would only create the discard-granularity node but not discard-alignment, and thus the setup done in blkfront_setup_discard would fail. Fix blkfront_setup_discard to not fail on missing nodes, and also fix blkif_set_queue_limits to set the discard granularity to the physical block size if none is specified in xenbus. Fixes: ed30bf317c5ce ('xen-blkfront: Handle discard requests.') Reported-by: Arthur Borsboom <arthurborsboom@gmail.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Tested-By: Arthur Borsboom <arthurborsboom@gmail.com> Link: https://lore.kernel.org/r/20210119105727.95173-1-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
* | Merge tag 'for-linus-5.11-rc1b-tag' of ↵Linus Torvalds2020-12-191-0/+1
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: "Some minor cleanup patches and a small series disentangling some Xen related Kconfig options" * tag 'for-linus-5.11-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Kconfig: remove X86_64 depends from XEN_512GB xen/manage: Fix fall-through warnings for Clang xen-blkfront: Fix fall-through warnings for Clang xen: remove trailing semicolon in macro definition xen: Kconfig: nest Xen guest options xen: Remove Xen PVH/PVHVM dependency on PCI x86/xen: Convert to DEFINE_SHOW_ATTRIBUTE