summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'x86_asm_for_v5.10' of ↵Linus Torvalds2020-10-134-25/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Borislav Petkov: "Two asm wrapper fixes: - Use XORL instead of XORQ to avoid a REX prefix and save some bytes in the .fixup section, by Uros Bizjak. - Replace __force_order dummy variable with a memory clobber to fix LLVM requiring a definition for former and to prevent memory accesses from still being cached/reordered, by Arvind Sankar" * tag 'x86_asm_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Replace __force_order with a memory clobber x86/uaccess: Use XORL %0,%0 in __get_user_asm()
| * x86/asm: Replace __force_order with a memory clobberArvind Sankar2020-10-013-24/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CRn accessor functions use __force_order as a dummy operand to prevent the compiler from reordering CRn reads/writes with respect to each other. The fact that the asm is volatile should be enough to prevent this: volatile asm statements should be executed in program order. However GCC 4.9.x and 5.x have a bug that might result in reordering. This was fixed in 8.1, 7.3 and 6.5. Versions prior to these, including 5.x and 4.9.x, may reorder volatile asm statements with respect to each other. There are some issues with __force_order as implemented: - It is used only as an input operand for the write functions, and hence doesn't do anything additional to prevent reordering writes. - It allows memory accesses to be cached/reordered across write functions, but CRn writes affect the semantics of memory accesses, so this could be dangerous. - __force_order is not actually defined in the kernel proper, but the LLVM toolchain can in some cases require a definition: LLVM (as well as GCC 4.9) requires it for PIE code, which is why the compressed kernel has a definition, but also the clang integrated assembler may consider the address of __force_order to be significant, resulting in a reference that requires a definition. Fix this by: - Using a memory clobber for the write functions to additionally prevent caching/reordering memory accesses across CRn writes. - Using a dummy input operand with an arbitrary constant address for the read functions, instead of a global variable. This will prevent reads from being reordered across writes, while allowing memory loads to be cached/reordered across CRn reads, which should be safe. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82602 Link: https://lore.kernel.org/lkml/20200527135329.1172644-1-arnd@arndb.de/ Link: https://lkml.kernel.org/r/20200902232152.3709896-1-nivedita@alum.mit.edu
| * x86/uaccess: Use XORL %0,%0 in __get_user_asm()Uros Bizjak2020-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XORL %0,%0 is equivalent to XORQ %0,%0 as both will zero the entire register. Use XORL %0,%0 for all operand sizes to avoid REX prefix byte when legacy registers are used and to avoid size prefix byte when 16bit registers are used. Zeroing the full register is OK in this use case. As a result, the size of the .fixup section decreases by 20 bytes. [ bp: Massage commit message. ] Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Link: https://lkml.kernel.org/r/20200827180904.96399-1-ubizjak@gmail.com
* | Merge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds2020-10-1358-1269/+2091
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver updates from Jens Axboe: "Here are the driver updates for 5.10. A few SCSI updates in here too, in coordination with Martin as they depend on core block changes for the shared tag bitmap. This contains: - NVMe pull requests via Christoph: - fix keep alive timer modification (Amit Engel) - order the PCI ID list more sensibly (Andy Shevchenko) - cleanup the open by controller helper (Chaitanya Kulkarni) - use an xarray for the CSE log lookup (Chaitanya Kulkarni) - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni) - fix nvme_ns_report_zones (Christoph Hellwig) - add a sanity check to nvmet-fc (James Smart) - fix interrupt allocation when too many polled queues are specified (Jeffle Xu) - small nvmet-tcp optimization (Mark Wunderlich) - fix a controller refcount leak on init failure (Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni) - major refactoring of the scanning code (Christoph Hellwig) - MD updates via Song: - Bug fixes in bitmap code, from Zhao Heming - Fix a work queue check, from Guoqing Jiang - Fix raid5 oops with reshape, from Song Liu - Clean up unused code, from Jason Yan - Discard improvements, from Xiao Ni - raid5/6 page offset support, from Yufen Yu - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap, Hannes) - null_blk open/active zone limit support (Niklas) - Set of bcache updates (Coly, Dongsheng, Qinglang)" * tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits) md/raid5: fix oops during stripe resizing md/bitmap: fix memory leak of temporary bitmap md: fix the checking of wrong work queue md/bitmap: md_bitmap_get_counter returns wrong blocks md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks md/raid0: remove unused function is_io_in_chunk_boundary() nvme-core: remove extra condition for vwc nvme-core: remove extra variable nvme: remove nvme_identify_ns_list nvme: refactor nvme_validate_ns nvme: move nvme_validate_ns nvme: query namespace identifiers before adding the namespace nvme: revalidate zone bitmaps in nvme_update_ns_info nvme: remove nvme_update_formats nvme: update the known admin effects nvme: set the queue limits in nvme_update_ns_info nvme: remove the 0 lba_shift check in nvme_update_ns_info nvme: clean up the check for too large logic block sizes nvme: freeze the queue over ->lba_shift updates nvme: factor out a nvme_configure_metadata helper ...
| * \ Merge branch 'md-next' of ↵Jens Axboe2020-10-095-24/+9
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.10/drivers Pull MD updates from Song: "The main changes are: - Bug fixes in bitmap code, from Zhao Heming. - Fix a work queue check, from Guoqing Jiang. - Fix raid5 oops with reshape, from Song Liu. - Clean up unused code, from Jason Yan." * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid5: fix oops during stripe resizing md/bitmap: fix memory leak of temporary bitmap md: fix the checking of wrong work queue md/bitmap: md_bitmap_get_counter returns wrong blocks md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks md/raid0: remove unused function is_io_in_chunk_boundary()
| | * | md/raid5: fix oops during stripe resizingSong Liu2020-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KoWei reported crash during raid5 reshape: [ 1032.252932] Oops: 0002 [#1] SMP PTI [...] [ 1032.252943] RIP: 0010:memcpy_erms+0x6/0x10 [...] [ 1032.252947] RSP: 0018:ffffba1ac0c03b78 EFLAGS: 00010286 [ 1032.252949] RAX: 0000784ac0000000 RBX: ffff91bec3d09740 RCX: 0000000000001000 [ 1032.252951] RDX: 0000000000001000 RSI: ffff91be6781c000 RDI: 0000784ac0000000 [ 1032.252953] RBP: ffffba1ac0c03bd8 R08: 0000000000001000 R09: ffffba1ac0c03bf8 [ 1032.252954] R10: 0000000000000000 R11: 0000000000000000 R12: ffffba1ac0c03bf8 [ 1032.252955] R13: 0000000000001000 R14: 0000000000000000 R15: 0000000000000000 [ 1032.252958] FS: 0000000000000000(0000) GS:ffff91becf500000(0000) knlGS:0000000000000000 [ 1032.252959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1032.252961] CR2: 0000784ac0000000 CR3: 000000031780a002 CR4: 00000000001606e0 [ 1032.252962] Call Trace: [ 1032.252969] ? async_memcpy+0x179/0x1000 [async_memcpy] [ 1032.252977] ? raid5_release_stripe+0x8e/0x110 [raid456] [ 1032.252982] handle_stripe_expansion+0x15a/0x1f0 [raid456] [ 1032.252988] handle_stripe+0x592/0x1270 [raid456] [ 1032.252993] handle_active_stripes.isra.0+0x3cb/0x5a0 [raid456] [ 1032.252999] raid5d+0x35c/0x550 [raid456] [ 1032.253002] ? schedule+0x42/0xb0 [ 1032.253006] ? schedule_timeout+0x10e/0x160 [ 1032.253011] md_thread+0x97/0x160 [ 1032.253015] ? wait_woken+0x80/0x80 [ 1032.253019] kthread+0x104/0x140 [ 1032.253022] ? md_start_sync+0x60/0x60 [ 1032.253024] ? kthread_park+0x90/0x90 [ 1032.253027] ret_from_fork+0x35/0x40 This is because cache_size_mutex was unlocked too early in resize_stripes, which races with grow_one_stripe() that grow_one_stripe() allocates a stripe with wrong pool_size. Fix this issue by unlocking cache_size_mutex after updating pool_size. Cc: <stable@vger.kernel.org> # v4.4+ Reported-by: KoWei Sung <winders@amazon.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| | * | md/bitmap: fix memory leak of temporary bitmapZhao Heming2020-10-082-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Callers of get_bitmap_from_slot() are responsible to free the bitmap. Suggested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| | * | md: fix the checking of wrong work queueGuoqing Jiang2020-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It should check md_rdev_misc_wq instead of md_misc_wq. Fixes: cc1ffe61c026 ("md: add new workqueue for delete rdev") Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| | * | md/bitmap: md_bitmap_get_counter returns wrong blocksZhao Heming2020-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | md_bitmap_get_counter() has code: ``` if (bitmap->bp[page].hijacked || bitmap->bp[page].map == NULL) csize = ((sector_t)1) << (bitmap->chunkshift + PAGE_COUNTER_SHIFT - 1); ``` The minus 1 is wrong, this branch should report 2048 bits of space. With "-1" action, this only report 1024 bit of space. This bug code returns wrong blocks, but it doesn't inflence bitmap logic: 1. Most callers focus this function return value (the counter of offset), not the parameter blocks. 2. The bug is only triggered when hijacked is true or map is NULL. the hijacked true condition is very rare. the "map == null" only true when array is creating or resizing. 3. Even the caller gets wrong blocks, current code makes caller just to call md_bitmap_get_counter() one more time. Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| | * | md/bitmap: md_bitmap_read_sb uses wrong bitmap blocksZhao Heming2020-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patched code is used to get chunks number, should use round-up div to replace current sector_div. The same code is in md_bitmap_resize(): ``` chunks = DIV_ROUND_UP_SECTOR_T(blocks, 1 << chunkshift); ``` Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| | * | md/raid0: remove unused function is_io_in_chunk_boundary()Jason Yan2020-10-081-17/+0
| |/ / | | | | | | | | | | | | | | | | | | | | | This function is no longger needed after commit 20d0189b1012 ("block: Introduce new bio_split()"). Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | Merge tag 'nvme-5.10-2020-10-08' of git://git.infradead.org/nvme into ↵Jens Axboe2020-10-085-268/+220
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for-5.10/drivers Pull NVMe updates from Christoph: "nvme update for 5.10: - fix a controller refcount leak on init failure (Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni) - major refactoring of the scanning code (me)" * tag 'nvme-5.10-2020-10-08' of git://git.infradead.org/nvme: (23 commits) nvme-core: remove extra condition for vwc nvme-core: remove extra variable nvme: remove nvme_identify_ns_list nvme: refactor nvme_validate_ns nvme: move nvme_validate_ns nvme: query namespace identifiers before adding the namespace nvme: revalidate zone bitmaps in nvme_update_ns_info nvme: remove nvme_update_formats nvme: update the known admin effects nvme: set the queue limits in nvme_update_ns_info nvme: remove the 0 lba_shift check in nvme_update_ns_info nvme: clean up the check for too large logic block sizes nvme: freeze the queue over ->lba_shift updates nvme: factor out a nvme_configure_metadata helper nvme: call nvme_identify_ns as the first thing in nvme_alloc_ns_block nvme: lift the check for an unallocated namespace into nvme_identify_ns nvme: rename __nvme_revalidate_disk nvme: rename _nvme_revalidate_disk nvme: rename nvme_validate_ns to nvme_validate_or_alloc_ns nvme: remove the disk argument to nvme_update_zone_info ...
| | * | nvme-core: remove extra condition for vwcChaitanya Kulkarni2020-10-071-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nvme_set_queue_limits() we initialize vwc to false and later add a condition to set vwc true. The value of the vwc can be declare initialized which makes all the blk_queue_XXX() calls uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme-core: remove extra variableChaitanya Kulkarni2020-10-071-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nvme_validate_ns() the exra variable ctrl is used only twice. Using ns->ctrl directly still maintains the redability and original length of the lines in the code. Get rid of the extra variable ctrl & use ns->ctrl directly. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * | nvme: remove nvme_identify_ns_listChristoph Hellwig2020-10-071-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just fold it into the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
| | * | nvme: refactor nvme_validate_nsChristoph Hellwig2020-10-071-20/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the logic to revalidate the block_device size or remove the namespace from the caller into nvme_validate_ns. This removes the return value and thus the status code translation. Additionally it also catches non-permanent errors from nvme_update_ns_info using the existing logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: move nvme_validate_nsChristoph Hellwig2020-10-071-37/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move nvme_validate_ns just above its only remaining caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: query namespace identifiers before adding the namespaceChristoph Hellwig2020-10-071-63/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check the namespace identifier list first thing when scanning namespaces. This keeps the code to query the CSI common between the alloc and validate path, and helps to structure the code better for multiple command set support. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: revalidate zone bitmaps in nvme_update_ns_infoChristoph Hellwig2020-10-071-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Consolidate the two calls into a single place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
| | * | nvme: remove nvme_update_formatsChristoph Hellwig2020-10-071-30/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the queue is frozen before updating ->lba_shift we can't hit the invalid references mentioned in the comment any more. More importantly this code would not have helped us if the format was changed by another controller or through implementation defined back channels. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: update the known admin effectsChristoph Hellwig2020-10-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A Format NVM command can change the capabilities of namespaces, while Sanitize does change the Logical Block Content and must be serialized. Also remove CSUPP bit for Format - it is not a mandatory command, and we don't check for the bit anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: set the queue limits in nvme_update_ns_infoChristoph Hellwig2020-10-071-25/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only set the queue limits once we have the real block size. This also updates the limits on a rescan if needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: remove the 0 lba_shift check in nvme_update_ns_infoChristoph Hellwig2020-10-071-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can no longer reach this code if Identify Namespace failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: clean up the check for too large logic block sizesChristoph Hellwig2020-10-071-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a single statement to set both the capacity and fake block size instead of two. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
| | * | nvme: freeze the queue over ->lba_shift updatesChristoph Hellwig2020-10-071-6/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that there can't be any I/O in flight went we change the disk geometry in nvme_update_ns_info, most notable the LBA size by lifting the queue free from nvme_update_disk_info into the caller Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: factor out a nvme_configure_metadata helperChristoph Hellwig2020-10-071-31/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out a helper from nvme_update_ns_info that configures the per-namespaces metadata and PI settings. Also make sure the helpers clear the flags explicitly instead of all of ->features to allow for potentially reusing ->features for future non-metadata flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: call nvme_identify_ns as the first thing in nvme_alloc_ns_blockChristoph Hellwig2020-10-071-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check if the namespace actually exists as the very first thing and don't bother with any extra work if not. This should speed up and simplify the sequential scanning for NVMe 1.0 devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: lift the check for an unallocated namespace into nvme_identify_nsChristoph Hellwig2020-10-071-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the check from the two callers into the common helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
| | * | nvme: rename __nvme_revalidate_diskChristoph Hellwig2020-10-071-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename __nvme_revalidate_disk to nvme_update_ns_info and pass a namespace instead of the gendisk. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: rename _nvme_revalidate_diskChristoph Hellwig2020-10-071-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename _nvme_revalidate_disk to nvme_validate_ns to better describe what the function does, and pass the struct nvme_ns instead of the gendisk to better match the call chain. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: rename nvme_validate_ns to nvme_validate_or_alloc_nsChristoph Hellwig2020-10-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a slightly more descriptive name to enable reusing nvme_validate_ns in the next patch for a lower level function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: remove the disk argument to nvme_update_zone_infoChristoph Hellwig2020-10-073-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The queue can trivially be derived from the nvme_ns structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme: fix initialization of the zone bitmapsChristoph Hellwig2020-10-073-23/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The removal of the ->revalidate_disk method broke the initialization of the zone bitmaps, as nvme_revalidate_disk now never gets called during initialization. Move the zone related code from nvme_revalidate_disk into a new helper in zns.c, and call it from nvme_alloc_ns in addition to nvme_validate_ns to ensure the zone bitmaps are initialized during probe. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | block: optimize blk_queue_zoned_model for !CONFIG_BLK_DEV_ZONEDChristoph Hellwig2020-10-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always return BLK_ZONED_NONE if zoned device support is not enabled. This allows various compiler optimizations including the dead code elimination that we so like for avoiding ifdefs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
| | * | nvme-loop: don't put ctrl on nvme_init_ctrl errorChaitanya Kulkarni2020-10-071-2/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function nvme_init_ctrl() gets the ctrl reference & when it fails it does put the ctrl reference in the error unwind code. When creating loop ctrl in nvme_loop_create_ctrl() if nvme_init_ctrl() returns non zero (i.e. error) value it jumps to the "out_put_ctrl" label which calls nvme_put_ctrl(), that will lead to douple ctrl put in error unwind path. Update nvme_loop_create_ctrl() such that this patch removes the "out_put_ctrl" label, add a new "out" label after nvme_put_ctrl() in error unwind path and jump to newly added label when nvme_init_ctrl() call retuns an error. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | scsi: megaraid_sas: Added support for shared host tagset for cpuhotplugKashyap Desai2020-10-062-13/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fusion adapters can steer completions to individual queues, and we now have support for shared host-wide tags. So we can enable multiqueue support for fusion adapters. Once driver enable shared host-wide tags, cpu hotplug feature is also supported as it was enabled using below patchsets - commit bf0beec0607d ("blk-mq: drain I/O when all CPUs in a hctx are offline") Currently driver has provision to disable host-wide tags using "host_tagset_enable" module parameter. Once we do not have any major performance regression using host-wide tags, we will drop the hand-crafted interrupt affinity settings. Performance is also meeting the expecatation - (used both none and mq-deadline scheduler) 24 Drive SSD on Aero with/without this patch can get 3.1M IOPs 3 VDs consist of 8 SAS SSD on Aero with/without this patch can get 3.1M IOPs. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | scsi: scsi_debug: Support host tagsetJohn Garry2020-10-061-18/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When host_max_queue is set (> 0), set the Scsi_Host.host_tagset such that blk-mq will use a hostwide tagset over all SCSI host submission queues. This means that we may expose all submission queues and always use the hwq chosen by blk-mq. And since if sdebug_host_max_queue is set, sdebug_max_queue is fixed to the same value, we can simplify how sdebug_driver_template.can_queue is set. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | scsi: hisi_sas: Switch v3 hw to MQJohn Garry2020-10-063-70/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the block layer provides a shared tag, we can switch the driver to expose all HW queues. Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | scsi: core: Show nr_hw_queues in sysfsJohn Garry2020-10-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So that we don't use a value of 0 for when Scsi_Host.nr_hw_queues is unset, use the tag_set->nr_hw_queues (which holds 1 for this case). Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | scsi: Add host and host template flag 'host_tagset'Hannes Reinecke2020-10-063-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Host and host template flag 'host_tagset' so hostwide tagset can be shared on multiple reply queues after the SCSI device's reply queue is converted to blk-mq hw queue. [jpg: Update comment on .can_queue and add Scsi_Host.host_tagset] Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used Tested-by: Douglas Gilbert <dgilbert@interlog.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | block: scsi_ioctl: Avoid the use of one-element arraysGustavo A. R. Silva2020-10-022-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One-element arrays are being deprecated[1]. Replace the one-element array with a simple object of type compat_caddr_t: 'compat_caddr_t unused'[2], once it seems this field is actually never used. Also, update struct cdrom_generic_command in UAPI by adding an anonimous union to avoid using the one-element array _reserved_. [1] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays [2] https://github.com/KSPP/linux/issues/86 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/lkml/5f76f5d0.qJ4t%2FHWuRzSW7bTa%25lkp@intel.com/ Build-tested-by: kernel test robot <lkp@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | rsxx: Use fallthrough pseudo-keywordGustavo A. R. Silva2020-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace /* Fall through. */ comment with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: remove embedded struct cache_sb from struct cache_setColy Li2020-10-0211-59/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since bcache code was merged into mainline kerrnel, each cache set only as one single cache in it. The multiple caches framework is here but the code is far from completed. Considering the multiple copies of cached data can also be stored on e.g. md raid1 devices, it is unnecessary to support multiple caches in one cache set indeed. The previous preparation patches fix the dependencies of explicitly making a cache set only have single cache. Now we don't have to maintain an embedded partial super block in struct cache_set, the in-memory super block can be directly referenced from struct cache. This patch removes the embedded struct cache_sb from struct cache_set, and fixes all locations where the superb lock was referenced from this removed super block by referencing the in-memory super block of struct cache. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: check and set sync status on cache's in-memory super blockColy Li2020-10-024-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the cache's sync status is checked and set on cache set's in- memory partial super block. After removing the embedded struct cache_sb from cache set and reference cache's in-memory super block from struct cache_set, the sync status can set and check directly on cache's super block. This patch checks and sets the cache sync status directly on cache's in-memory super block. This is a preparation for later removing embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: remove can_attach_cache()Coly Li2020-10-021-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After removing the embedded struct cache_sb from struct cache_set, cache set will directly reference the in-memory super block of struct cache. It is unnecessary to compare block_size, bucket_size and nr_in_set from the identical in-memory super block in can_attach_cache(). This is a preparation patch for latter removing cache_set->sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: don't check seq numbers in register_cache_set()Coly Li2020-10-021-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to update the partial super block of cache set, the seq numbers of cache and cache set are checked in register_cache_set(). If cache's seq number is larger than cache set's seq number, cache set must update its partial super block from cache's super block. It is unncessary when the embedded struct cache_sb is removed from struct cache set. This patch removed the seq numbers checking from register_cache_set(), because later there will be no such partial super block in struct cache set, the cache set will directly reference in-memory super block from struct cache. This is a preparation patch for removing embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: only use bucket_bytes() on struct cacheColy Li2020-10-022-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because struct cache_set and struct cache both have struct cache_sb, macro bucket_bytes() currently are used on both of them. When removing the embedded struct cache_sb from struct cache_set, this macro won't be used on struct cache_set anymore. This patch unifies all bucket_bytes() usage only on struct cache, this is one of the preparation to remove the embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: remove useless bucket_pages()Coly Li2020-10-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems alloc_bucket_pages() is the only user of bucket_pages(). Considering alloc_bucket_pages() is removed from bcache code, it is safe to remove the useless macro bucket_pages() now. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: remove useless alloc_bucket_pages()Coly Li2020-10-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | Now no one uses alloc_bucket_pages() anymore, remove it from bcache.h. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | bcache: only use block_bytes() on struct cacheColy Li2020-10-027-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because struct cache_set and struct cache both have struct cache_sb, therefore macro block_bytes() can be used on both of them. When removing the embedded struct cache_sb from struct cache_set, this macro won't be used on struct cache_set anymore. This patch unifies all block_bytes() usage only on struct cache, this is one of the preparation to remove the embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>