summaryrefslogtreecommitdiffstats
path: root/drivers/md
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'block-5.10-2020-12-12' of git://git.kernel.dk/linux-blockLinus Torvalds2020-12-136-393/+82
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: "This should be it for 5.10. Mike and Song looked into the warning case, and thankfully it appears the fix was pretty trivial - we can just change the md device chunk type to unsigned int to get rid of it. They cannot currently be < 0, and nobody is checking for that either. We're reverting the discard changes as the corruption reports came in very late, and there's just no time to attempt to deal with it at this point. Reverting the changes in question is the right call for 5.10" * tag 'block-5.10-2020-12-12' of git://git.kernel.dk/linux-block: md: change mddev 'chunk_sectors' from int to unsigned Revert "md: add md_submit_discard_bio() for submitting discard bio" Revert "md/raid10: extend r10bio devs to raid disks" Revert "md/raid10: pull codes that wait for blocked dev into one function" Revert "md/raid10: improve raid10 discard request" Revert "md/raid10: improve discard request for far layout" Revert "dm raid: remove unnecessary discard limits for raid10"
| * md: change mddev 'chunk_sectors' from int to unsignedMike Snitzer2020-12-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") exposed compiler warnings introduced by commit e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10"): In file included from ./include/linux/kernel.h:14, from ./include/asm-generic/bug.h:20, from ./arch/x86/include/asm/bug.h:93, from ./include/linux/bug.h:5, from ./include/linux/mmdebug.h:5, from ./include/linux/gfp.h:5, from ./include/linux/slab.h:15, from drivers/md/dm-raid.c:8: drivers/md/dm-raid.c: In function ‘raid_io_hints’: ./include/linux/minmax.h:18:28: warning: comparison of distinct pointer types lacks a cast (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) ^~ ./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’ (__typecheck(x, y) && __no_side_effects(x, y)) ^~~~~~~~~~~ ./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’ __builtin_choose_expr(__safe_cmp(x, y), \ ^~~~~~~~~~ ./include/linux/minmax.h:51:19: note: in expansion of macro ‘__careful_cmp’ #define min(x, y) __careful_cmp(x, y, <) ^~~~~~~~~~~~~ ./include/linux/minmax.h:84:39: note: in expansion of macro ‘min’ __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); }) ^~~ drivers/md/dm-raid.c:3739:33: note: in expansion of macro ‘min_not_zero’ limits->max_discard_sectors = min_not_zero(rs->md.chunk_sectors, ^~~~~~~~~~~~ Fix this by changing the chunk_sectors member of 'struct mddev' from int to 'unsigned int' to match the type used for the 'chunk_sectors' member of 'struct queue_limits'. Various MD code still uses 'int' but none of it appears to ever make use of signed int; and storing positive signed int in unsigned is perfectly safe. Reported-by: Song Liu <songliubraving@fb.com> Fixes: e2782f560c29 ("Revert "dm raid: remove unnecessary discard limits for raid10"") Fixes: e0910c8e4f87 ("dm raid: fix discard limits for raid1 and raid10") Cc: stable@vger,kernel.org # e0910c8e4f87 was marked for stable@ Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * Revert "md: add md_submit_discard_bio() for submitting discard bio"Song Liu2020-12-093-24/+12
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 2628089b74d5a64bd0bcb5d247a18f78d7b6f4d0. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * Revert "md/raid10: extend r10bio devs to raid disks"Song Liu2020-12-091-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 8650a889017cb1f6ea6813ccf83a2e9f6fa49dd3. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * Revert "md/raid10: pull codes that wait for blocked dev into one function"Song Liu2020-12-091-67/+51
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f046f5d0d79cdb968f219ce249e497fd1accf484. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * Revert "md/raid10: improve raid10 discard request"Song Liu2020-12-091-255/+1
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit bcc90d280465ebd51ab8688be86e1f00c62dccf9. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * Revert "md/raid10: improve discard request for far layout"Song Liu2020-12-092-64/+23
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d3ee2d8415a6256c1c41e1be36e80e640c3e6359. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * Revert "dm raid: remove unnecessary discard limits for raid10"Song Liu2020-12-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f0e90b6c663a7e3b4736cb318c6c7c589f152c28. Matthew Ruffell reported data corruption in raid10 due to the changes in discard handling [1]. Revert these changes before we find a proper fix. [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/ Cc: Matthew Ruffell <matthew.ruffell@canonical.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
* | dm: remove invalid sparse __acquires and __releases annotationsMike Snitzer2020-12-041-2/+0
| | | | | | | | | | | | | | | | | | | | Fixes sparse warnings: drivers/md/dm.c:508:12: warning: context imbalance in 'dm_prepare_ioctl' - wrong count at exit drivers/md/dm.c:543:13: warning: context imbalance in 'dm_unprepare_ioctl' - wrong count at exit Fixes: 971888c46993f ("dm: hold DM table for duration of ioctl rather than use blkdev_get") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm: fix double RCU unlock in dm_dax_zero_page_range() error pathMike Snitzer2020-12-041-2/+0
| | | | | | | | | | | | | | | | | | | | Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error path to fix sparse warning: drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock Fixes: cdf6cdcd3b99a ("dm,dax: Add dax zero_page_range operation") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm: fix IO splittingMike Snitzer2020-12-042-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") caused a couple regressions: 1) Using lcm_not_zero() when stacking chunk_sectors was a bug because chunk_sectors must reflect the most limited of all devices in the IO stack. 2) DM targets that set max_io_len but that do _not_ provide an .iterate_devices method no longer had there IO split properly. And commit 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") also caused a regression where DM no longer supported varied (per target) IO splitting. The implication being the potential for severely reduced performance for IO stacks that use a DM target like dm-cache to hide performance limitations of a slower device (e.g. one that requires 4K IO splitting). Coming full circle: Fix all these issues by discontinuing stacking chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(), add optional chunk_sectors override argument to blk_max_size_offset() and update DM's max_io_len() to pass ti->max_io_len to its blk_max_size_offset() call. Passing in an optional chunk_sectors override to blk_max_size_offset() allows for code reuse of block's centralized calculation for max IO size based on provided offset and split boundary. Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") Cc: stable@vger.kernel.org Reported-by: John Dorminy <jdorminy@redhat.com> Reported-by: Bruce Johnston <bjohnsto@redhat.com> Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com> Reviewed-by: John Dorminy <jdorminy@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk>
* | dm writecache: remove BUG() and fail gracefully insteadMike Snitzer2020-12-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Building on arch/s390/ results in this build error: cc1: some warnings being treated as errors ../drivers/md/dm-writecache.c: In function 'persistent_memory_claim': ../drivers/md/dm-writecache.c:323:1: error: no return statement in function returning non-void [-Werror=return-type] Fix this by replacing the BUG() with an -EOPNOTSUPP return. Fixes: 48debafe4f2f ("dm: add writecache target") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm table: Remove BUG_ON(in_interrupt())Thomas Gleixner2020-12-011-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The BUG_ON(in_interrupt()) in dm_table_event() is a historic leftover from a rework of the dm table code which changed the calling context. Issuing a BUG for a wrong calling context is frowned upon and in_interrupt() is deprecated and only covering parts of the wrong contexts. The sanity check for the context is covered by CONFIG_DEBUG_ATOMIC_SLEEP and other debug facilities already. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm: fix bug with RCU locking in dm_blk_report_zonesSergei Shtepa2020-12-011-2/+4
| | | | | | | | | | | | | | | | | | | | The dm_get_live_table() function makes RCU read lock so dm_put_live_table() must be called even if dm_table map is not found. Fixes: e76239a3748c9 ("block: add a report_zones method") Cc: stable@vger.kernel.org Signed-off-by: Sergei Shtepa <sergei.shtepa@veeam.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | Revert "dm cache: fix arm link errors with inline"Nick Desaulniers2020-12-011-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 43aeaa29573924df76f44eda2bbd94ca36e407b5. Since commit 0bddd227f3dc ("Documentation: update for gcc 4.9 requirement") the minimum supported version of GCC is gcc-4.9. It's now safe to remove this code. Link: https://github.com/ClangBuiltLinux/linux/issues/427 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm writecache: fix the maximum number of argumentsMikulas Patocka2020-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | Advance the maximum number of arguments to 16. This fixes issue where certain operations, combined with table configured args, exceed 10 arguments. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm writecache: advance the number of arguments when reporting max_ageMikulas Patocka2020-11-171-0/+2
| | | | | | | | | | | | | | | | | | | | When reporting the "max_age" value the number of arguments must advance by two. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: 3923d4854e18 ("dm writecache: implement gradual cleanup") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | dm integrity: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORYMikulas Patocka2020-11-171-2/+2
|/ | | | | | | | | Don't use crypto drivers that have the flag CRYPTO_ALG_ALLOCATES_MEMORY set. These drivers allocate memory and thus they are not suitable for block I/O processing. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* Merge tag 'for-5.10/dm-changes' of ↵Linus Torvalds2020-10-1413-393/+222
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Improve DM core's bio splitting to use blk_max_size_offset(). Also fix bio splitting for bios that were deferred to the worker thread due to a DM device being suspended. - Remove DM core's special handling of NVMe devices now that block core has internalized efficiencies drivers previously needed to be concerned about (via now removed direct_make_request). - Fix request-based DM to not bounce through indirect dm_submit_bio; instead have block core make direct call to blk_mq_submit_bio(). - Various DM core cleanups to simplify and improve code. - Update DM cryot to not use drivers that set CRYPTO_ALG_ALLOCATES_MEMORY. - Fix DM raid's raid1 and raid10 discard limits for the purposes of linux-stable. But then remove DM raid's discard limits settings now that MD raid can efficiently handle large discards. - A couple small cleanups across various targets. * tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix request-based DM to not bounce through indirect dm_submit_bio dm: remove special-casing of bio-based immutable singleton target on NVMe dm: export dm_copy_name_and_uuid dm: fix comment in __dm_suspend() dm: fold dm_process_bio() into dm_submit_bio() dm: fix missing imposition of queue_limits from dm_wq_work() thread dm snap persistent: simplify area_io() dm thin metadata: Remove unused local variable when create thin and snap dm raid: remove unnecessary discard limits for raid10 dm raid: fix discard limits for raid1 and raid10 dm crypt: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY dm: use dm_table_get_device_name() where appropriate in targets dm table: make 'struct dm_table' definition accessible to all of DM core dm: eliminate need for start_io_acct() forward declaration dm: simplify __process_abnormal_io() dm: push use of on-stack flush_bio down to __send_empty_flush() dm: optimize max_io_len() by inlining max_io_len_target_boundary() dm: push md->immutable_target optimization down to __process_bio() dm: change max_io_len() to use blk_max_size_offset() dm table: stack 'chunk_sectors' limit to account for target-specific splitting
| * dm: fix request-based DM to not bounce through indirect dm_submit_bioMike Snitzer2020-10-071-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is unnecessary to force request-based DM to call into bio-based dm_submit_bio (via indirect disk->fops->submit_bio) only to have it then call blk_mq_submit_bio(). Fix this by establishing a request-based DM block_device_operations (dm_rq_blk_dops, which doesn't have .submit_bio) and update dm_setup_md_queue() to set md->disk->fops to it for DM_TYPE_REQUEST_BASED. Remove DM_TYPE_REQUEST_BASED conditional in dm_submit_bio and unexport blk_mq_submit_bio. Fixes: c62b37d96b6eb ("block: move ->make_request_fn to struct block_device_operations") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: remove special-casing of bio-based immutable singleton target on NVMeMike Snitzer2020-10-072-80/+7
| | | | | | | | | | | | | | | | Since commit 5a6c35f9af416 ("block: remove direct_make_request") there is no benefit to DM special-casing NVMe. Remove all code used to establish DM_TYPE_NVME_BIO_BASED. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: export dm_copy_name_and_uuidMike Snitzer2020-10-011-1/+1
| | | | | | | | | | | | | | Allow DM targets to access the configured name and uuid. Also, bump DM ioctl version. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: fix comment in __dm_suspend()Mike Snitzer2020-10-011-5/+4
| | | | | | | | | | | | Fix stale references to functions that have been renamed and fix typo. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: fold dm_process_bio() into dm_submit_bio()Mike Snitzer2020-10-011-30/+22
| | | | | | | | | | | | | | | | | | | | | | dm_process_bio() is only called by dm_submit_bio(), there is no benefit to keeping dm_process_bio() factored out, so fold it. While at it, cleanup dm_submit_bio()'s DMF_BLOCK_IO_FOR_SUSPEND related branching and expand scope of dm_get_live_table() rcu reference on map via common 'out' label to dm_put_live_table(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: fix missing imposition of queue_limits from dm_wq_work() threadMike Snitzer2020-09-301-25/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a DM device was suspended when bios were issued to it, those bios would be deferred using queue_io(). Once the DM device was resumed dm_process_bio() could be called by dm_wq_work() for original bio that still needs splitting. dm_process_bio()'s check for current->bio_list (meaning call chain is within ->submit_bio) as a prerequisite for calling blk_queue_split() for "abnormal IO" would result in dm_process_bio() never imposing corresponding queue_limits (e.g. discard_granularity, discard_max_bytes, etc). Fix this by always having dm_wq_work() resubmit deferred bios using submit_bio_noacct(). Side-effect is blk_queue_split() is always called for "abnormal IO" from ->submit_bio, be it from application thread or dm_wq_work() workqueue, so proper bio splitting and depth-first bio submission is performed. For sake of clarity, remove current->bio_list check before call to blk_queue_split(). Also, remove dm_wq_work()'s use of dm_{get,put}_live_table() -- no longer needed since IO will be reissued in terms of ->submit_bio. And rename bio variable from 'c' to 'bio'. Fixes: cf9c37865557 ("dm: fix comment in dm_process_bio()") Reported-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm snap persistent: simplify area_io()Qinglang Miao2020-09-291-9/+2
| | | | | | | | | | Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm thin metadata: Remove unused local variable when create thin and snapHuaisheng Ye2020-09-292-5/+4
| | | | | | | | | | | | | | | | | | | | The local variable disk details is not used during the creating of thin & snap devices. Remove them from dm-thin-metadata, and add pointer validity check for pointer value in btree_lookup_raw. Skip memory copy when the caller doesn't need the value. Signed-off-by: Huaisheng Ye <yehs1@lenovo.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm raid: remove unnecessary discard limits for raid10Mike Snitzer2020-09-291-11/+0
| | | | | | | | | | | | | | | | Commit bcc90d280465e ("md/raid10: improve raid10 discard request") removes raid10's inability to properly handle large discards. So eliminate associated constraint from dm-raid's raid10 support. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm raid: fix discard limits for raid1 and raid10Mike Snitzer2020-09-291-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block core warned that discard_granularity was 0 for dm-raid with personality of raid1. Reason is that raid_io_hints() was incorrectly special-casing raid1 rather than raid0. But since commit 29efc390b9462 ("md/md0: optimize raid0 discard handling") even raid0 properly handles large discards. Fix raid_io_hints() by removing discard limits settings for raid1. Also, fix limits for raid10 by properly stacking underlying limits as done in blk_stack_limits(). Depends-on: 29efc390b9462 ("md/md0: optimize raid0 discard handling") Fixes: 61697a6abd24a ("dm: eliminate 'split_discard_bios' flag from DM target interface") Cc: stable@vger.kernel.org Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm crypt: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORYMikulas Patocka2020-09-291-6/+11
| | | | | | | | | | | | | | | | | | Don't use crypto drivers that have the flag CRYPTO_ALG_ALLOCATES_MEMORY set. These drivers allocate memory and thus they are unsuitable for block I/O processing. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: use dm_table_get_device_name() where appropriate in targetsMike Snitzer2020-09-292-10/+8
| | | | | | | | | | | | | | dm_table_get_device_name() avoids calling dm_table_get_md() followed by dm_device_name() -- saves intermediate dm_table_get_md() call. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm table: make 'struct dm_table' definition accessible to all of DM coreMike Snitzer2020-09-295-70/+61
| | | | | | | | | | | | | | | | | | Move 'struct dm_table' definition from dm-table.c to dm-core.h and update DM core to access its members directly. Helps optimize max_io_len() and other methods slightly. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: eliminate need for start_io_acct() forward declarationMike Snitzer2020-09-291-40/+38
| | | | | | | | Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: simplify __process_abnormal_io()Mike Snitzer2020-09-291-51/+17
| | | | | | | | | | | | | | Only call bio_op() once in switch statement. Also remove the excessive factoring out to one line functions. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: push use of on-stack flush_bio down to __send_empty_flush()Mike Snitzer2020-09-291-24/+13
| | | | | | | | | | | | Eliminates duplicate code, no functional change. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: optimize max_io_len() by inlining max_io_len_target_boundary()Mike Snitzer2020-09-291-9/+10
| | | | | | | | | | | | | | | | | | Saves redundant dm_target_offset() math. Also, reverse argument order for max_io_len() to be consistent with other similar functions. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: push md->immutable_target optimization down to __process_bio()Mike Snitzer2020-09-291-13/+9
| | | | | | | | | | | | Also, update associated stale comment in __bind(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm: change max_io_len() to use blk_max_size_offset()Mike Snitzer2020-09-291-12/+8
| | | | | | | | | | | | | | | | Using blk_max_size_offset() enables DM core's splitting to impose ti->max_io_len (via q->limits.chunk_sectors) and also fallback to respecting q->limits.max_sectors if chunk_sectors isn't set. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * dm table: stack 'chunk_sectors' limit to account for target-specific splittingMike Snitzer2020-09-291-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | If target set ti->max_io_len it must be used when stacking DM device's queue_limits to establish a 'chunk_sectors' that is compatible with the IO stack. By using lcm_not_zero() care is taken to avoid blindly overriding the chunk_sectors limit stacked up by blk_stack_limits(). Depends-on: 07d098e6bbad ("block: allow 'chunk_sectors' to be non-power-of-2") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
| * Merge remote-tracking branch 'jens/for-5.10/block' into dm-5.10Mike Snitzer2020-09-2913-116/+102
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DM depends on these block 5.10 commits: 22ada802ede8 block: use lcm_not_zero() when stacking chunk_sectors 07d098e6bbad block: allow 'chunk_sectors' to be non-power-of-2 021a24460dc2 block: add QUEUE_FLAG_NOWAIT 6abc49468eea dm: add support for REQ_NOWAIT and enable it for linear target Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* | \ Merge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds2020-10-1324-620/+1032
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block driver updates from Jens Axboe: "Here are the driver updates for 5.10. A few SCSI updates in here too, in coordination with Martin as they depend on core block changes for the shared tag bitmap. This contains: - NVMe pull requests via Christoph: - fix keep alive timer modification (Amit Engel) - order the PCI ID list more sensibly (Andy Shevchenko) - cleanup the open by controller helper (Chaitanya Kulkarni) - use an xarray for the CSE log lookup (Chaitanya Kulkarni) - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni) - fix nvme_ns_report_zones (Christoph Hellwig) - add a sanity check to nvmet-fc (James Smart) - fix interrupt allocation when too many polled queues are specified (Jeffle Xu) - small nvmet-tcp optimization (Mark Wunderlich) - fix a controller refcount leak on init failure (Chaitanya Kulkarni) - misc cleanups (Chaitanya Kulkarni) - major refactoring of the scanning code (Christoph Hellwig) - MD updates via Song: - Bug fixes in bitmap code, from Zhao Heming - Fix a work queue check, from Guoqing Jiang - Fix raid5 oops with reshape, from Song Liu - Clean up unused code, from Jason Yan - Discard improvements, from Xiao Ni - raid5/6 page offset support, from Yufen Yu - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap, Hannes) - null_blk open/active zone limit support (Niklas) - Set of bcache updates (Coly, Dongsheng, Qinglang)" * tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits) md/raid5: fix oops during stripe resizing md/bitmap: fix memory leak of temporary bitmap md: fix the checking of wrong work queue md/bitmap: md_bitmap_get_counter returns wrong blocks md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks md/raid0: remove unused function is_io_in_chunk_boundary() nvme-core: remove extra condition for vwc nvme-core: remove extra variable nvme: remove nvme_identify_ns_list nvme: refactor nvme_validate_ns nvme: move nvme_validate_ns nvme: query namespace identifiers before adding the namespace nvme: revalidate zone bitmaps in nvme_update_ns_info nvme: remove nvme_update_formats nvme: update the known admin effects nvme: set the queue limits in nvme_update_ns_info nvme: remove the 0 lba_shift check in nvme_update_ns_info nvme: clean up the check for too large logic block sizes nvme: freeze the queue over ->lba_shift updates nvme: factor out a nvme_configure_metadata helper ...
| * | | md/raid5: fix oops during stripe resizingSong Liu2020-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KoWei reported crash during raid5 reshape: [ 1032.252932] Oops: 0002 [#1] SMP PTI [...] [ 1032.252943] RIP: 0010:memcpy_erms+0x6/0x10 [...] [ 1032.252947] RSP: 0018:ffffba1ac0c03b78 EFLAGS: 00010286 [ 1032.252949] RAX: 0000784ac0000000 RBX: ffff91bec3d09740 RCX: 0000000000001000 [ 1032.252951] RDX: 0000000000001000 RSI: ffff91be6781c000 RDI: 0000784ac0000000 [ 1032.252953] RBP: ffffba1ac0c03bd8 R08: 0000000000001000 R09: ffffba1ac0c03bf8 [ 1032.252954] R10: 0000000000000000 R11: 0000000000000000 R12: ffffba1ac0c03bf8 [ 1032.252955] R13: 0000000000001000 R14: 0000000000000000 R15: 0000000000000000 [ 1032.252958] FS: 0000000000000000(0000) GS:ffff91becf500000(0000) knlGS:0000000000000000 [ 1032.252959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1032.252961] CR2: 0000784ac0000000 CR3: 000000031780a002 CR4: 00000000001606e0 [ 1032.252962] Call Trace: [ 1032.252969] ? async_memcpy+0x179/0x1000 [async_memcpy] [ 1032.252977] ? raid5_release_stripe+0x8e/0x110 [raid456] [ 1032.252982] handle_stripe_expansion+0x15a/0x1f0 [raid456] [ 1032.252988] handle_stripe+0x592/0x1270 [raid456] [ 1032.252993] handle_active_stripes.isra.0+0x3cb/0x5a0 [raid456] [ 1032.252999] raid5d+0x35c/0x550 [raid456] [ 1032.253002] ? schedule+0x42/0xb0 [ 1032.253006] ? schedule_timeout+0x10e/0x160 [ 1032.253011] md_thread+0x97/0x160 [ 1032.253015] ? wait_woken+0x80/0x80 [ 1032.253019] kthread+0x104/0x140 [ 1032.253022] ? md_start_sync+0x60/0x60 [ 1032.253024] ? kthread_park+0x90/0x90 [ 1032.253027] ret_from_fork+0x35/0x40 This is because cache_size_mutex was unlocked too early in resize_stripes, which races with grow_one_stripe() that grow_one_stripe() allocates a stripe with wrong pool_size. Fix this issue by unlocking cache_size_mutex after updating pool_size. Cc: <stable@vger.kernel.org> # v4.4+ Reported-by: KoWei Sung <winders@amazon.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | md/bitmap: fix memory leak of temporary bitmapZhao Heming2020-10-082-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Callers of get_bitmap_from_slot() are responsible to free the bitmap. Suggested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | md: fix the checking of wrong work queueGuoqing Jiang2020-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It should check md_rdev_misc_wq instead of md_misc_wq. Fixes: cc1ffe61c026 ("md: add new workqueue for delete rdev") Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | md/bitmap: md_bitmap_get_counter returns wrong blocksZhao Heming2020-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | md_bitmap_get_counter() has code: ``` if (bitmap->bp[page].hijacked || bitmap->bp[page].map == NULL) csize = ((sector_t)1) << (bitmap->chunkshift + PAGE_COUNTER_SHIFT - 1); ``` The minus 1 is wrong, this branch should report 2048 bits of space. With "-1" action, this only report 1024 bit of space. This bug code returns wrong blocks, but it doesn't inflence bitmap logic: 1. Most callers focus this function return value (the counter of offset), not the parameter blocks. 2. The bug is only triggered when hijacked is true or map is NULL. the hijacked true condition is very rare. the "map == null" only true when array is creating or resizing. 3. Even the caller gets wrong blocks, current code makes caller just to call md_bitmap_get_counter() one more time. Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | md/bitmap: md_bitmap_read_sb uses wrong bitmap blocksZhao Heming2020-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patched code is used to get chunks number, should use round-up div to replace current sector_div. The same code is in md_bitmap_resize(): ``` chunks = DIV_ROUND_UP_SECTOR_T(blocks, 1 << chunkshift); ``` Signed-off-by: Zhao Heming <heming.zhao@suse.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | md/raid0: remove unused function is_io_in_chunk_boundary()Jason Yan2020-10-081-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is no longger needed after commit 20d0189b1012 ("block: Introduce new bio_split()"). Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>
| * | | bcache: remove embedded struct cache_sb from struct cache_setColy Li2020-10-0211-59/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since bcache code was merged into mainline kerrnel, each cache set only as one single cache in it. The multiple caches framework is here but the code is far from completed. Considering the multiple copies of cached data can also be stored on e.g. md raid1 devices, it is unnecessary to support multiple caches in one cache set indeed. The previous preparation patches fix the dependencies of explicitly making a cache set only have single cache. Now we don't have to maintain an embedded partial super block in struct cache_set, the in-memory super block can be directly referenced from struct cache. This patch removes the embedded struct cache_sb from struct cache_set, and fixes all locations where the superb lock was referenced from this removed super block by referencing the in-memory super block of struct cache. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | bcache: check and set sync status on cache's in-memory super blockColy Li2020-10-024-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the cache's sync status is checked and set on cache set's in- memory partial super block. After removing the embedded struct cache_sb from cache set and reference cache's in-memory super block from struct cache_set, the sync status can set and check directly on cache's super block. This patch checks and sets the cache sync status directly on cache's in-memory super block. This is a preparation for later removing embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | bcache: remove can_attach_cache()Coly Li2020-10-021-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After removing the embedded struct cache_sb from struct cache_set, cache set will directly reference the in-memory super block of struct cache. It is unnecessary to compare block_size, bucket_size and nr_in_set from the identical in-memory super block in can_attach_cache(). This is a preparation patch for latter removing cache_set->sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>