summaryrefslogtreecommitdiffstats
path: root/drivers/md/dm-cache-target.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-6.10/dm-fixes' of ↵Linus Torvalds2024-05-211-3/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM discard regressions due to DM core switching over to using queue_limits_set() without DM core and targets first being updated to set (and stack) discard limits in terms of max_hw_discard_sectors and not max_discard_sectors - Fix stable@ DM integrity discard support to set device's discard_granularity limit to the device's logical block size * tag 'for-6.10/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: always manage discard support in terms of max_hw_discard_sectors dm-integrity: set discard_granularity to logical block size
| * dm: always manage discard support in terms of max_hw_discard_sectorsMike Snitzer2024-05-201-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4f563a64732d ("block: add a max_user_discard_sectors queue limit") changed block core to set max_discard_sectors to: min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) Since commit 1c0e720228ad ("dm: use queue_limits_set") it was reported dm-thinp was failing in a few fstests (generic/347 and generic/405) with the first WARN_ON_ONCE in dm_cell_key_has_valid_range() being reported, e.g.: WARNING: CPU: 1 PID: 30 at drivers/md/dm-bio-prison-v1.c:128 dm_cell_key_has_valid_range+0x3d/0x50 blk_set_stacking_limits() sets max_user_discard_sectors to UINT_MAX, so given how block core now sets max_discard_sectors (detailed above) it follows that blk_stack_limits() stacks up the underlying device's max_hw_discard_sectors and max_discard_sectors is set to match it. If max_hw_discard_sectors exceeds dm's BIO_PRISON_MAX_RANGE, then dm_cell_key_has_valid_range() will trigger the warning with: WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE) Aside from this warning, the discard will fail. Fix this and other DM issues by governing discard support in terms of max_hw_discard_sectors instead of max_discard_sectors. Reported-by: Theodore Ts'o <tytso@mit.edu> Fixes: 1c0e720228ad ("dm: use queue_limits_set") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* | dm: use bio_list_merge_initChristoph Hellwig2024-04-011-8/+4
|/ | | | | | | | | | | | Use bio_list_merge_init instead of open coding bio_list_merge and bio_list_init. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240328084147.2954434-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig2023-06-121-6/+6
| | | | | | | | | | | | | | The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: push error reporting down to dm_register_target()Yangtao Li2023-04-111-1/+0
| | | | | | | | Simplifies each DM target's init method by making dm_register_target() responsible for its error reporting (on behalf of targets). Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm cache: add cond_resched() to various workqueue loopsMike Snitzer2023-02-171-0/+4
| | | | | | | Otherwise on resource constrained systems these workqueues may be too greedy. Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: declare variables static when sensibleHeinz Mauelshagen2023-02-141-1/+1
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: fix suspect indent whitespaceHeinz Mauelshagen2023-02-141-1/+1
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: add missing empty linesHeinz Mauelshagen2023-02-141-1/+11
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: avoid spaces before function arguments or in favour of tabsHeinz Mauelshagen2023-02-141-6/+6
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: correct block comments format.Heinz Mauelshagen2023-02-141-21/+37
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: change "unsigned" to "unsigned int"Heinz Mauelshagen2023-02-141-25/+25
| | | | | Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm: add missing SPDX-License-IndentifiersHeinz Mauelshagen2023-02-141-0/+1
| | | | | | | | | 'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use. Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm cache: set needs_check flag after aborting metadataMike Snitzer2022-12-011-5/+5
| | | | | | | | | | | | Otherwise the commit that will be aborted will be associated with the metadata objects that will be torn down. Must write needs_check flag to metadata with a reset block manager. Found through code-inspection (and compared against dm-thin.c). Cc: stable@vger.kernel.org Fixes: 028ae9f76f29 ("dm cache: add fail io mode and needs_check flag") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm cache: Fix UAF in destroy()Luo Meng2022-11-301-0/+1
| | | | | | | | | | | | Dm_cache also has the same UAF problem when dm_resume() and dm_destroy() are concurrent. Therefore, cancelling timer again in destroy(). Cc: stable@vger.kernel.org Fixes: c6b4fcbad044e ("dm: add cache target") Signed-off-by: Luo Meng <luomeng12@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* dm cache: fix typo in 2 comment blocksSteven Lung2022-07-071-1/+1
| | | | | | | Replace neccessarily with necessarily. Signed-off-by: Steven Lung <1030steven@gmail.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
* block: remove QUEUE_FLAG_DISCARDChristoph Hellwig2022-04-171-8/+1
| | | | | | | | | | | | | | | | | | | Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm cache: use dm_submit_bio_remapMike Snitzer2022-03-101-3/+4
| | | | Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm: stop using bdevnameChristoph Hellwig2022-03-021-6/+4
| | | | | | | | Just use the %pg format specifier instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* block: pass a block_device to bio_clone_fastChristoph Hellwig2022-02-041-2/+2
| | | | | | | | | | Pass a block_device to bio_clone_fast and __bio_clone_fast and give the functions more suitable names. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm-cache: remove __remap_to_origin_clear_discardChristoph Hellwig2022-02-041-16/+8
| | | | | | | | | | Fold __remap_to_origin_clear_discard into the two callers to prepare for bio cloning refactoring. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220202160109.108149-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: use bdev_nr_sectors and bdev_nr_bytes instead of open coding themChristoph Hellwig2021-10-181-1/+1
| | | | | | | | | | Use the proper helpers to read the block device size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20211018101130.1838532-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: update target status functions to support IMA measurementTushar Sugandhi2021-08-101-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For device mapper targets to take advantage of IMA's measurement capabilities, the status functions for the individual targets need to be updated to handle the status_type_t case for value STATUSTYPE_IMA. Update status functions for the following target types, to log their respective attributes to be measured using IMA. 01. cache 02. crypt 03. integrity 04. linear 05. mirror 06. multipath 07. raid 08. snapshot 09. striped 10. verity For rest of the targets, handle the STATUSTYPE_IMA case by setting the measurement buffer to NULL. For IMA to measure the data on a given system, the IMA policy on the system needs to be updated to have the following line, and the system needs to be restarted for the measurements to take effect. /etc/ima/ima-policy measure func=CRITICAL_DATA label=device-mapper template=ima-buf The measurements will be reflected in the IMA logs, which are located at: /sys/kernel/security/integrity/ima/ascii_runtime_measurements /sys/kernel/security/integrity/ima/binary_runtime_measurements These IMA logs can later be consumed by various attestation clients running on the system, and send them to external services for attesting the system. The DM target data measured by IMA subsystem can alternatively be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with DM_TABLE_STATUS_CMD. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm io tracker: factor out IO trackerMike Snitzer2021-06-251-76/+6
| | | | | | Allow other code to use dm_io_tracker. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: remove needless request_queue NULL pointer checksXu Wang2021-03-261-1/+1
| | | | | | | | | Since commit ff9ea323816d ("block, bdi: an active gendisk always has a request_queue associated with it") the request_queue pointer returned from bdev_get_queue() shall never be NULL. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: simplify the return expression of load_mapping()Zheng Yongjun2020-12-221-6/+1
| | | | | | | Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* Revert "dm cache: fix arm link errors with inline"Nick Desaulniers2020-12-011-4/+0
| | | | | | | | | | | | | This reverts commit 43aeaa29573924df76f44eda2bbd94ca36e407b5. Since commit 0bddd227f3dc ("Documentation: update for gcc 4.9 requirement") the minimum supported version of GCC is gcc-4.9. It's now safe to remove this code. Link: https://github.com/ClangBuiltLinux/linux/issues/427 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm: use dm_table_get_device_name() where appropriate in targetsMike Snitzer2020-09-291-1/+1
| | | | | | | dm_table_get_device_name() avoids calling dm_table_get_md() followed by dm_device_name() -- saves intermediate dm_table_get_md() call. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* writeback: remove bdi->congested_fnChristoph Hellwig2020-07-081-19/+0
| | | | | | | | | | | | | Except for pktdvd, the only places setting congested bits are file systems that allocate their own backing_dev_info structures. And pktdvd is a deprecated driver that isn't useful in stack setup either. So remove the dead congested_fn stacking infrastructure. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: David Sterba <dsterba@suse.com> [axboe: fixup unused variables in bcache/request.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: rename generic_make_request to submit_bio_noacctChristoph Hellwig2020-07-011-3/+3
| | | | | | | | | generic_make_request has always been very confusingly misnamed, so rename it to submit_bio_noacct to make it clear that it is submit_bio minus accounting and a few checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: bump version of core and various targetsMike Snitzer2020-03-031-1/+1
| | | | | | | | | | Changes made during the 5.6 cycle warrant bumping the version number for DM core and the targets modified by this commit. It should be noted that dm-thin, dm-crypt and dm-raid already had their target version bumped during the 5.6 merge window. Signed-off-by; Mike Snitzer <snitzer@redhat.com>
* dm cache: fix a crash due to incorrect work item cancellingMikulas Patocka2020-02-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The crash can be reproduced by running the lvm2 testsuite test lvconvert-thin-external-cache.sh for several minutes, e.g.: while :; do make check T=shell/lvconvert-thin-external-cache.sh; done The crash happens in this call chain: do_waker -> policy_tick -> smq_tick -> end_hotspot_period -> clear_bitset -> memset -> __memset -- which accesses an invalid pointer in the vmalloc area. The work entry on the workqueue is executed even after the bitmap was freed. The problem is that cancel_delayed_work doesn't wait for the running work item to finish, so the work item can continue running and re-submitting itself even after cache_postsuspend. In order to make sure that the work item won't be running, we must use cancel_delayed_work_sync. Also, change flush_workqueue to drain_workqueue, so that if some work item submits itself or another work item, we are properly waiting for both of them. Fixes: c6b4fcbad044 ("dm: add cache target") Cc: stable@vger.kernel.org # v3.9 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: replace spin_lock_irqsave with spin_lock_irqMikulas Patocka2019-11-051-49/+28
| | | | | | | | | | | | If we are in a place where it is known that interrupts are enabled, functions spin_lock_irq/spin_unlock_irq should be used instead of spin_lock_irqsave/spin_unlock_irqrestore. spin_lock_irq and spin_unlock_irq are faster because they don't need to push and pop the flags register. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: fix bugs when a GFP_NOWAIT allocation failsMikulas Patocka2019-10-171-26/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | GFP_NOWAIT allocation can fail anytime - it doesn't wait for memory being available and it fails if the mempool is exhausted and there is not enough memory. If we go down this path: map_bio -> mg_start -> alloc_migration -> mempool_alloc(GFP_NOWAIT) we can see that map_bio() doesn't check the return value of mg_start(), and the bio is leaked. If we go down this path: map_bio -> mg_start -> mg_lock_writes -> alloc_prison_cell -> dm_bio_prison_alloc_cell_v2 -> mempool_alloc(GFP_NOWAIT) -> mg_lock_writes -> mg_complete the bio is ended with an error - it is unacceptable because it could cause filesystem corruption if the machine ran out of memory temporarily. Change GFP_NOWAIT to GFP_NOIO, so that the mempool code will properly wait until memory becomes available. mempool_alloc with GFP_NOIO can't fail, so remove the code paths that deal with allocation failure. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: add support for discard passdown to the origin deviceMike Snitzer2019-03-051-26/+100
| | | | | | | | | | | | | | DM cache now defaults to passing discards down to the origin device. User may disable this using the "no_discard_passdown" feature when creating the cache device. If the cache's underlying origin device doesn't support discards then passdown is disabled (with warning). Similarly, if the underlying origin device's max_discard_sectors is less than a cache block discard passdown will be disabled (this is required because sizing of the cache internal discard bitset depends on it). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm: eliminate 'split_discard_bios' flag from DM target interfaceMike Snitzer2019-02-201-1/+0
| | | | | | | | | There is no need to have DM core split discards on behalf of a DM target now that blk_queue_split() handles splitting discards based on the queue_limits. A DM target just needs to set max_discard_sectors, discard_granularity, etc, in queue_limits. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: destroy migration_cache if cache target registration failedShenghui Wang2018-10-091-3/+2
| | | | | | | | | | | Commit 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") inadvertently introduced this bug when it moved dm_register_target() after the call to KMEM_CACHE(). Fixes: 7e6358d244e47 ("dm: fix various targets to dm_register_target after module __init resources created") Cc: stable@vger.kernel.org Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: fix resize crash if user doesn't reload cache tableMike Snitzer2018-10-041-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | A reload of the cache's DM table is needed during resize because otherwise a crash will occur when attempting to access smq policy entries associated with the portion of the cache that was recently extended. The reason is cache-size based data structures in the policy will not be resized, the only way to safely extend the cache is to allow for a proper cache policy initialization that occurs when the cache table is loaded. For example the smq policy's space_init(), init_allocator(), calc_hotspot_params() must be sized based on the extended cache size. The fix for this is to disallow cache resizes of this pattern: 1) suspend "cache" target's device 2) resize the fast device used for the cache 3) resume "cache" target's device Instead, the last step must be a full reload of the cache's DM table. Fixes: 66a636356 ("dm cache: add stochastic-multi-queue (smq) policy") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm kcopyd: return void from dm_kcopyd_copy()Mike Snitzer2018-07-311-12/+4
| | | | | | | dm_kcopyd_copy() only ever returns 0 so there is no need for callers to account for possible failure. Same goes for dm_kcopyd_zero(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: only allow a single io_mode cache feature to be requestedJohn Pittman2018-07-271-4/+15
| | | | | | | | | | | More than one io_mode feature can be requested when creating a dm cache device (as is: last one wins). The io_mode selections are incompatible with one another, we should force them to be selected exclusively. Add a counter to check for more than one io_mode selection. Fixes: 629d0a8a1a10 ("dm cache metadata: add "metadata2" feature") Signed-off-by: John Pittman <jpittman@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm: adjust structure members to improve alignmentMike Snitzer2018-06-081-29/+32
| | | | | | | | | Eliminate most holes in DM data structures that were modified by commit 6f1c819c21 ("dm: convert to bioset_init()/mempool_init()"). Also prevent structure members from unnecessarily spanning cache lines. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm: convert to bioset_init()/mempool_init()Kent Overstreet2018-05-301-13/+12
| | | | | | | | Convert dm to embedded bio sets. Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* dm: allow targets to return output from messages they are sentMike Snitzer2018-04-031-1/+2
| | | | | | | | Could be useful for a target to return stats or other information. If a target does DMEMIT() anything to @result from its .message method then it must return 1 to the caller. Signed-off-By: Mike Snitzer <snitzer@redhat.com>
* dm: fix various targets to dm_register_target after module __init resources ↵monty_pavel@sina.com2017-12-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | created A NULL pointer is seen if two concurrent "vgchange -ay -K <vg name>" processes race to load the dm-thin-pool module: PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange" #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9 0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992 0000002 [ffff883cd743d730] oops_end at ffffffff81515c90 0000003 [ffff883cd743d760] no_context at ffffffff81049f1b 0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5 0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce 0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f 0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae 0000008 [ffff883cd743d980] page_fault at ffffffff81514f95 [exception RIP: kmem_cache_alloc+108] RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046 RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000 RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000 R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0 R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool] 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod] 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod] 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod] The race results in a NULL pointer because: Process A (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target not registered; c. modprobe dm_thin_pool and wait until end. Process B (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target registered; c. table_load->dm_table_add_target->pool_ctr; d. _new_mapping_cache is NULL and panic. Note: 1. process A and process B are two concurrent processes. 2. pool_target can be detected by process B but _new_mapping_cache initialization has not ended. To fix dm-thin-pool, and other targets (cache, multipath, and snapshot) with the same problem, simply dm_register_target() after all resources created during module init (as labelled with __init) are finished. Cc: stable@vger.kernel.org Signed-off-by: monty <monty_pavel@sina.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: lift common migration preparation code to alloc_migration()Mike Snitzer2017-11-101-10/+7
| | | | Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: remove usused deferred_cells member from struct cacheJoe Thornber2017-11-101-2/+0
| | | | | Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: simplify get_per_bio_data() by removing data_size argumentMike Snitzer2017-11-101-39/+22
| | | | | | | There is only one per_bio_data size now that writethrough-specific data was removed from the per_bio_data structure. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: remove all obsolete writethrough-specific codeMike Snitzer2017-11-101-81/+1
| | | | | | | | | | Now that the writethrough code is much simpler there is no need to track so much state or cascade bio submission (as was done, via writethrough_endio(), to issue origin then cache IO in series). As such the obsolete writethrough list and workqueue is also removed. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: submit writethrough writes in parallel to origin and cacheMike Snitzer2017-11-101-17/+37
| | | | | | | | | | | | | | | | | | Discontinue issuing writethrough write IO in series to the origin and then cache. Use bio_clone_fast() to create a new origin clone bio that will be mapped to the origin device and then bio_chain() it to the bio that gets remapped to the cache device. The origin clone bio does _not_ have a copy of the per_bio_data -- as such check_if_tick_bio_needed() will not be called. The cache bio (parent bio) will not complete until the origin bio has completed -- this fulfills bio_clone_fast()'s requirements as well as the requirement to not complete the original IO until the write IO has completed to both the origin and cache device. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
* dm cache: pass cache structure to mode functionsMike Snitzer2017-11-101-16/+16
| | | | | | | No functional changes, just a bit cleaner than passing cache_features structure. Signed-off-by: Mike Snitzer <snitzer@redhat.com>