summaryrefslogtreecommitdiffstats
path: root/drivers/md
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | md/md-linear: cleanup linear_add()Yu Kuai2023-10-101-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that caller already suspend the array, there is no need to suspend array in liner_add(). Note that mddev_suspend/resume() is not used anymore. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-16-yukuai1@huaweicloud.com
| * | | md: cleanup mddev_create/destroy_serial_pool()Yu Kuai2023-10-103-31/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that except for stopping the array, all the callers already suspend the array, there is no need to suspend anymore, hence remove the second parameter. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-15-yukuai1@huaweicloud.com
| * | | md: use new apis to suspend array before mddev_create/destroy_serial_poolYu Kuai2023-10-103-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mddev_create/destroy_serial_pool() will be called from several places where mddev_suspend() will be called later. Prepare to remove the mddev_suspend() from mddev_create/destroy_serial_pool(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-14-yukuai1@huaweicloud.com
| * | | md: use new apis to suspend array for ioctls involed array reconfigurationYu Kuai2023-10-101-10/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'reconfig_mutex' will be grabbed before these ioctls, suspend array before holding the lock, so that io won't concurrent with array reconfiguration through ioctls. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-13-yukuai1@huaweicloud.com
| * | | md: use new apis to suspend array for adding/removing rdev from state_store()Yu Kuai2023-10-101-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User can write 'remove' and 're-add' to trigger array reconfiguration through sysfs, suspend array in this case so that io won't concurrent with array reconfiguration. And now that all the caller of add_bound_rdev() alread suspend the array, remove mddev_suspend/resume() from add_bound_rdev() as well. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-12-yukuai1@huaweicloud.com
| * | | md: use new apis to suspend array for sysfs apisYu Kuai2023-10-101-16/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to use new apis in following sysfs apis: - level_store - suspend_lo_store - suspend_hi_store - serialize_policy_store These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-11-yukuai1@huaweicloud.com
| * | | md/raid5: use new apis to suspend arrayYu Kuai2023-10-101-26/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to use new apis, the old apis will be removed eventually. These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-10-yukuai1@huaweicloud.com
| * | | md/raid5-cache: use new apis to suspend arrayYu Kuai2023-10-101-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to use new apis, the old apis will be removed eventually. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-9-yukuai1@huaweicloud.com
| * | | md/md-bitmap: use new apis to suspend array for location_store()Yu Kuai2023-10-101-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to use new apis, the old apis will be removed eventually. This is not hot path, so performance is not concerned. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-8-yukuai1@huaweicloud.com
| * | | md/dm-raid: use new apis to suspend arrayYu Kuai2023-10-101-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to use new apis, the old apis will be removed eventually. These are not hot path, so performance is not concerned. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-7-yukuai1@huaweicloud.com
| * | | md: add new helpers to suspend/resume and lock/unlock arrayYu Kuai2023-10-101-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new helpers suspend the array first and then lock the array, Prepare to refactor from: mddev_lock/lock_nointr mddev_suspend ... mddev_resuem mddev_lock With: mddev_suspend_and_lock/lock_nointr ... mddev_unlock_and_resume After all the use cases is refactored, mddev_suspend/resume() will be removed. And mddev_suspend_and_lock() will also replace mddev_lock() for the case that the array will be reconfigured, in order to synchronize with io to prevent problems in many corner cases. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-6-yukuai1@huaweicloud.com
| * | | md: add new helpers to suspend/resume arrayYu Kuai2023-10-102-2/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Advantages for new apis: - reconfig_mutex is not required; - the weird logical that suspend array hold 'reconfig_mutex' for mddev_check_recovery() to update superblock is not needed; - the specail handling, 'pers->prepare_suspend', for raid456 is not needed; - It's safe to be called at any time once mddev is allocated, and it's designed to be used from slow path where array configuration is changed; - the new helpers is designed to be called before mddev_lock(), hence it support to be interrupted by user as well. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-5-yukuai1@huaweicloud.com
| * | | md: replace is_md_suspended() with 'mddev->suspended' in md_check_recovery()Yu Kuai2023-10-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare to cleanup pers->prepare_suspend(), which is used to fix a deadlock in raid456 by returning error for io that is waiting for reshape to make progress in mddev_suspend(). This change will allow reshape to make progress while waiting for io to be done in mddev_suspend() in following patches. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-4-yukuai1@huaweicloud.com
| * | | md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'Yu Kuai2023-10-101-22/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are not procted, hence protect it with READ_ONCE/WRITE_ONCE to prevent reading abnormal values. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-3-yukuai1@huaweicloud.com
| * | | md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'Yu Kuai2023-10-101-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Protect 'suspend_lo' and 'suspend_hi' with READ_ONCE/WRITE_ONCE to prevent reading abnormal values. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231010151958.145896-2-yukuai1@huaweicloud.com
| * | | md/raid1: don't split discard io for write behindYu Kuai2023-10-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, discad io is treated the same as normal write io, and for write behind case, io size is limited to: BIO_MAX_VECS * (PAGE_SIZE >> 9) For 0.5KB sector size and 4KB PAGE_SIZE, this is just 1MB. For consequence, if 'WriteMostly' is set to one of the underlying disks, then diskcard io will be splited into 1MB and it will take a long time for the diskcard to finish. Fix this problem by disable write behind for discard io. Reported-by: Roman Mamedov <rm@romanrm.net> Closes: https://lore.kernel.org/all/6a1165f7-c792-c054-b8f0-1ad4f7b8ae01@ultracoder.org/ Reported-and-tested-by: Kirill Kirilenko <kirill@ultracoder.org> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231007112105.407449-1-yukuai1@huaweicloud.com
| * | | md: do not require mddev_lock() for all options in array_state_store()Mariusz Tkaczyk2023-09-281-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't need to lock device to reject not supported request in array_state_store(). No functional changes intended. There are differences between ioctl and sysfs handling during stopping. With this change, it will be easier to add additional steps which needs to be completed before mddev_lock() is taken. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230928125517.12356-1-mariusz.tkaczyk@linux.intel.com
| * | | md: simplify md_seq_opsYu Kuai2023-09-271-78/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, the implementation is hacky and hard to understand: 1) md_seq_start set pos to 1; 2) md_seq_show found pos is 1, then print Personalities; 3) md_seq_next found pos is 1, then it update pos to the first mddev; 4) md_seq_show found pos is not 1 or 2, show mddev; 5) md_seq_next found pos is not 1 or 2, update pos to next mddev; 6) loop 4-5 until the last mddev, then md_seq_next update pos to 2; 7) md_seq_show found pos is 2, then print unused devices; 8) md_seq_next found pos is 2, stop; This patch remove the magic value and use seq_list_start/next/stop() directly, and move printing "Personalities" to md_seq_start(), "unsed devices" to md_seq_stop(): 1) md_seq_start print Personalities, and then set pos to first mddev; 2) md_seq_show show mddev; 3) md_seq_next update pos to next mddev; 4) loop 2-3 until the last mddev; 5) md_seq_stop print unsed devices; Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230927061241.1552837-3-yukuai1@huaweicloud.com
| * | | md: factor out a helper from mddev_put()Yu Kuai2023-09-271-12/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no functional changes, prepare to simplify md_seq_ops in next patch. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230927061241.1552837-2-yukuai1@huaweicloud.com
| * | | md: replace deprecated strncpy with memcpyJustin Stitt2023-09-251-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. There are three such strncpy uses that this patch addresses: The respective destination buffers are: 1) mddev->clevel 2) clevel 3) mddev->metadata_type We expect mddev->clevel to be NUL-terminated due to its use with format strings: | ret = sprintf(page, "%s\n", mddev->clevel); Furthermore, we can see that mddev->clevel is not expected to be NUL-padded as `md_clean()` merely set's its first byte to NULL -- not the entire buffer: | static void md_clean(struct mddev *mddev) | { | mddev->array_sectors = 0; | mddev->external_size = 0; | ... | mddev->level = LEVEL_NONE; | mddev->clevel[0] = 0; | ... A suitable replacement for this instance is `memcpy` as we know the number of bytes to copy and perform manual NUL-termination at a specified offset. This really decays to just a byte copy from one buffer to another. `strscpy` is also a considerable replacement but using `slen` as the length argument would result in truncation of the last byte unless something like `slen + 1` was provided which isn't the most idiomatic strscpy usage. For the next case, the destination buffer `clevel` is expected to be NUL-terminated based on its usage within kstrtol() which expects NUL-terminated strings. Note that, in context, this code removes a trailing newline which is seemingly not required as kstrtol() can handle trailing newlines implicitly. However, there exists further usage of clevel (or buf) that would also like to have the newline removed. All in all, with similar reasoning to the first case, let's just use memcpy as this is just a byte copy and NUL-termination is handled manually. The third and final case concerning `mddev->metadata_type` is more or less the same as the other two. We expect that it be NUL-terminated based on its usage with seq_printf: | seq_printf(seq, " super external:%s", | mddev->metadata_type); ... and we can surmise that NUL-padding isn't required either due to how it is handled in md_clean(): | static void md_clean(struct mddev *mddev) | { | ... | mddev->metadata_type[0] = 0; | ... So really, all these instances have precisely calculated lengths and purposeful NUL-termination so we can just use memcpy to remove ambiguity surrounding strncpy. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230925-strncpy-drivers-md-md-c-v1-1-2b0093b89c2b@google.com
| * | | md/md-linear: Annotate struct linear_conf with __counted_byKees Cook2023-09-222-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct linear_conf. Additionally, since the element count member must be set before accessing the annotated flexible array member, move its initialization earlier. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Song Liu <song@kernel.org> Cc: linux-raid@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230915200328.never.064-kees@kernel.org
| * | | md: don't check 'mddev->pers' and 'pers->quiesce' from suspend_lo_store()Yu Kuai2023-09-221-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to remove such checking. This will also allow the array to be suspended even before the array is ran. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-8-yukuai1@huaweicloud.com
| * | | md: don't check 'mddev->pers' from suspend_hi_store()Yu Kuai2023-09-221-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to remove such checking. This will also allow the array to be suspended even before the array is ran. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-7-yukuai1@huaweicloud.com
| * | | md-bitmap: suspend array earlier in location_store()Yu Kuai2023-09-221-23/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that mddev_suspend() doean't rely on 'mddev->pers' to be set, it's safe to call mddev_suspend() earlier. This will also be helper to refactor mddev_suspend() later. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-6-yukuai1@huaweicloud.com
| * | | md-bitmap: remove the checking of 'pers->quiesce' from location_store()Yu Kuai2023-09-221-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit 4d27e927344a ("md: don't quiesce in mddev_suspend()"), there is no need to check 'pers->quiesce' before calling mddev_suspend(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-5-yukuai1@huaweicloud.com
| * | | md: don't rely on 'mddev->pers' to be set in mddev_suspend()Yu Kuai2023-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'active_io' used to be initialized while the array is running, and 'mddev->pers' is set while the array is running as well. Hence caller must hold 'reconfig_mutex' and guarantee 'mddev->pers' is set before calling mddev_suspend(). Now that 'active_io' is initialized when mddev is allocated, such restriction doesn't exist anymore. In the meantime, follow up patches will refactor mddev_suspend(), hence add checking for 'mddev->pers' to prevent null-ptr-deref. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-4-yukuai1@huaweicloud.com
| * | | md: initialize 'writes_pending' while allocating mddevYu Kuai2023-09-225-26/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently 'writes_pending' is initialized in pers->run for raid1/5/10, and it's freed while deleing mddev, instead of pers->free. pers->run can be called multiple times before mddev is deleted, and a helper mddev_init_writes_pending() is used to prevent 'writes_pending' to be initialized multiple times, this usage is safe but a litter weird. On the other hand, 'writes_pending' is only initialized for raid1/5/10, however, it's used in common layer, for example: array_state_store set_in_sync if (!mddev->in_sync) -> in_sync is used for all levels // access writes_pending There might be some implicit dependency that I don't recognized to make sure 'writes_pending' can only be accessed for raid1/5/10, but there are no comments about that. By the way, it make sense to initialize 'writes_pending' in common layer because there are already three levels use it. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-3-yukuai1@huaweicloud.com
| * | | md: initialize 'active_io' while allocating mddevYu Kuai2023-09-223-22/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'active_io' is used for mddev_suspend() and it's initialized in md_run(), this restrict that 'reconfig_mutex' must be held and "mddev->pers" must be set before calling mddev_suspend(). Initialize 'active_io' early so that mddev_suspend() is safe to call once mddev is allocated, this will be helpful to refactor mddev_suspend() in following patches. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825030956.1527023-2-yukuai1@huaweicloud.com
| * | | md: delay remove_and_add_spares() for read only array to md_start_sync()Yu Kuai2023-09-221-10/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, for read-only array: md_check_recovery() check that 'MD_RECOVERY_NEEDED' is set, then it will call remove_and_add_spares() directly to try to remove and add rdevs from array. After this patch: 1) md_check_recovery() check that 'MD_RECOVERY_NEEDED' is set, and the worker 'sync_work' is not pending, and there are rdevs can be added or removed, then it will queue new work md_start_sync(); 2) md_start_sync() will call remove_and_add_spares() and exist; This change make sure that array reconfiguration is independent from daemon, and it'll be much easier to synchronize it with io, consier that io may rely on daemon thread to be done. Also fix a problem that 'pers->spars_active' is called after remove_and_add_spares(), which order is wrong, because spares must active first, and then remove_and_add_spares() can add spares to the array, like what read-write case does: 1) daemon set 'MD_RECOVERY_RUNNING', register new sync thread to do recovery; 2) recovery is done, md_do_sync() set 'MD_RECOVERY_DONE' before return; 3) daemon call 'pers->spars_active', and clear 'MD_RECOVERY_RUNNING'; 4) in the next round of daemon, call remove_and_add_spares() to add spares to the array. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-8-yukuai1@huaweicloud.com
| * | | md: factor out a helper rdev_addable() from remove_and_add_spares()Yu Kuai2023-09-221-12/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no functional changes, just to make the code simpler and prepare to delay remove_and_add_spares() to md_start_sync(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-7-yukuai1@huaweicloud.com
| * | | md: factor out a helper rdev_is_spare() from remove_and_add_spares()Yu Kuai2023-09-221-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no functional changes, just to make the code simpler and prepare to delay remove_and_add_spares() to md_start_sync(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-6-yukuai1@huaweicloud.com
| * | | md: factor out a helper rdev_removeable() from remove_and_add_spares()Yu Kuai2023-09-221-6/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no functional changes, just to make the code simpler and prepare to delay remove_and_add_spares() to md_start_sync(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-5-yukuai1@huaweicloud.com
| * | | md: delay choosing sync action to md_start_sync()Yu Kuai2023-09-221-34/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, for read-write array: 1) md_check_recover() found that something need to be done, and it'll try to grab 'reconfig_mutex'. The case that md_check_recover() need to do something: - array is not suspend; - super_block need to be updated; - 'MD_RECOVERY_NEEDED' or 'MD_RECOVERY_DONE' is set; - unusual case related to safemode; 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set, md_check_recover() will try to choose a sync action, and then queue a work md_start_sync(). 3) md_start_sync() register sync_thread; After this patch, 1) is the same; 2) if 'MD_RECOVERY_RUNNING' is not set, and 'MD_RECOVERY_NEEDED' is set, queue a work md_start_sync() directly; 3) md_start_sync() will try to choose a sync action, and then register sync_thread(); Because 'MD_RECOVERY_RUNNING' is cleared when sync_thread is done, 2) and 3) and md_do_sync() is always ran in serial and they can never concurrent, this change should not introduce any behavior change for now. Also fix a problem that md_start_sync() can clear 'MD_RECOVERY_RUNNING' without protection in error path, which might affect the logical in md_check_recovery(). The advantage to change this is that array reconfiguration is independent from daemon now, and it'll be much easier to synchronize it with io, consider that io may rely on daemon thread to be done. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-4-yukuai1@huaweicloud.com
| * | | md: factor out a helper to choose sync action from md_check_recovery()Yu Kuai2023-09-221-25/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no functional changes, on the one hand make the code cleaner, on the other hand prevent following checkpatch error in the next patch to delay choosing sync action to md_start_sync(). ERROR: do not use assignment in if condition + } else if ((spares = remove_and_add_spares(mddev, NULL))) { Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-3-yukuai1@huaweicloud.com
| * | | md: use separate work_struct for md_start_sync()Yu Kuai2023-09-222-5/+10
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | It's a little weird to borrow 'del_work' for md_start_sync(), declare a new work_struct 'sync_work' for md_start_sync(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230825031622.1530464-2-yukuai1@huaweicloud.com
* | | Merge tag 'hardening-v6.7-rc1' of ↵Linus Torvalds2023-10-305-5/+5
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "One of the more voluminous set of changes is for adding the new __counted_by annotation[1] to gain run-time bounds checking of dynamically sized arrays with UBSan. - Add LKDTM test for stuck CPUs (Mark Rutland) - Improve LKDTM selftest behavior under UBSan (Ricardo Cañuelo) - Refactor more 1-element arrays into flexible arrays (Gustavo A. R. Silva) - Analyze and replace strlcpy and strncpy uses (Justin Stitt, Azeem Shaikh) - Convert group_info.usage to refcount_t (Elena Reshetova) - Add __counted_by annotations (Kees Cook, Gustavo A. R. Silva) - Add Kconfig fragment for basic hardening options (Kees Cook, Lukas Bulwahn) - Fix randstruct GCC plugin performance mode to stay in groups (Kees Cook) - Fix strtomem() compile-time check for small sources (Kees Cook)" * tag 'hardening-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (56 commits) hwmon: (acpi_power_meter) replace open-coded kmemdup_nul reset: Annotate struct reset_control_array with __counted_by kexec: Annotate struct crash_mem with __counted_by virtio_console: Annotate struct port_buffer with __counted_by ima: Add __counted_by for struct modsig and use struct_size() MAINTAINERS: Include stackleak paths in hardening entry string: Adjust strtomem() logic to allow for smaller sources hardening: x86: drop reference to removed config AMD_IOMMU_V2 randstruct: Fix gcc-plugin performance mode to stay in group mailbox: zynqmp: Annotate struct zynqmp_ipi_pdata with __counted_by drivers: thermal: tsens: Annotate struct tsens_priv with __counted_by irqchip/imx-intmux: Annotate struct intmux_data with __counted_by KVM: Annotate struct kvm_irq_routing_table with __counted_by virt: acrn: Annotate struct vm_memory_region_batch with __counted_by hwmon: Annotate struct gsc_hwmon_platform_data with __counted_by sparc: Annotate struct cpuinfo_tree with __counted_by isdn: kcapi: replace deprecated strncpy with strscpy_pad isdn: replace deprecated strncpy with strscpy NFS/flexfiles: Annotate struct nfs4_ff_layout_segment with __counted_by nfs41: Annotate struct nfs4_file_layout_dsaddr with __counted_by ...
| * | | dm: Annotate struct dm_bio_prison with __counted_byKees Cook2023-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct dm_bio_prison. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200407.never.611-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | dm: Annotate struct dm_stat with __counted_byKees Cook2023-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct dm_stat. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200400.never.585-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | dm: Annotate struct stripe_c with __counted_byKees Cook2023-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct stripe_c. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200352.never.118-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | dm crypt: Annotate struct crypt_config with __counted_byKees Cook2023-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct crypt_config. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200344.never.272-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
| * | | dm raid: Annotate struct raid_set with __counted_byKees Cook2023-10-021-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct raid_set. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: dm-devel@redhat.com Reviewed-by: "Gustavo A. R. Silva" <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230915200335.never.098-kees@kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
* | | Merge tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefsLinus Torvalds2023-10-307-600/+5
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull initial bcachefs updates from Kent Overstreet: "Here's the bcachefs filesystem pull request. One new patch since last week: the exportfs constants ended up conflicting with other filesystems that are also getting added to the global enum, so switched to new constants picked by Amir. The only new non fs/bcachefs/ patch is the objtool patch that adds bcachefs functions to the list of noreturns. The patch that exports osq_lock() has been dropped for now, per Ingo" * tag 'bcachefs-2023-10-30' of https://evilpiepirate.org/git/bcachefs: (2781 commits) exportfs: Change bcachefs fid_type enum to avoid conflicts bcachefs: Refactor memcpy into direct assignment bcachefs: Fix drop_alloc_keys() bcachefs: snapshot_create_lock bcachefs: Fix snapshot skiplists during snapshot deletion bcachefs: bch2_sb_field_get() refactoring bcachefs: KEY_TYPE_error now counts towards i_sectors bcachefs: Fix handling of unknown bkey types bcachefs: Switch to unsafe_memcpy() in a few places bcachefs: Use struct_size() bcachefs: Correctly initialize new buckets on device resize bcachefs: Fix another smatch complaint bcachefs: Use strsep() in split_devs() bcachefs: Add iops fields to bch_member bcachefs: Rename bch_sb_field_members -> bch_sb_field_members_v1 bcachefs: New superblock section members_v2 bcachefs: Add new helper to retrieve bch_member from sb bcachefs: bucket_lock() is now a sleepable lock bcachefs: fix crc32c checksum merge byte order problem bcachefs: Fix bch2_inode_delete_keys() ...
| * | | bcache: move closures to lib/Kent Overstreet2023-10-197-600/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prep work for bcachefs - being a fork of bcache it also uses closures Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Acked-by: Coly Li <colyli@suse.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
* | | | bcache: Fixup error handling in register_cache()Jan Kara2023-10-281-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity has noticed that the printing of error message in register_cache() uses already freed bdev_handle to get to bdev. In fact the problem has been there even before commit "bcache: Convert to bdev_open_by_path()" just a bit more subtle one - cache object itself could have been freed by the time we looked at ca->bdev and we don't hold any reference to bdev either so even that could in principle go away (due to device unplug or similar). Fix all these problems by printing the error message before closing the bdev. Fixes: dc893f51d24a ("bcache: Convert to bdev_open_by_path()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231004093757.11560-1-jack@suse.cz Asked-by: Coly Li <colyli@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | md: Convert to bdev_open_by_dev()Jan Kara2023-10-282-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert md to use bdev_open_by_dev() and pass the handle around. We also don't need the 'Holder' flag anymore so remove it. CC: linux-raid@vger.kernel.org CC: Song Liu <song@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-11-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | dm: Convert to bdev_open_by_dev()Jan Kara2023-10-281-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert device mapper to use bdev_open_by_dev() and pass the handle around. CC: Alasdair Kergon <agk@redhat.com> CC: Mike Snitzer <snitzer@kernel.org> CC: dm-devel@redhat.com Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-10-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | bcache: Convert to bdev_open_by_path()Jan Kara2023-10-282-37/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert bcache to use bdev_open_by_path() and pass the handle around. CC: linux-bcache@vger.kernel.org CC: Coly Li <colyli@suse.de> CC: Kent Overstreet <kent.overstreet@gmail.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-9-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
* | | | Merge tag 'v6.6-p4' of ↵Linus Torvalds2023-10-101-1/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a regression in dm-crypt" * tag 'v6.6-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: dm crypt: Fix reqsize in crypt_iv_eboiv_gen
| * | | | dm crypt: Fix reqsize in crypt_iv_eboiv_genHerbert Xu2023-10-061-1/+2
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A skcipher_request object is made up of struct skcipher_request followed by a variable-sized trailer. The allocation of the skcipher_request and IV in crypt_iv_eboiv_gen is missing the memory for struct skcipher_request. Fix it by adding it to reqsize. Fixes: e3023094dffb ("dm crypt: Avoid using MAX_CIPHER_BLOCKSIZE") Cc: <stable@vger.kernel.org> #6.5+ Reported-by: Tatu Heikkilä <tatu.heikkila@gmail.com> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* | | | Merge tag 'for-6.6/dm-fixes-2' of ↵Linus Torvalds2023-10-071-8/+7
|\ \ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix memory leak when freeing dm zoned target device - Update dm-devel mailing list address in MAINTAINERS * tag 'for-6.6/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: MAINTAINERS: update the dm-devel mailing list dm zoned: free dmz->ddev array in dmz_put_zoned_devices