summaryrefslogtreecommitdiffstats
path: root/drivers/mtd/ubi
Commit message (Collapse)AuthorAgeFilesLines
* ubi: fastmap: Add module parameter to control reserving filling pool PEBsZhihao Cheng2023-10-283-6/+25
| | | | | | | | Adding 6th module parameter in 'mtd=xxx' to control whether or not reserving PEBs for filling pool/wl_pool. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Fix lapsed wear leveling for first 64 PEBsZhihao Cheng2023-10-284-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The anchor PEB must be picked from first 64 PEBs, these PEBs could have large erase counter greater than other PEBs especially when free space is nearly running out. The ubi_update_fastmap will be called as long as pool/wl_pool is empty, old anchor PEB is erased when updating fastmap. Given an UBI device with N PEBs, free PEBs is nearly running out and pool will be filled with 1 PEB every time ubi_update_fastmap invoked. So t=N/POOL_SIZE[1]/64 means that in worst case the erase counter of first 64 PEBs is t times greater than other PEBs in theory. After running fsstress for 24h, the erase counter statistics for two UBI devices shown as follow(CONFIG_MTD_UBI_WL_THRESHOLD=128): Device A(1024 PEBs, pool=50, wl_pool=25): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 0 0 0 0 10000 .. 99999: 960 29224 29282 29362 100000 .. inf: 64 117897 117934 117940 --------------------------------------------------------- Total : 1024 29224 34822 117940 Device B(8192 PEBs, pool=256, wl_pool=128): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 8128 2253 2321 2387 10000 .. 99999: 64 35387 35387 35388 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 8192 2253 2579 35388 The key point is reducing fastmap updating frequency by enlarging POOL_SIZE, so let UBI reserve ubi->fm_pool.max_size PEBs during attaching. Then POOL_SIZE will become ubi->fm_pool.max_size/2 even in free space running out case. Given an UBI device with 8192 PEBs(16384\8192\4096 is common large-capacity flash), t=8192/128/64=1. The fastmap updating will happen in either wl_pool or pool is empty, so setting fm_pool_rsv_cnt as ubi->fm_pool.max_size can fill wl_pool in full state. After pool reservation, running fsstress for 24h: Device A(1024 PEBs, pool=50, wl_pool=25): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 0 0 0 0 10000 .. 99999: 1024 33801 33997 34056 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 1024 33801 33997 34056 Device B(8192 PEBs, pool=256, wl_pool=128): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 0 0 0 0 1000 .. 9999: 8192 2205 2397 2460 10000 .. 99999: 0 0 0 0 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 8192 2205 2397 2460 The difference of erase counter between first 64 PEBs and others is under WL_FREE_MAX_DIFF(2*UBI_WL_THRESHOLD=2*128=256). Device A: 34056 - 33801 = 255 Device B: 2460 - 2205 = 255 Next patch will add a switch to control whether UBI needs to reserve PEBs for filling pool. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Get wl PEB even ec beyonds the 'max' if free PEBs are run outZhihao Cheng2023-10-282-16/+44
| | | | | | | | | | | | | | | | | | | This is the part 2 to fix cyclically reusing single fastmap data PEBs. Consider one situation, if there are four free PEBs for fm_anchor, pool, wl_pool and fastmap data PEB with erase counter 100, 100, 100, 5096 (ubi->beb_rsvd_pebs is 0). PEB with erase counter 5096 is always picked for fastmap data according to the realization of find_wl_entry(), since fastmap data PEB is not scheduled for wl, finally there are two PEBs (fm data) with great erase counter than other PEBS. Get wl PEB even its erase counter exceeds the 'max' in find_wl_entry() when free PEBs are run out after filling pools and fm data. Then the PEB with biggest erase conter is taken as wl PEB, it can be scheduled for wl. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: may_reserve_for_fm: Don't reserve PEB if fm_anchor existsZhihao Cheng2023-10-282-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the part 1 to fix cyclically reusing single fastmap data PEBs. After running fsstress on UBIFS for a while, UBI (16384 blocks, fastmap takes 2 blocks) has an erase block(PEB: 8031) with big erase counter greater than any other pebs: ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 532 84 92 99 100 .. 999: 15787 100 147 229 1000 .. 9999: 64 4699 4765 4826 10000 .. 99999: 0 0 0 0 100000 .. inf: 1 272935 272935 272935 --------------------------------------------------------- Total : 16384 84 180 272935 Not like fm_anchor, there is no candidate PEBs for fastmap data area, so old fastmap data pebs will be reused after all free pebs are filled into pool/wl_pool: ubi_update_fastmap for (i = 1; i < new_fm->used_blocks; i++) erase_block(ubi, old_fm->e[i]->pnum) new_fm->e[i] = old_fm->e[i] According to wear leveling algorithm, UBI selects one small erase counter PEB from ubi->used and one big erase counter PEB from wl_pool, the reused fastmap data PEB is not in these trees. UBI won't schedule this PEB for wl even it is in ubi->used because wl algorithm expects small erase counter for used PEB. Don't reserve PEB for fastmap in may_reserve_for_fm() if fm_anchor already exists. Otherwise, when UBI is running out of free PEBs, the only one free PEB (pnum < 64) will be skipped and fastmap data will be written on the same old PEB. Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Remove unneeded break condition while filling poolsZhihao Cheng2023-10-281-3/+2
| | | | | | | | | | | | Change pool filling stop condition. Commit d09e9a2bddba ("ubi: fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool not empty") reserves fastmap data PEBs after filling 1 PEB in wl_pool. Now wait_free_pebs_for_pool() makes enough free PEBs before filling pool, there will still be at least 1 PEB in pool and 1 PEB in wl_pool after doing ubi_refill_pools(). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Wait until there are enough free PEBs before filling poolsZhihao Cheng2023-10-285-16/+65
| | | | | | | | | | | | | | | | | | | | | Wait until there are enough free PEBs before filling pool/wl_pool, sometimes erase_worker is not scheduled in time, which causes two situations: A. There are few PEBs filled in pool, which makes ubi_update_fastmap is frequently called and leads first 64 PEBs are erased more times than other PEBs. So waiting free PEBs before filling pool reduces fastmap updating frequency and prolongs flash service life. B. In situation that space is nearly running out, ubi_refill_pools() cannot make sure pool and wl_pool are filled with free PEBs, caused by the delay of erase_worker. After this patch applied, there must exist free PEBs in pool after one call of ubi_update_fastmap. Besides, this patch is a preparetion for fixing large erase counter in fastmap data block and fixing lapsed wear leveling for first 64 PEBs. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Use free pebs reserved for bad block handlingZhihao Cheng2023-10-281-11/+5
| | | | | | | | | | | | | | | | | | | | | If new bad PEBs occur, UBI firstly consumes ubi->beb_rsvd_pebs, and then ubi->avail_pebs, finally UBI becomes read-only if above two items are 0, which means that the amount of PEBs for user volumes is not effected. Besides, UBI reserves count of free PBEs is ubi->beb_rsvd_pebs while filling wl pool or getting free PEBs, but ubi->avail_pebs is not reserved. So ubi->beb_rsvd_pebs and ubi->avail_pebs have nothing to do with the usage of free PEBs, UBI can use all free PEBs. Commit 78d6d497a648 ("UBI: Move fastmap specific functions out of wl.c") has removed beb_rsvd_pebs checking while filling pool. Now, don't reserve ubi->beb_rsvd_pebs while filling wl_pool. This will fill more PEBs in pool and also reduce fastmap updating frequency. Also remove beb_rsvd_pebs checking in ubi_wl_get_fm_peb. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217787 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Replace erase_block() with sync_erase()Zhihao Cheng2023-10-283-51/+7
| | | | | | | | Since erase_block() has same logic with sync_erase(), just replace it with sync_erase(), also rename 'sync_erase()' to 'ubi_sync_erase()'. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Allocate memory with GFP_NOFS in ubi_update_fastmapZhihao Cheng2023-10-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | Function ubi_update_fastmap could be called in IO context, for example: ubifs_writepage do_writepage ubifs_jnl_write_data write_head ubifs_wbuf_write_nolock ubifs_leb_write ubi_leb_write ubi_eba_write_leb try_write_vid_and_data ubi_wl_get_peb ubi_update_fastmap erase_block So it's better to allocate memory with GFP_NOFS mode, in case waiting page writeback(dead loop). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: erase_block: Get erase counter from wl_entry rather than flashZhihao Cheng2023-10-281-18/+9
| | | | | | | | Just like sync_erase() does, getting erase counter from wl_entry is faster than reading from flash. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Fix missed ec updating after erasing old fastmap data blockZhihao Cheng2023-10-281-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After running fsstress on ubifs for a long time, UBI(16384 blocks, fastmap takes 2 blocks) has an erase block with different erase counters displayed from two views: From ubiscan view: PEB 8031 has erase counter 31581 ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 16383 290 315 781 1000 .. 9999: 0 0 0 0 10000 .. 99999: 1 31581 31581 31581 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 16384 290 317 31581 From detailed_erase_block_info view: PEB 8031 has erase counter 7 physical_block_number erase_count 8030 421 8031 7 # mem info is different from disk info 8032 434 8033 425 8034 431 Following process missed updating erase counter in wl_entry(in memory): ubi_update_fastmap for (i = 1; i < new_fm->used_blocks; i++) // update fastmap data if (!tmp_e) if (old_fm && old_fm->e[i]) erase_block(ubi, old_fm->e[i]->pnum) ret = ubi_io_sync_erase(ubi, pnum, 0) ec = be64_to_cpu(ec_hdr->ec) ec += ret ec_hdr->ec = cpu_to_be64(ec) ubi_io_write_ec_hdr(ubi, pnum, ec_hdr) // ec is updated on flash // ec is not updated in old_fm->e[i] (in memory) Fix it by passing wl_enter into erase_block() and updating erase counter in erase_block(). Fixes: dbb7d2a88d2a ("UBI: Add fastmap core") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Refuse attaching if mtd's erasesize is 0Zhihao Cheng2023-09-071-0/+7
| | | | | | | | | | | | | There exists mtd devices with zero erasesize, which will trigger a divide-by-zero exception while attaching ubi device. Fix it by refusing attaching if mtd's erasesize is 0. Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Reported-by: Yu Hao <yhao016@ucr.edu> Link: https://lore.kernel.org/lkml/977347543.226888.1682011999468.JavaMail.zimbra@nod.at/T/ Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig2023-06-121-3/+2
| | | | | | | | | | | | | | The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: remove the unused mode argument to ->releaseChristoph Hellwig2023-06-121-1/+1
| | | | | | | | | | | | The mode argument to the ->release block_device_operation is never used, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* block: pass a gendisk to ->openChristoph Hellwig2023-06-121-2/+2
| | | | | | | | | | | | ->open is only called on the whole device. Make that explicit by passing a gendisk instead of the block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Merge tag 'ubifs-for-linus-6.4-rc1' of ↵Linus Torvalds2023-05-032-6/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: "UBI: - Fix error value for try_write_vid_and_data() - Minor cleanups UBIFS: - Fixes for various memory leaks - Minor cleanups" * tag 'ubifs-for-linus-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubifs: Fix memleak when insert_old_idx() failed Revert "ubifs: dirty_cow_znode: Fix memleak in error handling path" ubifs: Fix memory leak in do_rename ubifs: Free memory for tmpfile name ubi: Fix return value overwrite issue in try_write_vid_and_data() ubifs: Remove return in compr_exit() ubi: Simplify bool conversion
| * ubi: Fix return value overwrite issue in try_write_vid_and_data()Wang YanQing2023-04-211-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 2d78aee426d8 ("UBI: simplify LEB write and atomic LEB change code") adds helper function, try_write_vid_and_data(), to simplify the code, but this helper function has bug, it will return 0 (success) when ubi_io_write_vid_hdr() or the ubi_io_write_data() return error number (-EIO, etc), because the return value of ubi_wl_put_peb() will overwrite the original return value. This issue will cause unexpected data loss issue, because the caller of this function and UBIFS willn't know the data is lost. Fixes: 2d78aee426d8 ("UBI: simplify LEB write and atomic LEB change code") Cc: stable@vger.kernel.org Signed-off-by: Wang YanQing <udknight@gmail.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
| * ubi: Simplify bool conversionYang Li2023-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | ./drivers/mtd/ubi/build.c:1261:33-38: WARNING: conversion to bool not needed here Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4061 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* | Merge tag 'driver-core-6.4-rc1' of ↵Linus Torvalds2023-04-271-2/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems" * tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits) device property: make device_property functions take const device * driver core: update comments in device_rename() driver core: Don't require dynamic_debug for initcall_debug probe timing firmware_loader: rework crypto dependencies firmware_loader: Strip off \n from customized path zram: fix up permission for the hot_add sysfs file cacheinfo: Add use_arch[|_cache]_info field/function arch_topology: Remove early cacheinfo error message if -ENOENT cacheinfo: Check cache properties are present in DT cacheinfo: Check sib_leaf in cache_leaves_are_shared() cacheinfo: Allow early level detection when DT/ACPI info is missing/broken cacheinfo: Add arm64 early level initializer implementation cacheinfo: Add arch specific early level initializer tty: make tty_class a static const structure driver core: class: remove struct class_interface * from callbacks driver core: class: mark the struct class in struct class_interface constant driver core: class: make class_register() take a const * driver core: class: mark class_release() as taking a const * driver core: remove incorrect comment for device_create* MIPS: vpe-cmp: remove module owner pointer from struct class usage. ...
| * Merge 6.3-rc5 into driver-core-nextGreg Kroah-Hartman2023-04-031-1/+4
| |\ | | | | | | | | | | | | | | | | | | We need the fixes in here for testing, as well as the driver core changes for documentation updates to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | driver core: class: mark the struct class for sysfs callbacks as constantGreg Kroah-Hartman2023-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct class should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct class to be moved to read-only memory. While we are touching all class sysfs callbacks also mark the attribute as constant as it can not be modified. The bonding code still uses this structure so it can not be removed from the function callbacks. Cc: "David S. Miller" <davem@davemloft.net> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steve French <sfrench@samba.org> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: linux-cifs@vger.kernel.org Cc: linux-gpio@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: linux-rdma@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: netdev@vger.kernel.org Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20230325084537.3622280-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | drivers: remove struct module * setting from struct classGreg Kroah-Hartman2023-03-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to manually set the owner of a struct class, as the registering function does it automatically, so remove all of the explicit settings from various drivers that did so as it is unneeded. This allows us to remove this pointer entirely from this structure going forward. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Link: https://lore.kernel.org/r/20230313181843.1207845-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | Merge tag 'ubifs-for-linus-6.3-rc7' of ↵Linus Torvalds2023-04-152-8/+17
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI fixes from Richard Weinberger: - Fix failure to attach when vid_hdr offset equals the (sub)page size - Fix for a deadlock in UBI's worker thread * tag 'ubifs-for-linus-6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size ubi: Fix deadlock caused by recursively holding work_sem
| * | ubi: Fix failure attaching when vid_hdr offset equals to (sub)page sizeZhihao Cheng2023-03-291-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following process will make ubi attaching failed since commit 1b42b1a36fc946 ("ubi: ensure that VID header offset ... size"): ID="0xec,0xa1,0x00,0x15" # 128M 128KB 2KB modprobe nandsim id_bytes=$ID flash_eraseall /dev/mtd0 modprobe ubi mtd="0,2048" # set vid_hdr offset as 2048 (one page) (dmesg): ubi0 error: ubi_attach_mtd_dev [ubi]: VID header offset 2048 too large. UBI error: cannot attach mtd0 UBI error: cannot initialize UBI, error -22 Rework original solution, the key point is making sure 'vid_hdr_shift + UBI_VID_HDR_SIZE < ubi->vid_hdr_alsize', so we should check vid_hdr_shift rather not vid_hdr_offset. Then, ubi still support (sub)page aligined VID header offset. Fixes: 1b42b1a36fc946 ("ubi: ensure that VID header offset ... size") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Tested-by: Nicolas Schichan <nschichan@freebox.fr> Tested-by: Miquel Raynal <miquel.raynal@bootlin.com> # v5.10, v4.19 Signed-off-by: Richard Weinberger <richard@nod.at>
| * | ubi: Fix deadlock caused by recursively holding work_semZhaoLong Wang2023-03-041-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the processing of the bgt, if the sync_erase() return -EBUSY or some other error code in __erase_worker(),schedule_erase() called again lead to the down_read(ubi->work_sem) hold twice and may get block by down_write(ubi->work_sem) in ubi_update_fastmap(), which cause deadlock. ubi bgt other task do_work down_read(&ubi->work_sem) ubi_update_fastmap erase_worker # Blocked by down_read __erase_worker down_write(&ubi->work_sem) schedule_erase schedule_ubi_work down_read(&ubi->work_sem) Fix this by changing input parameter @nested of the schedule_erase() to 'true' to avoid recursively acquiring the down_read(&ubi->work_sem). Also, fix the incorrect comment about @nested parameter of the schedule_erase() because when down_write(ubi->work_sem) is held, the @nested is also need be true. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093 Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()") Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* / ubi: block: Fix missing blk_mq_end_requestRichard Weinberger2023-03-111-1/+4
|/ | | | | | | | | | | | | | Switching to BLK_MQ_F_BLOCKING wrongly removed the call to blk_mq_end_request(). Add it back to have our IOs finished Fixes: 91cc8fbcc8c7 ("ubi: block: set BLK_MQ_F_BLOCKING") Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Daniel Palmer <daniel@0x0f.com> Link: https://lore.kernel.org/linux-mtd/CAHk-=wi29bbBNh3RqJKu3PxzpjDN5D5K17gEVtXrb7-6bfrnMQ@mail.gmail.com/ Signed-off-by: Richard Weinberger <richard@nod.at> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Daniel Palmer <daniel@0x0f.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ubi: block: Fix a possible use-after-free bug in ubiblock_create()Harshit Mogalapalli2023-02-141-0/+1
| | | | | | | | | | | | | | | Smatch warns: drivers/mtd/ubi/block.c:438 ubiblock_create() warn: '&dev->list' not removed from list 'dev' is freed in 'out_free_dev:, but it is still on the list. To fix this, delete the list item before freeing. Fixes: 91cc8fbcc8c7 ("ubi: block: set BLK_MQ_F_BLOCKING") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Richard Weinberger <richard@nod.at>
* mtd: ubi: block: wire-up device parentDaniel Golle2023-02-132-1/+2
| | | | | | | | | | ubiblock devices were previously only identifyable by their name, but not connected to their parent UBI volume device e.g. in sysfs. Properly parent ubiblock device as descendant of a UBI volume device to reflect device model hierachy. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
* mtd: ubi: wire-up parent MTD deviceDaniel Golle2023-02-131-0/+1
| | | | | | | | | | | | | | | | | Wire up the device parent pointer of UBI devices to their lower MTD device, typically an MTD partition or whole-chip device. The most noticeable change is that in sysfs, previously ubi devices would be could in /sys/devices/virtual/ubi while after this change they would be correctly attached to their parent MTD device, e.g. /sys/devices/platform/1100d000.spi/spi_master/spi1/spi1.0/mtd/mtd2/ubi0. Locating UBI devices using /sys/class/ubi/ of course still works as well. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: use correct names in function kernel-doc commentsRandy Dunlap2023-02-053-3/+3
| | | | | | | | | | | | | | | | Fix kernel-doc warnings by using the correct function names in their kernel-doc notation: drivers/mtd/ubi/eba.c:72: warning: expecting prototype for next_sqnum(). Prototype was for ubi_next_sqnum() instead drivers/mtd/ubi/wl.c:176: warning: expecting prototype for wl_tree_destroy(). Prototype was for wl_entry_destroy() instead drivers/mtd/ubi/misc.c:24: warning: expecting prototype for calc_data_len(). Prototype was for ubi_calc_data_len() instead Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Vignesh Raghavendra <vigneshr@ti.com> Cc: linux-mtd@lists.infradead.org Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: block: set BLK_MQ_F_BLOCKINGChristoph Hellwig2023-02-051-69/+28
| | | | | | | | Set BLK_MQ_F_BLOCKING so that the block layer always calls ->queue_rq from process context and drop the driver internal workqueue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Richard Weinberger <richard@nod.at>
* UBI: Fastmap: Fix kernel-docJiapeng Chong2023-02-021-1/+1
| | | | | | | | | drivers/mtd/ubi/fastmap.c:104: warning: expecting prototype for new_fm_vhdr(). Prototype was for new_fm_vbuf() instead. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2289 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: ubi_wl_put_peb: Fix infinite loop when wear-leveling work failedZhihao Cheng2023-02-021-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following process will trigger an infinite loop in ubi_wl_put_peb(): ubifs_bgt ubi_bgt ubifs_leb_unmap ubi_leb_unmap ubi_eba_unmap_leb ubi_wl_put_peb wear_leveling_worker e1 = rb_entry(rb_first(&ubi->used) e2 = get_peb_for_wl(ubi) ubi_io_read_vid_hdr // return err (flash fault) out_error: ubi->move_from = ubi->move_to = NULL wl_entry_destroy(ubi, e1) ubi->lookuptbl[e->pnum] = NULL retry: e = ubi->lookuptbl[pnum]; // return NULL if (e == ubi->move_from) { // NULL == NULL gets true goto retry; // infinite loop !!! $ top PID USER PR NI VIRT RES SHR S %CPU %MEM COMMAND 7676 root 20 0 0 0 0 R 100.0 0.0 ubifs_bgt0_0 Fix it by: 1) Letting ubi_wl_put_peb() returns directly if wearl leveling entry has been removed from 'ubi->lookuptbl'. 2) Using 'ubi->wl_lock' protecting wl entry deletion to preventing an use-after-free problem for wl entry in ubi_wl_put_peb(). Fetch a reproducer in [Link]. Fixes: 43f9b25a9cdd7b1 ("UBI: bugfix: protect from volume removal") Fixes: ee59ba8b064f692 ("UBI: Fix stale pointers in ubi->lookuptbl") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216111 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Fix UAF wear-leveling entry in eraseblk_count_seq_show()Zhihao Cheng2023-02-021-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Wear-leveling entry could be freed in error path, which may be accessed again in eraseblk_count_seq_show(), for example: __erase_worker eraseblk_count_seq_show wl = ubi->lookuptbl[*block_number] if (wl) wl_entry_destroy ubi->lookuptbl[e->pnum] = NULL kmem_cache_free(ubi_wl_entry_slab, e) erase_count = wl->ec // UAF! Wear-leveling entry updating/accessing in ubi->lookuptbl should be protected by ubi->wl_lock, fix it by adding ubi->wl_lock to serialize wl entry accessing between wl_entry_destroy() and eraseblk_count_seq_show(). Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216305 Fixes: 7bccd12d27b7e3 ("ubi: Add debugfs file for tracking PEB state") Fixes: 801c135ce73d5d ("UBI: Unsorted Block Images") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Fix missed fm_anchor PEB in wear-leveling after disabling fastmapZhihao Cheng2023-02-021-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | After disabling fastmap(ubi->fm_disabled = 1), fastmap won't be updated, fm_anchor PEB is missed being scheduled for erasing. Besides, fm_anchor PEB may have smallest erase count, it doesn't participate wear-leveling. The difference of erase count between fm_anchor PEB and other PEBs will be larger and larger later on. In which situation fastmap can be disabled? Initially, we have an UBI image with fastmap. Then the image will be atttached without module parameter 'fm_autoconvert', ubi turns to full scanning mode in one random attaching process(eg. bad fastmap caused by powercut), ubi fastmap is disabled since then. Fix it by not getting fm_anchor if fastmap is disabled in ubi_refill_pools(). Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216341 Fixes: 4b68bf9a69d22d ("ubi: Select fastmap anchor PEBs considering ...") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Fix permission display of the debugfs filesZhaoLong Wang2023-02-021-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some interface files in debugfs support the read method dfs_file_read(), but their rwx permissions is shown as unreadable. In the user mode, the following problem can be clearly seen: # ls -l /sys/kernel/debug/ubi/ubi0/ total 0 --w------- 1 root root 0 Oct 22 16:26 chk_fastmap --w------- 1 root root 0 Oct 22 16:26 chk_gen --w------- 1 root root 0 Oct 22 16:26 chk_io -r-------- 1 root root 0 Oct 22 16:26 detailed_erase_block_info --w------- 1 root root 0 Oct 22 16:26 tst_disable_bgt --w------- 1 root root 0 Oct 22 16:26 tst_emulate_bitflips --w------- 1 root root 0 Oct 22 16:26 tst_emulate_io_failures --w------- 1 root root 0 Oct 22 16:26 tst_emulate_power_cut --w------- 1 root root 0 Oct 22 16:26 tst_emulate_power_cut_max --w------- 1 root root 0 Oct 22 16:26 tst_emulate_power_cut_min It shows that these files do not have read permission 'r', but we can actually read their contents. # echo 1 > /sys/kernel/debug/ubi/ubi0/chk_io # cat /sys/kernel/debug/ubi/ubi0/chk_io 1 User's permission access is determined by capabilities. Of course, the root user is not restricted from reading these files. When reading a debugfs file, the process is as follows: ksys_read() vfs_read() if (file->f_op->read) file->f_op->read() full_proxy_open() real_fops->read() dfs_file_read() -- Read method of debugfs file. else if (file->f_op->read_iter) new_sync_read() else ret = -EINVAL -- Return -EINVAL if no read method. This indicates that the debugfs file can be read as long as the read method of the debugfs file is registered. This patch adds the read permission display for file that support the read method. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Fix possible null-ptr-deref in ubi_free_volume()Yang Yingliang2023-02-022-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | It willl cause null-ptr-deref in the following case: uif_init() ubi_add_volume() cdev_add() -> if it fails, call kill_volumes() device_register() kill_volumes() -> if ubi_add_volume() fails call this function ubi_free_volume() cdev_del() device_unregister() -> trying to delete a not added device, it causes null-ptr-deref So in ubi_free_volume(), it delete devices whether they are added or not, it will causes null-ptr-deref. Handle the error case whlie calling ubi_add_volume() to fix this problem. If add volume fails, set the corresponding vol to null, so it can not be accessed in kill_volumes() and release the resource in ubi_add_volume() error path. Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Suggested-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: fastmap: Add fastmap control support for module parameterZhaoLong Wang2023-02-021-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The UBI driver can use the IOCTL to disable the fastmap after the mainline 669d204469c4 ("ubi: fastmap: Add fastmap control support for 'UBI_IOCATT' ioctl"). To destroy the fastmap on a old image, we need to reattach the device in user space. However, if the UBI driver build in kernel and the UBI volume is the root partition, the UBI device cannot be reattached in user space. To disable fastmap in this case, the UBI must provide the kernel cmdline parameters to disable fastmap during attach. This patch add 'enable_fm' as 5th module init parameter of mtd=xx to control fastmap enable or not. When the value is 0, fastmap will not create and existed fastmap will destroyed for the given ubi device. Default value is 0. To enable or disable fastmap during module loading, fm_autoconvert must be set to non-zero. +-----------------+---------------+---------------------------+ | \ | enable_fm=0 | enable_fm=1 | +-----------------+---------------+---------------------------+ |fm_autoconvert=Y | disable fm | enable fm | +---------------------------------+---------------------------+ |fm_autoconvert=N | disable fm | Enable fastmap if fastmap | | | | exists on the old image | +-------------------------------------------------------------+ Example: # - Attach mtd1 to ubi1, disable fastmap, mtd2 to ubi2, enable fastmap. # modprobe ubi mtd=1,0,0,1,0 mtd=2,0,0,2,1 fm_autoconvert=1 # - If 5th parameter is not specified, the value is 0, fastmap is disable # modprobe ubi mtd=1 fm_autoconvert=1 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216623 Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Fix unreferenced object reported by kmemleak in ubi_resize_volume()Li Zetao2023-02-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a memory leaks problem reported by kmemleak: unreferenced object 0xffff888102007a00 (size 128): comm "ubirsvol", pid 32090, jiffies 4298464136 (age 2361.231s) hex dump (first 32 bytes): ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ................ backtrace: [<ffffffff8176cecd>] __kmalloc+0x4d/0x150 [<ffffffffa02a9a36>] ubi_eba_create_table+0x76/0x170 [ubi] [<ffffffffa029764e>] ubi_resize_volume+0x1be/0xbc0 [ubi] [<ffffffffa02a3321>] ubi_cdev_ioctl+0x701/0x1850 [ubi] [<ffffffff81975d2d>] __x64_sys_ioctl+0x11d/0x170 [<ffffffff83c142a5>] do_syscall_64+0x35/0x80 [<ffffffff83e0006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This is due to a mismatch between create and destroy interfaces, and in detail that "new_eba_tbl" created by ubi_eba_create_table() but destroyed by kfree(), while will causing "new_eba_tbl->entries" not freed. Fix it by replacing kfree(new_eba_tbl) with ubi_eba_destroy_table(new_eba_tbl) Fixes: 799dca34ac54 ("UBI: hide EBA internals") Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: Fix use-after-free when volume resizing failedLi Zetao2023-02-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an use-after-free problem reported by KASAN: ================================================================== BUG: KASAN: use-after-free in ubi_eba_copy_table+0x11f/0x1c0 [ubi] Read of size 8 at addr ffff888101eec008 by task ubirsvol/4735 CPU: 2 PID: 4735 Comm: ubirsvol Not tainted 6.1.0-rc1-00003-g84fa3304a7fc-dirty #14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report+0x171/0x472 kasan_report+0xad/0x130 ubi_eba_copy_table+0x11f/0x1c0 [ubi] ubi_resize_volume+0x4f9/0xbc0 [ubi] ubi_cdev_ioctl+0x701/0x1850 [ubi] __x64_sys_ioctl+0x11d/0x170 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> When ubi_change_vtbl_record() returns an error in ubi_resize_volume(), "new_eba_tbl" will be freed on error handing path, but it is holded by "vol->eba_tbl" in ubi_eba_replace_table(). It means that the liftcycle of "vol->eba_tbl" and "vol" are different, so when resizing volume in next time, it causing an use-after-free fault. Fix it by not freeing "new_eba_tbl" after it replaced in ubi_eba_replace_table(), while will be freed in next volume resizing. Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: block: Reduce warning print to info for static volumesMårten Lindahl2023-02-021-3/+6
| | | | | | | | | | | | | | | If volume size is not multiple of the sector size 512 a warning is printed saying that the last non-sector aligned bytes will be ignored. This should be valid for resizable volumes, but when creating static volumes which are read only this will always be printed even if the unaligned data is deliberate. The message is still valid but the severity should be lowered for static volumes. Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* ubi: ensure that VID header offset + VID header size <= alloc, sizeGeorge Kennedy2023-02-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure that the VID header offset + VID header size does not exceed the allocated area to avoid slab OOB. BUG: KASAN: slab-out-of-bounds in crc32_body lib/crc32.c:111 [inline] BUG: KASAN: slab-out-of-bounds in crc32_le_generic lib/crc32.c:179 [inline] BUG: KASAN: slab-out-of-bounds in crc32_le_base+0x58c/0x626 lib/crc32.c:197 Read of size 4 at addr ffff88802bb36f00 by task syz-executor136/1555 CPU: 2 PID: 1555 Comm: syz-executor136 Tainted: G W 6.0.0-1868 #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module+el8.3.0+7860+a7792d29 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x85/0xad lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold.13+0xb6/0x6bb mm/kasan/report.c:433 kasan_report+0xa7/0x11b mm/kasan/report.c:495 crc32_body lib/crc32.c:111 [inline] crc32_le_generic lib/crc32.c:179 [inline] crc32_le_base+0x58c/0x626 lib/crc32.c:197 ubi_io_write_vid_hdr+0x1b7/0x472 drivers/mtd/ubi/io.c:1067 create_vtbl+0x4d5/0x9c4 drivers/mtd/ubi/vtbl.c:317 create_empty_lvol drivers/mtd/ubi/vtbl.c:500 [inline] ubi_read_volume_table+0x67b/0x288a drivers/mtd/ubi/vtbl.c:812 ubi_attach+0xf34/0x1603 drivers/mtd/ubi/attach.c:1601 ubi_attach_mtd_dev+0x6f3/0x185e drivers/mtd/ubi/build.c:965 ctrl_cdev_ioctl+0x2db/0x347 drivers/mtd/ubi/cdev.c:1043 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x193/0x213 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3e/0x86 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0x0 RIP: 0033:0x7f96d5cf753d Code: RSP: 002b:00007fffd72206f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96d5cf753d RDX: 0000000020000080 RSI: 0000000040186f40 RDI: 0000000000000003 RBP: 0000000000400cd0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400be0 R13: 00007fffd72207e0 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 1555: kasan_save_stack+0x20/0x3d mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:45 [inline] set_alloc_info mm/kasan/common.c:437 [inline] ____kasan_kmalloc mm/kasan/common.c:516 [inline] __kasan_kmalloc+0x88/0xa3 mm/kasan/common.c:525 kasan_kmalloc include/linux/kasan.h:234 [inline] __kmalloc+0x138/0x257 mm/slub.c:4429 kmalloc include/linux/slab.h:605 [inline] ubi_alloc_vid_buf drivers/mtd/ubi/ubi.h:1093 [inline] create_vtbl+0xcc/0x9c4 drivers/mtd/ubi/vtbl.c:295 create_empty_lvol drivers/mtd/ubi/vtbl.c:500 [inline] ubi_read_volume_table+0x67b/0x288a drivers/mtd/ubi/vtbl.c:812 ubi_attach+0xf34/0x1603 drivers/mtd/ubi/attach.c:1601 ubi_attach_mtd_dev+0x6f3/0x185e drivers/mtd/ubi/build.c:965 ctrl_cdev_ioctl+0x2db/0x347 drivers/mtd/ubi/cdev.c:1043 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x193/0x213 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3e/0x86 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0x0 The buggy address belongs to the object at ffff88802bb36e00 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 0 bytes to the right of 256-byte region [ffff88802bb36e00, ffff88802bb36f00) The buggy address belongs to the physical page: page:00000000ea4d1263 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2bb36 head:00000000ea4d1263 order:1 compound_mapcount:0 compound_pincount:0 flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) raw: 000fffffc0010200 ffffea000066c300 dead000000000003 ffff888100042b40 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88802bb36e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88802bb36e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88802bb36f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88802bb36f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88802bb37000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: 801c135ce73d ("UBI: Unsorted Block Images") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* treewide: use get_random_u32_below() instead of deprecated functionJason A. Donenfeld2022-11-182-4/+4
| | | | | | | | | | | | | | | | | | | | This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* Merge tag 'random-6.1-rc1-for-linus' of ↵Linus Torvalds2022-10-162-4/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull more random number generator updates from Jason Donenfeld: "This time with some large scale treewide cleanups. The intent of this pull is to clean up the way callers fetch random integers. The current rules for doing this right are: - If you want a secure or an insecure random u64, use get_random_u64() - If you want a secure or an insecure random u32, use get_random_u32() The old function prandom_u32() has been deprecated for a while now and is just a wrapper around get_random_u32(). Same for get_random_int(). - If you want a secure or an insecure random u16, use get_random_u16() - If you want a secure or an insecure random u8, use get_random_u8() - If you want secure or insecure random bytes, use get_random_bytes(). The old function prandom_bytes() has been deprecated for a while now and has long been a wrapper around get_random_bytes() - If you want a non-uniform random u32, u16, or u8 bounded by a certain open interval maximum, use prandom_u32_max() I say "non-uniform", because it doesn't do any rejection sampling or divisions. Hence, it stays within the prandom_*() namespace, not the get_random_*() namespace. I'm currently investigating a "uniform" function for 6.2. We'll see what comes of that. By applying these rules uniformly, we get several benefits: - By using prandom_u32_max() with an upper-bound that the compiler can prove at compile-time is ≤65536 or ≤256, internally get_random_u16() or get_random_u8() is used, which wastes fewer batched random bytes, and hence has higher throughput. - By using prandom_u32_max() instead of %, when the upper-bound is not a constant, division is still avoided, because prandom_u32_max() uses a faster multiplication-based trick instead. - By using get_random_u16() or get_random_u8() in cases where the return value is intended to indeed be a u16 or a u8, we waste fewer batched random bytes, and hence have higher throughput. This series was originally done by hand while I was on an airplane without Internet. Later, Kees and I worked on retroactively figuring out what could be done with Coccinelle and what had to be done manually, and then we split things up based on that. So while this touches a lot of files, the actual amount of code that's hand fiddled is comfortably small" * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: prandom: remove unused functions treewide: use get_random_bytes() when possible treewide: use get_random_u32() when possible treewide: use get_random_{u8,u16}() when possible, part 2 treewide: use get_random_{u8,u16}() when possible, part 1 treewide: use prandom_u32_max() when possible, part 2 treewide: use prandom_u32_max() when possible, part 1
| * treewide: use prandom_u32_max() when possible, part 1Jason A. Donenfeld2022-10-112-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than incurring a division or requesting too many random bytes for the given range, use the prandom_u32_max() function, which only takes the minimum required bytes from the RNG and avoids divisions. This was done mechanically with this coccinelle script: @basic@ expression E; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; typedef u64; @@ ( - ((T)get_random_u32() % (E)) + prandom_u32_max(E) | - ((T)get_random_u32() & ((E) - 1)) + prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2) | - ((u64)(E) * get_random_u32() >> 32) + prandom_u32_max(E) | - ((T)get_random_u32() & ~PAGE_MASK) + prandom_u32_max(PAGE_SIZE) ) @multi_line@ identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; identifier RAND; expression E; @@ - RAND = get_random_u32(); ... when != RAND - RAND %= (E); + RAND = prandom_u32_max(E); // Find a potential literal @literal_mask@ expression LITERAL; type T; identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32"; position p; @@ ((T)get_random_u32()@p & (LITERAL)) // Add one to the literal. @script:python add_one@ literal << literal_mask.LITERAL; RESULT; @@ value = None if literal.startswith('0x'): value = int(literal, 16) elif literal[0] in '123456789': value = int(literal, 10) if value is None: print("I don't know how to handle %s" % (literal)) cocci.include_match(False) elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1: print("Skipping 0x%x for cleanup elsewhere" % (value)) cocci.include_match(False) elif value & (value + 1) != 0: print("Skipping 0x%x because it's not a power of two minus one" % (value)) cocci.include_match(False) elif literal.startswith('0x'): coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1)) else: coccinelle.RESULT = cocci.make_expr("%d" % (value + 1)) // Replace the literal mask with the calculated result. @plus_one@ expression literal_mask.LITERAL; position literal_mask.p; expression add_one.RESULT; identifier FUNC; @@ - (FUNC()@p & (LITERAL)) + prandom_u32_max(RESULT) @collapse_ret@ type T; identifier VAR; expression E; @@ { - T VAR; - VAR = (E); - return VAR; + return E; } @drop_var@ type T; identifier VAR; @@ { - T VAR; ... when != VAR } Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: KP Singh <kpsingh@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390 Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | ubi: fastmap: Add fastmap control support for 'UBI_IOCATT' ioctlZhihao Cheng2022-09-213-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [1] suggests that fastmap is suitable for large flash devices. Module parameter 'fm_autoconvert' is a coarse grained switch to enable all ubi devices to generate fastmap, which may turn on fastmap even for small flash devices. This patch imports a new field 'disable_fm' in struct 'ubi_attach_req' to support following situations by ioctl 'UBI_IOCATT'. [old functions] A. Disable 'fm_autoconvert': Disbable fastmap for all ubi devices B. Enable 'fm_autoconvert': Enable fastmap for all ubi devices [new function] C. Enable 'fm_autoconvert', set 'disable_fm' for given device: Don't create new fastmap and do full scan (existed fastmap will be destroyed) for the given ubi device. A simple test case in [2]. [1] http://www.linux-mtd.infradead.org/doc/ubi.html#L_fastmap [2] https://bugzilla.kernel.org/show_bug.cgi?id=216278 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* | ubi: fastmap: Use the bitmap API to allocate bitmapsChristophe JAILLET2022-09-211-6/+4
| | | | | | | | | | | | | | | | | | | | Use bitmap_zalloc()/bitmap_free() instead of hand-writing them. It is less verbose and it improves the semantic. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* | mtd: ubi: drop unexpected word 'a' in commentsJiang Jian2022-09-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | there is an unexpected word 'a' in the comments that need to be dropped file - drivers/mtd/ubi/vmt.c line - 626,779 * Returns zero if volume is all right and a a negative error code if not. changed to: * Returns zero if volume is all right and a negative error code if not. Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com> Signed-off-by: Richard Weinberger <richard@nod.at>
* | ubi: block: Fix typos in commentsJulia Lawall2022-09-211-1/+1
| | | | | | | | | | | | | | | | Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Richard Weinberger <richard@nod.at>
* | ubi: fastmap: Fix typo in commentsZhang Jiaming2022-09-211-1/+1
| | | | | | | | | | | | | | | | | | There are a typo(dont't) in comments. Fix it. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at>