summaryrefslogtreecommitdiffstats
path: root/drivers/md/bitmap.c
Commit message (Collapse)AuthorAgeFilesLines
* sysfs: clean up sysfs_get_dirent()Tejun Heo2013-09-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pre-existing sysfs interfaces which take explicit namespace argument are weird in that they place the optional @ns in front of @name which is contrary to the established convention. For example, we end up forcing vast majority of sysfs_get_dirent() users to do sysfs_get_dirent(parent, NULL, name), which is silly and error-prone especially as @ns and @name may be interchanged without causing compilation warning. This renames sysfs_get_dirent() to sysfs_get_dirent_ns() and swap the positions of @name and @ns, and sysfs_get_dirent() is now a wrapper around sysfs_get_dirent_ns(). This makes confusions a lot less likely. There are other interfaces which take @ns before @name. They'll be updated by following patches. This patch doesn't introduce any functional changes. v2: EXPORT_SYMBOL_GPL() wasn't updated leading to undefined symbol error on module builds. Reported by build test robot. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* md: replace strict_strto*() with kstrto*()Jingoo Han2013-06-141-4/+4
| | | | | | | | | The usage of strict_strtoul() is not preferred, because strict_strtoul() is obsolete. Thus, kstrtoul() should be used. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md: use set_bit_le and clear_bit_leAkinobu Mita2013-04-241-2/+2
| | | | | | | | | | The value returned by test_and_set_bit_le() drivers/md/bitmap.c is not used. So just use set_bit_le(). The same goes for test_and_clear_bit_le(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
* new helper: file_inode(file)Al Viro2013-02-221-2/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* md/bitmap:Don't use IS_ERR to judge alloc_page().Jianpeng Ma2012-10-111-6/+2
| | | | | Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* raid: replace list_for_each_continue_rcu with new interfaceMichael Wang2012-10-111-6/+3
| | | | | | | | | | This patch replaces list_for_each_continue_rcu() with list_for_each_entry_continue_rcu() to save a few lines of code and allow removing list_for_each_continue_rcu(). Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md/raid1: submit IO from originating thread instead of md thread.NeilBrown2012-08-021-1/+1
| | | | | | | | | | | | queuing writes to the md thread means that all requests go through the one processor which may not be able to keep up with very high request rates. So use the plugging infrastructure to submit all requests on unplug. If a 'schedule' is needed, we fall back on the old approach of handing the requests to the thread for it to handle. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: record the space available for the bitmap in the superblock.NeilBrown2012-05-221-0/+7
| | | | | | | | | | Now that bitmaps can grow and shrink it is best if we record how much space is available. This means that when we reduce the size of the bitmap we won't "lose" the space for late when we might want to increase the size of the bitmap again. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: make sure reshape request are reflected in superblock.NeilBrown2012-05-221-0/+3
| | | | | | | As a reshape may change the sync_size and/or chunk_size, we need to update these whenever we write out the bitmap superblock. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: add bitmap_resize function to allow bitmap resizing.NeilBrown2012-05-221-30/+169
| | | | | | | | | | | | | | | | This function will allocate the new data structures and copy bits across from old to new, allowing for the possibility that the chunksize has changed. Use the same function for performing the initial allocation of the structures. This improves test coverage. When bitmap_resize is used to resize an existing bitmap, it only copies '1' bits in, not '0' bits. So when allocating the bitmap, ensure everything is initialised to ZERO. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: use DIV_ROUND_UP instead of open-codeNeilBrown2012-05-221-3/+2
| | | | | | Also take the opportunity to simplify CHUNK_BLOCK_RATIO. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: create a 'struct bitmap_counts' substructure of 'struct bitmap'NeilBrown2012-05-221-67/+71
| | | | | | | | | | The new "struct bitmap_counts" contains all the fields that are related to counting the number of active writes in each bitmap chunk. Having this separate will make it easier to change the chunksize or overall size of a bitmap atomically. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: make bitmap bitops atomic.NeilBrown2012-05-221-4/+2
| | | | | | | This allows us to remove spinlock protection which is more heavy-weight than simple atomics. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: make _page_attr bitops atomic.NeilBrown2012-05-221-32/+23
| | | | | | | | | | Using e.g. set_bit instead of __set_bit and using test_and_clear_bit allow us to remove some locking and contract other locked ranges. It is rare that we set or clear a lot of these bits, so gain should outweigh any cost. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: merge bitmap_file_unmap and bitmap_file_put.NeilBrown2012-05-221-24/+10
| | | | | | | | | | | There functions really do one thing together: release the 'bitmap_storage'. So make them just one function. Since we removed the locking (previous patch), we don't need to zero any fields before freeing them, so it all becomes a bit simpler. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: remove async freeing of bitmap file.NeilBrown2012-05-221-12/+6
| | | | | | | | | | | | | | | | | There is no real value in freeing things the moment there is an error. It is just as good to free the bitmap file and pages when the bitmap is explicitly removed (and replaced?) or at shutdown. With this gone, the bitmap will only disappear when the array is quiescent, so we can remove some locking. As the 'filemap' doesn't disappear now, include extra checks before trying to write any of it out. Also remove the check for "has it disappeared" in bitmap_daemon_write(). Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: convert some spin_lock_irqsave to spin_lock_irqNeilBrown2012-05-221-18/+14
| | | | | | | | All of these sites can only be called from process context with irqs enabled, so using irqsave/irqrestore just adds noise. Remove it. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: use set_bit, test_bit, etc for operation on bitmap->flags.NeilBrown2012-05-221-25/+21
| | | | | | | | | | We currently use '&' and '|' which isn't the norm in the kernel and doesn't allow easy atomicity. So change to bit numbers and {set,clear,test}_bit. This allows us to remove a spinlock/unlock (which was dubious anyway) and some other simplifications. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: remove single-bit manipulation on sb->stateNeilBrown2012-05-221-2/+2
| | | | | | | | | | | | | | | | Just do single-bit manipulations on bitmap->flags and copy whole value between that and sb->state. This will allow next patch which changes how bit manipulations are performed on bitmap->flags. This does result in BITMAP_STALE not being set in sb by bitmap_read_sb, however as the setting is determined by other information in the 'sb' we do not lose information this way. Normally, bitmap_load will be called shortly which will clear BITMAP_STALE anyway. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: remove bitmap_mask_stateNeilBrown2012-05-221-34/+3
| | | | | | | | | | | This function isn't really needed. It sets or clears a flag in both bitmap->flags and sb->state. However both times it is called, bitmap_update_sb is called soon afterwards which copies bitmap->flags to sb->state. So just make changes to bitmap->flags, and open-code those rather than hiding in a function. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: move storage allocation from bitmap_load to bitmap_create.NeilBrown2012-05-221-5/+6
| | | | | | | We should allocate memory for the storage-bitmap at create-time, not load time. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: separate bitmap file allocation to its own function.NeilBrown2012-05-221-46/+67
| | | | | | This will allow allocation before swapping in a new bitmap. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: store bytes in file rather than just in last page.NeilBrown2012-05-221-7/+9
| | | | | | | This number is more generally useful, and bytes-in-last-page is easily extracted from it. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: move some fields of 'struct bitmap' into a 'storage' substruct.NeilBrown2012-05-221-86/+94
| | | | | | | | | This new 'struct bitmap_storage' reflects the external storage of the bitmap. Having this clearly defined will make it easier to change the storage used while the array is active. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: change *_page_attr() to take a page number, not a page.NeilBrown2012-05-221-29/+26
| | | | | | | | Most often we have the page number, not the page. And that is what the *_page_attr() functions really want. So change the arguments to take that number. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: centralise allocation of bitmap file pages.NeilBrown2012-05-221-81/+68
| | | | | | | | | | | Instead of allocating pages in read_sb_page, read_page and bitmap_read_sb, allocate them all in bitmap_init_from disk. Also replace the hack of calling "attach_page_buffers(page, NULL)" to ensure that free_buffer() won't complain, by putting a test for PagePrivate in free_buffer(). Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: allow a bitmap with no backing storage.NeilBrown2012-05-221-60/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | An md bitmap comprises two parts - internal counting of active writes per 'chunk'. - external storage of whether there are any active writes on each chunk The second requires the first, but the first doesn't require the second. Not having backing storage means that the bitmap cannot expedite resync after a crash, but it still allows us to expedite the recovery of a recently-removed device. So: allow a bitmap to exist even if there is no backing device. In that case we default to 128M chunks. A particular value of this is that we can remove and re-add a bitmap (possibly of a different granularity) on a degraded array, and not lose the information needed to fast-recover the missing device. We don't actually activate these bitmaps yet - that will come in a later patch. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: add new 'space' attribute for bitmaps.NeilBrown2012-05-221-0/+39
| | | | | | | | | | | | | If we are to allow bitmaps to be resized when the array is resized, we need to know how much space there is. So create an attribute to store this information and set appropriate defaults. It can be set more precisely via sysfs, or future metadata extensions may allow it to be recorded. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: disentangle two different 'pending' flags.NeilBrown2012-05-221-101/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two different 'pending' concepts in the handling of the write intent bitmap. Firstly, a 'page' from the bitmap (which container PAGE_SIZE*8 bits) may have changes (bits cleared) that should be written in due course. There is no hurry for these and the page will transition from PENDING to NEEDWRITE and will then be written, though if it ever becomes DIRTY it will be written much sooner and PENDING will be cleared. Secondly, a page of counters - which contains PAGE_SIZE/2 counters, one for each bit, can usefully have a 'pending' flag which indicates if any of the counters are low (2 or 1) and ready to be processed by bitmap_daemon_work(). If this flag is clear we can skip the whole page. These two concepts are currently combined in the bitmap-file flag. This causes a tighter connection between the counters and the bitmap file than I would like - as I want to add some flexibility to the bitmap file. So introduce a new flag with the page-of-counters, and rewrite bitmap_daemon_work() so that it handles the two different 'pending' concepts separately. This also allows us to clear BITMAP_PAGE_PENDING when we write out a dirty page, which may occasionally reduce the number of times we write a page. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: fix calculation of 'chunks' - missing shift.NeilBrown2012-05-041-2/+1
| | | | | | | | | | | | | | | | commit 61a0d80c "md/bitmap: discard CHUNK_BLOCK_SHIFT macro" replaced CHUNK_BLOCK_RATIO() by the same text that was replacing CHUNK_BLOCK_SHIFT() - which is clearly wrong. The result is that 'chunks' is often too small by 1, which can sometimes result in a crash (not sure how). So use the correct replacement, and get rid of CHUNK_BLOCK_RATIO which is no longe used. Reported-by: Karl Newman <siliconfiend@gmail.com> Tested-by: Karl Newman <siliconfiend@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: prevent bitmap_daemon_work running while initialising bitmapNeilBrown2012-04-121-0/+2
| | | | | | | | | | | | | | | | If a bitmap is added while the array is active, it is possible for bitmap_daemon_work to run while the bitmap is being initialised. This is particularly a problem if bitmap_daemon_work sees bitmap->filemap as non-NULL before it has been filled in properly. So hold bitmap_info.mutex while filling in ->filemap to prevent problems. This patch is suitable for any -stable kernel, though it might not apply cleanly before about 3.1. Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
* MD: Bitmap version cleanup.Andrei Warkentin2012-04-121-3/+0
| | | | | | | | | | | | bitmap_new_disk_sb() would still create V3 bitmap superblock with host-endian layout. Perhaps I'm confused, but shouldn't bitmap_new_disk_sb() be creating a V4 bitmap superblock instead, that is portable, as per comment in bitmap.h? Signed-off-by: Andrei Warkentin <andrey.warkentin@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Merge tag 'md-3.4' of git://neil.brown.name/mdLinus Torvalds2012-03-221-77/+75
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull md updates for 3.4 from Neil Brown: "Mostly tidying up code in preparation for some bigger changes next time. A few bug fixes tagged for -stable. Main functionality change is that some RAID10 arrays can now grow to use extra space that may have been made available on the individual devices." Fixed up trivial conflicts with the k[un]map_atomic() cleanups in drivers/md/bitmap.c. * tag 'md-3.4' of git://neil.brown.name/md: (22 commits) md: Add judgement bb->unacked_exist in function md_ack_all_badblocks(). md: fix clearing of the 'changed' flags for the bad blocks list. md/bitmap: discard CHUNK_BLOCK_SHIFT macro md/bitmap: remove unnecessary indirection when allocating. md/bitmap: remove some pointless locking. md/bitmap: change a 'goto' to a normal 'if' construct. md/bitmap: move printing of bitmap status to bitmap.c md/bitmap: remove some unused noise from bitmap.h md/raid10 - support resizing some RAID10 arrays. md/raid1: handle merge_bvec_fn in member devices. md/raid10: handle merge_bvec_fn in member devices. md: add proper merge_bvec handling to RAID0 and Linear. md: tidy up rdev_for_each usage. md/raid1,raid10: avoid deadlock during resync/recovery. md/bitmap: ensure to load bitmap when creating via sysfs. md: don't set md arrays to readonly on shutdown. md: allow re-add to failed arrays. md/raid5: use atomic_dec_return() instead of atomic_dec() and atomic_read(). md: Use existed macros instead of numbers md/raid5: removed unused 'added_devices' variable. ...
| * md/bitmap: discard CHUNK_BLOCK_SHIFT macroNeilBrown2012-03-191-17/+18
| | | | | | | | | | | | | | | | Be redefining ->chunkshift as the shift from sectors to chunks rather than bytes to chunks, we can just use "bitmap->chunkshift" which is shorter than the macro call, and less indirect. Signed-off-by: NeilBrown <neilb@suse.de>
| * md/bitmap: remove unnecessary indirection when allocating.NeilBrown2012-03-191-28/+3
| | | | | | | | | | | | | | | | These funcitons don't add anything useful except possibly the trace points, and I don't think they are worth the extra indirection. So remove them. Signed-off-by: NeilBrown <neilb@suse.de>
| * md/bitmap: remove some pointless locking.NeilBrown2012-03-191-12/+2
| | | | | | | | | | | | | | | | | | | | There is nothing gained by holding a lock while we check if a pointer is NULL or not. If there could be a race, then it could become NULL immediately after the unlock - but there is no race here. So just remove the locking. Signed-off-by: NeilBrown <neilb@suse.de>
| * md/bitmap: change a 'goto' to a normal 'if' construct.NeilBrown2012-03-191-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | The use of a goto makes the control flow more obscure here. So make it a normal: if (x) { Y; } No functional change. Signed-off-by: NeilBrown <neilb@suse.de>
| * md/bitmap: move printing of bitmap status to bitmap.cNeilBrown2012-03-191-0/+28
| | | | | | | | | | | | | | The part of /proc/mdstat which describes the bitmap should really be generated by code in bitmap.c. So move it there. Signed-off-by: NeilBrown <neilb@suse.de>
| * md: tidy up rdev_for_each usage.NeilBrown2012-03-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | md.h has an 'rdev_for_each()' macro for iterating the rdevs in an mddev. However it uses the 'safe' version of list_for_each_entry, and so requires the extra variable, but doesn't include 'safe' in the name, which is useful documentation. Consequently some places use this safe version without needing it, and many use an explicity list_for_each entry. So: - rename rdev_for_each to rdev_for_each_safe - create a new rdev_for_each which uses the plain list_for_each_entry, - use the 'safe' version only where needed, and convert all other list_for_each_entry calls to use rdev_for_each. Signed-off-by: NeilBrown <neilb@suse.de>
| * md/bitmap: ensure to load bitmap when creating via sysfs.NeilBrown2012-03-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | When commit 69e51b449d383e (md/bitmap: separate out loading a bitmap...) created bitmap_load, it missed calling it after bitmap_create when a bitmap is created through the sysfs interface. So if a bitmap is added this way, we don't allocate memory properly and can crash. This is suitable for any -stable release since 2.6.35. Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
* | md: remove the second argument of k[un]map_atomic()Cong Wang2012-03-201-21/+21
|/ | | | | Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Cong Wang <amwang@redhat.com>
* md/bitmap: be more consistent when setting new bits in memory bitmap.NeilBrown2011-12-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | For each active region corresponding to a bit in the bitmap with have a 14bit counter (and some flags). This counts number of active writes + bit in the on-disk bitmap + delay-needed. The "delay-needed" is because we always want a delay before clearing a bit. So the number here is normally number of active writes plus 2. If there have been no writes for a while, we drop to 1. If still no writes we clear the bit and drop to 0. So for consistency, when setting bit from the on-disk bitmap or by request from user-space it is best to set the counter to '2' to start with. In particular we might also set the NEEDED_MASK flag at this time, and in all other cases NEEDED_MASK is only set when the counter is 2 or more. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: daemon_work cleanup.NeilBrown2011-12-231-5/+5
| | | | | | | | We have a variable 'mddev' in this function, but repeatedly get the same value by dereferencing bitmap->mddev. There is room for simplification here... Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: It is OK to clear bits during recovery.NeilBrown2011-12-231-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d0a4bb492772ce5c4bdfba3744a99ed6f6fb238f introduced a regression which is annoying but fairly harmless. When writing to an array that is undergoing recovery (a spare in being integrated into the array), writing to the array will set bits in the bitmap, but they will not be cleared when the write completes. For bits covering areas that have not been recovered yet this is not a problem as the recovery will clear the bits. However bits set in already-recovered region will stay set and never be cleared. This doesn't risk data integrity. The only negatives are: - next time there is a crash, more resyncing than necessary will be done. - the bitmap doesn't look clean, which is confusing. While an array is recovering we don't want to update the 'events_cleared' setting in the bitmap but we do still want to clear bits that have very recently been set - providing they were written to the recovering device. So split those two needs - which previously both depended on 'success' and always clear the bit of the write went to all devices. Signed-off-by: NeilBrown <neilb@suse.de>
* md/lock: ensure updates to page_attrs are properly locked.NeilBrown2011-11-231-0/+4
| | | | | | | | | | | | | | | | | Page attributes are set using __set_bit rather than set_bit as it normally called under a spinlock so the extra atomicity is not needed. However there are two places where we might set or clear page attributes without holding the spinlock. So add the spinlock in those cases. This might be the cause of occasional reports that bits a aren't getting clear properly - theory is that BITMAP_PAGE_PENDING gets lost when BITMAP_PAGE_NEEDWRITE is set or cleared. This is an inconvenience, not a threat to data safety. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap remove fault injection options.NeilBrown2011-10-111-33/+1
| | | | | | These are too hard to use to be much more than noise. Signed-off-by: NeilBrown <neilb@suse.de>
* md: remove typedefs: mddev_t -> struct mddevNeilBrown2011-10-111-22/+22
| | | | | | Having mddev_t and 'struct mddev_s' is ugly and not preferred Signed-off-by: NeilBrown <neilb@suse.de>
* md: removing typedefs: mdk_rdev_t -> struct md_rdevNeilBrown2011-10-111-4/+4
| | | | | | | The typedefs are just annoying. 'mdk' probably refers to 'md_k.h' which used to be an include file that defined this thing. Signed-off-by: NeilBrown <neilb@suse.de>
* md: remove PRINTK and dprintk debugging and use pr_debugNeilBrown2011-10-071-20/+13
| | | | | | Being able to dynamically enable these make them much more useful. Signed-off-by: NeilBrown <neilb@suse.de>
* md/bitmap: improve handling of 'allclean'.NeilBrown2011-09-211-15/+20
| | | | | | | | | | | | | | | | | The 'allclean' flag is used to cache the fact that there is nothing to do, so we can avoid waking up and scanning the bitmap regularly. The two sorts of pages that might need the attention of the bitmap daemon are BITMAP_PAGE_PENDING and BITMAP_PAGE_NEEDWRITE pages. So make sure allclean reflects exactly when there are none of those. So: set it before scanning all pages with either bit set. clear it whenever these bits are set clear it when we desire not to clear one of these bits. don't clear it any other time. Signed-off-by: NeilBrown <neilb@suse.de>