summaryrefslogtreecommitdiffstats
path: root/fs/reiserfs/bitmap.c
Commit message (Collapse)AuthorAgeFilesLines
* reiserfs: Fix possible recursive lockFrederic Weisbecker2009-12-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While allocating the bitmap using vmalloc, we hold the reiserfs lock, which makes lockdep later reporting a possible deadlock as we may swap out pages to allocate memory and then take the reiserfs lock recursively: inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes: (&REISERFS_SB(s)->lock){+.+.?.}, at: [<c11108a8>] reiserfs_write_lock+0x28/0x40 {RECLAIM_FS-ON-W} state was registered at: [<c104e1c2>] mark_held_locks+0x62/0x90 [<c104e28a>] lockdep_trace_alloc+0x9a/0xc0 [<c108e396>] kmem_cache_alloc+0x26/0xf0 [<c10850ec>] __get_vm_area_node+0x6c/0xf0 [<c10857de>] __vmalloc_node+0x7e/0xa0 [<c108597b>] vmalloc+0x2b/0x30 [<c10e00b9>] reiserfs_init_bitmap_cache+0x39/0x70 [<c10f8178>] reiserfs_fill_super+0x2e8/0xb90 [<c1094345>] get_sb_bdev+0x145/0x180 [<c10f5a11>] get_super_block+0x21/0x30 [<c10931f0>] vfs_kern_mount+0x40/0xd0 [<c10932d9>] do_kern_mount+0x39/0xd0 [<c10a9857>] do_mount+0x2c7/0x6b0 [<c10a9ca6>] sys_mount+0x66/0xa0 [<c161589b>] mount_block_root+0xc4/0x245 [<c1615a75>] mount_root+0x59/0x5f [<c1615b8c>] prepare_namespace+0x111/0x14b [<c1615269>] kernel_init+0xcf/0xdb [<c10031fb>] kernel_thread_helper+0x7/0x1c This is actually fine for two reasons: we call vmalloc at mount time then it's not in the swapping out path. Also the reiserfs lock can be acquired recursively, but since its implementation depends on a mutex, it's hard and not necessary worth it to teach that to lockdep. The lock is useless at mount time anyway, at least until we replay the journal. But let's remove it from this path later as this needs more thinking and is a sensible change. For now we can just relax the lock around vmalloc, Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>
* kill-the-BKL/reiserfs: release the write lock inside ↵Frederic Weisbecker2009-09-141-0/+2
| | | | | | | | | | | | | | | | | reiserfs_read_bitmap_block() reiserfs_read_bitmap_block() uses sb_bread() to read the bitmap block. This helper might sleep. Then, when the bkl was used, it was released at this point. We can then relax the write lock too here. [ Impact: release the reiserfs write lock when it is not needed ] Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
* reiserfs: kill-the-BKLFrederic Weisbecker2009-09-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is an attempt to remove the Bkl based locking scheme from reiserfs and is intended. It is a bit inspired from an old attempt by Peter Zijlstra: http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html The bkl is heavily used in this filesystem to prevent from concurrent write accesses on the filesystem. Reiserfs makes a deep use of the specific properties of the Bkl: - It can be acqquired recursively by a same task - It is released on the schedule() calls and reacquired when schedule() returns The two properties above are a roadmap for the reiserfs write locking so it's very hard to simply replace it with a common mutex. - We need a recursive-able locking unless we want to restructure several blocks of the code. - We need to identify the sites where the bkl was implictly relaxed (schedule, wait, sync, etc...) so that we can in turn release and reacquire our new lock explicitly. Such implicit releases of the lock are often required to let other resources producer/consumer do their job or we can suffer unexpected starvations or deadlocks. So the new lock that replaces the bkl here is a per superblock mutex with a specific property: it can be acquired recursively by a same task, like the bkl. For such purpose, we integrate a lock owner and a lock depth field on the superblock information structure. The first axis on this patch is to turn reiserfs_write_(un)lock() function into a wrapper to manage this mutex. Also some explicit calls to lock_kernel() have been converted to reiserfs_write_lock() helpers. The second axis is to find the important blocking sites (schedule...(), wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit release of the write lock on these locations before blocking. Then we can safely wait for those who can give us resources or those who need some. Typically this is a fight between the current writer, the reiserfs workqueue (aka the async commiter) and the pdflush threads. The third axis is a consequence of the second. The write lock is usually on top of a lock dependency chain which can include the journal lock, the flush lock or the commit lock. So it's dangerous to release and trying to reacquire the write lock while we still hold other locks. This is fine with the bkl: T1 T2 lock_kernel() mutex_lock(A) unlock_kernel() // do something lock_kernel() mutex_lock(A) -> already locked by T1 schedule() (and then unlock_kernel()) lock_kernel() mutex_unlock(A) .... This is not fine with a mutex: T1 T2 mutex_lock(write) mutex_lock(A) mutex_unlock(write) // do something mutex_lock(write) mutex_lock(A) -> already locked by T1 schedule() mutex_lock(write) -> already locked by T2 deadlock The solution in this patch is to provide a helper which releases the write lock and sleep a bit if we can't lock a mutex that depend on it. It's another simulation of the bkl behaviour. The last axis is to locate the fs callbacks that are called with the bkl held, according to Documentation/filesystem/Locking. Those are: - reiserfs_remount - reiserfs_fill_super - reiserfs_put_super Reiserfs didn't need to explicitly lock because of the context of these callbacks. But now we must take care of that with the new locking. After this patch, reiserfs suffers from a slight performance regression (for now). On UP, a high volume write with dd reports an average of 27 MB/s instead of 30 MB/s without the patch applied. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Ingo Molnar <mingo@elte.hu> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Bron Gondwana <brong@fastmail.fm> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> LKML-Reference: <1239070789-13354-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge branch 'reiserfs-updates' from Jeff MahoneyLinus Torvalds2009-03-301-38/+34
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * reiserfs-updates: (35 commits) reiserfs: rename [cn]_* variables reiserfs: rename p_._ variables reiserfs: rename p_s_tb to tb reiserfs: rename p_s_inode to inode reiserfs: rename p_s_bh to bh reiserfs: rename p_s_sb to sb reiserfs: strip trailing whitespace reiserfs: cleanup path functions reiserfs: factor out buffer_info initialization reiserfs: add atomic addition of selinux attributes during inode creation reiserfs: use generic readdir for operations across all xattrs reiserfs: journaled xattrs reiserfs: use generic xattr handlers reiserfs: remove i_has_xattr_dir reiserfs: make per-inode xattr locking more fine grained reiserfs: eliminate per-super xattr lock reiserfs: simplify xattr internal file lookups/opens reiserfs: Clean up xattrs when REISERFS_FS_XATTR is unset reiserfs: remove IS_PRIVATE helpers reiserfs: remove link detection code ... Fixed up conflicts manually due to: - quota name cleanups vs variable naming changes: fs/reiserfs/inode.c fs/reiserfs/namei.c fs/reiserfs/stree.c fs/reiserfs/xattr.c - exported include header cleanups include/linux/reiserfs_fs.h
| * reiserfs: use reiserfs_error()Jeff Mahoney2009-03-301-27/+29
| | | | | | | | | | | | | | | | This patch makes many paths that are currently using warnings to handle the error. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * reiserfs: rework reiserfs_warningJeff Mahoney2009-03-301-29/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ReiserFS warnings can be somewhat inconsistent. In some cases: * a unique identifier may be associated with it * the function name may be included * the device may be printed separately This patch aims to make warnings more consistent. reiserfs_warning() prints the device name, so printing it a second time is not required. The function name for a warning is always helpful in debugging, so it is now automatically inserted into the output. Hans has stated that every warning should have a unique identifier. Some cases lack them, others really shouldn't have them. reiserfs_warning() now expects an id associated with each message. In the rare case where one isn't needed, "" will suffice. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * reiserfs: make some warnings informationalJeff Mahoney2009-03-301-3/+3
| | | | | | | | | | | | | | | | | | In several places, reiserfs_warning is used when there is no warning, just a notice. This patch changes some of them to indicate that the message is merely informational. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | reiserfs: Use lowercase names of quota functionsJan Kara2009-03-261-6/+8
|/ | | | | | | Use lowercase names of quota functions instead of old uppercase ones. Signed-off-by: Jan Kara <jack@suse.cz> CC: reiserfs-devel@vger.kernel.org
* reiserfs: replace remaining __FUNCTION__ occurrencesHarvey Harrison2008-04-281-4/+4
| | | | | | | | | | __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/: Spelling fixesJoe Perches2008-02-031-3/+3
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
* reiserfs: ignore on disk s_bmap_nr valueJeff Mahoney2007-10-191-17/+22
| | | | | | | | | | | | | | | | | | | | | Implement support for file systems larger than 8 TiB. The reiserfs superblock contains a 16 bit value for counting the number of bitmap blocks. The rest of the disk format supports file systems up to 2^32 blocks, but the bitmap block limitation artificially limits this to 8 TiB with a 4KiB block size. Rather than trust the superblock's 16-bit bitmap block count, we calculate it dynamically based on the number of blocks in the file system. When an incorrect value is observed in the superblock, it is zeroed out, ensuring that older kernels will not be able to mount the file system. Userspace support has already been implemented and shipped in reiserfsprogs 3.6.20. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* reiserfs: remove first_zero_hintJeff Mahoney2007-10-191-17/+12
| | | | | | | | | | | | | | The first_zero_hint metadata caching was never actually used, and it's of dubious optimization quality. This patch removes it. It doesn't actually shrink the size of the reiserfs_bitmap_info struct, since that doesn't work with block sizes larger than 8K. There was a big fixme in there, and with all the work lately in allowing block size > page size, I might as well kill the fixme as well. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* reiserfs: fix usage of signed ints for block numbersJeff Mahoney2007-10-191-11/+13
| | | | | | | | | | | | Do a quick signedness check for block numbers. There are a number of places where signed integers are used for block numbers, which limits the usable file system size to 8 TiB. The disk format, excepting a problem which will be fixed in the following patch, supports file systems up to 16 TiB in size. This patch cleans up those sites so that we can enable the full usable size. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* reiserfs: use is_reusable to catch corruptionJeff Mahoney2007-10-191-8/+13
| | | | | | | | | Build in is_reusable() unconditionally and use it to catch corruption before it reaches the block freeing paths. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/reiserfs/: cleanupsAdrian Bunk2007-10-171-57/+0
| | | | | | | | | | | | | | | | | - remove the following no longer used functions: - bitmap.c: reiserfs_claim_blocks_to_be_allocated() - bitmap.c: reiserfs_release_claimed_blocks() - bitmap.c: reiserfs_can_fit_pages() - make the following functions static: - inode.c: restart_transaction() - journal.c: reiserfs_async_progress_wait() Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Vladimir V. Saveliev <vs@namesys.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] struct path: rename Reiserfs's struct pathJosef "Jeff" Sipek2006-12-081-1/+1
| | | | | | | | | | Rename Reiserfs's struct path to struct treepath to prevent name collision between it and struct path from fs/namei.c. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: null pointer dereferencing in reiserfs_read_bitmap_blockEric Eric Sesterhenn2006-10-071-2/+2
| | | | | | | | | null pointer dereferencing in reiserfs_read_bitmap_block. Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: eliminate minimum window size for bitmap searchingJeff Mahoney2006-10-011-22/+1
| | | | | | | | | | | | | | | | | | | | When a file system becomes fragmented (using MythTV, for example), the bigalloc window searching ends up causing huge performance problems. In a file system presented by a user experiencing this bug, the file system was 90% free, but no 32-block free windows existed on the entire file system. This causes the allocator to scan the entire file system for each 128k write before backing down to searching for individual blocks. In the end, finding a contiguous window for all the blocks in a write is an advantageous special case, but one that can be found naturally when such a window exists anyway. This patch removes the bigalloc window searching, and has been proven to fix the test case described above. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: on-demand bitmap loadingJeff Mahoney2006-10-011-53/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the patch the three previous ones have been leading up to. It changes the behavior of ReiserFS from loading and caching all the bitmaps as special, to treating the bitmaps like any other bit of metadata and just letting the system-wide caches figure out what to hang on to. Buffer heads are allocated on the fly, so there is no need to retain pointers to all of them. The caching of the metadata occurs when the data is read and updated, and is considered invalid and uncached until then. I needed to remove the vs-4040 check for performing a duplicate operation on a particular bit. The reason is that while the other sites for working with bitmaps are allowed to schedule, is_reusable() is called from do_balance(), which will panic if a schedule occurs in certain places. The benefit of on-demand bitmaps clearly outweighs a sanity check that depends on a compile-time option that is discouraged. [akpm@osdl.org: warning fix] Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: reorganize bitmap loading functionsJeff Mahoney2006-10-011-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the bitmap loading code from super.c to bitmap.c The code is also restructured somewhat. The only difference between new format bitmaps and old format bitmaps is where they are. That's a two liner before loading the block to use the correct one. There's no need for an entirely separate code path. The load path is generally the same, with the pattern being to throw out a bunch of requests and then wait for them, then cache the metadata from the contents. Again, like the previous patches, the purpose is to set up for later ones. Update: There was a bug in the previously posted version of this that resulted in corruption. The problem was that bitmap 0 on new format file systems must be treated specially, and wasn't. A stupid bug with an easy fix. This is hopefully the last fix for the disaster that is the reiserfs bitmap patch set. If a bitmap block was full, first_zero_hint would end up at zero since it would never be changed from it's zeroed out value. This just sets it beyond the end of the bitmap block. If any bits are freed, it will be reset to a valid bit. When info->free_count = 0, then we already know it's full. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: clean up bitmap block buffer head referencesJeff Mahoney2006-10-011-24/+36
| | | | | | | | | | | | | | | | | | Similar to the SB_JOURNAL cleanup that was accepted a while ago, this patch uses a temporary variable for buffer head references from the bitmap info array. This makes the code much more readable in some areas. It also uses proper reference counting, doing a get_bh() after using the pointer from the array and brelse()'ing it later. This may seem silly, but a later patch will replace the simple temporary variables with an actual read, so the reference freeing will be used then. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs: fix is_reusable bitmap check to not traverse the bitmap ↵Jeff Mahoney2006-10-011-15/+25
| | | | | | | | | | | | | | | | | | | | info array There is a check in is_reusable to determine if a particular block is a bitmap block. It verifies this by going through the array of bitmap block buffer heads and comparing the block number to each one. Bitmap blocks are at defined locations on the disk in both old and current formats. Simply checking against the known good values is enough. This is a trivial optimization for a non-production codepath, but this is the first in a series of patches that will ultimately remove the buffer heads from that array. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Remove obsolete #include <linux/config.h>Jörn Engel2006-06-301-1/+0
| | | | | Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* reiserfs: run scripts/Lindent on reiserfs codeLinus Torvalds2005-07-121-873/+969
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was a pure indentation change, using: scripts/Lindent fs/reiserfs/*.c include/linux/reiserfs_*.h to make reiserfs match the regular Linux indentation style. As Jeff Mahoney <jeffm@suse.com> writes: The ReiserFS code is a mix of a number of different coding styles, sometimes different even from line-to-line. Since the code has been relatively stable for quite some time and there are few outstanding patches to be applied, it is time to reformat the code to conform to the Linux style standard outlined in Documentation/CodingStyle. This patch contains the result of running scripts/Lindent against fs/reiserfs/*.c and include/linux/reiserfs_*.h. There are places where the code can be made to look better, but I'd rather keep those patches separate so that there isn't a subtle by-hand hand accident in the middle of a huge patch. To be clear: This patch is reformatting *only*. A number of patches may follow that continue to make the code more consistent with the Linux coding style. Hans wasn't particularly enthusiastic about these patches, but said he wouldn't really oppose them either. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs endianness: annotate little-endian objectsAl Viro2005-05-011-3/+4
| | | | | | | | | | little-endian objects annotated as such; again, obviously no changes of resulting code, we only replace __u16 with __le16, etc. in relevant places. Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] reiserfs endianness: clone struct reiserfs_keyAl Viro2005-05-011-2/+2
| | | | | | | | | | | | struct reiserfs_key cloned; (currently) identical struct in_core_key added. Places that expect host-endian data in reiserfs_key switched to in_core_key. Basically, we get annotation of reiserfs_key users and keep the resulting tree obviously equivalent to original. Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Linux-2.6.12-rc2v2.6.12-rc2Linus Torvalds2005-04-161-0/+1169
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!