summaryrefslogtreecommitdiffstats
path: root/fs/gfs2
Commit message (Collapse)AuthorAgeFilesLines
...
| * | GFS2: Make gfs2_dir_del update link count when requiredSteven Whitehouse2011-05-093-157/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we remove an entry from a directory, we can save ourselves some trouble if we know the type of the entry in question, since if it is itself a directory, we can update the link count of the parent at the same time as removing the directory entry. In addition this patch also merges the rmdir and unlink code which was almost identical anyway. This eliminates the calls to remove the . and .. directory entries on each rmdir (not needed since the directory will be deallocated, anyway) which was the only thing preventing passing the dentry to gfs2_dir_del(). The passing of the dentry rather than just the name allows us to figure out the type of the entry which is being removed, and thus adjust the link count when required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Don't use gfs2_change_nlink in link syscallSteven Whitehouse2011-05-091-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are three users of gfs2_change_nlink which add to the link count. Two of these are about to be removed in later patches, so this means that there will no callers, when that happens allowing removal of that function, also in a later patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Don't use a try lock when promoting to a higher modeSteven Whitehouse2011-05-051-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we marked all locks being promoted to a higher mode with the try flag to avoid any potential deadlocks issues. The DLM is able to detect these and report them in way that GFS2 can deal with them correctly. So we can just request the required mode and wait for a response without needing to perform this check. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Double check link count under glockSteven Whitehouse2011-05-052-8/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid any possible races relating to the link count, we need to recheck it under the inode's glock in all cases where it matters. Also to ensure we never get any nasty surprises, this patch also ensures that once the link count has hit zero it can never be elevated by rereading in data from disk. The only place we cannot provide a proper solution is in rename in the case where we are removing a target inode and we discover that the target inode has been already unlinked on another node. The race window is very small, and we return EAGAIN in this case to indicate what has happened. The proper solution would be to move the lookup parts of rename from the vfs into library calls which the fs could call directly, but that is potentially a very big job and this fix should cover most cases for now. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Improve bug trap code in ->releasepage()Steven Whitehouse2011-05-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the buffer is dirty or pinned, then as well as printing a warning, we should also refuse to release the page in question. Currently this can occur if there is a race between mmap()ed writers and O_DIRECT on the same file. With the addition of ->launder_page() in the future, we should be able to close this gap. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Fix ail list traversalSteven Whitehouse2011-05-031-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the recent patches to update the AIL list code, I managed to forget that the ail list lock got dropped, even though I added a comment specifically to remind myself :( Reported-by: Barry Marson <bmarson@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: make sure fallocate bytes is a multiple of blksizeBenjamin Marzinski2011-05-031-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GFS2 fallocate code chooses a target size to for allocating chunks of space. Whenever it can't find any resource groups with enough space free, it halves its target. Since this target is in bytes, eventually it will no longer be a multiple of blksize. As long as there is more space available in the resource group than the target, this isn't a problem, since gfs2 will use the actual space available, which is always a multiple of blksize. However, when gfs couldn't fallocate a bigger chunk than the target, it was using the non-blksize aligned number. This caused a BUG in later code that required blksize aligned offsets. GFS2 now ensures that bytes is always a multiple of blksize Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Add an AIL writeback tracepointSteven Whitehouse2011-04-202-0/+30
| | | | | | | | | | | | | | | | | | Add a tracepoint for monitoring writeback of the AIL. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Make writeback more responsive to system conditionsSteven Whitehouse2011-04-208-90/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds writeback_control to writing back the AIL list. This means that we can then take advantage of the information we get in ->write_inode() in order to set off some pre-emptive writeback. In addition, the AIL code is cleaned up a bit to make it a bit simpler to understand. There is still more which can usefully be done in this area, but this is a good start at least. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Optimise glock lru and end of life inodesSteven Whitehouse2011-04-207-89/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GLF_LRU flag introduced in the previous patch can be used to check if a glock is on the lru list when a new holder is queued and if so remove it, without having first to get the lru_lock. The main purpose of this patch however is to optimise the glocks left over when an inode at end of life is being evicted. Previously such glocks were left with the GLF_LFLUSH flag set, so that when reclaimed, each one required a log flush. This patch resets the GLF_LFLUSH flag when there is nothing left to flush thus preventing later log flushes as glocks are reused or demoted. In order to do this, we need to keep track of the number of revokes which are outstanding, and also to clear the GLF_LFLUSH bit after a log commit when only revokes have been processed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Improve tracing support (adds two flags)Steven Whitehouse2011-04-204-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for two new flags. One keeps track of whether the glock is on the LRU list or not. The other isn't really a flag as such, but an indication of whether the glock has an attached object or not. This indication is reported without any locking, which is ok since we do not dereference the object pointer but merely report whether it is NULL or not. Also, this fixes one place where a tracepoint was missing, which was at the point we remove deallocated blocks from the journal. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Clean up fsync()Steven Whitehouse2011-04-203-40/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is designed to clean up GFS2's fsync implementation and ensure that it really does get everything on disk. Since ->write_inode() has been updated, we can call that via the vfs library function sync_inode_metadata() and the only remaining thing that has to be done is to ensure that we get any revoke records in the log after the inode has been written back. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Remove unused macroSteven Whitehouse2011-04-201-2/+0
| | | | | | | | | | | | | | | | | | | | | The buffer_in_io() macro has been unused for some time, so remove it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Alter point of entry to glock lru list for glocks with an address_spaceSteven Whitehouse2011-04-205-27/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than allowing the glocks to be scheduled for possible reclaim as soon as they have exited the journal, this patch delays their entry to the list until the glocks in question are no longer in use. This means that we will rely on the vm for writeback of all dirty data and metadata from now on. When glocks are added to the lru list they should be freeable much faster since all the I/O required to free them should have already been completed. This should lead to much better I/O patterns under low memory conditions. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Use filemap_fdatawrite() to write back the AILSteven Whitehouse2011-04-201-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to ensure that the mapping stats (and thus the bdi) are correctly updated, this patch changes the AIL writeback to use the filemap_datawrite function. This helps prevent stalls in balance_dirty_pages() due to large amounts of dirty metadata when there is little or no dirty data around. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Make ->write_inode() really writeSteven Whitehouse2011-04-201-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GFS2 ->write_inode function should be more aggressive at writing back to the filesystem. This adopts the XFS system of returning -EAGAIN when the writeback has not been completely done. Also, we now kick off in-place writeback when called with WB_SYNC_NONE, but we only wait for it and flush the log when WB_SYNC_ALL is requested. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: move function foreach_leaf to gfs2_dir_exhash_deallocBob Peterson2011-04-201-81/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | The previous patches made function gfs2_dir_exhash_dealloc do nothing but call function foreach_leaf. This patch simplifies the code by moving the entire function foreach_leaf into gfs2_dir_exhash_dealloc. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: pass leaf_bh into leaf_deallocBob Peterson2011-04-201-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Function foreach_leaf used to look up the leaf block address and get a buffer_head. Then it would call leaf_dealloc which did the same lookup. This patch combines the two operations by making foreach_leaf pass the leaf bh to leaf_dealloc. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Combine transaction from gfs2_dir_exhash_deallocBob Peterson2011-04-201-35/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the end of function gfs2_dir_exhash_dealloc, it was setting the dinode type to "file" to prevent directory corruption in case of a crash. It was doing so in its own journal transaction. This patch makes the change occur when the last call is make to leaf_dealloc, since it needs to rewrite the directory dinode at that time anyway. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: remove *leaf_call_t and simplify leaf_deallocBob Peterson2011-04-201-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since foreach_leaf is only called with leaf_dealloc as its only possible call function, we can simplify the code by making it call leaf_dealloc directly. This simplifies the code and eliminates the need for leaf_call_t, the generic call method. This is a first small step in simplifying the directory leaf deallocation code. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | GFS2: Dump better debug info if a bitmap inconsistency is detectedBob Peterson2011-04-201-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On rare occasions we encounter gfs2 problems where an invalid bitmap state transition is attempted. For example, trying to "unlink" a free block. In these cases, there is really no useful information logged to debug the problem. This patch adds more debug details that should allow us to more closely examine the problem and possibly solve it. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | | add hlist_bl_lock/unlock helpersChristoph Hellwig2011-04-251-4/+2
|/ / | | | | | | | | | | | | | | | | | | | | | | Now that the whole dcache_hash_bucket crap is gone, go all the way and also remove the weird locking layering violations for locking the hash buckets. Add hlist_bl_lock/unlock helpers to move the locking into the list abstraction instead of requiring each caller to open code it. After all allowing for the bit locks is the whole point of these helpers over the plain hlist variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | GFS2: filesystem hang caused by incorrect lock orderBob Peterson2011-04-186-21/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a deadlock in GFS2 where two processes are trying to reclaim an unlinked dinode: One holds the inode glock and calls gfs2_lookup_by_inum trying to look up the inode, which it can't, due to I_FREEING. The other has set I_FREEING from vfs and is at the beginning of gfs2_delete_inode waiting for the glock, which is held by the first. The solution is to add a new non_block parameter to the gfs2_iget function that causes it to return -ENOENT if the inode is being freed. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: Don't try to deallocate unlinked inodes when mounted roSteven Whitehouse2011-04-182-2/+7
| | | | | | | | | | | | | | | | This adds a couple of missing tests to avoid read-only nodes from attempting to deallocate unlinked inodes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: Michel Andre de la Porte <madelaporte@ubi.com>
* | GFS2: directly write blocks past i_sizeBenjamin Marzinski2011-04-181-10/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | GFS2 was relying on the writepage code to write out the zeroed data for fallocate. However, with FALLOC_FL_KEEP_SIZE set, this may be past i_size. If it is, it will be ignored. To work around this, gfs2 now calls write_dirty_buffer directly on the buffer_heads when FALLOC_FL_KEEP_SIZE is set, and it's writing past i_size. This version is just a cleanup of my last version Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | GFS2: write_end error path fails to unlock transaction lockBob Peterson2011-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | I did an audit of gfs2's transaction glock for bugzilla bug 658619 and ran across this: In function gfs2_write_end, in the unlikely event that gfs2_meta_inode_buffer returns an error, the code may forget to unlock the transaction lock because the "failed" label appears after the call to function gfs2_trans_end. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* | Fix common misspellingsLucas De Marchi2011-03-313-3/+3
|/ | | | | | Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
* Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2011-03-244-13/+9
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}
| * Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/coreJens Axboe2011-03-104-13/+9
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * block: kill off REQ_UNPLUGJens Axboe2011-03-103-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the plugging now being explicitly controlled by the submitter, callers need not pass down unplugging hints to the block layer. If they want to unplug, it's because they manually plugged on their own - in which case, they should just unplug at will. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * block: remove per-queue pluggingJens Axboe2011-03-102-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* | | userns: rename is_owner_or_cap to inode_owner_or_capableSerge E. Hallyn2011-03-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And give it a kernel-doc comment. [akpm@linux-foundation.org: btrfs changed in linux-next] Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'trivial' of ↵Linus Torvalds2011-03-201-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: (25 commits) video: change to new flag variable scsi: change to new flag variable rtc: change to new flag variable rapidio: change to new flag variable pps: change to new flag variable net: change to new flag variable misc: change to new flag variable message: change to new flag variable memstick: change to new flag variable isdn: change to new flag variable ieee802154: change to new flag variable ide: change to new flag variable hwmon: change to new flag variable dma: change to new flag variable char: change to new flag variable fs: change to new flag variable xtensa: change to new flag variable um: change to new flag variables s390: change to new flag variable mips: change to new flag variable ... Fix up trivial conflict in drivers/hwmon/Makefile
| * | | fs: change to new flag variablematt mooney2011-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace EXTRA_CFLAGS with ccflags-y. And change ntfs-objs to ntfs-y for cleaner conditional inclusion. Signed-off-by: matt mooney <mfm@muteddisk.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-03-161-3/+4
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits) AppArmor: kill unused macros in lsm.c AppArmor: cleanup generated files correctly KEYS: Add an iovec version of KEYCTL_INSTANTIATE KEYS: Add a new keyctl op to reject a key with a specified error code KEYS: Add a key type op to permit the key description to be vetted KEYS: Add an RCU payload dereference macro AppArmor: Cleanup make file to remove cruft and make it easier to read SELinux: implement the new sb_remount LSM hook LSM: Pass -o remount options to the LSM SELinux: Compute SID for the newly created socket SELinux: Socket retains creator role and MLS attribute SELinux: Auto-generate security_is_socket_class TOMOYO: Fix memory leak upon file open. Revert "selinux: simplify ioctl checking" selinux: drop unused packet flow permissions selinux: Fix packet forwarding checks on postrouting selinux: Fix wrong checks for selinux_policycap_netpeer selinux: Fix check for xfrm selinux context algorithm ima: remove unnecessary call to ima_must_measure IMA: remove IMA imbalance checking ...
| * \ \ \ Merge branch 'next' into for-linusJames Morris2011-03-161-3/+4
| |\ \ \ \ | | |/ / / | |/| | |
| | * | | Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into nextJames Morris2011-03-081-3/+4
| | |\ \ \ | | | |/ / | | |/| |
| | | * | fs/vfs/security: pass last path component to LSM on inode creationEric Paris2011-02-011-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SELinux would like to implement a new labeling behavior of newly created inodes. We currently label new inodes based on the parent and the creating process. This new behavior would also take into account the name of the new object when deciding the new label. This is not the (supposed) full path, just the last component of the path. This is very useful because creating /etc/shadow is different than creating /etc/passwd but the kernel hooks are unable to differentiate these operations. We currently require that userspace realize it is doing some difficult operation like that and than userspace jumps through SELinux hoops to get things set up correctly. This patch does not implement new behavior, that is obviously contained in a seperate SELinux patch, but it does pass the needed name down to the correct LSM hook. If no such name exists it is fine to pass NULL. Signed-off-by: Eric Paris <eparis@redhat.com>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2011-03-1618-378/+351
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Don't use _raw version of RCU dereference GFS2: Adding missing unlock_page() GFS2: Update to AIL list locking GFS2: introduce AIL lock GFS2: fix block allocation check for fallocate GFS2: Optimize glock multiple-dequeue code GFS2: Remove potential race in flock code GFS2: Fix glock deallocation race GFS2: quota allows exceeding hard limit GFS2: deallocation performance patch GFS2: panics on quotacheck update GFS2: Improve cluster mmap scalability GFS2: Fix glock queue trace point GFS2: Post-VFS scale update for RCU path walk GFS2: Use RCU for glock hash table
| * | | | | GFS2: Don't use _raw version of RCU dereferenceSteven Whitehouse2011-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per RCU glock patch review comments, don't use the _raw version of this function here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
| * | | | | GFS2: Adding missing unlock_page()Maxim2011-03-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfs2_write_begin() calls grab_cache_page_write_begin() that returns *locked* page. Correspondent error-handling path lacks for unlock_page() call: > out: > if (error == 0) > return 0; > > page_cache_release(page); The whole system hangs if gfs2_unstuff_dinode() called from gfs2_write_begin() failed for some reason. Reported-by: Maxim <maxim.patlasov@gmail.com> Signed-off-by: Maxim <maxim.patlasov@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: Update to AIL list lockingSteven Whitehouse2011-03-143-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous patch missed a couple of places where the AIL list needed locking, so this fixes up those places, plus a comment is corrected too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Chinner <dchinner@redhat.com>
| * | | | | GFS2: introduce AIL lockDave Chinner2011-03-115-18/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The log lock is currently used to protect the AIL lists and the movements of buffers into and out of them. The lists are self contained and no log specific items outside the lists are accessed when starting or emptying the AIL lists. Hence the operation of the AIL does not require the protection of the log lock so split them out into a new AIL specific lock to reduce the amount of traffic on the log lock. This will also reduce the amount of serialisation that occurs when the gfs2_logd pushes on the AIL to move it forward. This reduces the impact of log pushing on sequential write throughput. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: fix block allocation check for fallocateBenjamin Marzinski2011-03-111-25/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GFS2 fallocate wasn't properly checking if a blocks were already allocated. In write_empty_blocks(), if a page didn't have buffer_heads attached, GFS2 was always treating it as if there were no blocks allocated for that page. GFS2 now calls gfs2_block_map() to check if the blocks are allocated before writing them out. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: Optimize glock multiple-dequeue codeBob Peterson2011-03-111-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a small patch that optimizes multiple glock dequeue operations. It changes the unlock order to be more efficient and makes it easier for lock debugging tools to unravel. It also eliminates the need for the temp variable x, although that would likely be optimized out. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: Remove potential race in flock codeSteven Whitehouse2011-03-091-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch ensures that we always wait for glock demotion when dropping flocks on a file in order to prevent any race conditions associated with further flock calls or closing the file. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: Fix glock deallocation raceSteven Whitehouse2011-03-094-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a race in deallocating glocks which was introduced in the RCU glock patch. We need to ensure that the glock count is kept correct even in the case that there is a race to add a new glock into the hash table. Also, to avoid having to wait for an RCU grace period, the glock counter can be decremented before call_rcu() is called. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: quota allows exceeding hard limitAbhijith Das2011-03-092-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Immediately after being synced to disk, cached quotas are zeroed out and a subsequent access of the cached quotas results in incorrect zero values. This meant that gfs2 assumed the actual usage to be the zero (or near-zero) usage values it found in the cached quotas and comparison against warn/limits never triggered a quota violation. This patch adds a new flag QDF_REFRESH that is set after a sync so that the cached quotas are forcefully refreshed from disk on a subsequent access on seeing this flag set. Resolves: rhbz#675944 Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: deallocation performance patchBob Peterson2011-02-243-8/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a performance improvement to GFS2's dealloc code. Rather than update the quota file and statfs file for every single block that's stripped off in unlink function do_strip, this patch keeps track and updates them once for every layer that's stripped. This is done entirely inside the existing transaction, so there should be no risk of corruption. The other functions that deallocate blocks will be unaffected because they are using wrapper functions that do the same thing that they do today. I tested this code on my roth cluster by creating 200 files in a directory, each of which is 100MB, then on four nodes, I simultaneously deleted the files, thus competing for GFS2 resources (but different files). The commands I used were: [root@roth-01]# time for i in `seq 1 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-02]# time for i in `seq 2 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-03]# time for i in `seq 3 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done [root@roth-05]# time for i in `seq 4 4 200` ; do rm /mnt/gfs2/bigdir/gfs2.$i; done The performance increase was significant: roth-01 roth-02 roth-03 roth-05 --------- --------- --------- --------- old: real 0m34.027 0m25.021s 0m23.906s 0m35.646s new: real 0m22.379s 0m24.362s 0m24.133s 0m18.562s Total time spent deleting: old: 118.6s new: 89.4 For this particular case, this showed a 25% performance increase for GFS2 unlinks. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
| * | | | | GFS2: panics on quotacheck updateAbhijith Das2011-02-071-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle block allocation for forceful unstuffing of quota dinode during quota update using quotactl(). Also fix block reservation for special cases when quotas cross over block boundaries and update 2 blocks instead of 1. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>