summaryrefslogtreecommitdiffstats
path: root/fs/xfs/xfs_log.c
Commit message (Collapse)AuthorAgeFilesLines
* xfs: fix fs/xfs/xfs_log.c:1740:39: error: 'B_TRUE' undeclaredBen Myers2013-01-181-1/+1
| | | | | | | | | | Commit 667a9291c5b3 "xfs: Remove boolean_t typedef completely." didn't. Remove a stray B_TRUE that breaks CONFIG_XFS_DEBUG=y. Signed-off-by: Ben Myers <bpm@sgi.com> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com>
* xfs: Remove boolean_t typedef completely.Thiago Farina2013-01-171-4/+4
| | | | | | | | | | | | Since we are using C99 we have one builtin defined in include/linux/types.h, use that instead. v2: you missed one in fs/xfs/xfs_qm_bhv.c, cleaned up. -bpm Signed-off-by: Thiago Farina <tfarina@chromium.org> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: fix sparse reported log CRC endian issueDave Chinner2012-12-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | Not a bug as such, just warning noise from the xlog_cksum() returning a __be32 type when it should be returning a __le32 type. On Wed, Nov 28, 2012 at 08:30:59AM -0500, Christoph Hellwig wrote: > But why are we storing the crc field little endian while all other on > disk formats are big endian? (And yes I realize it might as well have > been me who did that back in the idea, but I still have no idea why) Because the CRC always returns the calcuation LE format, even on BE systems. So rather than always having to byte swap it everywhere and have all the force casts and anootations for sparse, it seems simpler to just make it a __le32 everywhere.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: fix direct IO nested transaction deadlock.Dave Chinner2012-11-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The direct IO path can do a nested transaction reservation when writing past the EOF. The first transaction is the append transaction for setting the filesize at IO completion, but we can also need a transaction for allocation of blocks. If the log is low on space due to reservations and small log, the append transaction can be granted after wating for space as the only active transaction in the system. This then attempts a reservation for an allocation, which there isn't space in the log for, and the reservation sleeps. The result is that there is nothing left in the system to wake up all the processes waiting for log space to come free. The stack trace that shows this deadlock is relatively innocuous: xlog_grant_head_wait xlog_grant_head_check xfs_log_reserve xfs_trans_reserve xfs_iomap_write_direct __xfs_get_blocks xfs_get_blocks_direct do_blockdev_direct_IO __blockdev_direct_IO xfs_vm_direct_IO generic_file_direct_write xfs_file_dio_aio_writ xfs_file_aio_write do_sync_write vfs_write This was discovered on a filesystem with a log of only 10MB, and a log stripe unit of 256k whih increased the base reservations by 512k. Hence a allocation transaction requires 1.2MB of log space to be available instead of only 260k, and so greatly increased the chance that there wouldn't be enough log space available for the nested transaction to succeed. The key to reproducing it is this mkfs command: mkfs.xfs -f -d agcount=16,su=256k,sw=12 -l su=256k,size=2560b $SCRATCH_DEV The test case was a 1000 fsstress processes running with random freeze and unfreezes every few seconds. Thanks to Eryu Guan (eguan@redhat.com) for writing the test that found this on a system with a somewhat unique default configuration.... cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andrew Dahl <adahl@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: add CRC checks to the logChristoph Hellwig2012-11-191-23/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement CRCs for the log buffers. We re-use a field in struct xlog_rec_header that was used for a weak checksum of the log buffer payload in debug builds before. The new checksumming uses the crc32c checksum we will use elsewhere in XFS, and also protects the record header and addition cycle data. Due to this there are some interesting changes in xlog_sync, as we need to do the cycle wrapping for the split buffer case much earlier, as we would touch the buffer after generating the checksum otherwise. The CRC calculation is always enabled, even for non-CRC filesystems, as adding this CRC does not change the log format. On non-CRC filesystems, only issue an alert if a CRC mismatch is found and allow recovery to continue - this will act as an indicator that log recovery problems are a result of log corruption. On CRC enabled filesystems, however, log recovery will fail. Note that existing debug kernels will write a simple checksum value to the log, so the first time this is run on a filesystem taht was last used on a debug kernel it will through CRC mismatch warning errors. These can be ignored. Initially based on a patch from Dave Chinner, then modified significantly by Christoph Hellwig. Modified again by Dave Chinner to get to this version. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: make buffer read verication an IO completion functionDave Chinner2012-11-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Add a verifier function callback capability to the buffer read interfaces. This will be used by the callers to supply a function that verifies the contents of the buffer when it is read from disk. This patch does not provide callback functions, but simply modifies the interfaces to allow them to be called. The reason for adding this to the read interfaces is that it is very difficult to tell fom the outside is a buffer was just read from disk or whether we just pulled it out of cache. Supplying a callbck allows the buffer cache to use it's internal knowledge of the buffer to execute it only when the buffer is read from disk. It is intended that the verifier functions will mark the buffer with an EFSCORRUPTED error when verification fails. This allows the reading context to distinguish a verification error from an IO error, and potentially take further actions on the buffer (e.g. attempt repair) based on the error reported. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phil White <pwhite@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: only update the last_sync_lsn when a transaction completesDave Chinner2012-10-171-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The log write code stamps each iclog with the current tail LSN in the iclog header so that recovery knows where to find the tail of thelog once it has found the head. Normally this is taken from the first item on the AIL - the log item that corresponds to the oldest active item in the log. The problem is that when the AIL is empty, the tail lsn is dervied from the the l_last_sync_lsn, which is the LSN of the last iclog to be written to the log. In most cases this doesn't happen, because the AIL is rarely empty on an active filesystem. However, when it does, it opens up an interesting case when the transaction being committed to the iclog spans multiple iclogs. That is, the first iclog is stamped with the l_last_sync_lsn, and IO is issued. Then the next iclog is setup, the changes copied into the iclog (takes some time), and then the l_last_sync_lsn is stamped into the header and IO is issued. This is still the same transaction, so the tail lsn of both iclogs must be the same for log recovery to find the entire transaction to be able to replay it. The problem arises in that the iclog buffer IO completion updates the l_last_sync_lsn with it's own LSN. Therefore, If the first iclog completes it's IO before the second iclog is filled and has the tail lsn stamped in it, it will stamp the LSN of the first iclog into it's tail lsn field. If the system fails at this point, log recovery will not see a complete transaction, so the transaction will no be replayed. The fix is simple - the l_last_sync_lsn is updated when a iclog buffer IO completes, and this is incorrect. The l_last_sync_lsn shoul dbe updated when a transaction is completed by a iclog buffer IO. That is, only iclog buffers that have transaction commit callbacks attached to them should update the l_last_sync_lsn. This means that the last_sync_lsn will only move forward when a commit record it written, not in the middle of a large transaction that is rolling through multiple iclog buffers. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: xfs_quiesce_attr() should quiesce the log like unmountDave Chinner2012-10-171-7/+18
| | | | | | | | | | | | | | xfs_quiesce_attr() is supposed to leave the log empty with an unmount record written. Right now it does not wait for the AIL to be emptied before writing the unmount record, not does it wait for metadata IO completion, either. Fix it to use the same method and code as xfs_log_unmount(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: syncd workqueue is no moreDave Chinner2012-10-171-1/+1
| | | | | | | | | | | | | | | With the syncd functions moved to the log and/or removed, the syncd workqueue is the only remaining bit left. It is used by the log covering/ail pushing work, as well as by the inode reclaim work. Given how cheap workqueues are these days, give the log and inode reclaim work their own work queues and kill the syncd work queue. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: Bring some sanity to log unmountingDave Chinner2012-10-171-3/+26
| | | | | | | | | | | | | | | | | | | When unmounting the filesystem, there are lots of operations that need to be done in a specific order, and they are spread across across a couple of functions. We have to drain the AIL before we write the unmount record, and we have to shut down the background log work before we do either of them. But this is all split haphazardly across xfs_unmountfs() and xfs_log_unmount(). Move all the AIL flushing and log manipulations to xfs_log_unmount() so that the responisbilities of each function is clear and the operations they perform obvious. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: sync work is now only periodic log workDave Chinner2012-10-171-10/+51
| | | | | | | | | | | | | | | | The only thing the periodic sync work does now is flush the AIL and idle the log. These are really functions of the log code, so move the work to xfs_log.c and rename it appropriately. The only wart that this leaves behind is the xfssyncd_centisecs sysctl, otherwise the xfssyncd is dead. Clean up any comments that related to xfssyncd to reflect it's passing. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: remove xlog_t typedefMark Tinguely2012-06-211-86/+137
| | | | | | | | Remove the xlog_t type definitions. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: rename log structure to xlogMark Tinguely2012-06-211-32/+44
| | | | | | | | | | Rename the XFS log structure to xlog to help crash distinquish it from the other logs in Linux. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: shutdown xfs_sync_worker before the logBen Myers2012-06-211-0/+1
| | | | | | | | | | | | | Revert commit 1307bbd, which uses the s_umount semaphore to provide exclusion between xfs_sync_worker and unmount, in favor of shutting down the sync worker before freeing the log in xfs_log_unmount. This is a cleaner way of resolving the race between xfs_sync_worker and unmount than using s_umount. Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: switch to proper __bitwise type for KM_... flagsAl Viro2012-05-291-1/+1
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* xfs: add trace points for log forcesDave Chinner2012-05-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To enable easy tracing of the location of log forces and the frequency of them via perf, add a pair of trace points to the log force functions. This will help debug where excessive log forces are being issued from by simple perf commands like: # ~/perf/perf top -e xfs:xfs_log_force -G -U Which gives this sort of output: Events: 141 xfs:xfs_log_force - 100.00% [kernel] [k] xfs_log_force - xfs_log_force 87.04% xfsaild kthread kernel_thread_helper - 12.87% xfs_buf_lock _xfs_buf_find xfs_buf_get xfs_trans_get_buf xfs_da_do_buf xfs_da_get_buf xfs_dir2_data_init xfs_dir2_leaf_addname xfs_dir_createname xfs_create xfs_vn_mknod xfs_vn_create vfs_create do_last.isra.41 path_openat do_filp_open do_sys_open sys_open system_call_fastpath Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sig.com>
* xfs: clean up xfs_bit.h includesDave Chinner2012-05-141-1/+0
| | | | | | | | | | | | | | With the removal of xfs_rw.h and other changes over time, xfs_bit.h is being included in many files that don't actually need it. Clean up the includes as necessary. Also move the only-used-once xfs_ialloc_find_free() static inline function out of a header file that is widely included to reduce the number of needless dependencies on xfs_bit.h. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: move xfs_get_extsz_hint() and kill xfs_rw.hDave Chinner2012-05-141-1/+0
| | | | | | | | | | | | | The only thing left in xfs_rw.h is a function prototype for an inode function. Move that to xfs_inode.h, and kill xfs_rw.h. Also move the function implementing the prototype from xfs_rw.c to xfs_inode.c so we only have one function left in xfs_rw.c Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: move xfsagino_t to xfs_types.hDave Chinner2012-05-141-1/+0
| | | | | | | | | | | Untangle the header file includes a bit by moving the definition of xfs_agino_t to xfs_types.h. This removes the dependency that xfs_ag.h has on xfs_inum.h, meaning we don't need to include xfs_inum.h everywhere we include xfs_ag.h. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: use blocks for storing the desired IO sizeDave Chinner2012-05-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we pass block counts everywhere, and index buffers by block number and length in units of blocks, convert the desired IO size into block counts rather than bytes. Convert the code to use block counts, and those that need byte counts get converted at the time of use. Rename the b_desired_count variable to something closer to it's purpose - b_io_length - as it is only used to specify the length of an IO for a subset of the buffer. The only time this is used is for log IO - both writing iclogs and during log recovery. In all other cases, the b_io_length matches b_length, and hence a lot of code confuses the two. e.g. the buf item code uses the io count exclusively when it should be using the buffer length. Fix these apprpriately as they are found. Also, remove the XFS_BUF_{SET_}COUNT() macros that are just wrappers around the desired IO length. They only serve to make the code shouty loud, don't actually add any real value, and are often used incorrectly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: use blocks for counting length of buffersDave Chinner2012-05-141-4/+1
| | | | | | | | | | | | | | | | Now that we pass block counts everywhere, and index buffers by block number, track the length of the buffer in units of blocks rather than bytes. Convert the code to use block counts, and those that need byte counts get converted at the time of use. Also, remove the XFS_BUF_{SET_}SIZE() macros that are just wrappers around the buffer length. They only serve to make the code shouty loud and don't actually add any real value. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: clean up buffer get/read call APIDave Chinner2012-05-141-3/+3
| | | | | | | | | | | | | | | | | | | | | The xfs_buf_get/read API is not consistent in the units it uses, and does not use appropriate or consistent units/types for the variables. Convert the API to use disk addresses and block counts for all buffer get and read calls. Use consistent naming for all the functions and their declarations, and convert the internal functions to use disk addresses and block counts to avoid need to convert them from one type to another and back again. Fix all the callers to use disk addresses and block counts. In many cases, this removes an additional conversion from the function call as the callers already have a block count. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: allow assigning the tail lsn with the AIL lock heldChristoph Hellwig2012-05-141-8/+23
| | | | | | | | | | | | | | | | | | Provide a variant of xlog_assign_tail_lsn that has the AIL lock already held. By doing so we do an additional atomic_read + atomic_set under the lock, which comes down to two instructions. Switch xfs_trans_ail_update_bulk and xfs_trans_ail_delete_bulk to the new version to reduce the number of lock roundtrips, and prepare for a new addition that would require a third lock roundtrip in xfs_trans_ail_delete_bulk. This addition is also the reason for slightly rearranging the conditionals and relying on xfs_log_space_wake for checking that the filesystem has been shut down internally. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: Account log unmount transaction correctlyDave Chinner2012-03-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | There have been a few reports of this warning appearing recently: XFS (dm-4): xlog_space_left: head behind tail tail_cycle = 129, tail_bytes = 20163072 GH cycle = 129, GH bytes = 20162880 The common cause appears to be lots of freeze and unfreeze cycles, and the output from the warnings indicates that we are leaking around 8 bytes of log space per freeze/unfreeze cycle. When we freeze the filesystem, we write an unmount record and that uses xlog_write directly - a special type of transaction, effectively. What it doesn't do, however, is correctly account for the log space it uses. The unmount record writes an 8 byte structure with a special magic number into the log, and the space this consumes is not accounted for in the log ticket tracking the operation. Hence we leak 8 bytes every unmount record that is written. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: split and cleanup xfs_log_reserveChristoph Hellwig2012-02-221-141/+124
| | | | | | | | | | | | | Split the log regrant case out of xfs_log_reserve into a separate function, and merge xlog_grant_log_space and xlog_regrant_write_log_space into their respective callers. Also replace the XFS_LOG_PERM_RESERV flag, which easily got misused before the previous cleanups with a simple boolean parameter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: share code for grant head availability checksChristoph Hellwig2012-02-221-73/+60
| | | | | | | | Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: share code for grant head wakeupsChristoph Hellwig2012-02-221-30/+20
| | | | | | | | Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: share code for grant head waitingChristoph Hellwig2012-02-221-45/+18
| | | | | | | | Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: add xlog_grant_head_wake_allChristoph Hellwig2012-02-221-10/+14
| | | | | | | | Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: add xlog_grant_head_initChristoph Hellwig2012-02-221-6/+12
| | | | | | | | Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: add the xlog_grant_head structureChristoph Hellwig2012-02-221-56/+56
| | | | | | | | | | | Add a new data structure to allow sharing code between the log grant and regrant code. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: remove log space waitqueuesChristoph Hellwig2012-02-221-9/+15
| | | | | | | | | | | | The tic->t_wait waitqueues can never have more than a single waiter on them, so we can easily replace them with a task_struct pointer and wake_up_process. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: cleanup xfs_log_space_wakeChristoph Hellwig2012-02-221-30/+5
| | | | | | | | | | | | Remove the now unused opportunistic parameter, and use the the xlog_writeq_wake and xlog_reserveq_wake helpers now that we don't have to care about the opportunistic wakeups. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: do exact log space wakeups in xlog_ungrant_log_spaceChristoph Hellwig2012-02-221-1/+1
| | | | | | | | | | | | | The only reason that xfs_log_space_wake had to do opportunistic wakeups was that the old xfs_log_move_tail calling convention didn't allow for exact wakeups when not updating the log tail LSN. Since this issue has been fixed we can do exact wakeups now. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: split tail_lsn assignments from log space wakeupsChristoph Hellwig2012-02-221-43/+31
| | | | | | | | | | | | | | | | | | | | | | | | | Currently xfs_log_move_tail has a tail_lsn argument that is horribly overloaded: it may contain either an actual lsn to assign to the log tail, 0 as a special case to use the last sync LSN, or 1 to indicate that no tail LSN assignment should be performed, and we should opportunisticly wake up at one task waiting for log space even if we did not move the LSN. Remove the tail lsn assigned from xfs_log_move_tail and make the two callers use xlog_assign_tail_lsn instead of the current variant of partially using the code in xfs_log_move_tail and partially opencoding it. Note that means we grow an addition lock roundtrip on the AIL lock for each bulk update or delete, which is still far less than what we had before introducing the bulk operations. If this proves to be a problem we can still add a variant of xlog_assign_tail_lsn that expects the lock to be held already. Also rename the remainder of xfs_log_move_tail to xfs_log_space_wake as that name describes its functionality much better. Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: remove the deprecated nodelaylog optionChristoph Hellwig2011-12-081-58/+21
| | | | | | | | | | | The delaylog mode has been the default for a long time, and the nodelaylog option has been scheduled for removal in Linux 3.3. Remove it and code only used by it now that we have opened the 3.3 window. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: fix the logspace waiting algorithmChristoph Hellwig2011-12-061-175/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apply the scheme used in log_regrant_write_log_space to wake up any other threads waiting for log space before the newly added one to log_regrant_write_log_space as well, and factor the code into readable helpers. For each of the queues we have add two helpers: - one to try to wake up all waiting threads. This helper will also be usable by xfs_log_move_tail once we remove the current opportunistic wakeups in it. - one to sleep on t_wait until enough log space is available, loosely modelled after Linux waitqueues. And use them to reimplement the guts of log_regrant_write_log_space and log_regrant_write_log_space. These two function now use one and the same algorithm for waiting on log space instead of subtly different ones before, with an option to completely unify them in the near future. Also move the filesystem shutdown handling to the common caller given that we had to touch it anyway. Based on hard debugging and an earlier patch from Chandra Seetharaman <sekharan@us.ibm.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandra Seetharaman <sekharan@us.ibm.com> Tested-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Ben Myers <bpm@sgi.com>
* xfs: constify xfs_item_opsChristoph Hellwig2011-11-081-1/+1
| | | | | | | | | | | The log item ops aren't nessecarily the biggest exploit vector, but marking them const is easy enough. Also remove the unused xfs_item_ops_t typedef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com>
* xfs: clean up xfs_ioerror_alertChristoph Hellwig2011-10-111-7/+7
| | | | | | | | | | | | | | | | | Instead of passing the block number and mount structure explicitly get them off the bp and fix make the argument order more natural. Also move it to xfs_buf.c and stop printing the device name given that we already get the fs name as part of xfs_alert, and we know what device is operates on because of the caller that gets printed, finally rename it to xfs_buf_ioerror_alert and pass __func__ as argument where it makes sense. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: clean up buffer allocationChristoph Hellwig2011-10-111-1/+1
| | | | | | | | | | | | Change _xfs_buf_initialize to allocate the buffer directly and rename it to xfs_buf_alloc now that is the only buffer allocation routine. Also remove the xfs_buf_deallocate wrapper around the kmem_zone_free calls for buffers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALEChristoph Hellwig2011-10-111-2/+2
| | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: Remove the macro XFS_BUF_SET_PTRChandra Seetharaman2011-07-251-2/+2
| | | | | | | | Remove the definition and usages of the macro XFS_BUF_SET_PTR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: Remove the macro XFS_BUF_PTRChandra Seetharaman2011-07-251-1/+1
| | | | | | | | Remove the definition and usages of the macro XFS_BUF_PTR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: Remove macro XFS_BUF_BUSY and familyChandra Seetharaman2011-07-251-4/+0
| | | | | | | | | Remove the definitions and uses of the macros XFS_BUF_BUSY, XFS_BUF_UNBUSY, and XFS_BUF_ISBUSY. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: Remove the macro XFS_BUF_ERROR and familyChandra Seetharaman2011-07-251-2/+2
| | | | | | | | | Remove the definitions and usage of the macros XFS_BUF_ERROR, XFS_BUF_GETERROR and XFS_BUF_ISERROR. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
* xfs: remove wrappers around b_iodoneChristoph Hellwig2011-07-131-2/+2
| | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: remove wrappers around b_fsprivChristoph Hellwig2011-07-131-11/+6
| | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: add a proper transaction pointer to struct xfs_bufChristoph Hellwig2011-07-131-9/+0
| | | | | | | | | | | Replace the typeless b_fspriv2 and the ugly macros around it with a properly typed transaction pointer. As a fallout the log buffer state debug checks are also removed. We could have kept them using casts, but as they do not have a real purpose we can as well just remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: avoid a few disk cache flushesChristoph Hellwig2011-07-081-6/+12
| | | | | | | | | | | There is no need for a pre-flush when doing writing the second part of a split log buffer, and if we are using an external log there is no need to do a full cache flush of the log device at all given that all writes to it use the FUA flag. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
* xfs: cleanup I/O-related buffer flagsChristoph Hellwig2011-07-081-5/+4
| | | | | | | | | | | | Remove the unused and misnamed _XBF_RUN_QUEUES flag, rename XBF_LOG_BUFFER to the more fitting XBF_SYNCIO, and split XBF_ORDERED into XBF_FUA and XBF_FLUSH to allow more fine grained control over the bio flags. Also cleanup processing of the flags in _xfs_buf_ioapply to make more sense, and renumber the sparse flag number space to group flags by purpose. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>