| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.
Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.
Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall <hubcap@omnibond.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
commit 006351ac8ead0d4a67dd3845e3ceffe650a23212 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
| |
commit 940ef1a0ed939c2ca029fca715e25e7778ce1e34 upstream.
... and it really needs splitting into "new" and "extend" cases, but that's for
later
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
commit 6b0d144fa758869bdd652c50aa41aaf601232550 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
commit eb315d2ae614493fd1ebb026c75a80573d84f7ad upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
commit 414cf7186dbec29bd946c138d6b5c09da5955a08 upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
| |
commit 8785d84d002c2ce0f68fbcd6c2c86be859802c7e upstream.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
| |
Followup to the UFS series - with the way we clear the new blocks (via
buffer cache, possibly on more than a page worth of file) we really
should not insert a reference to new block into inode block tree until
after we'd cleared it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just pass NULL as locked_page in case of first block in the indirect
chain. Old calling conventions aside, a reason for having 'phys'
was that ufs_inode_getfrag() used to be able to do _two_ allocations
- indirect block and extending/reallocating a tail. We needed
locked_page for the latter (it's a data), but we also needed to
figure out that indirect block is metadata. So we used to pass
non-NULL locked_page in all cases *and* used NULL phys as
indication of being asked to allocate an indirect.
With tail unpacking taken into a separate function we don't need
those convolutions anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
... and not "write to beginning of the disk", TYVM...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
same story as with ufs_inode_getblock()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
ufs_extend_tail() is handling that now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
... instead of messing with buffer_head. We can bloody well do
sb_bread() in there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value passed to ufs_inode_getblock() as the 3rd argument
had lower bits ignored; the upper bits were shifted down
and used and they actually make sense - those are _lower_ bits
of index in indirect block (i.e. they form the index within
a fragment within an indirect block).
Pass those as argument. Upper bits of index (i.e. the number
of fragment within indirect block) will join them shortly.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
just return the damn block number
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
| |
These calling conventions are rudiments of pre-2.3 times; they
really need to be sanitized. This is the first step; next
will be _always_ returning a block number, instead of this
"return a pointer to buffer_head, except when we get to the
actual data" crap.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
we'd already calculated it...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
... and massage ufs_frag_map() to take those instead of fragment number.
As it is, we duplicate the damn thing on the write side, open-coded and
bloody hard to follow.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
We are holding ->truncate_mutex, so nobody else can alter our
block pointers. Rechecks/retries were needed back when we
only held BKL there, and had to cope with write_begin/writepage
and writepage/truncate races. Can't happen anymore...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a case when an indirect block gets dirtied for no good
reason - when there's a hole starting in the middle of area
covered by it and spanning past its end, and truncate() is done
precisely to the beginning of the hole.
The block is obviously not modified at all - all removals happen
beyond it. However, existing code ends up dirtying it just in
case. It's trivial to fix and while it's not a real bug by any
stretch of imagination, it makes the damn thing harder to follow.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
Note that it's already made unreachable from the inode, so we don't have
to worry about ufs_frag_map() walking into something already freed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
Have caller fetch the block number *and* remove it from wherever
it was. Pass the block number instead.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
turn recursion into a pair of loops
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We always have 0 < depth2 <= depth in there, so
if (--depth) {
if (--depth2)
A
B
} else {
C // not using depth2
}
D // not using depth2
is equivalent to
if (--depth2)
A with s/depth/depth - 1/
if (--depth)
B
else
C
D
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
open-coded in several places...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
| |
For calls in __ufs_truncate_blocks() it's just a matter of not
incrementing offsets[0] and not making that call - immediately
following loop will be executed one extra time and we'll be just
fine. For recursive call in ufs_trunc_branch() itself, just
assing NULL to offsets if we would be about to make such call.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
... and turn the switch into if (), since all cases with
depth != 1 have just become identical.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of manually checking that the array contains only zeroes,
find the position of the last non-zero (in __ufs_truncate(), where
we can conveniently do that) and use that to tell if there's
any non-zero in the array tail passed to ufs_trunc_...indirect().
The goal of all that clumsiness is to get fold these functions
together.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
| |
rather than bitslicing the offset just formed as sum of shifted indices,
pass the array of those indices itself. NULL is used as equivalent
of "all zeroes" (== free the entire branch).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
same as the previous two.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
... instead of file offset. Same cleanups as in the tindirect
conversion in previous commit.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IOW, the distance of cutoff from the begining of the branch
(in blocks).
That (and the fact that block just prior to cutoff is guaranteed to
be present) allows to tell whether to free triple indirect block
just by looking at the offset.
While we are at it, using u64 for index in the block is wrong -
those should be unsigned int.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
Use ufs_block_to_path() to find the cutoff path in the block pointers' tree.
For now just use the information about the depth (to bypass the fully
preserved subtrees); subsequent commits will use the information about actual
path.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
| |
type makes no sense - those are indices in block number arrays, not
block numbers. And no, UFS is not likely to grow indirect blocks with
4Gpointers in them...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
| |
It is closely tied to block pointers handling there, can benefit
from existing helpers, etc. - no point keeping them apart.
Trimmed the trailing whitespaces in inode.c at the same time.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
| |
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently - on lock_ufs(), eventually - on per-inode mutex.
lock_ufs() used to be mere BKL, which is much weaker, so it needed
those rechecks. BKL doesn't provide any exclusion once we lose CPU;
its blind replacement, OTOH, _does_. Making that per-filesystem was
an atrocity, but at least we can simplify life here. And yes, we
certainly need to make that sucker per-inode - these days inode.c and
truncate.c uses are needed only to protect the block pointers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
| |
make it return void
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
| |
There were 3 remaining users; in two of them we took ->s_lock immediately
after lock_ufs() and held it until just before unlock_ufs(); the third
one (statfs) could not be called from itself or from other two (remount
and sync_fs). Just use ->s_lock in statfs and don't bother with lock_ufs
at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* stores to block pointers are under per-inode seqlock (meta_lock) and
mutex (truncate_mutex)
* fetches of block pointers are either under truncate_mutex, or wrapped
into seqretry loop on meta_lock
* all changes of ->i_size are under truncate_mutex and i_mutex
* all changes of ->i_lastfrag are under truncate_mutex
It's similar to what ext2 is doing; the main difference is that unlike
ext2 we can't rely upon the atomicity of stores into block pointers -
on UFS2 they are 64bit. So we can't cut the corner when switching
a pointer from NULL to non-NULL as we could in ext2_splice_branch()
and need to use meta_lock on all modifications.
We use seqlock where ext2 uses rwlock; ext2 could probably also benefit
from such change...
Another non-trivial difference is that with UFS we *cannot* have reader
grab truncate_mutex in case of race - it has to keep retrying. That
might be possible to change, but not until we lift tail unpacking
several levels up in call chain.
After that commit we do *NOT* hold fs-wide serialization on accesses
to block pointers anymore. Moreover, lock_ufs() can become a normal
mutex now - it's only used on statfs, remount and sync_fs and none
of those uses are recursive. As the matter of fact, *now* it can be
collapsed with ->s_lock, and be eventually replaced with saner
per-cylinder-group spinlocks, but that's a separate story.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
|
|
|
|
|
| |
right now it doesn't matter (lock_ufs() serializes everything),
but when we switch to per-inode locking, it will be needed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|