| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Introduce a block group type bit for a global reserve and fill the space
info for SPACE_INFO ioctl. This should replace the newly added ioctl
(01e219e8069516cdb98594d417b8bb8d906ed30d) to get just the 'size' part
of the global reserve, while the actual usage can be now visible in the
'btrfs fi df' output during ENOSPC stress.
The unpatched userspace tools will show the blockgroup as 'unknown'.
CC: Jeff Mahoney <jeffm@suse.com>
CC: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Reproducer:
mount /dev/ubda /mnt
mount -oremount,thread_pool=42 /mnt
Gives a crash:
? btrfs_workqueue_set_max+0x0/0x70
btrfs_resize_thread_pool+0xe3/0xf0
? sync_filesystem+0x0/0xc0
? btrfs_resize_thread_pool+0x0/0xf0
btrfs_remount+0x1d2/0x570
? kern_path+0x0/0x80
do_remount_sb+0xd9/0x1c0
do_mount+0x26a/0xbf0
? kfree+0x0/0x1b0
SyS_mount+0xc4/0x110
It's a call
btrfs_workqueue_set_max(fs_info->scrub_wr_completion_workers, new_pool_size);
with
fs_info->scrub_wr_completion_workers = NULL;
as scrub wqs get created only on user's demand.
Patch skips not-created-yet workqueues.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Qu Wenruo <quwenruo@cn.fujitsu.com>
CC: Chris Mason <clm@fb.com>
CC: Josef Bacik <jbacik@fb.com>
CC: linux-btrfs@vger.kernel.org
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
I'm not sure why we weren't aborting here in the first place, it is obviously a
bad time from the fact that we print the leaf and yell loudly about it. Fix
this up, otherwise we panic because our path could be pointing into oblivion.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
btrfs_drop_extents can now return -EINVAL, but only one caller
in btrfs_clone was checking for it. This adds it to the
caller for inline extents, which is where we really need it.
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This patch fix a regression caused by the following patch:
Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock
break while loop will make us call @spin_unlock() without
calling @spin_lock() before, fix it.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Steps to reproduce:
# mkfs.btrfs -f /dev/sda[8-11] -m raid5 -d raid5
# mount /dev/sda8 /mnt
# btrfs scrub start -BR /mnt
# echo $? <--unverified errors make return value be 3
This is because we don't setup right mapping between physical
and logical address for raid56, which makes checksum mismatch.
But we will find everthing is fine later when rechecking using
btrfs_map_block().
This patch fixed the problem by settuping right mappings and
we only verify data stripes' checksums.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
To compress a small file range(<=blocksize) that is not
an inline extent can not save disk space at all. skip it can
save us some cpu time.
This patch can also fix wrong setting nocompression flag for
inode, say a case when @total_in is 4096, and then we get
@total_compressed 52,because we do aligment to page cache size
firstly, and then we get into conclusion @total_in=@total_compressed
thus we will clear this inode's compression flag.
An exception comes from inserting inline extent failure but we
still have @total_compressed < @total_in,so we will still reset
inode's flag, this is ok, because we don't have good compression
effect.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
If we don't reschedule use rb_next to find the next extent state
instead of a full tree search, which is more efficient and safe
since we didn't release the io tree's lock.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
There's no point building the path string in each iteration of the
send_hole loop, as it produces always the same string.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Originally following cmds will work:
# btrfs fi resize -10A <mnt>
# btrfs fi resize -10Gaha <mnt>
Filter the arg by checking the return pointer of memparse.
Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
During an incremental send, when we finish processing an inode (corresponding to
a regular file) we would assume the gap between the end of the last processed file
extent and the file's size corresponded to a file hole, and therefore incorrectly
send a bunch of zero bytes to overwrite that region in the file.
This affects only kernel 3.14.
Reproducer:
mkfs.btrfs -f /dev/sdc
mount /dev/sdc /mnt
xfs_io -f -c "falloc -k 0 268435456" /mnt/foo
btrfs subvolume snapshot -r /mnt /mnt/mysnap0
xfs_io -c "pwrite -S 0x01 -b 9216 16190218 9216" /mnt/foo
xfs_io -c "pwrite -S 0x02 -b 1121 198720104 1121" /mnt/foo
xfs_io -c "pwrite -S 0x05 -b 9216 107887439 9216" /mnt/foo
xfs_io -c "pwrite -S 0x06 -b 9216 225520207 9216" /mnt/foo
xfs_io -c "pwrite -S 0x07 -b 67584 102138300 67584" /mnt/foo
xfs_io -c "pwrite -S 0x08 -b 7000 94897484 7000" /mnt/foo
xfs_io -c "pwrite -S 0x09 -b 113664 245083212 113664" /mnt/foo
xfs_io -c "pwrite -S 0x10 -b 123 17937788 123" /mnt/foo
xfs_io -c "pwrite -S 0x11 -b 39936 229573311 39936" /mnt/foo
xfs_io -c "pwrite -S 0x12 -b 67584 174792222 67584" /mnt/foo
xfs_io -c "pwrite -S 0x13 -b 9216 249253213 9216" /mnt/foo
xfs_io -c "pwrite -S 0x16 -b 67584 150046083 67584" /mnt/foo
xfs_io -c "pwrite -S 0x17 -b 39936 118246040 39936" /mnt/foo
xfs_io -c "pwrite -S 0x18 -b 67584 215965442 67584" /mnt/foo
xfs_io -c "pwrite -S 0x19 -b 33792 97096725 33792" /mnt/foo
xfs_io -c "pwrite -S 0x20 -b 125952 166300596 125952" /mnt/foo
xfs_io -c "pwrite -S 0x21 -b 123 1078957 123" /mnt/foo
xfs_io -c "pwrite -S 0x25 -b 9216 212044492 9216" /mnt/foo
xfs_io -c "pwrite -S 0x26 -b 7000 265037146 7000" /mnt/foo
xfs_io -c "pwrite -S 0x27 -b 42757 215922685 42757" /mnt/foo
xfs_io -c "pwrite -S 0x28 -b 7000 69865411 7000" /mnt/foo
xfs_io -c "pwrite -S 0x29 -b 67584 67948958 67584" /mnt/foo
xfs_io -c "pwrite -S 0x30 -b 39936 266967019 39936" /mnt/foo
xfs_io -c "pwrite -S 0x31 -b 1121 19582453 1121" /mnt/foo
xfs_io -c "pwrite -S 0x32 -b 17408 257710255 17408" /mnt/foo
xfs_io -c "pwrite -S 0x33 -b 39936 3895518 39936" /mnt/foo
xfs_io -c "pwrite -S 0x34 -b 125952 12045847 125952" /mnt/foo
xfs_io -c "pwrite -S 0x35 -b 17408 19156379 17408" /mnt/foo
xfs_io -c "pwrite -S 0x36 -b 39936 50160066 39936" /mnt/foo
xfs_io -c "pwrite -S 0x37 -b 113664 9549793 113664" /mnt/foo
xfs_io -c "pwrite -S 0x38 -b 105472 94391506 105472" /mnt/foo
xfs_io -c "pwrite -S 0x39 -b 23552 143632863 23552" /mnt/foo
xfs_io -c "pwrite -S 0x40 -b 39936 241283845 39936" /mnt/foo
xfs_io -c "pwrite -S 0x41 -b 113664 199937606 113664" /mnt/foo
xfs_io -c "pwrite -S 0x42 -b 67584 67380093 67584" /mnt/foo
xfs_io -c "pwrite -S 0x43 -b 67584 26793129 67584" /mnt/foo
xfs_io -c "pwrite -S 0x44 -b 39936 14421913 39936" /mnt/foo
xfs_io -c "pwrite -S 0x45 -b 123 253097405 123" /mnt/foo
xfs_io -c "pwrite -S 0x46 -b 1121 128233424 1121" /mnt/foo
xfs_io -c "pwrite -S 0x47 -b 105472 91577959 105472" /mnt/foo
xfs_io -c "pwrite -S 0x48 -b 1121 7245381 1121" /mnt/foo
xfs_io -c "pwrite -S 0x49 -b 113664 182414694 113664" /mnt/foo
xfs_io -c "pwrite -S 0x50 -b 9216 32750608 9216" /mnt/foo
xfs_io -c "pwrite -S 0x51 -b 67584 266546049 67584" /mnt/foo
xfs_io -c "pwrite -S 0x52 -b 67584 87969398 67584" /mnt/foo
xfs_io -c "pwrite -S 0x53 -b 9216 260848797 9216" /mnt/foo
xfs_io -c "pwrite -S 0x54 -b 39936 119461243 39936" /mnt/foo
xfs_io -c "pwrite -S 0x55 -b 7000 200178693 7000" /mnt/foo
xfs_io -c "pwrite -S 0x56 -b 9216 243316029 9216" /mnt/foo
xfs_io -c "pwrite -S 0x57 -b 7000 209658229 7000" /mnt/foo
xfs_io -c "pwrite -S 0x58 -b 101376 179745192 101376" /mnt/foo
xfs_io -c "pwrite -S 0x59 -b 9216 64012300 9216" /mnt/foo
xfs_io -c "pwrite -S 0x60 -b 125952 181705139 125952" /mnt/foo
xfs_io -c "pwrite -S 0x61 -b 23552 235737348 23552" /mnt/foo
xfs_io -c "pwrite -S 0x62 -b 113664 106021355 113664" /mnt/foo
xfs_io -c "pwrite -S 0x63 -b 67584 135753552 67584" /mnt/foo
xfs_io -c "pwrite -S 0x64 -b 23552 95730888 23552" /mnt/foo
xfs_io -c "pwrite -S 0x65 -b 11 17311415 11" /mnt/foo
xfs_io -c "pwrite -S 0x66 -b 33792 120695553 33792" /mnt/foo
xfs_io -c "pwrite -S 0x67 -b 9216 17164631 9216" /mnt/foo
xfs_io -c "pwrite -S 0x68 -b 9216 136065853 9216" /mnt/foo
xfs_io -c "pwrite -S 0x69 -b 67584 37752198 67584" /mnt/foo
xfs_io -c "pwrite -S 0x70 -b 101376 189717473 101376" /mnt/foo
xfs_io -c "pwrite -S 0x71 -b 7000 227463698 7000" /mnt/foo
xfs_io -c "pwrite -S 0x72 -b 9216 12655137 9216" /mnt/foo
xfs_io -c "pwrite -S 0x73 -b 7000 7488866 7000" /mnt/foo
xfs_io -c "pwrite -S 0x74 -b 113664 87813649 113664" /mnt/foo
xfs_io -c "pwrite -S 0x75 -b 33792 25802183 33792" /mnt/foo
xfs_io -c "pwrite -S 0x76 -b 39936 93524024 39936" /mnt/foo
xfs_io -c "pwrite -S 0x77 -b 33792 113336388 33792" /mnt/foo
xfs_io -c "pwrite -S 0x78 -b 105472 184955320 105472" /mnt/foo
xfs_io -c "pwrite -S 0x79 -b 101376 225691598 101376" /mnt/foo
xfs_io -c "pwrite -S 0x80 -b 23552 77023155 23552" /mnt/foo
xfs_io -c "pwrite -S 0x81 -b 11 201888192 11" /mnt/foo
xfs_io -c "pwrite -S 0x82 -b 11 115332492 11" /mnt/foo
xfs_io -c "pwrite -S 0x83 -b 67584 230278015 67584" /mnt/foo
xfs_io -c "pwrite -S 0x84 -b 11 120589073 11" /mnt/foo
xfs_io -c "pwrite -S 0x85 -b 125952 202207819 125952" /mnt/foo
xfs_io -c "pwrite -S 0x86 -b 113664 86672080 113664" /mnt/foo
xfs_io -c "pwrite -S 0x87 -b 17408 208459603 17408" /mnt/foo
xfs_io -c "pwrite -S 0x88 -b 7000 73372211 7000" /mnt/foo
xfs_io -c "pwrite -S 0x89 -b 7000 42252122 7000" /mnt/foo
xfs_io -c "pwrite -S 0x90 -b 23552 46784881 23552" /mnt/foo
xfs_io -c "pwrite -S 0x91 -b 101376 63172351 101376" /mnt/foo
xfs_io -c "pwrite -S 0x92 -b 23552 59341931 23552" /mnt/foo
xfs_io -c "pwrite -S 0x93 -b 39936 239599283 39936" /mnt/foo
xfs_io -c "pwrite -S 0x94 -b 67584 175643105 67584" /mnt/foo
xfs_io -c "pwrite -S 0x97 -b 23552 105534880 23552" /mnt/foo
xfs_io -c "pwrite -S 0x98 -b 113664 8236844 113664" /mnt/foo
xfs_io -c "pwrite -S 0x99 -b 125952 144489686 125952" /mnt/foo
xfs_io -c "pwrite -S 0xa0 -b 7000 73273112 7000" /mnt/foo
xfs_io -c "pwrite -S 0xa1 -b 125952 194580243 125952" /mnt/foo
xfs_io -c "pwrite -S 0xa2 -b 123 56296779 123" /mnt/foo
xfs_io -c "pwrite -S 0xa3 -b 11 233066845 11" /mnt/foo
xfs_io -c "pwrite -S 0xa4 -b 39936 197727090 39936" /mnt/foo
xfs_io -c "pwrite -S 0xa5 -b 101376 53579812 101376" /mnt/foo
xfs_io -c "pwrite -S 0xa6 -b 9216 85669738 9216" /mnt/foo
xfs_io -c "pwrite -S 0xa7 -b 125952 21266322 125952" /mnt/foo
xfs_io -c "pwrite -S 0xa8 -b 23552 125726568 23552" /mnt/foo
xfs_io -c "pwrite -S 0xa9 -b 9216 18423680 9216" /mnt/foo
xfs_io -c "pwrite -S 0xb0 -b 1121 165901483 1121" /mnt/foo
btrfs subvolume snapshot -r /mnt /mnt/mysnap1
xfs_io -c "pwrite -S 0xff -b 10 16190218 10" /mnt/foo
btrfs subvolume snapshot -r /mnt /mnt/mysnap2
md5sum /mnt/foo # returns 79e53f1466bfc09fd82b450689e6119e
md5sum /mnt/mysnap2/foo # returns 79e53f1466bfc09fd82b450689e6119e too
btrfs send /mnt/mysnap1 -f /tmp/1.snap
btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap
mkfs.btrfs -f /dev/sdc
mount /dev/sdc /mnt
btrfs receive /mnt -f /tmp/1.snap
btrfs receive /mnt -f /tmp/2.snap
md5sum /mnt/mysnap2/foo # returns 2bb414c5155767cedccd7063e51beabd !!
A testcase for xfstests follows soon too.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The error handling was copy and pasted from memdup_user(). It should be
checking for NULL obviously.
Fixes: abccd00f8af2 ('btrfs: Fix 32/64-bit problem with BTRFS_SET_RECEIVED_SUBVOL ioctl')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
While running fsstress and snapshots concurrently, we will hit something
like followings:
Thread 1 Thread 2
|->fallocate
|->write pages
|->join transaction
|->add ordered extent
|->end transaction
|->flushing data
|->creating pending snapshots
|->write data into src root's
fallocated space
After above work flows finished, we will get a state that source and
snapshot root share same space, but source root have written data into
fallocated space, this will make fsck fail to verify checksums for
snapshot root's preallocating file extent data.Nocow writting also
has this same problem.
Fix this problem by syncing snapshots with nocow writting:
1.for nocow writting,if there are pending snapshots, we will
fall into COW way.
2.if there are pending nocow writes, snapshots for this root
will be blocked until nocow writting finish.
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When testing fsstress with snapshot making background, some snapshot
following problem.
Snapshot 270:
inode 323: size 0
Snapshot 271:
inode 323: size 349145
|-------Hole---|---------Empty gap-------|-------Hole-----|
0 122880 172032 349145
Snapshot 272:
inode 323: size 349145
|-------Hole---|------------Data---------|-------Hole-----|
0 122880 172032 349145
The fsstress operation on inode 323 is the following:
write: offset 126832 len 43124
truncate: size 349145
Since the write with offset is consist of 2 operations:
1. punch hole
2. write data
Hole punching is faster than data write, so hole punching in write
and truncate is done first and then buffered write, so the snapshot 271 got
empty gap, which will not pass btrfsck.
To fix the bug, this patch will change the write sequence which will
first punch a hole covering the write end if a hole is needed.
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Print the message only when the device is seen for the first time.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
When encountering memory pressure, testers have run into the following
lockdep warning. It was caused by __link_block_group calling kobject_add
with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL,
which gets us into reclaim context. The kobject doesn't actually need
to be added under the lock -- it just needs to ensure that it's only
added for the first block group to be linked.
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.14.0-rc8-default #1 Not tainted
---------------------------------------------------------
kswapd0/169 just changed the state of lock:
(&delayed_node->mutex){+.+.-.}, at: [<ffffffffa018baea>] __btrfs_release_delayed_node+0x3a/0x200 [btrfs]
but this lock took another, RECLAIM_FS-unsafe lock in the past:
(&found->groups_sem){+++++.}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&found->groups_sem);
local_irq_disable();
lock(&delayed_node->mutex);
lock(&found->groups_sem);
<Interrupt>
lock(&delayed_node->mutex);
*** DEADLOCK ***
2 locks held by kswapd0/169:
#0: (shrinker_rwsem){++++..}, at: [<ffffffff81159e8a>] shrink_slab+0x3a/0x160
#1: (&type->s_umount_key#27){++++..}, at: [<ffffffff811bac6f>] grab_super_passive+0x3f/0x90
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We currently rely too heavily on roots being read-only to save us from just
accessing root->commit_root. We can easily balance blocks out from underneath a
read only root, so to save us from getting screwed make sure we only access
root->commit_root under the commit root sem. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Lets try this again. We can deadlock the box if we send on a box and try to
write onto the same fs with the app that is trying to listen to the send pipe.
This is because the writer could get stuck waiting for a transaction commit
which is being blocked by the send. So fix this by making sure looking at the
commit roots is always going to be consistent. We do this by keeping track of
which roots need to have their commit roots swapped during commit, and then
taking the commit_root_sem and swapping them all at once. Then make sure we
take a read lock on the commit_root_sem in cases where we search the commit root
to make sure we're always looking at a consistent view of the commit roots.
Previously we had problems with this because we would swap a fs tree commit root
and then swap the extent tree commit root independently which would cause the
backref walking code to screw up sometimes. With this patch we no longer
deadlock and pass all the weird send/receive corner cases. Thanks,
Reportedy-by: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
So I have an awful exercise script that will run snapshot, balance and
send/receive in parallel. This sometimes would crash spectacularly and when it
came back up the fs would be completely hosed. Turns out this is because of a
bad interaction of balance and send/receive. Send will hold onto its entire
path for the whole send, but its blocks could get relocated out from underneath
it, and because it doesn't old tree locks theres nothing to keep this from
happening. So it will go to read in a slot with an old transid, and we could
have re-allocated this block for something else and it could have a completely
different transid. But because we think it is invalid we clear uptodate and
re-read in the block. If we do this before we actually write out the new block
we could write back stale data to the fs, and boom we're screwed.
Now we definitely need to fix this disconnect between send and balance, but we
really really need to not allow ourselves to accidently read in stale data over
new data. So make sure we check if the extent buffer is not under io before
clearing uptodate, this will kick back EIO to the caller instead of reading in
stale data and keep us from corrupting the fs. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
We could have possibly added an extent_op to the locked_ref while we dropped
locked_ref->lock, so check for this case as well and loop around. Otherwise we
could lose flag updates which would lead to extent tree corruption. Thanks,
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This was done to allow NO_COW to continue to be NO_COW after relocation but it
is not right. When relocating we will convert blocks to FULL_BACKREF that we
relocate. We can leave some of these full backref blocks behind if they are not
cow'ed out during the relocation, like if we fail the relocation with ENOSPC and
then just drop the reloc tree. Then when we go to cow the block again we won't
lookup the extent flags because we won't think there has been a snapshot
recently which means we will do our normal ref drop thing instead of adding back
a tree ref and dropping the shared ref. This will cause btrfs_free_extent to
blow up because it can't find the ref we are trying to free. This was found
with my ref verifying tool. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Pull exofs updates from Boaz Harrosh:
"Trivial updates to exofs for 3.15-rc1
Just a few fixes sent by people"
* 'for-linus' of git://git.open-osd.org/linux-open-osd:
MAINTAINERS: Update email address for bhalevy
fs: Mark functions as static in exofs/ore_raid.c
fs: Mark function as static in exofs/super.c
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Mark functions as static in exofs/ore_raid.c because they are not used
outside this file.
This also eliminates the following warning in exofs/ore_raid.c:
fs/exofs/ore_raid.c:24:14: warning: no previous prototype for _raid_page_alloc [-Wmissing-prototypes]
fs/exofs/ore_raid.c:29:6: warning: no previous prototype for _raid_page_free [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Mark function as static in exofs/super.c because it is not used outside
this file.
This also eliminates the following warning in exofs/super.c:
fs/exofs/super.c:546:5: warning: no previous prototype \
for __alloc_dev_table[-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Pull block layer fixes from Jens Axboe:
"A small collection of fixes that should go in before -rc1. The pull
request contains:
- A two patch fix for a regression with block enabled tagging caused
by a commit in the initial pull request. One patch is from Martin
and ensures that SCSI doesn't truncate 64-bit block flags, the
other one is from me and prevents us from double using struct
request queuelist for both completion and busy tags. This caused
anything from a boot crash for some, to crashes under load.
- A blk-mq fix for a potential soft stall when hot unplugging CPUs
with busy IO.
- percpu_counter fix is listed in here, that caused a suspend issue
with virtio-blk due to percpu counters having an inconsistent state
during CPU removal. Andrew sent this in separately a few days ago,
but it's here. JFYI.
- A few fixes for block integrity from Martin.
- A ratelimit fix for loop from Mike Galbraith, to avoid spewing too
much in error cases"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix regression with block enabled tagging
scsi: Make sure cmd_flags are 64-bit
block: Ensure we only enable integrity metadata for reads and writes
block: Fix integrity verification
block: Fix for_each_bvec()
drivers/block/loop.c: ratelimit error messages
blk-mq: fix potential stall during CPU unplug with IO pending
percpu_counter: fix bad counter state during suspend
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
We'd occasionally attempt to generate protection information for flushes
and other requests with a zero payload. Make sure we only attempt to
enable integrity for reads and writes.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Commit bf36f9cfa6d3d caused a regression by effectively reverting Nic's
fix from 5837c80e870b that ensures we traverse the full bio_vec list
upon completion.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Pull nfsd updates from Bruce Fields:
"Highlights:
- server-side nfs/rdma fixes from Jeff Layton and Tom Tucker
- xdr fixes (a larger xdr rewrite has been posted but I decided it
would be better to queue it up for 3.16).
- miscellaneous fixes and cleanup from all over (thanks especially to
Kinglong Mee)"
* 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits)
nfsd4: don't create unnecessary mask acl
nfsd: revert v2 half of "nfsd: don't return high mode bits"
nfsd4: fix memory leak in nfsd4_encode_fattr()
nfsd: check passed socket's net matches NFSd superblock's one
SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed
NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp
SUNRPC: New helper for creating client with rpc_xprt
NFSD: Free backchannel xprt in bc_destroy
NFSD: Clear wcc data between compound ops
nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+
nfsd4: fix nfs4err_resource in 4.1 case
nfsd4: fix setclientid encode size
nfsd4: remove redundant check from nfsd4_check_resp_size
nfsd4: use more generous NFS4_ACL_MAX
nfsd4: minor nfsd4_replay_cache_entry cleanup
nfsd4: nfsd4_replay_cache_entry should be static
nfsd4: update comments with obsolete function name
rpc: Allow xdr_buf_subsegment to operate in-place
NFSD: Using free_conn free connection
SUNRPC: fix memory leak of peer addresses in XPRT
...
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Any setattr of the ACL attribute, even if it sets just the basic 3-ACE
ACL exactly as it was returned from a file with only mode bits, creates
a mask entry, and it is only the mask, not group, entry that is changed
by subsequent modifications of the mode bits.
So, for example, it's surprising that GROUP@ is left without read or
write permissions after a chmod 0666:
touch test
chmod 0600 test
nfs4_getfacl test
A::OWNER@:rwatTcCy
A::GROUP@:tcy
A::EVERYONE@:tcy
nfs4_getfacl test | nfs4_setfacl -S - test #
chmod 0666 test
nfs4_getfacl test
A::OWNER@:rwatTcCy
A::GROUP@:tcy
D::GROUP@:rwa
A::EVERYONE@:rwatcy
So, let's stop creating the unnecessary mask ACL.
A mask will still be created on non-trivial ACLs (ACLs with actual named
user and group ACEs), so the odd posix-acl behavior of chmod modifying
only the mask will still be left in that case; but that's consistent
with local behavior.
Reported-by: Soumya Koduri <skoduri@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
that changes NFSv2 behavior.
Mark Lord found that it broke nfs-root for Linux clients, because it
broke NFSv2.
In fact, from RFC 1094:
"Notice that the file type is specified both in the mode bits
and in the file type. This is really a bug in the protocol and
will be fixed in future versions."
So NFSv2 clients really are expected to depend on the high bits of the
mode.
Cc: stable@kernel.org
Reported-by: Mark Lord <mlord@pobox.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
fh_put() does not free the temporary file handle.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
There could be a case, when NFSd file system is mounted in network, different
to socket's one, like below:
"ip netns exec" creates new network and mount namespace, which duplicates NFSd
mount point, created in init_net context. And thus NFS server stop in nested
network context leads to RPCBIND client destruction in init_net.
Then, on NFSd start in nested network context, rpc.nfsd process creates socket
in nested net and passes it into "write_ports", which leads to RPCBIND sockets
creation in init_net context because of the same reason (NFSd monut point was
created in init_net context). An attempt to register passed socket in nested
net leads to panic, because no RPCBIND client present in nexted network
namespace.
This patch add check that passed socket's net matches NFSd superblock's one.
And returns -EINVAL error to user psace otherwise.
v2: Put socket on exit.
Reported-by: Weng Meiling <wengmeiling.weng@huawei.com>
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Besides checking rpc_xprt out of xs_setup_bc_tcp,
increase it's reference (it's important).
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Testing NFS4.0 by pynfs, I got some messeages as,
"nfsd: inode locked twice during operation."
When one compound RPC contains two or more ops that locks
the filehandle,the second op will cause the message.
As two SETATTR ops, after the first SETATTR, nfsd will not call
fh_put() to release current filehandle, it means filehandle have
unlocked with fh_post_saved = 1.
The second SETATTR find fh_post_saved = 1, and printk the message.
v2: introduce helper fh_clear_wcc().
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
RFC5661 obsoletes NFS4ERR_STALE_STATEID in favour of NFS4ERR_BAD_STATEID.
Note that because nfsd encodes the clientid boot time in the stateid, we
can hit this error case in certain scenarios where the Linux client
state management thread exits early, before it has finished recovering
all state.
Reported-by: Idan Kedar <idank@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
encode_getattr, for example, can return nfserr_resource to indicate it
ran out of buffer space. That's not a legal error in the 4.1 case. And
in the 4.1 case, if we ran out of buffer space, we should have exceeded
a session limit too.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
cstate->slot and ->session are each set together in nfsd4_sequence. If
one is non-NULL, so is the other.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Maybe this is comment true, who cares? Handle this like any other
error.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This isn't actually used anywhere else.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Connection from alloc_conn must be freed through free_conn,
otherwise, the reference of svc_xprt will never be put.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
during lockd_up
We had a Fedora ABRT report with a stack trace like this:
kernel BUG at net/sunrpc/svc.c:550!
invalid opcode: 0000 [#1] SMP
[...]
CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206
RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
Stack:
ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
Call Trace:
[<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
[<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
[<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
[<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
[<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
[<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
[<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
[<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
[<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
[<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
[<ffffffff811c3f99>] ? putname+0x29/0x40
[<ffffffff811b9569>] SyS_write+0x49/0xa0
[<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
[<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP <ffff88003f9b9de0>
Evidently, we created some lockd sockets and then failed to create
others. make_socks then returned an error and we tried to tear down the
svc, but svc->sv_permsocks was not empty so we ended up tripping over
the BUG() in svc_destroy().
Fix this by ensuring that we tear down any live sockets we created when
socket creation is going to return an error.
Fixes: 786185b5f8abefa (SUNRPC: move per-net operations from...)
Reported-by: Raphos <raphoszap@laposte.net>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
When stopping nfsd, I got BUG messages, and soft lockup messages,
The problem is cuased by double rb_erase() in nfs4_state_destroy_net()
and destroy_client().
This patch just let nfsd traversing unconfirmed client through
hash-table instead of rbtree.
[ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0
[ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp
llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw
e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc
snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi
[last unloaded: nfsd]
[ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF O
3.14.0-rc8+ #2
[ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti:
ffff8800797f6000
[ 2325.022982] RIP: 0010:[<ffffffff8133c18c>] [<ffffffff8133c18c>]
rb_erase+0x14c/0x390
[ 2325.022982] RSP: 0018:ffff8800797f7d98 EFLAGS: 00010246
[ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX:
0000000000000000
[ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI:
ffff880079f4c810
[ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09:
ffff88007964fc70
[ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12:
ffff880079f4c800
[ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15:
ffff880079f4c860
[ 2325.022982] FS: 0000000000000000(0000) GS:ffff88007f900000(0000)
knlGS:0000000000000000
[ 2325.022982] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4:
00000000000006e0
[ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 2325.022982] Stack:
[ 2325.022982] ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8
ffff8800797f7da8
[ 2325.022982] ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0
ffff880079c1f010
[ 2325.022982] ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2
ffff8800797f7df0
[ 2325.022982] Call Trace:
[ 2325.022982] [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd]
[ 2325.022982] [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220
[nfsd]
[ 2325.022982] [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[ 2325.022982] [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[ 2325.022982] [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[ 2325.022982] [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd]
[ 2325.022982] [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd]
[ 2325.022982] [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 2325.022982] [<ffffffff810a8232>] kthread+0xd2/0xf0
[ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[ 2325.022982] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10
0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b
50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01
[ 2325.022982] RIP [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982] RSP <ffff8800797f7d98>
[ 2325.022982] CR2: 0000000000000000
[ 2325.022982] ---[ end trace 28c27ed011655e57 ]---
[ 228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558]
[ 228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211
xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0
snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev
parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore
i2c_core ata_generic pata_acpi
[ 228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF O
3.14.0-rc8+ #2
[ 228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti:
ffff880074616000
[ 228.064539] RIP: 0010:[<ffffffff8133ba17>] [<ffffffff8133ba17>]
rb_next+0x27/0x50
[ 228.064539] RSP: 0018:ffff880074617de0 EFLAGS: 00000282
[ 228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX:
0000000000000014
[ 228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI:
ffff880074478010
[ 228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09:
0000000000000012
[ 228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12:
ffffea0001d11a00
[ 228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15:
ffff880074617d50
[ 228.064539] FS: 0000000000000000(0000) GS:ffff88007f800000(0000)
knlGS:0000000000000000
[ 228.064539] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4:
00000000000006f0
[ 228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 228.064539] Stack:
[ 228.064539] ffff880074617e28 ffffffffa01ab7db ffff880074617df0
ffff880074617df0
[ 228.064539] ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0
0000000000000000
[ 228.064539] 0000000000000000 ffff880074617e48 ffffffffa01840b8
ffffffff81cc26c0
[ 228.064539] Call Trace:
[ 228.064539] [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220
[nfsd]
[ 228.064539] [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[ 228.064539] [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[ 228.064539] [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[ 228.064539] [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd]
[ 228.064539] [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd]
[ 228.064539] [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 228.064539] [<ffffffff810a8232>] kthread+0xd2/0xf0
[ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 228.064539] [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[ 228.064539] [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48
8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b
50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7
Fixes: ac55fdc408039 (nfsd: move the confirmed and unconfirmed hlists...)
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
We have a WARN_ON in the nfsd4_decode_write() that tells us when the
client has sent a request that is not padded out properly according to
RFC4506. A WARN_ON really isn't appropriate in this case though since
this indicates a client bug, not a server one.
Move this check out to the top-level compound decoder and have it just
explicitly return an error. Also add a dprintk() that shows the client
address and xid to help track down clients and frames that trigger it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Looks like this bug has been here since these write counts were
introduced, not sure why it was just noticed now.
Thanks also to Jan Kara for pointing out the problem.
Cc: stable@vger.kernel.org
Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
If the entire operation fails then there's nothing to encode.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|