| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit d9b22cf9f5466a057f2a4f1e642b469fa9d73117 ]
When a filesystem is created using:
mkfs.ext4 -b 4096 -E stride=512 <dev>
and we try to allocate 64MB extent, we will end up directly in
ext4_mb_complex_scan_group(). This is because the request is detected
as power-of-two allocation (so we start in ext4_mb_regular_allocator()
with ac_criteria == 0) however the check before
ext4_mb_simple_scan_group() refuses the direct buddy scan because the
allocation request is too large. Since cr == 0, the check whether we
should use ext4_mb_scan_aligned() fails as well and we fall back to
ext4_mb_complex_scan_group().
Fix the problem by checking for upper limit on power-of-two requests
directly when detecting them.
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 105ddc93f06ebe3e553f58563d11ed63dbcd59f0 upstream.
The first cluster group descriptor is not stored at the start of the
group but at an offset from the start. We need to take this into
account while doing fstrim on the first cluster group. Otherwise we
will wrongly start fstrim a few blocks after the desired start block and
the range can cross over into the next cluster group and zero out the
group descriptor there. This can cause filesytem corruption that cannot
be fixed by fsck.
Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f74bc7c6679200a4a83156bb89cbf6c229fe8ec0 upstream.
And fix tcon leak in error path.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f66665c09ab489a11ca490d6a82df57cfc1bea3e upstream.
In eCryptfs, we failed to verify that the authentication token keys are
not revoked before dereferencing their payloads, which is problematic
because the payload of a revoked key is NULL. request_key() *does* skip
revoked keys, but there is still a window where the key can be revoked
before we acquire the key semaphore.
Fix it by updating ecryptfs_get_key_payload_data() to return
-EKEYREVOKED if the key payload is NULL. For completeness we check this
for "encrypted" keys as well as "user" keys, although encrypted keys
cannot be revoked currently.
Alternatively we could use key_validate(), but since we'll also need to
fix ecryptfs_get_key_payload_data() to validate the payload length, it
seems appropriate to just check the payload pointer.
Fixes: 237fead61998 ("[PATCH] ecryptfs: fs/Makefile and fs/Kconfig")
Reviewed-by: James Morris <james.l.morris@oracle.com>
Cc: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit c6cdd51404b7ac12dd95173ddfc548c59ecf037f upstream.
Marios Titas running a Haskell program noticed a problem with fuse's
readdirplus: when it is interrupted by a signal, it skips one directory
entry.
The reason is that fuse erronously updates ctx->pos after a failed
dir_emit().
The issue originates from the patch adding readdirplus support.
Reported-by: Jakob Unterwurzacher <jakobunt@gmail.com>
Tested-by: Marios Titas <redneb@gmx.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6c2838fbdedb9b72a81c931d49e56b229b6cdbca upstream.
sparse warns:
fs/ceph/caps.c:2042:9: warning: context imbalance in 'try_flush_caps' - wrong count at exit
We need to exit this function with the lock unlocked, but a couple of
cases leave it locked.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d124b2c53c7bee6569d2a2d0b18b4a1afde00134 upstream.
When the file /proc/fs/fscache/objects (available with
CONFIG_FSCACHE_OBJECT_LIST=y) is opened, we request a user key with
description "fscache:objlist", then access its payload. However, a
revoked key has a NULL payload, and we failed to check for this.
request_key() *does* skip revoked keys, but there is still a window
where the key can be revoked before we access its payload.
Fix it by checking for a NULL payload, treating it like a key which was
already revoked at the time it was requested.
Fixes: 4fbf4291aa15 ("FS-Cache: Allow the current state of all objects to be dumped")
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 439a36b8ef38657f765b80b775e2885338d72451 ]
We are in the situation that we have to avoid recursive cluster locking,
but there is no way to check if a cluster lock has been taken by a precess
already.
Mostly, we can avoid recursive locking by writing code carefully.
However, we found that it's very hard to handle the routines that are
invoked directly by vfs code. For instance:
const struct inode_operations ocfs2_file_iops = {
.permission = ocfs2_permission,
.get_acl = ocfs2_iop_get_acl,
.set_acl = ocfs2_iop_set_acl,
};
Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR):
do_sys_open
may_open
inode_permission
ocfs2_permission
ocfs2_inode_lock() <=== first time
generic_permission
get_acl
ocfs2_iop_get_acl
ocfs2_inode_lock() <=== recursive one
A deadlock will occur if a remote EX request comes in between two of
ocfs2_inode_lock(). Briefly describe how the deadlock is formed:
On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in
BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of
the remote EX lock request. Another hand, the recursive cluster lock
(the second one) will be blocked in in __ocfs2_cluster_lock() because of
OCFS2_LOCK_BLOCKED. But, the downconvert never complete, why? because
there is no chance for the first cluster lock on this node to be
unlocked - we block ourselves in the code path.
The idea to fix this issue is mostly taken from gfs2 code.
1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track
of the processes' pid who has taken the cluster lock of this lock
resource;
2. introduce a new flag for ocfs2_inode_lock_full:
OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for
us if we've got cluster lock.
3. export a helper: ocfs2_is_locked_by_me() is used to check if we have
got the cluster lock in the upper code path.
The tracking logic should be used by some of the ocfs2 vfs's callbacks,
to solve the recursive locking issue cuased by the fact that vfs
routines can call into each other.
The performance penalty of processing the holder list should only be
seen at a few cases where the tracking logic is used, such as get/set
acl.
You may ask what if the first time we got a PR lock, and the second time
we want a EX lock? fortunately, this case never happens in the real
world, as far as I can see, including permission check,
(get|set)_(acl|attr), and the gfs2 code also do so.
[sfr@canb.auug.org.au remove some inlines]
Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 4dd9920d991745c4a16f53a8f615f706fbe4b3f7 ]
Under certain situations, an incremental send operation can fail due to a
premature attempt to create a new top level inode (a direct child of the
subvolume/snapshot root) whose name collides with another inode that was
removed from the send snapshot.
Consider the following example scenario.
Parent snapshot:
. (ino 256, gen 8)
|---- a1/ (ino 257, gen 9)
|---- a2/ (ino 258, gen 9)
Send snapshot:
. (ino 256, gen 3)
|---- a2/ (ino 257, gen 7)
In this scenario, when receiving the incremental send stream, the btrfs
receive command fails like this (ran in verbose mode, -vv argument):
rmdir a1
mkfile o257-7-0
rename o257-7-0 -> a2
ERROR: rename o257-7-0 -> a2 failed: Is a directory
What happens when computing the incremental send stream is:
1) An operation to remove the directory with inode number 257 and
generation 9 is issued.
2) An operation to create the inode with number 257 and generation 7 is
issued. This creates the inode with an orphanized name of "o257-7-0".
3) An operation rename the new inode 257 to its final name, "a2", is
issued. This is incorrect because inode 258, which has the same name
and it's a child of the same parent (root inode 256), was not yet
processed and therefore no rmdir operation for it was yet issued.
The rename operation is issued because we fail to detect that the
name of the new inode 257 collides with inode 258, because their
parent, a subvolume/snapshot root (inode 256) has a different
generation in both snapshots.
So fix this by ignoring the generation value of a parent directory that
matches a root inode (number 256) when we are checking if the name of the
inode currently being processed collides with the name of some other
inode that was not yet processed.
We can achieve this scenario of different inodes with the same number but
different generation values either by mounting a filesystem with the inode
cache option (-o inode_cache) or by creating and sending snapshots across
different filesystems, like in the following example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ mkdir /mnt/a1
$ mkdir /mnt/a2
$ btrfs subvolume snapshot -r /mnt /mnt/snap1
$ btrfs send /mnt/snap1 -f /tmp/1.snap
$ umount /mnt
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ touch /mnt/a2
$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs receive /mnt -f /tmp/1.snap
# Take note that once the filesystem is created, its current
# generation has value 7 so the inode from the second snapshot has
# a generation value of 7. And after receiving the first snapshot
# the filesystem is at a generation value of 10, because the call to
# create the second snapshot bumps the generation to 8 (the snapshot
# creation ioctl does a transaction commit), the receive command calls
# the snapshot creation ioctl to create the first snapshot, which bumps
# the filesystem's generation to 9, and finally when the receive
# operation finishes it calls an ioctl to transition the first snapshot
# (snap1) from RW mode to RO mode, which does another transaction commit
# and bumps the filesystem's generation to 10.
$ rm -f /tmp/1.snap
$ btrfs send /mnt/snap1 -f /tmp/1.snap
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.snap
$ umount /mnt
$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt
$ btrfs receive /mnt /tmp/1.snap
# Receive of snapshot snap2 used to fail.
$ btrfs receive /mnt /tmp/2.snap
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
[Rewrote changelog to be more precise and clear]
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 2e81a4eeedcaa66e35f58b81e0755b87057ce392 ]
When we need to move xattrs into external xattr block, we call
ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
up calling ext4_mark_inode_dirty() again which will recurse back into
the inode expansion code leading to deadlocks.
Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
its management into ext4_expand_extra_isize_ea() since its manipulation
is safe there (due to xattr_sem) from possible races with
ext4_xattr_set_handle() which plays with it as well.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 899f0429c7d3eed886406cd72182bee3b96aa1f9 upstream.
In the code added to function submit_page_section by commit b1058b981,
sdio->bio can currently be NULL when calling dio_bio_submit. This then
leads to a NULL pointer access in dio_bio_submit, so check for a NULL
bio in submit_page_section before trying to submit it instead.
Fixes xfstest generic/250 on gfs2.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 511c54a2f69195b28afb9dd119f03787b1625bb4 upstream.
According to the MS-SMB2 spec (3.2.5.1.6) once the client receives
STATUS_NETWORK_SESSION_EXPIRED error code from a server it should
reconnect the current SMB session. Currently the client doesn't do
that. This can result in subsequent client requests failing by
the server. The patch adds an additional logic to the demultiplex
thread to identify expired sessions and reconnect them.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 1bd8d6cd3e413d64e543ec3e69ff43e75a1cf1ea upstream.
In the ext4 implementations of SEEK_HOLE and SEEK_DATA, make sure we
return -ENXIO for negative offsets instead of banging around inside
the extent code and returning -EFSCORRUPTED.
Reported-by: Mateusz S <muttdini@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 2ba3e6e8afc9b6188b471f27cf2b5e3cf34e7af2 upstream.
It is OK for s_first_meta_bg to be equal to the number of block group
descriptor blocks. (It rarely happens, but it shouldn't cause any
problems.)
https://bugzilla.kernel.org/show_bug.cgi?id=194567
Fixes: 3a4b77cd47bb837b8557595ec7425f281f2ca1fe
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3a4b77cd47bb837b8557595ec7425f281f2ca1fe upstream.
Ralf Spenneberg reported that he hit a kernel crash when mounting a
modified ext4 image. And it turns out that kernel crashed when
calculating fs overhead (ext4_calculate_overhead()), this is because
the image has very large s_first_meta_bg (debug code shows it's
842150400), and ext4 overruns the memory in count_overhead() when
setting bitmap buffer, which is PAGE_SIZE.
ext4_calculate_overhead():
buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer
blks = count_overhead(sb, i, buf);
count_overhead():
for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400
ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun
count++;
}
This can be reproduced easily for me by this script:
#!/bin/bash
rm -f fs.img
mkdir -p /mnt/ext4
fallocate -l 16M fs.img
mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img
debugfs -w -R "ssv first_meta_bg 842150400" fs.img
mount -o loop fs.img /mnt/ext4
Fix it by validating s_first_meta_bg first at mount time, and
refusing to mount if its value exceeds the largest possible meta_bg
number.
Reported-by: Ralf Spenneberg <ralf@os-t.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a3bb2d5587521eea6dab2d05326abb0afb460abd upstream.
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit a056bdaae7a181f7dcc876cfab2f94538e508709 upstream.
mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.
Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: Michael Zimmer <michael@swarm64.com>
CC: stable@vger.kernel.org
Fixes: cb20d5188366f04d96d2e07b1240cc92170ade40
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 89f39af129382a40d7cd1f6914617282cfeee28e upstream.
Change thaw_super() to check frozen != SB_FREEZE_COMPLETE rather than
frozen == SB_UNFROZEN, otherwise it can race with freeze_super() which
drops sb->s_umount after SB_FREEZE_WRITE to preserve the lock ordering.
In this case thaw_super() will wrongly call s_op->unfreeze_fs() before
it was actually frozen, and call sb_freeze_unlock() which leads to the
unbalanced percpu_up_write(). Unfortunately lockdep can't detect this,
so this triggers misc BUG_ON()'s in kernel/rcu/sync.c.
Reported-and-tested-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 3da40c7b089810ac9cf2bb1e59633f619f3a7312 upstream.
At LSF we decided that if we truncate up from isize we shouldn't trim
fallocated blocks that were fallocated with KEEP_SIZE and are past the
new i_size. This patch fixes ext4 to do this.
[ Completely reworked patch so that i_disksize would actually get set
when truncating up. Also reworked the code for handling truncate so
that it's easier to handle. -- tytso ]
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 57e7ba04d422c3d41c8426380303ec9b7533ded9 upstream.
security_inode_getsecurity() provides the text string value
of a security attribute. It does not provide a "secctx".
The code in xattr_getsecurity() that calls security_inode_getsecurity()
and then calls security_release_secctx() happened to work because
SElinux and Smack treat the attribute and the secctx the same way.
It fails for cap_inode_getsecurity(), because that module has no
secctx that ever needs releasing. It turns out that Smack is the
one that's doing things wrong by not allocating memory when instructed
to do so by the "alloc" parameter.
The fix is simple enough. Change the security_release_secctx() to
kfree() because it isn't a secctx being returned by
security_inode_getsecurity(). Change Smack to allocate the string when
told to do so.
Note: this also fixes memory leaks for LSMs which implement
inode_getsecurity but not release_secctx, such as capabilities.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[ Upstream commit 08b005f1333154ae5b404ca28766e0ffb9f1c150 ]
The sole remaining caller of kmem_zalloc_greedy is bulkstat, which uses
it to grab 1-4 pages for staging of inobt records. The infinite loop in
the greedy allocation function is causing hangs[1] in generic/269, so
just get rid of the greedy allocator in favor of kmem_zalloc_large.
This makes bulkstat somewhat more likely to ENOMEM if there's really no
pages to spare, but eliminates a source of hangs.
[1] http://lkml.kernel.org/r/20170301044634.rgidgdqqiiwsmfpj%40XZHOUW.usersys.redhat.com
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6d6d282932d1a609e60dc4467677e0e863682f57 upstream.
`btrfs sub set-default` succeeds to set an ID which isn't corresponding to any
fs/file tree. If such the bad ID is set to a filesystem, we can't mount this
filesystem without specifying `subvol` or `subvolid` mount options.
Fixes: 6ef5ed0d386b ("Btrfs: add ioctl and incompat flag to set the default mount subvol")
Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit fc46820b27a2d9a46f7e90c9ceb4a64a1bc5fab8 upstream.
In generic_file_llseek_size, return -ENXIO for negative offsets as well
as offsets beyond EOF. This affects filesystems which don't implement
SEEK_HOLE / SEEK_DATA internally, possibly because they don't support
holes.
Fixes xfstest generic/448.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
| |
commit 1013e760d10e614dc10b5624ce9fc41563ba2e65 upstream.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 0603c96f3af50e2f9299fa410c224ab1d465e0f9 upstream.
As long as signing is supported (ie not a guest user connection) and
connection is SMB3 or SMB3.02, then validate negotiate (protect
against man in the middle downgrade attacks). We had been doing this
only when signing was required, not when signing was just enabled,
but this more closely matches recommended SMB3 behavior and is
better security. Suggested by Metze.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Jeremy Allison <jra@samba.org>
Acked-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f5c4ba816315d3b813af16f5571f86c8d4e897bd upstream.
There is a race that cause cifs reconnect in cifs_mount,
- cifs_mount
- cifs_get_tcp_session
- [ start thread cifs_demultiplex_thread
- cifs_read_from_socket: -ECONNABORTED
- DELAY_WORK smb2_reconnect_server ]
- cifs_setup_session
- [ smb2_reconnect_server ]
auth_key.response was allocated in cifs_setup_session, and
will release when the session destoried. So when session re-
connect, auth_key.response should be check and released.
Tested with my system:
CIFS VFS: Free previous auth_key.response = ffff8800320bbf80
A simple auth_key.response allocation call trace:
- cifs_setup_session
- SMB2_sess_setup
- SMB2_sess_auth_rawntlmssp_authenticate
- build_ntlmssp_auth_blob
- setup_ntlmv2_rsp
Signed-off-by: Shu Wang <shuwang@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 94183331e815617246b1baa97e0916f358c794bb upstream.
memory leak was found by kmemleak. exit_cifs_spnego
should be called before cifs module removed, or
cifs root_cred will not be released.
kmemleak report:
unreferenced object 0xffff880070a3ce40 (size 192):
backtrace:
kmemleak_alloc+0x4a/0xa0
kmem_cache_alloc+0xc7/0x1d0
prepare_kernel_cred+0x20/0x120
init_cifs_spnego+0x2d/0x170 [cifs]
0xffffffffc07801f3
do_one_initcall+0x51/0x1b0
do_init_module+0x60/0x1fd
load_module+0x161e/0x1b60
SYSC_finit_module+0xa9/0x100
SyS_finit_module+0xe/0x10
Signed-off-by: Shu Wang <shuwang@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit b0a5a9589decd07db755d6a8d9c0910d96ff7992 upstream.
Current ext4 quota should always "usage enabled" if the
quota feautre is enabled. But in ext4_orphan_cleanup(), it
turn quotas off directly (used for the older journaled
quota), so we cannot turn it on again via "quotaon" unless
umount and remount ext4.
Simple reproduce:
mkfs.ext4 -O project,quota /dev/vdb1
mount -o prjquota /dev/vdb1 /mnt
chattr -p 123 /mnt
chattr +P /mnt
touch /mnt/aa /mnt/bb
exec 100<>/mnt/aa
rm -f /mnt/aa
sync
echo c > /proc/sysrq-trigger
#reboot and mount
mount -o prjquota /dev/vdb1 /mnt
#query status
quotaon -Ppv /dev/vdb1
#output
quotaon: Cannot find mountpoint for device /dev/vdb1
quotaon: No correct mountpoint specified.
This patch add check for journaled quotas to avoid incorrect
quotaoff when ext4 has quota feautre.
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 125c9fb1ccb53eb2ea9380df40f3c743f3fb2fed upstream.
We need to check HOT_DATA to truncate any previous data block when doing
roll-forward recovery.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit b31ff3cdf540110da4572e3e29bd172087af65cc upstream.
If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
a directory in a filesystem that does not have a realtime device and
create a new file in that directory, it gets marked as a real time file.
When data is written and a fsync is issued, the filesystem attempts to
flush a non-existent rt device during the fsync process.
This results in a crash dereferencing a null buftarg pointer in
xfs_blkdev_issue_flush():
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: xfs_blkdev_issue_flush+0xd/0x20
.....
Call Trace:
xfs_file_fsync+0x188/0x1c0
vfs_fsync_range+0x3b/0xa0
do_fsync+0x3d/0x70
SyS_fsync+0x10/0x20
do_syscall_64+0x4d/0xb0
entry_SYSCALL64_slow_path+0x25/0x25
Setting RT inode flags does not require special privileges so any
unprivileged user can cause this oops to occur. To reproduce, confirm
kernel is compiled with CONFIG_XFS_RT=y and run:
# mkfs.xfs -f /dev/pmem0
# mount /dev/pmem0 /mnt/test
# mkdir /mnt/test/foo
# xfs_io -c 'chattr +t' /mnt/test/foo
# xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar
Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait.
Kernels built with CONFIG_XFS_RT=n are not exposed to this bug.
Fixes: f538d4da8d52 ("[XFS] write barrier support")
Signed-off-by: Richard Wareing <rwareing@fb.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6c6b5a39c4bf3dbd8cf629c9f5450e983c19dbb9 upstream.
Several distributions mount the "proper root" as ro during initrd and
then remount it as rw before pivot_root(2). Thus, if a rescan had been
aborted by a previous shutdown, the rescan would never be resumed.
This issue would manifest itself as several btrfs ioctl(2)s causing the
entire machine to hang when btrfs_qgroup_wait_for_completion was hit
(due to the fs_info->qgroup_rescan_running flag being set but the rescan
itself not being resumed). Notably, Docker's btrfs storage driver makes
regular use of BTRFS_QUOTA_CTL_DISABLE and BTRFS_IOC_QUOTA_RESCAN_WAIT
(causing this problem to be manifested on boot for some machines).
Cc: Jeff Mahoney <jeffm@suse.com>
Fixes: b382a324b60f ("Btrfs: fix qgroup rescan resume on mount")
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 55acdd926f6b21a5cdba23da98a48aedf19ac9c3 upstream.
Can be reproduced when running dlm_controld (tested on 4.4.x, 4.12.4):
# seq 1 100 | xargs -P0 -n1 dlm_tool join
# seq 1 100 | xargs -P0 -n1 dlm_tool leave
misc_register fails due to duplicate sysfs entry, which causes
dlm_device_register to free ls->ls_device.name.
In dlm_device_deregister the name was freed again, causing memory
corruption.
According to the comment in dlm_device_deregister the name should've been
set to NULL when registration fails,
so this patch does that.
sysfs: cannot create duplicate filename '/dev/char/10:1'
------------[ cut here ]------------
warning: cpu: 1 pid: 4450 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70
modules linked in: msr rfcomm dlm ccm bnep dm_crypt uvcvideo
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev
btusb media btrtl btbcm btintel bluetooth ecdh_generic intel_rapl
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
snd_hda_codec_hdmi irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel thinkpad_acpi pcbc nvram snd_seq_midi
snd_seq_midi_event aesni_intel snd_hda_codec_realtek snd_hda_codec_generic
snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_hda_intel snd_hda_codec
cryptd intel_cstate arc4 snd_hda_core snd_seq snd_seq_device snd_hwdep
iwldvm intel_rapl_perf mac80211 joydev input_leds iwlwifi serio_raw
cfg80211 snd_pcm shpchp snd_timer snd mac_hid mei_me lpc_ich mei soundcore
sunrpc parport_pc ppdev lp parport autofs4 i915 psmouse
e1000e ahci libahci i2c_algo_bit sdhci_pci ptp drm_kms_helper sdhci
pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops drm wmi video
cpu: 1 pid: 4450 comm: dlm_test.exe not tainted 4.12.4-041204-generic
hardware name: lenovo 232425u/232425u, bios g2et82ww (2.02 ) 09/11/2012
task: ffff96b0cbabe140 task.stack: ffffb199027d0000
rip: 0010:sysfs_warn_dup+0x56/0x70
rsp: 0018:ffffb199027d3c58 eflags: 00010282
rax: 0000000000000038 rbx: ffff96b0e2c49158 rcx: 0000000000000006
rdx: 0000000000000000 rsi: 0000000000000086 rdi: ffff96b15e24dcc0
rbp: ffffb199027d3c70 r08: 0000000000000001 r09: 0000000000000721
r10: ffffb199027d3c00 r11: 0000000000000721 r12: ffffb199027d3cd1
r13: ffff96b1592088f0 r14: 0000000000000001 r15: ffffffffffffffef
fs: 00007f78069c0700(0000) gs:ffff96b15e240000(0000)
knlgs:0000000000000000
cs: 0010 ds: 0000 es: 0000 cr0: 0000000080050033
cr2: 000000178625ed28 cr3: 0000000091d3e000 cr4: 00000000001406e0
call trace:
sysfs_do_create_link_sd.isra.2+0x9e/0xb0
sysfs_create_link+0x25/0x40
device_add+0x5a9/0x640
device_create_groups_vargs+0xe0/0xf0
device_create_with_groups+0x3f/0x60
? snprintf+0x45/0x70
misc_register+0x140/0x180
device_write+0x6a8/0x790 [dlm]
__vfs_write+0x37/0x160
? apparmor_file_permission+0x1a/0x20
? security_file_permission+0x3b/0xc0
vfs_write+0xb5/0x1a0
sys_write+0x55/0xc0
? sys_fcntl+0x5d/0xb0
entry_syscall_64_fastpath+0x1e/0xa9
rip: 0033:0x7f78083454bd
rsp: 002b:00007f78069bbd30 eflags: 00000293 orig_rax: 0000000000000001
rax: ffffffffffffffda rbx: 0000000000000006 rcx: 00007f78083454bd
rdx: 000000000000009c rsi: 00007f78069bee00 rdi: 0000000000000005
rbp: 00007f77f8000a20 r08: 000000000000fcf0 r09: 0000000000000032
r10: 0000000000000024 r11: 0000000000000293 r12: 00007f78069bde00
r13: 00007f78069bee00 r14: 000000000000000a r15: 00007f78069bbd70
code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 2c c8
ff ff 4c 89 e2 48 89 de 48 c7 c7 b0 8e 0c a8 e8 41 e8 ed ff <0f> ff 48 89
df e8 00 d5 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84
---[ end trace 40412246357cc9e0 ]---
dlm: 59f24629-ae39-44e2-9030-397ebc2eda26: leaving the lockspace group...
bug: unable to handle kernel null pointer dereference at 0000000000000001
ip: [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140
pgd 0
oops: 0000 [#1] smp
modules linked in: dlm 8021q garp mrp stp llc openvswitch nf_defrag_ipv6
nf_conntrack libcrc32c iptable_filter dm_multipath crc32_pclmul dm_mod
aesni_intel psmouse aes_x86_64 sg ablk_helper cryptd lrw gf128mul
glue_helper i2c_piix4 nls_utf8 tpm_tis tpm isofs nfsd auth_rpcgss
oid_registry nfs_acl lockd grace sunrpc xen_wdt ip_tables x_tables autofs4
hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi 8139too
serio_raw ata_piix 8139cp mii uhci_hcd ehci_pci ehci_hcd libata
scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6
cpu: 0 pid: 394 comm: systemd-udevd tainted: g w 4.4.0+0 #1
hardware name: xen hvm domu, bios 4.7.2-2.2 05/11/2017
task: ffff880002410000 ti: ffff88000243c000 task.ti: ffff88000243c000
rip: e030:[<ffffffff811a3b4a>] [<ffffffff811a3b4a>]
kmem_cache_alloc+0x7a/0x140
rsp: e02b:ffff88000243fd90 eflags: 00010202
rax: 0000000000000000 rbx: ffff8800029864d0 rcx: 000000000007b36c
rdx: 000000000007b36b rsi: 00000000024000c0 rdi: ffff880036801c00
rbp: ffff88000243fdc0 r08: 0000000000018880 r09: 0000000000000054
r10: 000000000000004a r11: ffff880034ace6c0 r12: 00000000024000c0
r13: ffff880036801c00 r14: 0000000000000001 r15: ffffffff8118dcc2
fs: 00007f0ab77548c0(0000) gs:ffff880036e00000(0000) knlgs:0000000000000000
cs: e033 ds: 0000 es: 0000 cr0: 0000000080050033
cr2: 0000000000000001 cr3: 000000000332d000 cr4: 0000000000040660
stack:
ffffffff8118dc90 ffff8800029864d0 0000000000000000 ffff88003430b0b0
ffff880034b78320 ffff88003430b0b0 ffff88000243fdf8 ffffffff8118dcc2
ffff8800349c6700 ffff8800029864d0 000000000000000b 00007f0ab7754b90
call trace:
[<ffffffff8118dc90>] ? anon_vma_fork+0x60/0x140
[<ffffffff8118dcc2>] anon_vma_fork+0x92/0x140
[<ffffffff8107033e>] copy_process+0xcae/0x1a80
[<ffffffff8107128b>] _do_fork+0x8b/0x2d0
[<ffffffff81071579>] sys_clone+0x19/0x20
[<ffffffff815a30ae>] entry_syscall_64_fastpath+0x12/0x71
] code: f6 75 1c 4c 89 fa 44 89 e6 4c 89 ef e8 a7 e4 00 00 41 f7 c4 00 80
00 00 49 89 c6 74 47 eb 32 49 63 45 20 48 8d 4a 01 4d 8b 45 00 <49> 8b 1c
06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 ac 49 63
rip [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140
rsp <ffff88000243fd90>
cr2: 0000000000000001
--[ end trace 70cb9fd1b164a0e8 ]--
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 138e4ad67afd5c6c318b056b4d17c17f2c0ca5c0 upstream.
The race was introduced by me in commit 971316f0503a ("epoll:
ep_unregister_pollwait() can use the freed pwq->whead"). I did not
realize that nothing can protect eventpoll after ep_poll_callback() sets
->whead = NULL, only whead->lock can save us from the race with
ep_free() or ep_remove().
Move ->whead = NULL to the end of ep_poll_callback() and add the
necessary barriers.
TODO: cleanup the ewake/EPOLLEXCLUSIVE logic, it was confusing even
before this patch.
Hopefully this explains use-after-free reported by syzcaller:
BUG: KASAN: use-after-free in debug_spin_lock_before
...
_raw_spin_lock_irqsave+0x4a/0x60 kernel/locking/spinlock.c:159
ep_poll_callback+0x29f/0xff0 fs/eventpoll.c:1148
this is spin_lock(eventpoll->lock),
...
Freed by task 17774:
...
kfree+0xe8/0x2c0 mm/slub.c:3883
ep_free+0x22c/0x2a0 fs/eventpoll.c:865
Fixes: 971316f0503a ("epoll: ep_unregister_pollwait() can use the freed pwq->whead")
Reported-by: 范龙飞 <long7573@126.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6e3c1529c39e92ed64ca41d53abadabbaa1d5393 upstream.
Recent patch had an endian warning ie
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 9e37b1784f2be9397a903307574ee565bbadfd75 upstream.
Currently the maximum size of SMB2/3 header is set incorrectly which
leads to hanging of directory listing operations on encrypted SMB3
connections. Fix this by setting the maximum size to 170 bytes that
is calculated as RFC1002 length field size (4) + transform header
size (52) + SMB2 header size (64) + create response size (56).
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit fc788f64f1f3eb31e87d4f53bcf1ab76590d5838 upstream.
When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.
More critical, nfsd4_decode_write is failing to increment argp->pagelen
when it increments argp->pagelist. This can cause later xdr decoders
to assume more data is available than really is, which can cause server
crashes on malformed requests.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit d3edede29f74d335f81d95a4588f5f136a9f7dcf upstream.
Add checking for the path component length and verify it is <= the maximum
that the server advertizes via FileFsAttributeInformation.
With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
when users to access an overlong path.
To test this, try to cd into a (non-existing) directory on a CIFS share
that has a too long name:
cd /mnt/aaaaaaaaaaaaaaa...
and it now should show a good error message from the shell:
bash: cd: /mnt/aaaaaaaaaaaaaaaa...aaaaaa: File name too long
rh bz 1153996
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 42bec214d8bd432be6d32a1acb0a9079ecd4d142 upstream.
The df for a SMB2 share triggers a GetInfo call for
FS_FULL_SIZE_INFORMATION. The values returned are used to populate
struct statfs.
The problem is that none of the information returned by the call
contains the total blocks available on the filesystem. Instead we use
the blocks available to the user ie. quota limitation when filling out
statfs.f_blocks. The information returned does contain Actual free units
on the filesystem and is used to populate statfs.f_bfree. For users with
quota enabled, it can lead to situations where the total free space
reported is more than the total blocks on the system ending up with df
reports like the following
# df -h /mnt/a
Filesystem Size Used Avail Use% Mounted on
//192.168.22.10/a 2.5G -2.3G 2.5G - /mnt/a
To fix this problem, we instead populate both statfs.f_bfree with the
same value as statfs.f_bavail ie. CallerAvailableAllocationUnits. This
is similar to what is done already in the code for cifs and df now
reports the quota information for the user used to mount the share.
# df --si /mnt/a
Filesystem Size Used Avail Use% Mounted on
//192.168.22.10/a 2.7G 101M 2.6G 4% /mnt/a
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Pierguido Lambri <plambri@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 8a9d6e964d318533ba3d2901ce153ba317c99a89 upstream.
The blocklayout code does not compile cleanly for a 32-bit sector_t,
and also has no reliable checks for devices sizes, which makes it
unsafe to use with a kernel that doesn't support large block devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5c83746a0cf2 ("pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 68227c03cba84a24faf8a7277d2b1a03c8959c2c upstream.
Before the patch, the flock flag could remain uninitialized for the
lifespan of the fuse_file allocation. Unless set to true in
fuse_file_flock(), it would remain in an indeterminate state until read in
an if statement in fuse_release_common(). This could consequently lead to
taking an unexpected branch in the code.
The bug was discovered by a runtime instrumentation designed to detect use
of uninitialized memory in the kernel.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Fixes: 37fb3a30b462 ("fuse: fix flock")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 15d3042a937c13f5d9244241c7a9c8416ff6e82a upstream.
Make sure segno and blkoff read from raw image are valid.
Cc: stable@vger.kernel.org
Signed-off-by: Jin Qian <jinqian@google.com>
[Jaegeuk Kim: adjust minor coding style]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[AmitP: Found in Android Security bulletin for Aug'17, fixes CVE-2017-10663]
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit aec51758ce10a9c847a62a48a168f8c804c6e053 upstream.
On a 32-bit platform, the value of n_blcoks_count may be wrong during
the file system is resized to size larger than 2^32 blocks. This may
caused the superblock being corrupted with zero blocks count.
Fixes: 1c6bd7173d66
Signed-off-by: Jerry Lee <jerrylee@qnap.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit fcf5ea10992fbac3c7473a1db33d56a139333cd1 upstream.
ext4_find_unwritten_pgoff() does not properly handle a situation when
starting index is in the middle of a page and blocksize < pagesize. The
following command shows the bug on filesystem with 1k blocksize:
xfs_io -f -c "falloc 0 4k" \
-c "pwrite 1k 1k" \
-c "pwrite 3k 1k" \
-c "seek -a -r 0" foo
In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048,
SEEK_DATA) will return the correct result.
Fix the problem by neglecting buffers in a page before starting offset.
Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit e9a330c4289f2ba1ca4bf98c2b430ab165a8931b upstream.
The per-prz spinlock should be using the dynamic initializer so that
lockdep can correctly track it. Without this, under lockdep, we get a
warning at boot that the lock is in non-static memory.
Fixes: 109704492ef6 ("pstore: Make spinlock per zone instead of global")
Fixes: 76d5692a5803 ("pstore: Correctly initialize spinlock and flags")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 76d5692a58031696e282384cbd893832bc92bd76 upstream.
The ram backend wasn't always initializing its spinlock correctly. Since
it was coming from kzalloc memory, though, it was harmless on
architectures that initialize unlocked spinlocks to 0 (at least x86 and
ARM). This also fixes a possibly ignored flag setting too.
When running under CONFIG_DEBUG_SPINLOCK, the following Oops was visible:
[ 0.760836] persistent_ram: found existing buffer, size 29988, start 29988
[ 0.765112] persistent_ram: found existing buffer, size 30105, start 30105
[ 0.769435] persistent_ram: found existing buffer, size 118542, start 118542
[ 0.785960] persistent_ram: found existing buffer, size 0, start 0
[ 0.786098] persistent_ram: found existing buffer, size 0, start 0
[ 0.786131] pstore: using zlib compression
[ 0.790716] BUG: spinlock bad magic on CPU#0, swapper/0/1
[ 0.790729] lock: 0xffffffc0d1ca9bb0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 0.790742] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc2+ #913
[ 0.790747] Hardware name: Google Kevin (DT)
[ 0.790750] Call trace:
[ 0.790768] [<ffffff900808ae88>] dump_backtrace+0x0/0x2bc
[ 0.790780] [<ffffff900808b164>] show_stack+0x20/0x28
[ 0.790794] [<ffffff9008460ee0>] dump_stack+0xa4/0xcc
[ 0.790809] [<ffffff9008113cfc>] spin_dump+0xe0/0xf0
[ 0.790821] [<ffffff9008113d3c>] spin_bug+0x30/0x3c
[ 0.790834] [<ffffff9008113e28>] do_raw_spin_lock+0x50/0x1b8
[ 0.790846] [<ffffff9008a2d2ec>] _raw_spin_lock_irqsave+0x54/0x6c
[ 0.790862] [<ffffff90083ac3b4>] buffer_size_add+0x48/0xcc
[ 0.790875] [<ffffff90083acb34>] persistent_ram_write+0x60/0x11c
[ 0.790888] [<ffffff90083aab1c>] ramoops_pstore_write_buf+0xd4/0x2a4
[ 0.790900] [<ffffff90083a9d3c>] pstore_console_write+0xf0/0x134
[ 0.790912] [<ffffff900811c304>] console_unlock+0x48c/0x5e8
[ 0.790923] [<ffffff900811da18>] register_console+0x3b0/0x4d4
[ 0.790935] [<ffffff90083aa7d0>] pstore_register+0x1a8/0x234
[ 0.790947] [<ffffff90083ac250>] ramoops_probe+0x6b8/0x7d4
[ 0.790961] [<ffffff90085ca548>] platform_drv_probe+0x7c/0xd0
[ 0.790972] [<ffffff90085c76ac>] driver_probe_device+0x1b4/0x3bc
[ 0.790982] [<ffffff90085c7ac8>] __device_attach_driver+0xc8/0xf4
[ 0.790996] [<ffffff90085c4bfc>] bus_for_each_drv+0xb4/0xe4
[ 0.791006] [<ffffff90085c7414>] __device_attach+0xd0/0x158
[ 0.791016] [<ffffff90085c7b18>] device_initial_probe+0x24/0x30
[ 0.791026] [<ffffff90085c648c>] bus_probe_device+0x50/0xe4
[ 0.791038] [<ffffff90085c35b8>] device_add+0x3a4/0x76c
[ 0.791051] [<ffffff90087d0e84>] of_device_add+0x74/0x84
[ 0.791062] [<ffffff90087d19b8>] of_platform_device_create_pdata+0xc0/0x100
[ 0.791073] [<ffffff90087d1a2c>] of_platform_device_create+0x34/0x40
[ 0.791086] [<ffffff900903c910>] of_platform_default_populate_init+0x58/0x78
[ 0.791097] [<ffffff90080831fc>] do_one_initcall+0x88/0x160
[ 0.791109] [<ffffff90090010ac>] kernel_init_freeable+0x264/0x31c
[ 0.791123] [<ffffff9008a25bd0>] kernel_init+0x18/0x11c
[ 0.791133] [<ffffff9008082ec0>] ret_from_fork+0x10/0x50
[ 0.793717] console [pstore-1] enabled
[ 0.797845] pstore: Registered ramoops as persistent store backend
[ 0.804647] ramoops: attached 0x100000@0xf7edc000, ecc: 0/0
Fixes: 663deb47880f ("pstore: Allow prz to control need for locking")
Fixes: 109704492ef6 ("pstore: Make spinlock per zone instead of global")
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 663deb47880f2283809669563c5a52ac7c6aef1a upstream.
In preparation of not locking at all for certain buffers depending on if
there's contention, make locking optional depending on the initialization
of the prz.
Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: moved locking flag into prz instead of via caller arguments]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 49d31c2f389acfe83417083e1208422b4091cd9e upstream.
take_dentry_name_snapshot() takes a safe snapshot of dentry name;
if the name is a short one, it gets copied into caller-supplied
structure, otherwise an extra reference to external name is grabbed
(those are never modified). In either case the pointer to stable
string is stored into the same structure.
dentry must be held by the caller of take_dentry_name_snapshot(),
but may be freely dropped afterwards - the snapshot will stay
until destroyed by release_dentry_name_snapshot().
Intended use:
struct name_snapshot s;
take_dentry_name_snapshot(&s, dentry);
...
access s.name
...
release_dentry_name_snapshot(&s);
Replaces fsnotify_oldname_...(), gets used in fsnotify to obtain the name
to pass down with event.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 upstream.
Currently pstore has a global spinlock for all zones. Since the zones
are independent and modify different areas of memory, there's no need
to have a global lock, so we should use a per-zone lock as introduced
here. Also, when ramoops's ftrace use-case has a FTRACE_PER_CPU flag
introduced later, which splits the ftrace memory area into a single zone
per CPU, it will eliminate the need for locking. In preparation for this,
make the locking optional.
Signed-off-by: Joel Fernandes <joelaf@google.com>
[kees: updated commit message]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit f2e95355891153f66d4156bf3a142c6489cd78c6 upstream.
udf_setsize() called truncate_setsize() with i_data_sem held. Thus
truncate_pagecache() called from truncate_setsize() could lock a page
under i_data_sem which can deadlock as page lock ranks below
i_data_sem - e. g. writeback can hold page lock and try to acquire
i_data_sem to map a block.
Fix the problem by moving truncate_setsize() calls from under
i_data_sem. It is safe for us to change i_size without holding
i_data_sem as all the places that depend on i_size being stable already
hold inode_lock.
Fixes: 7e49b6f2480cb9a9e7322a91592e56a5c85361f5
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit cc89684c9a265828ce061037f1f79f4a68ccd3f7 upstream.
Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate")
in v3.18, a return of '0' from ->d_revalidate() will cause the dentry
to be invalidated even if it has filesystems mounted on or it or on a
descendant. The mounted filesystem is unmounted.
This means we need to be careful not to return 0 unless the directory
referred to truly is invalid. So -ESTALE or -ENOENT should invalidate
the directory. Other errors such a -EPERM or -ERESTARTSYS should be
returned from ->d_revalidate() so they are propagated to the caller.
A particular problem can be demonstrated by:
1/ mount an NFS filesystem using NFSv3 on /mnt
2/ mount any other filesystem on /mnt/foo
3/ ls /mnt/foo
4/ turn off network, or otherwise make the server unable to respond
5/ ls /mnt/foo &
6/ cat /proc/$!/stack # note that nfs_lookup_revalidate is in the call stack
7/ kill -9 $! # this results in -ERESTARTSYS being returned
8/ observe that /mnt/foo has been unmounted.
This patch changes nfs_lookup_revalidate() to only treat
-ESTALE from nfs_lookup_verify_inode() and
-ESTALE or -ENOENT from ->lookup()
as indicating an invalid inode. Other errors are returned.
Also nfs_check_inode_attributes() is changed to return -ESTALE rather
than -EIO. This is consistent with the error returned in similar
circumstances from nfs_update_inode().
As this bug allows any user to unmount a filesystem mounted on an NFS
filesystem, this fix is suitable for stable kernels.
Fixes: bafc9b754f75 ("vfs: More precise tests in d_invalidate")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|