summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | xfs: clean up xfs_trans_brelse()Brian Foster2018-09-291-71/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfs_trans_brelse() is a bit of a historical mess, similar to xfs_buf_item_unlock(). It is unnecessarily verbose, has snippets of commented out code, inconsistency with regard to stale items, etc. Clean up xfs_trans_brelse() to use similar logic and flow as xfs_buf_item_unlock() with regard to bli reference count handling. This patch makes no functional changes, but facilitates further refactoring of the common bli reference count handling code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | xfs: don't unlock invalidated buf on aborted tx commitBrian Foster2018-09-292-50/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfstests generic/388,475 occasionally reproduce assertion failures in xfs_buf_item_unpin() when the final bli reference is dropped on an invalidated buffer and the buffer is not locked as it is expected to be. Invalidated buffers should remain locked on transaction commit until the final unpin, at which point the buffer is removed from the AIL and the bli is freed since stale buffers are not written back. The assert failures are associated with filesystem shutdown, typically due to log I/O errors injected by the test. The problematic situation can occur if the shutdown happens to cause a race between an active transaction that has invalidated a particular buffer and an I/O error on a log buffer that contains the bli associated with the same (now stale) buffer. Both transaction and log contexts acquire a bli reference. If the transaction has already invalidated the buffer by the time the I/O error occurs and ends up aborting due to shutdown, the transaction and log hold the last two references to a stale bli. If the transaction cancel occurs first, it treats the buffer as non-stale due to the aborted state: the bli reference is dropped and the buffer is released/unlocked. The log buffer I/O error handling eventually calls into xfs_buf_item_unpin(), drops the final reference to the bli and treats it as stale. The buffer wasn't left locked by xfs_buf_item_unlock(), however, so the assert fails and the buffer is double unlocked. The latter problem is mitigated by the fact that the fs is shutdown and no further damage is possible. ->iop_unlock() of an invalidated buffer should behave consistently with respect to the bli refcount, regardless of aborted state. If the refcount remains elevated on commit, we know the bli is awaiting an unpin (since it can't be in another transaction) and will be handled appropriately on log buffer completion. If the final bli reference of an invalidated buffer is dropped in ->iop_unlock(), we can assume the transaction has aborted because invalidation implies a dirty transaction. In the non-abort case, the log would have acquired a bli reference in ->iop_pin() and prevented bli release at ->iop_unlock() time. In the abort case the item must be freed and buffer unlocked because it wasn't pinned by the log. Rework xfs_buf_item_unlock() to simplify the currently circuitous and duplicate logic and leave invalidated buffers locked based on bli refcount, regardless of aborted state. This ensures that a pinned, stale buffer is always found locked when eventually unpinned. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | xfs: remove last of unnecessary xfs_defer_cancel() callersBrian Foster2018-09-294-44/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that deferred operations are completely managed via transactions, it's no longer necessary to cancel the dfops in error paths that already cancel the associated transaction. There are a few such calls lingering throughout the codebase. Remove all remaining unnecessary calls to xfs_defer_cancel(). This leaves xfs_defer_cancel() calls in two places. The first is the call in the transaction cancel path itself, which facilitates this patch. The second is made via the xfs_defer_finish() error path to provide consistent error semantics with transaction commit. For example, xfs_trans_commit() expects an xfs_defer_finish() failure to clean up the dfops structure before it returns. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
| * | | | xfs: don't crash the vfs on a garbage inline symlinkDarrick J. Wong2018-09-291-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VFS routine that calls ->get_link blindly copies whatever's returned into the user's buffer. If we return a NULL pointer, the vfs will crash on the null pointer. Therefore, return -EFSCORRUPTED instead of blowing up the kernel. [dgc: clean up with hch's suggestions] Reported-by: wen.xu@gatech.edu Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | | | | Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsGreg Kroah-Hartman2018-10-031-11/+13
|\ \ \ \ \ | |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Al writes: "xattrs regression fix from Andreas; sat in -next for quite a while." * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: sysfs: Do not return POSIX ACL xattrs via listxattr
| * | | | sysfs: Do not return POSIX ACL xattrs via listxattrAndreas Gruenbacher2018-09-181-11/+13
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 786534b92f3c introduced a regression that caused listxattr to return the POSIX ACL attribute names even though sysfs doesn't support POSIX ACLs. This happens because simple_xattr_list checks for NULL i_acl / i_default_acl, but inode_init_always initializes those fields to ACL_NOT_CACHED ((void *)-1). For example: $ getfattr -m- -d /sys /sys: system.posix_acl_access: Operation not supported /sys: system.posix_acl_default: Operation not supported Fix this in simple_xattr_list by checking if the filesystem supports POSIX ACLs. Fixes: 786534b92f3c ("tmpfs: listxattr should include POSIX ACL xattrs") Reported-by: Marc Aurèle La France <tsi@tuyoix.net> Tested-by: Marc Aurèle La France <tsi@tuyoix.net> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | | Merge tag 'pstore-v4.19-rc7' of ↵Greg Kroah-Hartman2018-10-011-4/+25
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Kees writes: "Pstore fixes for v4.19-rc7 - Fix failure-path memory leak in ramoops_init (nixiaoming)" * tag 'pstore-v4.19-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Fix failure-path memory leak in ramoops_init
| * | | pstore/ram: Fix failure-path memory leak in ramoops_initKees Cook2018-09-301-4/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported by nixiaoming, with some minor clarifications: 1) memory leak in ramoops_register_dummy(): dummy_data = kzalloc(sizeof(*dummy_data), GFP_KERNEL); but no kfree() if platform_device_register_data() fails. 2) memory leak in ramoops_init(): Missing platform_device_unregister(dummy) and kfree(dummy_data) if platform_driver_register(&ramoops_driver) fails. I've clarified the purpose of ramoops_register_dummy(), and added a common cleanup routine for all three failure paths to call. Reported-by: nixiaoming <nixiaoming@huawei.com> Cc: stable@vger.kernel.org Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | Merge tag 'libnvdimm-fixes2-4.19-rc6' of ↵Greg Kroah-Hartman2018-09-301-0/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Dan writes: "filesystem-dax for 4.19-rc6 Fix a deadlock in the new for 4.19 dax_lock_mapping_entry() routine." * tag 'libnvdimm-fixes2-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Fix deadlock in dax_lock_mapping_entry()
| * | | | dax: Fix deadlock in dax_lock_mapping_entry()Jan Kara2018-09-271-0/+1
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When dax_lock_mapping_entry() has to sleep to obtain entry lock, it will fail to unlock mapping->i_pages spinlock and thus immediately deadlock against itself when retrying to grab the entry lock again. Fix the problem by unlocking mapping->i_pages before retrying. Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()") Reported-by: Barret Rhoden <brho@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | | Merge tag 'for_v4.19-rc6' of ↵Greg Kroah-Hartman2018-09-271-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Jan writes: "an ext2 patch fixing fsync(2) for DAX mounts." * tag 'for_v4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2, dax: set ext2_dax_aops for dax files
| * | | | ext2, dax: set ext2_dax_aops for dax filesToshi Kani2018-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync syscall to DAX file needs to flush processor cache, but it currently does not flush to existing DAX files. This is because 'ext2_da_aops' is set to address_space_operations of existing DAX files, instead of 'ext2_dax_aops', since S_DAX flag is set after ext2_set_aops() in the open path. Similar to ext4, change ext2_iget() to initialize i_flags before ext2_set_aops(). Fixes: fb094c90748f ("ext2, dax: introduce ext2_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Jan Kara <jack@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | | erge tag 'libnvdimm-fixes-4.19-rc6' of ↵Greg Kroah-Hartman2018-09-251-11/+2
|\ \ \ \ \ | |_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Dan writes: "libnvdimm/dax for 4.19-rc6 * (2) fixes for the dax error handling updates that were merged for v4.19-rc1. My mails to Al have been bouncing recently, so I do not have his ack but the uaccess change is of the trivial / obviously correct variety. The address_space_operations fixes a regression. * A filesystem-dax fix to correct the zero page lookup to be compatible with non-x86 (mips and s390) architectures." * tag 'libnvdimm-fixes-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: Add missing address_space_operations uaccess: Fix is_source param for check_copy_size() in copy_to_iter_mcsafe() filesystem-dax: Fix use of zero page
| * | | | filesystem-dax: Fix use of zero pageMatthew Wilcox2018-09-111-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use my_zero_pfn instead of ZERO_PAGE(), and pass the vaddr to it instead of zero so it works on MIPS and s390 who reference the vaddr to select a zero page. Cc: <stable@vger.kernel.org> Fixes: 91d25ba8a6b0 ("dax: use common 4k zero page for dax mmap reads") Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | | | Merge tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifsGreg Kroah-Hartman2018-09-212-25/+6
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Richard writes: "This pull request contains fixes for UBIFS: - A wrong UBIFS assertion in mount code - Fix for a NULL pointer deref in mount code - Revert of a bad fix for xattrs" * tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifs: Revert "ubifs: xattr: Don't operate on deleted inodes" ubifs: drop false positive assertion ubifs: Check for name being NULL while mounting
| * | | | | Revert "ubifs: xattr: Don't operate on deleted inodes"Richard Weinberger2018-09-201-24/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 11a6fc3dc743e22fb50f2196ec55bee5140d3c52. UBIFS wants to assert that xattr operations are only issued on files with positive link count. The said patch made this operations return -ENOENT for unlinked files such that the asserts will no longer trigger. This was wrong since xattr operations are perfectly fine on unlinked files. Instead the assertions need to be fixed/removed. Cc: <stable@vger.kernel.org> Fixes: 11a6fc3dc743 ("ubifs: xattr: Don't operate on deleted inodes") Reported-by: Koen Vandeputte <koen.vandeputte@ncentric.com> Tested-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | | ubifs: drop false positive assertionSascha Hauer2018-09-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following sequence triggers ubifs_assert(c, c->lst.taken_empty_lebs > 0); at the end of ubifs_remount_fs(): mount -t ubifs /dev/ubi0_0 /mnt echo 1 > /sys/kernel/debug/ubifs/ubi0_0/ro_error umount /mnt mount -t ubifs -o ro /dev/ubix_y /mnt mount -o remount,ro /mnt The resulting UBIFS assert failed in ubifs_remount_fs at 1878 (pid 161) is a false positive. In the case above c->lst.taken_empty_lebs has never been changed from its initial zero value. This will only happen when the deferred recovery is done. Fix this by doing the assertion only when recovery has been done already. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at>
| * | | | | ubifs: Check for name being NULL while mountingRichard Weinberger2018-09-201-0/+3
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The requested device name can be NULL or an empty string. Check for that and refuse to continue. UBIFS has to do this manually since we cannot use mount_bdev(), which checks for this condition. Fixes: 1e51764a3c2ac ("UBIFS: add new flash file system") Reported-by: syzbot+38bd0f7865e5c6379280@syzkaller.appspotmail.com Signed-off-by: Richard Weinberger <richard@nod.at>
* | | | | ocfs2: fix ocfs2 read block panicJunxiao Bi2018-09-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While reading block, it is possible that io error return due to underlying storage issue, in this case, BH_NeedsValidate was left in the buffer head. Then when reading the very block next time, if it was already linked into journal, that will trigger the following panic. [203748.702517] kernel BUG at fs/ocfs2/buffer_head_io.c:342! [203748.702533] invalid opcode: 0000 [#1] SMP [203748.702561] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc dm_switch dm_queue_length dm_multipath bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i iw_cxgb4 cxgb4 cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas ipmi_ssif i2c_core ipmi_si ipmi_msghandler acpi_pad pcspkr sb_edac edac_core lpc_ich mfd_core shpchp sg tg3 ptp pps_core ext4 jbd2 mbcache2 sr_mod cdrom sd_mod ahci libahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [203748.703024] CPU: 7 PID: 38369 Comm: touch Not tainted 4.1.12-124.18.6.el6uek.x86_64 #2 [203748.703045] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.5.2 01/28/2015 [203748.703067] task: ffff880768139c00 ti: ffff88006ff48000 task.ti: ffff88006ff48000 [203748.703088] RIP: 0010:[<ffffffffa05e9f09>] [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2] [203748.703130] RSP: 0018:ffff88006ff4b818 EFLAGS: 00010206 [203748.703389] RAX: 0000000008620029 RBX: ffff88006ff4b910 RCX: 0000000000000000 [203748.703885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000023079fe [203748.704382] RBP: ffff88006ff4b8d8 R08: 0000000000000000 R09: ffff8807578c25b0 [203748.704877] R10: 000000000f637376 R11: 000000003030322e R12: 0000000000000000 [203748.705373] R13: ffff88006ff4b910 R14: ffff880732fe38f0 R15: 0000000000000000 [203748.705871] FS: 00007f401992c700(0000) GS:ffff880bfebc0000(0000) knlGS:0000000000000000 [203748.706370] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [203748.706627] CR2: 00007f4019252440 CR3: 00000000a621e000 CR4: 0000000000060670 [203748.707124] Stack: [203748.707371] ffff88006ff4b828 ffffffffa0609f52 ffff88006ff4b838 0000000000000001 [203748.707885] 0000000000000000 0000000000000000 ffff880bf67c3800 ffffffffa05eca00 [203748.708399] 00000000023079ff ffffffff81c58b80 0000000000000000 0000000000000000 [203748.708915] Call Trace: [203748.709175] [<ffffffffa0609f52>] ? ocfs2_inode_cache_io_unlock+0x12/0x20 [ocfs2] [203748.709680] [<ffffffffa05eca00>] ? ocfs2_empty_dir_filldir+0x80/0x80 [ocfs2] [203748.710185] [<ffffffffa05ec0cb>] ocfs2_read_dir_block_direct+0x3b/0x200 [ocfs2] [203748.710691] [<ffffffffa05f0fbf>] ocfs2_prepare_dx_dir_for_insert.isra.57+0x19f/0xf60 [ocfs2] [203748.711204] [<ffffffffa065660f>] ? ocfs2_metadata_cache_io_unlock+0x1f/0x30 [ocfs2] [203748.711716] [<ffffffffa05f4f3a>] ocfs2_prepare_dir_for_insert+0x13a/0x890 [ocfs2] [203748.712227] [<ffffffffa05f442e>] ? ocfs2_check_dir_for_entry+0x8e/0x140 [ocfs2] [203748.712737] [<ffffffffa061b2f2>] ocfs2_mknod+0x4b2/0x1370 [ocfs2] [203748.713003] [<ffffffffa061c385>] ocfs2_create+0x65/0x170 [ocfs2] [203748.713263] [<ffffffff8121714b>] vfs_create+0xdb/0x150 [203748.713518] [<ffffffff8121b225>] do_last+0x815/0x1210 [203748.713772] [<ffffffff812192e9>] ? path_init+0xb9/0x450 [203748.714123] [<ffffffff8121bca0>] path_openat+0x80/0x600 [203748.714378] [<ffffffff811bcd45>] ? handle_pte_fault+0xd15/0x1620 [203748.714634] [<ffffffff8121d7ba>] do_filp_open+0x3a/0xb0 [203748.714888] [<ffffffff8122a767>] ? __alloc_fd+0xa7/0x130 [203748.715143] [<ffffffff81209ffc>] do_sys_open+0x12c/0x220 [203748.715403] [<ffffffff81026ddb>] ? syscall_trace_enter_phase1+0x11b/0x180 [203748.715668] [<ffffffff816f0c9f>] ? system_call_after_swapgs+0xe9/0x190 [203748.715928] [<ffffffff8120a10e>] SyS_open+0x1e/0x20 [203748.716184] [<ffffffff816f0d5e>] system_call_fastpath+0x18/0xd7 [203748.716440] Code: 00 00 48 8b 7b 08 48 83 c3 10 45 89 f8 44 89 e1 44 89 f2 4c 89 ee e8 07 06 11 e1 48 8b 03 48 85 c0 75 df 8b 5d c8 e9 4d fa ff ff <0f> 0b 48 8b 7d a0 e8 dc c6 06 00 48 b8 00 00 00 00 00 00 00 10 [203748.717505] RIP [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2] [203748.717775] RSP <ffff88006ff4b818> Joesph ever reported a similar panic. Link: https://oss.oracle.com/pipermail/ocfs2-devel/2013-May/008931.html Link: http://lkml.kernel.org/r/20180912063207.29484-1-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <ge.changwei@h3c.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | fs/proc/kcore.c: fix invalid memory access in multi-page read optimizationDominique Martinet2018-09-201-0/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'm' kcore_list item could point to kclist_head, and it is incorrect to look at m->addr / m->size in this case. There is no choice but to run through the list of entries for every address if we did not find any entry in the previous iteration Reset 'm' to NULL in that case at Omar Sandoval's suggestion. [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/1536100702-28706-1-git-send-email-asmadeus@codewreck.org Fixes: bf991c2231117 ("proc/kcore: optimize multiple page reads") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Omar Sandoval <osandov@osandov.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: James Morse <james.morse@arm.com> Cc: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | Merge tag 'ext4_for_linus_stable' of ↵Greg Kroah-Hartman2018-09-178-26/+72
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Ted writes: Various ext4 bug fixes; primarily making ext4 more robust against maliciously crafted file systems, and some DAX fixes. * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4, dax: set ext4_dax_aops for dax files ext4, dax: add ext4_bmap to ext4_dax_aops ext4: don't mark mmp buffer head dirty ext4: show test_dummy_encryption mount option in /proc/mounts ext4: close race between direct IO and ext4_break_layouts() ext4: fix online resizing for bigalloc file systems with a 1k block size ext4: fix online resize's handling of a too-small final block group ext4: recalucate superblock checksum after updating free blocks/inodes ext4: avoid arithemetic overflow that can trigger a BUG ext4: avoid divide by zero fault when deleting corrupted inline directories ext4: check to make sure the rename(2)'s destination is not freed ext4: add nonstring annotations to ext4.h
| * | | | ext4, dax: set ext4_dax_aops for dax filesToshi Kani2018-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync syscall to DAX file needs to flush processor cache, but it currently does not flush to existing DAX files. This is because 'ext4_da_aops' is set to address_space_operations of existing DAX files, instead of 'ext4_dax_aops', since S_DAX flag is set after ext4_set_aops() in the open path. New file -------- lookup_open ext4_create __ext4_new_inode ext4_set_inode_flags // Set S_DAX flag ext4_set_aops // Set aops to ext4_dax_aops Existing file ------------- lookup_open ext4_lookup ext4_iget ext4_set_aops // Set aops to ext4_da_aops ext4_set_inode_flags // Set S_DAX flag Change ext4_iget() to initialize i_flags before ext4_set_aops(). Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
| * | | | ext4, dax: add ext4_bmap to ext4_dax_aopsToshi Kani2018-09-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ext4 mount path calls .bmap to the journal inode. This currently works for the DAX mount case because ext4_iget() always set 'ext4_da_aops' to any regular files. In preparation to fix ext4_iget() to set 'ext4_dax_aops' for ext4 DAX files, add ext4_bmap() to 'ext4_dax_aops', since bmap works for DAX inodes. Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
| * | | | ext4: don't mark mmp buffer head dirtyLi Dongyang2018-09-151-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marking mmp bh dirty before writing it will make writeback pick up mmp block later and submit a write, we don't want the duplicate write as kmmpd thread should have full control of reading and writing the mmp block. Another reason is we will also have random I/O error on the writeback request when blk integrity is enabled, because kmmpd could modify the content of the mmp block(e.g. setting new seq and time) while the mmp block is under I/O requested by writeback. Signed-off-by: Li Dongyang <dongyangli@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@vger.kernel.org
| * | | | ext4: show test_dummy_encryption mount option in /proc/mountsEric Biggers2018-09-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When in effect, add "test_dummy_encryption" to _ext4_show_options() so that it is shown in /proc/mounts and other relevant procfs files. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | ext4: close race between direct IO and ext4_break_layouts()Ross Zwisler2018-09-111-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the refcount of a page is lowered between the time that it is returned by dax_busy_page() and when the refcount is again checked in ext4_break_layouts() => ___wait_var_event(), the waiting function ext4_wait_dax_page() will never be called. This means that ext4_break_layouts() will still have 'retry' set to false, so we'll stop looping and never check the refcount of other pages in this inode. Instead, always continue looping as long as dax_layout_busy_page() gives us a page which it found with an elevated refcount. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | ext4: fix online resizing for bigalloc file systems with a 1k block sizeTheodore Ts'o2018-09-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An online resize of a file system with the bigalloc feature enabled and a 1k block size would be refused since ext4_resize_begin() did not understand s_first_data_block is 0 for all bigalloc file systems, even when the block size is 1k. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | ext4: fix online resize's handling of a too-small final block groupTheodore Ts'o2018-09-031-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid growing the file system to an extent so that the last block group is too small to hold all of the metadata that must be stored in the block group. This problem can be triggered with the following reproducer: umount /mnt mke2fs -F -m0 -b 4096 -t ext4 -O resize_inode,^has_journal \ -E resize=1073741824 /tmp/foo.img 128M mount /tmp/foo.img /mnt truncate --size 1708M /tmp/foo.img resize2fs /dev/loop0 295400 umount /mnt e2fsck -fy /tmp/foo.img Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | ext4: recalucate superblock checksum after updating free blocks/inodesTheodore Ts'o2018-09-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mounting the superblock, ext4_fill_super() calculates the free blocks and free inodes and stores them in the superblock. It's not strictly necessary, since we don't use them any more, but it's nice to keep them roughly aligned to reality. Since it's not critical for file system correctness, the code doesn't call ext4_commit_super(). The problem is that it's in ext4_commit_super() that we recalculate the superblock checksum. So if we're not going to call ext4_commit_super(), we need to call ext4_superblock_csum_set() to make sure the superblock checksum is consistent. Most of the time, this doesn't matter, since we end up calling ext4_commit_super() very soon thereafter, and definitely by the time the file system is unmounted. However, it doesn't work in this sequence: mke2fs -Fq -t ext4 /dev/vdc 128M mount /dev/vdc /vdc cp xfstests/git-versions /vdc godown /vdc umount /vdc mount /dev/vdc tune2fs -l /dev/vdc With this commit, the "tune2fs -l" no longer fails. Reported-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
| * | | | ext4: avoid arithemetic overflow that can trigger a BUGTheodore Ts'o2018-09-012-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A maliciously crafted file system can cause an overflow when the results of a 64-bit calculation is stored into a 32-bit length parameter. https://bugzilla.kernel.org/show_bug.cgi?id=200623 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
| * | | | ext4: avoid divide by zero fault when deleting corrupted inline directoriesTheodore Ts'o2018-08-272-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A specially crafted file system can trick empty_inline_dir() into reading past the last valid entry in a inline directory, and then run into the end of xattr marker. This will trigger a divide by zero fault. Fix this by using the size of the inline directory instead of dir->i_size. Also clean up error reporting in __ext4_check_dir_entry so that the message is clearer and more understandable --- and avoids the division by zero trap if the size passed in is zero. (I'm not sure why we coded it that way in the first place; printing offset % size is actually more confusing and less useful.) https://bugzilla.kernel.org/show_bug.cgi?id=200933 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
| * | | | ext4: check to make sure the rename(2)'s destination is not freedTheodore Ts'o2018-08-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the destination of the rename(2) system call exists, the inode's link count (i_nlinks) must be non-zero. If it is, the inode can end up on the orphan list prematurely, leading to all sorts of hilarity, including a use-after-free. https://bugzilla.kernel.org/show_bug.cgi?id=200931 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Wen Xu <wen.xu@gatech.edu> Cc: stable@vger.kernel.org
| * | | | ext4: add nonstring annotations to ext4.hTheodore Ts'o2018-08-271-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This suppresses some false positives in gcc 8's -Wstringop-truncation Suggested by Miguel Ojeda (hopefully the __nonstring definition will eventually get accepted in the compiler-gcc.h header file). Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
* | | | | Merge tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2018-09-145-17/+43
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "Fixes for four CIFS/SMB3 potential pointer overflow issues, one minor build fix, and a build warning cleanup" * tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6: cifs: read overflow in is_valid_oplock_break() cifs: integer overflow in in SMB2_ioctl() CIFS: fix wrapping bugs in num_entries() cifs: prevent integer overflow in nxt_dir_entry() fs/cifs: require sha512 fs/cifs: suppress a string overflow warning
| * | | | | cifs: read overflow in is_valid_oplock_break()Dan Carpenter2018-09-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to verify that the "data_offset" is within bounds. Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
| * | | | | cifs: integer overflow in in SMB2_ioctl()Dan Carpenter2018-09-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "le32_to_cpu(rsp->OutputOffset) + *plen" addition can overflow and wrap around to a smaller value which looks like it would lead to an information leak. Fixes: 4a72dafa19ba ("SMB2 FSCTL and IOCTL worker function") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org>
| * | | | | CIFS: fix wrapping bugs in num_entries()Dan Carpenter2018-09-121-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that "entryptr + next_offset" and "entryptr + len + size" can wrap. I ended up changing the type of "entryptr" because it makes the math easier when we don't have to do so much casting. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>
| * | | | | cifs: prevent integer overflow in nxt_dir_entry()Dan Carpenter2018-09-121-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "old_entry + le32_to_cpu(pDirInfo->NextEntryOffset)" can wrap around so I have added a check for integer overflow. Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
| * | | | | fs/cifs: require sha512Stefan Metzmacher2018-09-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This got lost in commit 0fdfef9aa7ee68ddd508aef7c98630cfc054f8d6, which removed CONFIG_CIFS_SMB311. Signed-off-by: Stefan Metzmacher <metze@samba.org> Fixes: 0fdfef9aa7ee68ddd ("smb3: simplify code by removing CONFIG_CIFS_SMB311") CC: Stable <stable@vger.kernel.org> CC: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
| * | | | | fs/cifs: suppress a string overflow warningStephen Rothwell2018-09-091-3/+8
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A powerpc build of cifs with gcc v8.2.0 produces this warning: fs/cifs/cifssmb.c: In function ‘CIFSSMBNegotiate’: fs/cifs/cifssmb.c:605:3: warning: ‘strncpy’ writing 16 bytes into a region of size 1 overflows the destination [-Wstringop-overflow=] strncpy(pSMB->DialectsArray+count, protocols[i].name, 16); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since we are already doing a strlen() on the source, change the strncpy to a memcpy(). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steve French <stfrench@microsoft.com>
* | | | | Merge tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2018-09-144-24/+39
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Anna Schumaker: "These are a handful of fixes for problems that Trond found. Patch #1 and #3 have the same name, a second issue was found after applying the first patch. Stable bugfixes: - v4.17+: Fix tracepoint Oops in initiate_file_draining() - v4.11+: Fix an infinite loop on I/O Other fixes: - Return errors if a waiting layoutget is killed - Don't open code clearing of delegation state" * tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Don't open code clearing of delegation state NFSv4.1 fix infinite loop on I/O. NFSv4: Fix a tracepoint Oops in initiate_file_draining() pNFS: Ensure we return the error if someone kills a waiting layoutget NFSv4: Fix a tracepoint Oops in initiate_file_draining()
| * | | | | NFS: Don't open code clearing of delegation stateTrond Myklebust2018-09-141-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper for the case when the nfs4 open state has been set to use a delegation stateid, and we want to revert to using the open stateid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | NFSv4.1 fix infinite loop on I/O.Trond Myklebust2018-09-142-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous fix broke recovery of delegated stateids because it assumes that if we did not mark the delegation as suspect, then the delegation has effectively been revoked, and so it removes that delegation irrespectively of whether or not it is valid and still in use. While this is "mostly harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning in an infinite loop while complaining that we're using an invalid stateid (in this case the all-zero stateid). What we rather want to do here is ensure that the delegation is always correctly marked as needing testing when that is the case. So we want to close the loophole offered by nfs4_schedule_stateid_recovery(), which marks the state as needing to be reclaimed, but not the delegation that may be backing it. Fixes: 0e3d3e5df07dc ("NFSv4.1 fix infinite loop on IO BAD_STATEID error") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | NFSv4: Fix a tracepoint Oops in initiate_file_draining()Trond Myklebust2018-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to change the test in the tracepoint. Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | pNFS: Ensure we return the error if someone kills a waiting layoutgetTrond Myklebust2018-09-141-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If someone interrupts a wait on one or more outstanding layoutgets in pnfs_update_layout() then return the ERESTARTSYS/EINTR error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | | | NFSv4: Fix a tracepoint Oops in initiate_file_draining()Trond Myklebust2018-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to change the test in the tracepoint. Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* | | | | | Merge tag 'ovl-fixes-4.19-rc4' of ↵Linus Torvalds2018-09-133-15/+44
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "This fixes a regression in the recent file stacking update, reported and fixed by Amir Goldstein. The fix is fairly trivial, but involves adding a fadvise() f_op and the associated churn in the vfs. As discussed on -fsdevel, there are other possible uses for this method, than allowing proper stacking for overlays. And there's one other fix for a syzkaller detected oops" * tag 'ovl-fixes-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: fix oopses in ovl_fill_super() failure paths ovl: add ovl_fadvise() vfs: implement readahead(2) using POSIX_FADV_WILLNEED vfs: add the fadvise() file operation Documentation/filesystems: update documentation of file_operations ovl: fix GPF in swapfile_activate of file from overlayfs over xfs ovl: respect FIEMAP_FLAG_SYNC flag
| * | | | | | ovl: fix oopses in ovl_fill_super() failure pathsMiklos Szeredi2018-09-101-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ovl_free_fs() dereferences ofs->workbasedir and ofs->upper_mnt in cases when those might not have been initialized yet. Fix the initialization order for these fields. Reported-by: syzbot+c75f181dc8429d2eb887@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # v4.15 Fixes: 95e6d4177cb7 ("ovl: grab reference to workbasedir early") Fixes: a9075cdb467d ("ovl: factor out ovl_free_fs() helper")
| * | | | | | ovl: add ovl_fadvise()Amir Goldstein2018-09-031-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement stacked fadvise to fix syscalls readahead(2) and fadvise64(2) on an overlayfs file. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: d1d04ef8572b ("ovl: stack file ops") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * | | | | | ovl: fix GPF in swapfile_activate of file from overlayfs over xfsAmir Goldstein2018-08-302-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since overlayfs implements stacked file operations, the underlying filesystems are not supposed to be exposed to the overlayfs file, whose f_inode is an overlayfs inode. Assigning an overlayfs file to swap_file results in an attempt of xfs code to dereference an xfs_inode struct from an ovl_inode pointer: CPU: 0 PID: 2462 Comm: swapon Not tainted 4.18.0-xfstests-12721-g33e17876ea4e #3402 RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f Call Trace: xfs_iomap_swapfile_activate+0x1f/0x43 __se_sys_swapon+0xb1a/0xee9 Fix this by not assigning the real inode mapping to f_mapping, which will cause swapon() to return an error (-EINVAL). Although it makes sense not to allow setting swpafile on an overlayfs file, some users may depend on it, so we may need to fix this up in the future. Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT open to fail. Fix this by installing ovl_aops with noop_direct_IO in overlay inode mapping. Keeping f_mapping pointing to overlay inode mapping will cause other a_ops related operations to fail (e.g. readahead()). Those will be fixed by follow up patches. Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: f7c72396d0de ("ovl: add O_DIRECT support") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>