summaryrefslogtreecommitdiffstats
path: root/fs/overlayfs
Commit message (Collapse)AuthorAgeFilesLines
* ovl: simplify file spliceMiklos Szeredi2021-10-201-44/+2
| | | | | | | | | | | | | | | | | | | commit 82a763e61e2b601309d696d4fa514c77d64ee1be upstream. generic_file_splice_read() and iter_file_splice_write() will call back into f_op->iter_read() and f_op->iter_write() respectively. These already do the real file lookup and cred override. So the code in ovl_splice_read() and ovl_splice_write() is redundant. In addition the ovl_file_accessed() call in ovl_splice_write() is incorrect, though probably harmless. Fix by calling generic_file_splice_read() and iter_file_splice_write() directly. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> [reported to resolve issues with 1a980b8cbf00 ("ovl: add splice file read write helper")] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix missing negative dentry check in ovl_rename()Zheng Liang2021-10-131-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a295aef603e109a47af355477326bd41151765b6 upstream. The following reproducer mkdir lower upper work merge touch lower/old touch lower/new mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge rm merge/new mv merge/old merge/new & unlink upper/new may result in this race: PROCESS A: rename("merge/old", "merge/new"); overwrite=true,ovl_lower_positive(old)=true, ovl_dentry_is_whiteout(new)=true -> flags |= RENAME_EXCHANGE PROCESS B: unlink("upper/new"); PROCESS A: lookup newdentry in new_upperdir call vfs_rename() with negative newdentry and RENAME_EXCHANGE Fix by adding the missing check for negative newdentry. Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Fixes: e9be9d5e76e3 ("overlay filesystem") Cc: <stable@vger.kernel.org> # v3.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()chenying2021-09-221-2/+4
| | | | | | | | | | | | | | | | commit 52d5a0c6bd8a89f460243ed937856354f8f253a3 upstream. If function ovl_instantiate() returns an error, ovl_cleanup will be called and try to remove newdentry from wdir, but the newdentry has been moved to udir at this time. This will causes BUG_ON(victim->d_parent->d_inode != dir) in fs/namei.c:may_delete. Signed-off-by: chenying <chenying.kernel@bytedance.com> Fixes: 01b39dcc9568 ("ovl: use inode_insert5() to hash a newly created inode") Link: https://lore.kernel.org/linux-unionfs/e6496a94-a161-dc04-c38a-d2544633acb4@bytedance.com/ Cc: <stable@vger.kernel.org> # v4.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix uninitialized pointer read in ovl_lookup_real_one()Miklos Szeredi2021-09-031-1/+1
| | | | | | | | | | | | | | [ Upstream commit 580c610429b3994e8db24418927747cf28443cde ] One error path can result in release_dentry_name_snapshot() being called before "name" was initialized by take_dentry_name_snapshot(). Fix by moving the release_dentry_name_snapshot() to immediately after the only use. Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ovl: add splice file read write helperMurphy Zhou2021-08-261-0/+47
| | | | | | | | | | | | | | [ Upstream commit 1a980b8cbf0059a5308eea61522f232fd03002e2 ] Now overlayfs falls back to use default file splice read and write, which is not compatiple with overlayfs, returning EFAULT. xfstests generic/591 can reproduce part of this. Tested this patch with xfstests auto group tests. Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ovl: fix missing revert_creds() on error pathDan Carpenter2021-05-141-1/+2
| | | | | | | | | | | | | commit 7b279bbfd2b230c7a210ff8f405799c7e46bbf48 upstream. Smatch complains about missing that the ovl_override_creds() doesn't have a matching revert_creds() if the dentry is disconnected. Fix this by moving the ovl_override_creds() until after the disconnected check. Fixes: aa3ff3c152ff ("ovl: copy up of disconnected dentries") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: allow upperdir inside lowerdirMiklos Szeredi2021-05-071-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 708fa01597fa002599756bf56a96d0de1677375c upstream. Commit 146d62e5a586 ("ovl: detect overlapping layers") made sure we don't have overlapping layers, but it also broke the arguably valid use case of mount -olowerdir=/,upperdir=/subdir,.. where upperdir overlaps lowerdir on the same filesystem. This has been causing regressions. Revert the check, but only for the specific case where upperdir and/or workdir are subdirectories of lowerdir. Any other overlap (e.g. lowerdir is subdirectory of upperdir, etc) case is crazy, so leave the check in place for those. Overlaps are detected at lookup time too, so reverting the mount time check should be safe. Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Cc: <stable@vger.kernel.org> # v5.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: expand warning in ovl_d_real()Miklos Szeredi2021-02-171-5/+8
| | | | | | | | | commit cef4cbff06fbc3be54d6d79ee139edecc2ee8598 upstream. There was a syzbot report with this warning but insufficient information... Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: skip getxattr of security labelsAmir Goldstein2021-02-171-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 03fedf93593c82538b18476d8c4f0e8f8435ea70 ] When inode has no listxattr op of its own (e.g. squashfs) vfs_listxattr calls the LSM inode_listsecurity hooks to list the xattrs that LSMs will intercept in inode_getxattr hooks. When selinux LSM is installed but not initialized, it will list the security.selinux xattr in inode_listsecurity, but will not intercept it in inode_getxattr. This results in -ENODATA for a getxattr call for an xattr returned by listxattr. This situation was manifested as overlayfs failure to copy up lower files from squashfs when selinux is built-in but not initialized, because ovl_copy_xattr() iterates the lower inode xattrs by vfs_listxattr() and vfs_getxattr(). ovl_copy_xattr() skips copy up of security labels that are indentified by inode_copy_up_xattr LSM hooks, but it does that after vfs_getxattr(). Since we are not going to copy them, skip vfs_getxattr() of the security labels. Reported-by: Michael Labriola <michael.d.labriola@gmail.com> Tested-by: Michael Labriola <michael.d.labriola@gmail.com> Link: https://lore.kernel.org/linux-unionfs/2nv9d47zt7.fsf@aldarion.sourceruckus.org/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ovl: perform vfs_getxattr() with mounter credsMiklos Szeredi2021-02-171-0/+2
| | | | | | | | | | | | | | | [ Upstream commit 554677b97257b0b69378bd74e521edb7e94769ff ] The vfs_getxattr() in ovl_xattr_set() is used to check whether an xattr exist on a lower layer file that is to be removed. If the xattr does not exist, then no need to copy up the file. This call of vfs_getxattr() wasn't wrapped in credential override, and this is probably okay. But for consitency wrap this instance as well. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ovl: fix dentry leak in ovl_get_redirectLiangyan2021-02-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e04527fefba6e4e66492f122cf8cc6314f3cf3bf upstream. We need to lock d_parent->d_lock before dget_dlock, or this may have d_lockref updated parallelly like calltrace below which will cause dentry->d_lockref leak and risk a crash. CPU 0 CPU 1 ovl_set_redirect lookup_fast ovl_get_redirect __d_lookup dget_dlock //no lock protection here spin_lock(&dentry->d_lock) dentry->d_lockref.count++ dentry->d_lockref.count++ [   49.799059] PGD 800000061fed7067 P4D 800000061fed7067 PUD 61fec5067 PMD 0 [   49.799689] Oops: 0002 [#1] SMP PTI [   49.800019] CPU: 2 PID: 2332 Comm: node Not tainted 4.19.24-7.20.al7.x86_64 #1 [   49.800678] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8a46cfe 04/01/2014 [   49.801380] RIP: 0010:_raw_spin_lock+0xc/0x20 [   49.803470] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [   49.803949] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [   49.804600] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [   49.805252] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [   49.805898] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [   49.806548] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [   49.807200] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [   49.807935] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [   49.808461] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [   49.809113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [   49.809758] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [   49.810410] Call Trace: [   49.810653]  d_delete+0x2c/0xb0 [   49.810951]  vfs_rmdir+0xfd/0x120 [   49.811264]  do_rmdir+0x14f/0x1a0 [   49.811573]  do_syscall_64+0x5b/0x190 [   49.811917]  entry_SYSCALL_64_after_hwframe+0x44/0xa9 [   49.812385] RIP: 0033:0x7ffbf505ffd7 [   49.814404] RSP: 002b:00007ffbedffada8 EFLAGS: 00000297 ORIG_RAX: 0000000000000054 [   49.815098] RAX: ffffffffffffffda RBX: 00007ffbedffb640 RCX: 00007ffbf505ffd7 [   49.815744] RDX: 0000000004449700 RSI: 0000000000000000 RDI: 0000000006c8cd50 [   49.816394] RBP: 00007ffbedffaea0 R08: 0000000000000000 R09: 0000000000017d0b [   49.817038] R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000012 [   49.817687] R13: 00000000072823d8 R14: 00007ffbedffb700 R15: 00000000072823d8 [   49.818338] Modules linked in: pvpanic cirrusfb button qemu_fw_cfg atkbd libps2 i8042 [   49.819052] CR2: 0000000000000088 [   49.819368] ---[ end trace 4e652b8aa299aa2d ]--- [   49.819796] RIP: 0010:_raw_spin_lock+0xc/0x20 [   49.821880] RSP: 0018:ffffac6fc5417e98 EFLAGS: 00010246 [   49.822363] RAX: 0000000000000000 RBX: ffff93b8da3446c0 RCX: 0000000a00000000 [   49.823008] RDX: 0000000000000001 RSI: 000000000000000a RDI: 0000000000000088 [   49.823658] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff993cf040 [   49.825404] R10: ffff93b92292e580 R11: ffffd27f188a4b80 R12: 0000000000000000 [   49.827147] R13: 00000000ffffff9c R14: 00000000fffffffe R15: ffff93b8da3446c0 [   49.828890] FS:  00007ffbedffb700(0000) GS:ffff93b927880000(0000) knlGS:0000000000000000 [   49.830725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [   49.832359] CR2: 0000000000000088 CR3: 00000005e3f74006 CR4: 00000000003606a0 [   49.834085] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [   49.835792] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Cc: <stable@vger.kernel.org> Fixes: a6c606551141 ("ovl: redirect on rename-dir") Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix unneeded call to ovl_change_flags()Amir Goldstein2020-07-221-4/+6
| | | | | | | | | | | | | | commit 81a33c1ee941c3bb9ffc6bac8f676be13351344e upstream. The check if user has changed the overlay file was wrong, causing unneeded call to ovl_change_flags() including taking f_lock on every file access. Fixes: d989903058a8 ("ovl: do not generate duplicate fsnotify events for "fake" path") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: relax WARN_ON() when decoding lower directory file handleAmir Goldstein2020-07-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 124c2de2c0aee96271e4ddab190083d8aa7aa71a upstream. Decoding a lower directory file handle to overlay path with cold inode/dentry cache may go as follows: 1. Decode real lower file handle to lower dir path 2. Check if lower dir is indexed (was copied up) 3. If indexed, get the upper dir path from index 4. Lookup upper dir path in overlay 5. If overlay path found, verify that overlay lower is the lower dir from step 1 On failure to verify step 5 above, user will get an ESTALE error and a WARN_ON will be printed. A mismatch in step 5 could be a result of lower directory that was renamed while overlay was offline, after that lower directory has been copied up and indexed. This is a scripted reproducer based on xfstest overlay/052: # Create lower subdir create_dirs create_test_files $lower/lowertestdir/subdir mount_dirs # Copy up lower dir and encode lower subdir file handle touch $SCRATCH_MNT/lowertestdir test_file_handles $SCRATCH_MNT/lowertestdir/subdir -p -o $tmp.fhandle # Rename lower dir offline unmount_dirs mv $lower/lowertestdir $lower/lowertestdir.new/ mount_dirs # Attempt to decode lower subdir file handle test_file_handles $SCRATCH_MNT -p -i $tmp.fhandle Since this WARN_ON() can be triggered by user we need to relax it. Fixes: 4b91c30a5a19 ("ovl: lookup connected ancestor of dir in inode cache") Cc: <stable@vger.kernel.org> # v4.16+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: inode reference leak in ovl_is_inuse true case.youngjun2020-07-221-1/+10
| | | | | | | | | | | | | | | commit 24f14009b8f1754ec2ae4c168940c01259b0f88a upstream. When "ovl_is_inuse" true case, trap inode reference not put. plus adding the comment explaining sequence of ovl_is_inuse after ovl_setup_trap. Fixes: 0be0bfd2de9d ("ovl: fix regression caused by overlapping layers detection") Cc: <stable@vger.kernel.org> # v4.19+ Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: youngjun <her0gyugyu@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix regression with re-formatted lower squashfsAmir Goldstein2020-07-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a888db310195400f050b89c47673f0f8babfbb41 upstream. Commit 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") relaxed the requirement for non null uuid with single lower layer to allow enabling index and nfs_export features with single lower squashfs. Fabian reported a regression in a setup when overlay re-uses an existing upper layer and re-formats the lower squashfs image. Because squashfs has no uuid, the origin xattr in upper layer are decoded from the new lower layer where they may resolve to a wrong origin file and user may get an ESTALE or EIO error on lookup. To avoid the reported regression while still allowing the new features with single lower squashfs, do not allow decoding origin with lower null uuid unless user opted-in to one of the new features that require following the lower inode of non-dir upper (index, xino, metacopy). Reported-by: Fabian <godi.beat@gmx.net> Link: https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/ Fixes: 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: initialize error in ovl_copy_xattrYuxuan Shui2020-06-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 520da69d265a91c6536c63851cbb8a53946974f0 upstream. In ovl_copy_xattr, if all the xattrs to be copied are overlayfs private xattrs, the copy loop will terminate without assigning anything to the error variable, thus returning an uninitialized value. If ovl_copy_xattr is called from ovl_clear_empty, this uninitialized error value is put into a pointer by ERR_PTR(), causing potential invalid memory accesses down the line. This commit initialize error with 0. This is the correct value because when there's no xattr to copy, because all xattrs are private, ovl_copy_xattr should succeed. This bug is discovered with the help of INIT_STACK_ALL and clang. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1050405 Fixes: 0956254a2d5b ("ovl: don't copy up opaqueness") Cc: stable@vger.kernel.org # v4.8 Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix value of i_ino for lower hardlink corner caseAmir Goldstein2020-04-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | commit 300b124fcf6ad2cd99a7b721e0f096785e0a3134 upstream. Commit 6dde1e42f497 ("ovl: make i_ino consistent with st_ino in more cases"), relaxed the condition nfs_export=on in order to set the value of i_ino to xino map of real ino. Specifically, it also relaxed the pre-condition that index=on for consistent i_ino. This opened the corner case of lower hardlink in ovl_get_inode(), which calls ovl_fill_inode() with ino=0 and then ovl_init_inode() is called to set i_ino to lower real ino without the xino mapping. Pass the correct values of ino;fsid in this case to ovl_fill_inode(), so it can initialize i_ino correctly. Fixes: 6dde1e42f497 ("ovl: make i_ino consistent with st_ino in more ...") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix lseek overflow on 32bitMiklos Szeredi2020-02-111-1/+1
| | | | | | | | | | | | | | | | commit a4ac9d45c0cd14a2adc872186431c79804b77dbf upstream. ovl_lseek() is using ssize_t to return the value from vfs_llseek(). On a 32-bit kernel ssize_t is a 32-bit signed int, which overflows above 2 GB. Assign the return value of vfs_llseek() to loff_t to fix this. Reported-by: Boris Gjenero <boris.gjenero@gmail.com> Fixes: 9e46b840c705 ("ovl: support stacked SEEK_HOLE/SEEK_DATA") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix wrong WARN_ON() in ovl_cache_update_ino()Amir Goldstein2020-02-111-1/+7
| | | | | | | | | | | | | | | | | | commit 4c37e71b713ecffe81f8e6273c6835e54306d412 upstream. The WARN_ON() that child entry is always on overlay st_dev became wrong when we allowed this function to update d_ino in non-samefs setup with xino enabled. It is not true in case of xino bits overflow on a non-dir inode. Leave the WARN_ON() only for directories, where assertion is still true. Fixes: adbf4f7ea834 ("ovl: consistent d_ino for non-samefs with xino") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: relax WARN_ON() on rename to selfAmir Goldstein2019-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | commit 6889ee5a53b8d969aa542047f5ac8acdc0e79a91 upstream. In ovl_rename(), if new upper is hardlinked to old upper underneath overlayfs before upper dirs are locked, user will get an ESTALE error and a WARN_ON will be printed. Changes to underlying layers while overlayfs is mounted may result in unexpected behavior, but it shouldn't crash the kernel and it shouldn't trigger WARN_ON() either, so relax this WARN_ON(). Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com Fixes: 804032fabb3b ("ovl: don't check rename to self") Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix corner case of non-unique st_dev;st_inoAmir Goldstein2019-12-171-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9c6d8f13e9da10a26ad7f0a020ef86e8ef142835 upstream. On non-samefs overlay without xino, non pure upper inodes should use a pseudo_dev assigned to each unique lower fs and pure upper inodes use the real upper st_dev. It is fine for an overlay pure upper inode to use the same st_dev;st_ino values as the real upper inode, because the content of those two different filesystem objects is always the same. In this case, however: - two filesystems, A and B - upper layer is on A - lower layer 1 is also on A - lower layer 2 is on B Non pure upper overlay inode, whose origin is in layer 1 will have the same st_dev;st_ino values as the real lower inode. This may result with a false positive results of 'diff' between the real lower and copied up overlay inode. Fix this by using the upper st_dev;st_ino values in this case. This breaks the property of constant st_dev;st_ino across copy up of this case. This breakage will be fixed by a later patch. Fixes: 5148626b806a ("ovl: allocate anon bdev per unique lower fs") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: fix lookup failure on multi lower squashfsAmir Goldstein2019-12-173-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7e63c87fc2dcf3be9d3aab82d4a0ea085880bdca upstream. In the past, overlayfs required that lower fs have non null uuid in order to support nfs export and decode copy up origin file handles. Commit 9df085f3c9a2 ("ovl: relax requirement for non null uuid of lower fs") relaxed this requirement for nfs export support, as long as uuid (even if null) is unique among all lower fs. However, said commit unintentionally also relaxed the non null uuid requirement for decoding copy up origin file handles, regardless of the unique uuid requirement. Amend this mistake by disabling decoding of copy up origin file handle from lower fs with a conflicting uuid. We still encode copy up origin file handles from those fs, because file handles like those already exist in the wild and because they might provide useful information in the future. There is an unhandled corner case described by Miklos this way: - two filesystems, A and B, both have null uuid - upper layer is on A - lower layer 1 is also on A - lower layer 2 is on B In this case bad_uuid won't be set for B, because the check only involves the list of lower fs. Hence we'll try to decode a layer 2 origin on layer 1 and fail. We will deal with this corner case later. Reported-by: Colin Ian King <colin.king@canonical.com> Tested-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/ Fixes: 9df085f3c9a2 ("ovl: relax requirement for non null uuid ...") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ovl: filter of trusted xattr results in auditMark Salyzyn2019-09-111-1/+2
| | | | | | | | | | | | | | | | | When filtering xattr list for reading, presence of trusted xattr results in a security audit log. However, if there is other content no errno will be set, and if there isn't, the errno will be -ENODATA and not -EPERM as is usually associated with a lack of capability. The check does not block the request to list the xattrs present. Switch to ns_capable_noaudit to reflect a more appropriate check. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Cc: linux-security-module@vger.kernel.org Cc: kernel-team@android.com Cc: stable@vger.kernel.org # v3.18+ Fixes: a082c6f680da ("ovl: filter trusted xattr for non-admin") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: Fix dereferencing possible ERR_PTR()Ding Xiang2019-09-111-2/+1
| | | | | | | | | | if ovl_encode_real_fh() fails, no memory was allocated and the error in the error-valued pointer should be returned. Fixes: 9b6faee07470 ("ovl: check ERR_PTR() return value from ovl_encode_fh()") Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com> Cc: <stable@vger.kernel.org> # v4.16+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: fix regression caused by overlapping layers detectionAmir Goldstein2019-07-162-26/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once upon a time, commit 2cac0c00a6cd ("ovl: get exclusive ownership on upper/work dirs") in v4.13 added some sanity checks on overlayfs layers. This change caused a docker regression. The root cause was mount leaks by docker, which as far as I know, still exist. To mitigate the regression, commit 85fdee1eef1a ("ovl: fix regression caused by exclusive upper/work dir protection") in v4.14 turned the mount errors into warnings for the default index=off configuration. Recently, commit 146d62e5a586 ("ovl: detect overlapping layers") in v5.2, re-introduced exclusive upper/work dir checks regardless of index=off configuration. This changes the status quo and mount leak related bug reports have started to re-surface. Restore the status quo to fix the regressions. To clarify, index=off does NOT relax overlapping layers check for this ovelayfs mount. index=off only relaxes exclusive upper/work dir checks with another overlayfs mount. To cover the part of overlapping layers detection that used the exclusive upper/work dir checks to detect overlap with self upper/work dir, add a trap also on the work base dir. Link: https://github.com/moby/moby/issues/34672 Link: https://lore.kernel.org/linux-fsdevel/20171006121405.GA32700@veci.piliscsaba.szeredi.hu/ Link: https://github.com/containers/libpod/issues/3540 Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Tested-by: Colin Walters <walters@verbum.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* Merge tag 'spdx-5.2-rc6' of ↵Linus Torvalds2019-06-2111-44/+11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull still more SPDX updates from Greg KH: "Another round of SPDX updates for 5.2-rc6 Here is what I am guessing is going to be the last "big" SPDX update for 5.2. It contains all of the remaining GPLv2 and GPLv2+ updates that were "easy" to determine by pattern matching. The ones after this are going to be a bit more difficult and the people on the spdx list will be discussing them on a case-by-case basis now. Another 5000+ files are fixed up, so our overall totals are: Files checked: 64545 Files with SPDX: 45529 Compared to the 5.1 kernel which was: Files checked: 63848 Files with SPDX: 22576 This is a huge improvement. Also, we deleted another 20000 lines of boilerplate license crud, always nice to see in a diffstat" * tag 'spdx-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (65 commits) treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 507 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 506 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 503 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 502 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 501 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 498 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 496 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 495 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 491 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 490 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 489 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 488 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 487 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 486 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 485 ...
| * treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner2019-06-1911-44/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | ovl: make i_ino consistent with st_ino in more casesAmir Goldstein2019-06-191-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Relax the condition that overlayfs supports nfs export, to require that i_ino is consistent with st_ino/d_ino. It is enough to require that st_ino and d_ino are consistent. This fixes the failure of xfstest generic/504, due to mismatch of st_ino to inode number in the output of /proc/locks. Fixes: 12574a9f4c9c ("ovl: consistent i_ino for non-samefs with xino") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: fix typo in MODULE_PARM_DESCNicolas Schier2019-06-183-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change first argument to MODULE_PARM_DESC() calls, that each of them matched the actual module parameter name. The matching results in changing (the 'parm' section from) the output of `modinfo overlay` from: parm: ovl_check_copy_up:Obsolete; does nothing parm: redirect_max:ushort parm: ovl_redirect_max:Maximum length of absolute redirect xattr value parm: redirect_dir:bool parm: ovl_redirect_dir_def:Default to on or off for the redirect_dir feature parm: redirect_always_follow:bool parm: ovl_redirect_always_follow:Follow redirects even if redirect_dir feature is turned off parm: index:bool parm: ovl_index_def:Default to on or off for the inodes index feature parm: nfs_export:bool parm: ovl_nfs_export_def:Default to on or off for the NFS export feature parm: xino_auto:bool parm: ovl_xino_auto_def:Auto enable xino feature parm: metacopy:bool parm: ovl_metacopy_def:Default to on or off for the metadata only copy up feature into: parm: check_copy_up:Obsolete; does nothing parm: redirect_max:Maximum length of absolute redirect xattr value (ushort) parm: redirect_dir:Default to on or off for the redirect_dir feature (bool) parm: redirect_always_follow:Follow redirects even if redirect_dir feature is turned off (bool) parm: index:Default to on or off for the inodes index feature (bool) parm: nfs_export:Default to on or off for the NFS export feature (bool) parm: xino_auto:Auto enable xino feature (bool) parm: metacopy:Default to on or off for the metadata only copy up feature (bool) Signed-off-by: Nicolas Schier <n.schier@avm.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: fix bogus -Wmaybe-unitialized warningArnd Bergmann2019-06-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc gets a bit confused by the logic in ovl_setup_trap() and can't figure out whether the local 'trap' variable in the caller was initialized or not: fs/overlayfs/super.c: In function 'ovl_fill_super': fs/overlayfs/super.c:1333:4: error: 'trap' may be used uninitialized in this function [-Werror=maybe-uninitialized] iput(trap); ^~~~~~~~~~ fs/overlayfs/super.c:1312:17: note: 'trap' was declared here Reword slightly to make it easier for the compiler to understand. Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: don't fail with disconnected lower NFSMiklos Szeredi2019-06-181-17/+9
| | | | | | | | | | | | | | | | | | | | | | | | NFS mounts can be disconnected from fs root. Don't fail the overlapping layer check because of this. The check is not authoritative anyway, since topology can change during or after the check. Reported-by: Antti Antinoja <antti@fennosys.fi> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 146d62e5a586 ("ovl: detect overlapping layers")
* | ovl: fix wrong flags check in FS_IOC_FS[SG]ETXATTR ioctlsAmir Goldstein2019-06-111-26/+65
|/ | | | | | | | The ioctl argument was parsed as the wrong type. Fixes: b21d9c435f93 ("ovl: support the FS_IOC_FS[SG]ETXATTR ioctls") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: detect overlapping layersAmir Goldstein2019-05-296-17/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlapping overlay layers are not supported and can cause unexpected behavior, but overlayfs does not currently check or warn about these configurations. User is not supposed to specify the same directory for upper and lower dirs or for different lower layers and user is not supposed to specify directories that are descendants of each other for overlay layers, but that is exactly what this zysbot repro did: https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000 Moving layer root directories into other layers while overlayfs is mounted could also result in unexpected behavior. This commit places "traps" in the overlay inode hash table. Those traps are dummy overlay inodes that are hashed by the layers root inodes. On mount, the hash table trap entries are used to verify that overlay layers are not overlapping. While at it, we also verify that overlay layers are not overlapping with directories "in-use" by other overlay instances as upperdir/workdir. On lookup, the trap entries are used to verify that overlay layers root inodes have not been moved into other layers after mount. Some examples: $ ./run --ov --samefs -s ... ( mkdir -p base/upper/0/u base/upper/0/w base/lower lower upper mnt mount -o bind base/lower lower mount -o bind base/upper upper mount -t overlay none mnt ... -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w) $ umount mnt $ mount -t overlay none mnt ... -o lowerdir=base,upperdir=upper/0/u,workdir=upper/0/w [ 94.434900] overlayfs: overlapping upperdir path mount: mount overlay on mnt failed: Too many levels of symbolic links $ mount -t overlay none mnt ... -o lowerdir=upper/0/u,upperdir=upper/0/u,workdir=upper/0/w [ 151.350132] overlayfs: conflicting lowerdir path mount: none is already mounted or mnt busy $ mount -t overlay none mnt ... -o lowerdir=lower:lower/a,upperdir=upper/0/u,workdir=upper/0/w [ 201.205045] overlayfs: overlapping lowerdir path mount: mount overlay on mnt failed: Too many levels of symbolic links $ mount -t overlay none mnt ... -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w $ mv base/upper/0/ base/lower/ $ find mnt/0 mnt/0 mnt/0/w find: 'mnt/0/w/work': Too many levels of symbolic links find: 'mnt/0/u': Too many levels of symbolic links Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: support the FS_IOC_FS[SG]ETXATTR ioctlsAmir Goldstein2019-05-271-3/+6
| | | | | | | | | | | | | | They are the extended version of FS_IOC_FS[SG]ETFLAGS ioctls. xfs_io -c "chattr <flags>" uses the new ioctls for setting flags. This used to work in kernel pre v4.19, before stacked file ops introduced the ovl_ioctl whitelist. Reported-by: Dave Chinner <david@fromorbit.com> Fixes: d1d04ef8572b ("ovl: stack file ops") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* treewide: Add SPDX license identifier - Makefile/KconfigThomas Gleixner2019-05-212-0/+2
| | | | | | | | | | | | | | Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'ovl-update-5.2' of ↵Linus Torvalds2019-05-145-33/+113
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs update from Miklos Szeredi: "Just bug fixes in this small update" * tag 'ovl-update-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: relax WARN_ON() for overlapping layers use case ovl: check the capability before cred overridden ovl: do not generate duplicate fsnotify events for "fake" path ovl: support stacked SEEK_HOLE/SEEK_DATA ovl: fix missing upper fs freeze protection on copy up for ioctl
| * ovl: relax WARN_ON() for overlapping layers use caseAmir Goldstein2019-05-082-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This nasty little syzbot repro: https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000 Creates overlay mounts where the same directory is both in upper and lower layers. Simplified example: mkdir foo work mount -t overlay none foo -o"lowerdir=.,upperdir=foo,workdir=work" The repro runs several threads in parallel that attempt to chdir into foo and attempt to symlink/rename/exec/mkdir the file bar. The repro hits a WARN_ON() I placed in ovl_instantiate(), which suggests that an overlay inode already exists in cache and is hashed by the pointer of the real upper dentry that ovl_create_real() has just created. At the point of the WARN_ON(), for overlay dir inode lock is held and upper dir inode lock, so at first, I did not see how this was possible. On a closer look, I see that after ovl_create_real(), because of the overlapping upper and lower layers, a lookup by another thread can find the file foo/bar that was just created in upper layer, at overlay path foo/foo/bar and hash the an overlay inode with the new real dentry as lower dentry. This is possible because the overlay directory foo/foo is not locked and the upper dentry foo/bar is in dcache, so ovl_lookup() can find it without taking upper dir inode shared lock. Overlapping layers is considered a wrong setup which would result in unexpected behavior, but it shouldn't crash the kernel and it shouldn't trigger WARN_ON() either, so relax this WARN_ON() and leave a pr_warn() instead to cover all cases of failure to get an overlay inode. The error returned from failure to insert new inode to cache with inode_insert5() was changed to -EEXIST, to distinguish from the error -ENOMEM returned on failure to get/allocate inode with iget5_locked(). Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com Fixes: 01b39dcc9568 ("ovl: use inode_insert5() to hash a newly...") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * ovl: check the capability before cred overriddenJiufei Xue2019-05-061-18/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We found that it return success when we set IMMUTABLE_FL flag to a file in docker even though the docker didn't have the capability CAP_LINUX_IMMUTABLE. The commit d1d04ef8572b ("ovl: stack file ops") and dab5ca8fd9dd ("ovl: add lsattr/chattr support") implemented chattr operations on a regular overlay file. ovl_real_ioctl() overridden the current process's subjective credentials with ofs->creator_cred which have the capability CAP_LINUX_IMMUTABLE so that it will return success in vfs_ioctl()->cap_capable(). Fix this by checking the capability before cred overridden. And here we only care about APPEND_FL and IMMUTABLE_FL, so get these information from inode. [SzM: move check and call to underlying fs inside inode locked region to prevent two such calls from racing with each other] Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * ovl: do not generate duplicate fsnotify events for "fake" pathAmir Goldstein2019-05-061-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlayfs "fake" path is used for stacked file operations on underlying files. Operations on files with "fake" path must not generate fsnotify events with path data, because those events have already been generated at overlayfs layer and because the reported event->fd for fanotify marks on underlying inode/filesystem will have the wrong path (the overlayfs path). Link: https://lore.kernel.org/linux-fsdevel/20190423065024.12695-1-jencce.kernel@gmail.com/ Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Fixes: d1d04ef8572b ("ovl: stack file ops") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * ovl: support stacked SEEK_HOLE/SEEK_DATAAmir Goldstein2019-05-061-4/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlay file f_pos is the master copy that is preserved through copy up and modified on read/write, but only real fs knows how to SEEK_HOLE/SEEK_DATA and real fs may impose limitations that are more strict than ->s_maxbytes for specific files, so we use the real file to perform seeks. We do not call real fs for SEEK_CUR:0 query and for SEEK_SET:0 requests. Fixes: d1d04ef8572b ("ovl: stack file ops") Reported-by: Eddie Horng <eddiehorng.tw@gmail.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
| * ovl: fix missing upper fs freeze protection on copy up for ioctlAmir Goldstein2019-05-063-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generalize the helper ovl_open_maybe_copy_up() and use it to copy up file with data before FS_IOC_SETFLAGS ioctl. The FS_IOC_SETFLAGS ioctl is a bit of an odd ball in vfs, which probably caused the confusion. File may be open O_RDONLY, but ioctl modifies the file. VFS does not call mnt_want_write_file() nor lock inode mutex, but fs-specific code for FS_IOC_SETFLAGS does. So ovl_ioctl() calls mnt_want_write_file() for the overlay file, and fs-specific code calls mnt_want_write_file() for upper fs file, but there was no call for ovl_want_write() for copy up duration which prevents overlayfs from copying up on a frozen upper fs. Fixes: dab5ca8fd9dd ("ovl: add lsattr/chattr support") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | Merge branch 'work.dcache' of ↵Linus Torvalds2019-05-071-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc dcache updates from Al Viro: "Most of this pile is putting name length into struct name_snapshot and making use of it. The beginning of this series ("ovl_lookup_real_one(): don't bother with strlen()") ought to have been split in two (separate switch of name_snapshot to struct qstr from overlayfs reaping the trivial benefits of that), but I wanted to avoid a rebase - by the time I'd spotted that it was (a) in -next and (b) close to 5.1-final ;-/" * 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: audit_compare_dname_path(): switch to const struct qstr * audit_update_watch(): switch to const struct qstr * inotify_handle_event(): don't bother with strlen() fsnotify: switch send_to_group() and ->handle_event to const struct qstr * fsnotify(): switch to passing const struct qstr * for file_name switch fsnotify_move() to passing const struct qstr * for old_name ovl_lookup_real_one(): don't bother with strlen() sysv: bury the broken "quietly truncate the long filenames" logics nsfs: unobfuscate unexport d_alloc_pseudo()
| * | ovl_lookup_real_one(): don't bother with strlen()Al Viro2019-04-261-1/+1
| |/ | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* / overlayfs: make use of ->free_inode()Al Viro2019-05-011-7/+6
|/ | | | | | synchronous parts are left in ->destroy_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* ovl: Do not lose security.capability xattr over metadata file copy-upVivek Goyal2019-02-133-22/+63
| | | | | | | | | | | | | | | | | If a file has been copied up metadata only, and later data is copied up, upper loses any security.capability xattr it has (underlying filesystem clears it as upon file write). From a user's point of view, this is just a file copy-up and that should not result in losing security.capability xattr. Hence, before data copy up, save security.capability xattr (if any) and restore it on upper after data copy up is complete. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Fixes: 0c2888749363 ("ovl: A new xattr OVL_XATTR_METACOPY for file on upper") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: During copy up, first copy up data and then xattrsVivek Goyal2019-02-041-13/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a file with capability set (and hence security.capability xattr) is written kernel clears security.capability xattr. For overlay, during file copy up if xattrs are copied up first and then data is, copied up. This means data copy up will result in clearing of security.capability xattr file on lower has. And this can result into surprises. If a lower file has CAP_SETUID, then it should not be cleared over copy up (if nothing was actually written to file). This also creates problems with chown logic where it first copies up file and then tries to clear setuid bit. But by that time security.capability xattr is already gone (due to data copy up), and caller gets -ENODATA. This has been reported by Giuseppe here. https://github.com/containers/libpod/issues/2015#issuecomment-447824842 Fix this by copying up data first and then metadta. This is a regression which has been introduced by my commit as part of metadata only copy up patches. TODO: There will be some corner cases where a file is copied up metadata only and later data copy up happens and that will clear security.capability xattr. Something needs to be done about that too. Fixes: bd64e57586d3 ("ovl: During copy up, first copy up metadata and then data") Cc: <stable@vger.kernel.org> # v4.19+ Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* Revert "ovl: relax permission checking on underlying layers"Miklos Szeredi2018-12-041-13/+4
| | | | | | | | | | | | This reverts commit 007ea44892e6fa963a0876a979e34890325c64eb. The commit broke some selinux-testsuite cases, and it looks like there's no straightforward fix keeping the direction of this patch, so revert for now. The original patch was trying to fix the consistency of permission checks, and not an observed bug. So reverting should be safe. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: fix decode of dir file handle with multi lower layersAmir Goldstein2018-11-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When decoding a lower file handle, we first call ovl_check_origin_fh() with connected=false to get any real lower dentry for overlay inode cache lookup. If the real dentry is a disconnected dir dentry, ovl_check_origin_fh() is called again with connected=true to get a connected real dentry and find the lower layer the real dentry belongs to. If the first call returned a connected real dentry, we use it to lookup an overlay connected dentry, but the first ovl_check_origin_fh() call with connected=false did not check that the found dentry is under the root of the layer (see ovl_acceptable()), it only checked that the found dentry super block matches the uuid of the lower file handle. In case there are multiple lower layers on the same fs and the found dentry is not from the top most lower layer, using the layer index returned from the first ovl_check_origin_fh() is wrong and we end up failing to decode the file handle. Fix this by always calling ovl_check_origin_fh() with connected=true if we got a directory dentry in the first call. Fixes: 8b58924ad55c ("ovl: lookup in inode cache first when decoding...") Cc: <stable@vger.kernel.org> # v4.17 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: fix missing override creds in link of a metacopy upperAmir Goldstein2018-11-191-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Theodore Ts'o reported a v4.19 regression with docker-dropbox: https://marc.info/?l=linux-fsdevel&m=154070089431116&w=2 "I was rebuilding my dropbox Docker container, and it failed in 4.19 with the following error: ... dpkg: error: error creating new backup file \ '/var/lib/dpkg/status-old': Invalid cross-device link" The problem did not reproduce with metacopy feature disabled. The error was caused by insufficient credentials to set "trusted.overlay.redirect" xattr on link of a metacopy file. Reproducer: echo Y > /sys/module/overlay/parameters/redirect_dir echo Y > /sys/module/overlay/parameters/metacopy cd /tmp mkdir l u w m chmod 777 l u touch l/foo ln l/foo l/link chmod 666 l/foo mount -t overlay none -olowerdir=l,upperdir=u,workdir=w m su fsgqa ln m/foo m/bar [ 21.455823] overlayfs: failed to set redirect (-1) ln: failed to create hard link 'm/bar' => 'm/foo':\ Invalid cross-device link Reported-by: Theodore Y. Ts'o <tytso@mit.edu> Reported-by: Maciej Zięba <maciekz82@gmail.com> Fixes: 4120fe64dce4 ("ovl: Set redirect on upper inode when it is linked") Cc: <stable@vger.kernel.org> # v4.19 Signed-off-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* Merge tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds2018-11-022-22/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull vfs dedup fixes from Dave Chinner: "This reworks the vfs data cloning infrastructure. We discovered many issues with these interfaces late in the 4.19 cycle - the worst of them (data corruption, setuid stripping) were fixed for XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all the problems was needed. That rework is the contents of this pull request. Rework the vfs_clone_file_range and vfs_dedupe_file_range infrastructure to use a common .remap_file_range method and supply generic bounds and sanity checking functions that are shared with the data write path. The current VFS infrastructure has problems with rlimit, LFS file sizes, file time stamps, maximum filesystem file sizes, stripping setuid bits, etc and so they are addressed in these commits. We also introduce the ability for the ->remap_file_range methods to return short clones so that clones for vfs_copy_file_range() don't get rejected if the entire range can't be cloned. It also allows filesystems to sliently skip deduplication of partial EOF blocks if they are not capable of doing so without requiring errors to be thrown to userspace. Existing filesystems are converted to user the new remap_file_range method, and both XFS and ocfs2 are modified to make use of the new generic checking infrastructure" * tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits) xfs: remove [cm]time update from reflink calls xfs: remove xfs_reflink_remap_range xfs: remove redundant remap partial EOF block checks xfs: support returning partial reflink results xfs: clean up xfs_reflink_remap_blocks call site xfs: fix pagecache truncation prior to reflink ocfs2: remove ocfs2_reflink_remap_range ocfs2: support partial clone range and dedupe range ocfs2: fix pagecache truncation prior to reflink ocfs2: truncate page cache for clone destination file before remapping vfs: clean up generic_remap_file_range_prep return value vfs: hide file range comparison function vfs: enable remap callers that can handle short operations vfs: plumb remap flags through the vfs dedupe functions vfs: plumb remap flags through the vfs clone functions vfs: make remap_file_range functions take and return bytes completed vfs: remap helper should update destination inode metadata vfs: pass remap flags to generic_remap_checks vfs: pass remap flags to generic_remap_file_range_prep vfs: combine the clone and dedupe into a single remap_file_range ...