summaryrefslogtreecommitdiffstats
path: root/fs/9p
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag '9p-for-6.2-rc1' of https://github.com/martinetd/linuxLinus Torvalds2022-12-239-9/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull 9p updates from Dominique Martinet: - improve p9_check_errors to check buffer size instead of msize when possible (e.g. not zero-copy) - some more syzbot and KCSAN fixes - minor headers include cleanup * tag '9p-for-6.2-rc1' of https://github.com/martinetd/linux: 9p/client: fix data race on req->status net/9p: fix response size check in p9_check_errors() net/9p: distinguish zero-copy requests 9p/xen: do not memcpy header into req->rc 9p: set req refcount to zero to avoid uninitialized usage 9p/net: Remove unneeded idr.h #include 9p/fs: Remove unneeded idr.h #include
| * 9p/fs: Remove unneeded idr.h #includeChristophe JAILLET2022-12-029-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | The 9p fs does not use IDR or IDA functionalities. So there is no point in including <linux/idr.h>. Remove it. Link: https://lkml.kernel.org/r/3d1e0ed9714eaee7e18d9f5b0b4bfa49b00b286d.1669553950.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> [Dominique: reword subject] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* | Merge tag 'fs.acl.rework.v6.2' of ↵Linus Torvalds2022-12-125-146/+170
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull VFS acl updates from Christian Brauner: "This contains the work that builds a dedicated vfs posix acl api. The origins of this work trace back to v5.19 but it took quite a while to understand the various filesystem specific implementations in sufficient detail and also come up with an acceptable solution. As we discussed and seen multiple times the current state of how posix acls are handled isn't nice and comes with a lot of problems: The current way of handling posix acls via the generic xattr api is error prone, hard to maintain, and type unsafe for the vfs until we call into the filesystem's dedicated get and set inode operations. It is already the case that posix acls are special-cased to death all the way through the vfs. There are an uncounted number of hacks that operate on the uapi posix acl struct instead of the dedicated vfs struct posix_acl. And the vfs must be involved in order to interpret and fixup posix acls before storing them to the backing store, caching them, reporting them to userspace, or for permission checking. Currently a range of hacks and duct tape exist to make this work. As with most things this is really no ones fault it's just something that happened over time. But the code is hard to understand and difficult to maintain and one is constantly at risk of introducing bugs and regressions when having to touch it. Instead of continuing to hack posix acls through the xattr handlers this series builds a dedicated posix acl api solely around the get and set inode operations. Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() helpers must be used in order to interact with posix acls. They operate directly on the vfs internal struct posix_acl instead of abusing the uapi posix acl struct as we currently do. In the end this removes all of the hackiness, makes the codepaths easier to maintain, and gets us type safety. This series passes the LTP and xfstests suites without any regressions. For xfstests the following combinations were tested: - xfs - ext4 - btrfs - overlayfs - overlayfs on top of idmapped mounts - orangefs - (limited) cifs There's more simplifications for posix acls that we can make in the future if the basic api has made it. A few implementation details: - The series makes sure to retain exactly the same security and integrity module permission checks. Especially for the integrity modules this api is a win because right now they convert the uapi posix acl struct passed to them via a void pointer into the vfs struct posix_acl format to perform permission checking on the mode. There's a new dedicated security hook for setting posix acls which passes the vfs struct posix_acl not a void pointer. Basing checking on the posix acl stored in the uapi format is really unreliable. The vfs currently hacks around directly in the uapi struct storing values that frankly the security and integrity modules can't correctly interpret as evidenced by bugs we reported and fixed in this area. It's not necessarily even their fault it's just that the format we provide to them is sub optimal. - Some filesystems like 9p and cifs need access to the dentry in order to get and set posix acls which is why they either only partially or not even at all implement get and set inode operations. For example, cifs allows setxattr() and getxattr() operations but doesn't allow permission checking based on posix acls because it can't implement a get acl inode operation. Thus, this patch series updates the set acl inode operation to take a dentry instead of an inode argument. However, for the get acl inode operation we can't do this as the old get acl method is called in e.g., generic_permission() and inode_permission(). These helpers in turn are called in various filesystem's permission inode operation. So passing a dentry argument to the old get acl inode operation would amount to passing a dentry to the permission inode operation which we shouldn't and probably can't do. So instead of extending the existing inode operation Christoph suggested to add a new one. He also requested to ensure that the get and set acl inode operation taking a dentry are consistently named. So for this version the old get acl operation is renamed to ->get_inode_acl() and a new ->get_acl() inode operation taking a dentry is added. With this we can give both 9p and cifs get and set acl inode operations and in turn remove their complex custom posix xattr handlers. In the future I hope to get rid of the inode method duplication but it isn't like we have never had this situation. Readdir is just one example. And frankly, the overall gain in type safety and the more pleasant api wise are simply too big of a benefit to not accept this duplication for a while. - We've done a full audit of every codepaths using variant of the current generic xattr api to get and set posix acls and surprisingly it isn't that many places. There's of course always a chance that we might have missed some and if so I'm sure we'll find them soon enough. The crucial codepaths to be converted are obviously stacking filesystems such as ecryptfs and overlayfs. For a list of all callers currently using generic xattr api helpers see [2] including comments whether they support posix acls or not. - The old vfs generic posix acl infrastructure doesn't obey the create and replace semantics promised on the setxattr(2) manpage. This patch series doesn't address this. It really is something we should revisit later though. The patches are roughly organized as follows: (1) Change existing set acl inode operation to take a dentry argument (Intended to be a non-functional change) (2) Rename existing get acl method (Intended to be a non-functional change) (3) Implement get and set acl inode operations for filesystems that couldn't implement one before because of the missing dentry. That's mostly 9p and cifs (Intended to be a non-functional change) (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() including security and integrity hooks (Intended to be a non-functional change) (5) Implement get and set acl inode operations for stacking filesystems (Intended to be a non-functional change) (6) Switch posix acl handling in stacking filesystems to new posix acl api now that all filesystems it can stack upon support it. (7) Switch vfs to new posix acl api (semantical change) (8) Remove all now unused helpers (9) Additional regression fixes reported after we merged this into linux-next Thanks to Seth for a lot of good discussion around this and encouragement and input from Christoph" * tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits) posix_acl: Fix the type of sentinel in get_acl orangefs: fix mode handling ovl: call posix_acl_release() after error checking evm: remove dead code in evm_inode_set_acl() cifs: check whether acl is valid early acl: make vfs_posix_acl_to_xattr() static acl: remove a slew of now unused helpers 9p: use stub posix acl handlers cifs: use stub posix acl handlers ovl: use stub posix acl handlers ecryptfs: use stub posix acl handlers evm: remove evm_xattr_acl_change() xattr: use posix acl api ovl: use posix acl api ovl: implement set acl method ovl: implement get acl method ecryptfs: implement set acl method ecryptfs: implement get acl method ksmbd: use vfs_remove_acl() acl: add vfs_remove_acl() ...
| * | 9p: use stub posix acl handlersChristian Brauner2022-10-203-126/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that 9p supports the get and set acl inode operations and the vfs has been switched to the new posi api, 9p can simply rely on the stub posix acl handlers. The custom xattr handlers and associated unused helpers can be removed. Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | 9p: implement set acl methodChristian Brauner2022-10-203-0/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. In order to build a type safe posix api around get and set acl we need all filesystem to implement get and set acl. So far 9p implemented a ->get_inode_acl() operation that didn't require access to the dentry in order to allow (limited) permission checking via posix acls in the vfs. Now that we have get and set acl inode operations that take a dentry argument we can give 9p get and set acl inode operations. This is mostly a light refactoring of the codepaths currently used in 9p posix acl xattr handler. After we have fully implemented the posix acl api and switched the vfs over to it, the 9p specific posix acl xattr handler and associated code will be removed. Note, until the vfs has been switched to the new posix acl api this patch is a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | 9p: implement get acl methodChristian Brauner2022-10-203-22/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. In order to build a type safe posix api around get and set acl we need all filesystem to implement get and set acl. So far 9p implemented a ->get_inode_acl() operation that didn't require access to the dentry in order to allow (limited) permission checking via posix acls in the vfs. Now that we have get and set acl inode operations that take a dentry argument we can give 9p get and set acl inode operations. This is mostly a refactoring of the codepaths currently used in 9p posix acl xattr handler. After we have fully implemented the posix acl api and switched the vfs over to it, the 9p specific posix acl xattr handler and associated code will be removed. Note, until the vfs has been switched to the new posix acl api this patch is a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
| * | fs: rename current get acl methodChristian Brauner2022-10-201-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. The current inode operation for getting posix acls takes an inode argument but various filesystems (e.g., 9p, cifs, overlayfs) need access to the dentry. In contrast to the ->set_acl() inode operation we cannot simply extend ->get_acl() to take a dentry argument. The ->get_acl() inode operation is called from: acl_permission_check() -> check_acl() -> get_acl() which is part of generic_permission() which in turn is part of inode_permission(). Both generic_permission() and inode_permission() are called in the ->permission() handler of various filesystems (e.g., overlayfs). So simply passing a dentry argument to ->get_acl() would amount to also having to pass a dentry argument to ->permission(). We should avoid this unnecessary change. So instead of extending the existing inode operation rename it from ->get_acl() to ->get_inode_acl() and add a ->get_acl() method later that passes a dentry argument and which filesystems that need access to the dentry can implement instead of ->get_inode_acl(). Filesystems like cifs which allow setting and getting posix acls but not using them for permission checking during lookup can simply not implement ->get_inode_acl(). This is intended to be a non-functional change. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Suggested-by/Inspired-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* / use less confusing names for iov_iter direction initializersAl Viro2022-11-253-5/+5
|/ | | | | | | | | | | | | READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 9p: Fix some kernel-doc commentsYang Li2022-07-021-2/+2
| | | | | | | | | | | | | | | Remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. fs/9p/fid.c:35: warning: Function parameter or member 'pfid' not described in 'v9fs_fid_add' fs/9p/fid.c:35: warning: Excess function parameter 'fid' description in 'v9fs_fid_add' fs/9p/fid.c:80: warning: Function parameter or member 'pfid' not described in 'v9fs_open_fid_add' fs/9p/fid.c:80: warning: Excess function parameter 'fid' description in 'v9fs_open_fid_add' Link: https://lkml.kernel.org/r/20220615012039.43479-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> [Dominique: further adjust comment] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p fid refcount: cleanup p9_fid_put callsDominique Martinet2022-07-026-90/+64
| | | | | | | | | | | | | | Simplify p9_fid_put cleanup path in many 9p functions since the function is noop on null or error fids. Also make the *_add_fid() helpers "steal" the fid by nulling its pointer, so put after them will be noop. This should lead to no change of behaviour Link: https://lkml.kernel.org/r/20220612085330.1451496-7-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p fid refcount: add p9_fid_get/put wrappersDominique Martinet2022-07-0210-69/+69
| | | | | | | | | | | | | | I was recently reminded that it is not clear that p9_client_clunk() was actually just decrementing refcount and clunking only when that reaches zero: make it clear through a set of helpers. This will also allow instrumenting refcounting better for debugging next patch Link: https://lkml.kernel.org/r/20220612085330.1451496-5-asmadeus@codewreck.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: Fix minor typo in code commentTyler Hicks2022-07-021-1/+1
| | | | | | | | | Fix s/patch/path/ typo and make it clear that we're talking about multiple path components. Link: https://lkml.kernel.org/r/20220527000003.355812-6-tyhicks@linux.microsoft.com Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: Remove unnecessary variable for old fids while walking from d_parentTyler Hicks2022-07-021-3/+3
| | | | | | | | | Remove the ofid variable that's local to the conditional block in favor of the old_fid variable that's local to the entire function. Link: https://lkml.kernel.org/r/20220527000003.355812-5-tyhicks@linux.microsoft.com Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: Make the path walk logic more clear about when cloning is requiredTyler Hicks2022-07-021-4/+3
| | | | | | | | | | Cloning during a path component walk is only needed when the old fid used for the walk operation is the root fid. Make that clear by comparing the two fids rather than using an additional variable. Link: https://lkml.kernel.org/r/20220527000003.355812-4-tyhicks@linux.microsoft.com Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: Track the root fid with its own variable during lookupsTyler Hicks2022-07-021-12/+14
| | | | | | | | | | Improve readability by using a new variable when dealing with the root fid. The root fid requires special handling in comparison to other fids used during fid lookup. Link: https://lkml.kernel.org/r/20220527000003.355812-3-tyhicks@linux.microsoft.com Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: fix EBADF errors in cached modeDominique Martinet2022-06-171-0/+13
| | | | | | | | | | | | | | | | | | | | cached operations sometimes need to do invalid operations (e.g. read on a write only file) Historic fscache had added a "writeback fid", a special handle opened RW as root, for this. The conversion to new fscache missed that bit. This commit reinstates a slightly lesser variant of the original code that uses the writeback fid for partial pages backfills if the regular user fid had been open as WRONLY, and thus would lack read permissions. Link: https://lkml.kernel.org/r/20220614033802.1606738-1-asmadeus@codewreck.org Fixes: eb497943fa21 ("9p: Convert to using the netfs helper lib to do reads and caching") Cc: stable@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Reported-By: Christian Schoenebeck <linux_oss@crudebyte.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Tested-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: Fix refcounting during full path walks for fid lookupsTyler Hicks2022-06-151-13/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Decrement the refcount of the parent dentry's fid after walking each path component during a full path walk for a lookup. Failure to do so can lead to fids that are not clunked until the filesystem is unmounted, as indicated by this warning: 9pnet: found fid 3 not clunked The improper refcounting after walking resulted in open(2) returning -EIO on any directories underneath the mount point when using the virtio transport. When using the fd transport, there's no apparent issue until the filesytem is unmounted and the warning above is emitted to the logs. In some cases, the user may not yet be attached to the filesystem and a new root fid, associated with the user, is created and attached to the root dentry before the full path walk is performed. Increment the new root fid's refcount to two in that situation so that it can be safely decremented to one after it is used for the walk operation. The new fid will still be attached to the root dentry when v9fs_fid_lookup_with_uid() returns so a final refcount of one is correct/expected. Link: https://lkml.kernel.org/r/20220527000003.355812-2-tyhicks@linux.microsoft.com Link: https://lkml.kernel.org/r/20220612085330.1451496-4-asmadeus@codewreck.org Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct") Cc: stable@vger.kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> [Dominique: fix clunking fid multiple times discussed in second link] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: fix fid refcount leak in v9fs_vfs_get_linkDominique Martinet2022-06-151-4/+4
| | | | | | | | | | | | we check for protocol version later than required, after a fid has been obtained. Just move the version check earlier. Link: https://lkml.kernel.org/r/20220612085330.1451496-3-asmadeus@codewreck.org Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct") Cc: stable@vger.kernel.org Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: fix fid refcount leak in v9fs_vfs_atomic_open_dotlDominique Martinet2022-06-151-0/+3
| | | | | | | | | | | | | | We need to release directory fid if we fail halfway through open This fixes fid leaking with xfstests generic 531 Link: https://lkml.kernel.org/r/20220612085330.1451496-2-asmadeus@codewreck.org Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct") Cc: stable@vger.kernel.org Reported-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* netfs: Rename the netfs_io_request cleanup op and give it an op pointerDavid Howells2022-06-101-6/+5
| | | | | | | | | | | | | | The netfs_io_request cleanup op is now always in a position to be given a pointer to a netfs_io_request struct, so this can be passed in instead of the mapping and private data arguments (both of which are included in the struct). So rename the ->cleanup op to ->free_request (to match ->init_request) and pass in the I/O pointer. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com
* netfs: Further cleanups after struct netfs_inode wrapper introducedLinus Torvalds2022-06-103-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
* netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_contextDavid Howells2022-06-095-13/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_object_size() as used by FORTIFY_SOURCE was not as easily fooled. This was causing the following complaint[1] from gcc v12: In file included from include/linux/string.h:253, from include/linux/ceph/ceph_debug.h:7, from fs/ceph/inode.c:2: In function 'fortify_memset_chk', inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2, inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2: include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by embedding a struct inode into struct netfs_i_context (which should perhaps be renamed to struct netfs_inode). The struct inode vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode structs and vfs_inode is then simply changed to "netfs.inode" in those filesystems. Further, rename netfs_i_context to netfs_inode, get rid of the netfs_inode() function that converted a netfs_i_context pointer to an inode pointer (that can now be done with &ctx->inode) and rename the netfs_i_context() function to netfs_inode() (which is now a wrapper around container_of()). Most of the changes were done with: perl -p -i -e 's/vfs_inode/netfs.inode/'g \ `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]` Kees suggested doing it with a pair structure[2] and a special declarator to insert that into the network filesystem's inode wrapper[3], but I think it's cleaner to embed it - and then it doesn't matter if struct randomisation reorders things. Dave Chinner suggested using a filesystem-specific VFS_I() function in each filesystem to convert that filesystem's own inode wrapper struct into the VFS inode struct[4]. Version #2: - Fix a couple of missed name changes due to a disabled cifs option. - Rename nfs_i_context to nfs_inode - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper structs. [ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily disable '-Wattribute-warning' for now") that is no longer needed ] Fixes: bc899ee1c898 ("netfs: Add a netfs inode context") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Xiubo Li <xiubli@redhat.com> cc: Jonathan Corbet <corbet@lwn.net> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <smfrench@gmail.com> cc: William Kucharski <william.kucharski@oracle.com> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: Dave Chinner <david@fromorbit.com> cc: linux-doc@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: linux-fsdevel@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2] Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3] Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4] Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 9p: Convert to release_folioMatthew Wilcox (Oracle)2022-05-091-9/+8
| | | | | | | A straightforward conversion as it already works in terms of folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jeff Layton <jlayton@kernel.org>
* fs: Convert netfs_readpage to netfs_read_folioMatthew Wilcox (Oracle)2022-05-091-1/+1
| | | | | | This is straightforward because netfs already worked in terms of folios. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
* fs: Remove flags parameter from aops->write_beginMatthew Wilcox (Oracle)2022-05-081-1/+1
| | | | | | | There are no more aop flags left, so remove the parameter. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
* fs: Remove aop_flags parameter from netfs_write_begin()Matthew Wilcox (Oracle)2022-05-081-1/+1
| | | | | | | There are no more aop flags left, so remove the parameter. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
* Merge tag 'netfs-prep-20220318' of ↵Linus Torvalds2022-03-315-65/+37
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull netfs updates from David Howells: "Netfs prep for write helpers. Having had a go at implementing write helpers and content encryption support in netfslib, it seems that the netfs_read_{,sub}request structs and the equivalent write request structs were almost the same and so should be merged, thereby requiring only one set of alloc/get/put functions and a common set of tracepoints. Merging the structs also has the advantage that if a bounce buffer is added to the request struct, a read operation can be performed to fill the bounce buffer, the contents of the buffer can be modified and then a write operation can be performed on it to send the data wherever it needs to go using the same request structure all the way through. The I/O handlers would then transparently perform any required crypto. This should make it easier to perform RMW cycles if needed. The potentially common functions and structs, however, by their names all proclaim themselves to be associated with the read side of things. The bulk of these changes alter this in the following ways: - Rename struct netfs_read_{,sub}request to netfs_io_{,sub}request. - Rename some enums, members and flags to make them more appropriate. - Adjust some comments to match. - Drop "read"/"rreq" from the names of common functions. For instance, netfs_get_read_request() becomes netfs_get_request(). - The ->init_rreq() and ->issue_op() methods become ->init_request() and ->issue_read(). I've kept the latter as a read-specific function and in another branch added an ->issue_write() method. The driver source is then reorganised into a number of files: fs/netfs/buffered_read.c Create read reqs to the pagecache fs/netfs/io.c Dispatchers for read and write reqs fs/netfs/main.c Some general miscellaneous bits fs/netfs/objects.c Alloc, get and put functions fs/netfs/stats.c Optional procfs statistics. and future development can be fitted into this scheme, e.g.: fs/netfs/buffered_write.c Modify the pagecache fs/netfs/buffered_flush.c Writeback from the pagecache fs/netfs/direct_read.c DIO read support fs/netfs/direct_write.c DIO write support fs/netfs/unbuffered_write.c Write modifications directly back Beyond the above changes, there are also some changes that affect how things work: - Make fscache_end_operation() generally available. - In the netfs tracing header, generate enums from the symbol -> string mapping tables rather than manually coding them. - Add a struct for filesystems that uses netfslib to put into their inode wrapper structs to hold extra state that netfslib is interested in, such as the fscache cookie. This allows netfslib functions to be set in filesystem operation tables and jumped to directly without having to have a filesystem wrapper. - Add a member to the struct added above to track the remote inode length as that may differ if local modifications are buffered. We may need to supply an appropriate EOF pointer when storing data (in AFS for example). - Pass extra information to netfs_alloc_request() so that the ->init_request() hook can access it and retain information to indicate the origin of the operation. - Make the ->init_request() hook return an error, thereby allowing a filesystem that isn't allowed to cache an inode (ceph or cifs, for example) to skip readahead. - Switch to using refcount_t for subrequests and add tracepoints to log refcount changes for the request and subrequest structs. - Add a function to consolidate dispatching a read request. Similar code is used in three places and another couple are likely to be added in the future" Link: https://lore.kernel.org/all/2639515.1648483225@warthog.procyon.org.uk/ * tag 'netfs-prep-20220318' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Maintain netfs_i_context::remote_i_size netfs: Keep track of the actual remote file size netfs: Split some core bits out into their own file netfs: Split fs/netfs/read_helper.c netfs: Rename read_helper.c to io.c netfs: Prepare to split read_helper.c netfs: Add a function to consolidate beginning a read netfs: Add a netfs inode context ceph: Make ceph_init_request() check caps on readahead netfs: Change ->init_request() to return an error code netfs: Refactor arguments for netfs_alloc_read_request netfs: Adjust the netfs_failure tracepoint to indicate non-subreq lines netfs: Trace refcounting on the netfs_io_subrequest struct netfs: Trace refcounting on the netfs_io_request struct netfs: Adjust the netfs_rreq tracepoint slightly netfs: Split netfs_io_* object handling out netfs: Finish off rename of netfs_read_request to netfs_io_request netfs: Rename netfs_read_*request to netfs_io_*request netfs: Generate enums from trace symbol mapping lists fscache: export fscache_end_operation()
| * netfs: Add a netfs inode contextDavid Howells2022-03-185-56/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a netfs_i_context struct that should be included in the network filesystem's own inode struct wrapper, directly after the VFS's inode struct, e.g.: struct my_inode { struct { /* These must be contiguous */ struct inode vfs_inode; struct netfs_i_context netfs_ctx; }; }; The netfs_i_context struct so far contains a single field for the network filesystem to use - the cache cookie: struct netfs_i_context { ... struct fscache_cookie *cache; }; Three functions are provided to help with this: (1) void netfs_i_context_init(struct inode *inode, const struct netfs_request_ops *ops); Initialise the netfs context and set the operations. (2) struct netfs_i_context *netfs_i_context(struct inode *inode); Find the netfs context from the VFS inode. (3) struct inode *netfs_inode(struct netfs_i_context *ctx); Find the VFS inode from the netfs context. Changes ======= ver #4) - Fix netfs_is_cache_enabled() to check cookie->cache_priv to see if a cache is present[3]. - Fix netfs_skip_folio_read() to zero out all of the page, not just some of it[3]. ver #3) - Split out the bit to move ceph cap-getting on readahead into ceph_init_request()[1]. - Stick in a comment to the netfs inode structs indicating the contiguity requirements[2]. ver #2) - Adjust documentation to match. - Use "#if IS_ENABLED()" in netfs_i_cookie(), not "#ifdef". - Move the cap check from ceph_readahead() to ceph_init_request() to be called from netfslib. - Remove ceph_readahead() and use netfs_readahead() directly instead. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/8af0d47f17d89c06bbf602496dd845f2b0bf25b3.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/beaf4f6a6c2575ed489adb14b257253c868f9a5c.camel@kernel.org/ [2] Link: https://lore.kernel.org/r/3536452.1647421585@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/164622984545.3564931.15691742939278418580.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/164678213320.1200972.16807551936267647470.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692909854.2099075.9535537286264248057.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/306388.1647595110@warthog.procyon.org.uk/ # v4
| * netfs: Change ->init_request() to return an error codeDavid Howells2022-03-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the request initialisation function to return an error code so that the network filesystem can return a failure (ENOMEM, for example). This will also allow ceph to abort a ->readahead() op if the server refuses to give it a cap allowing local caching from within the netfslib framework (errors aren't passed back through ->readahead(), so returning, say, -ENOBUFS will cause the op to be aborted). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/164678212401.1200972.16537041523832944934.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692905398.2099075.5238033621684646524.stgit@warthog.procyon.org.uk/ # v3
| * netfs: Finish off rename of netfs_read_request to netfs_io_requestDavid Howells2022-03-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adjust helper function names and comments after mass rename of struct netfs_read_*request to struct netfs_io_*request. Changes ======= ver #2) - Make the changes in the docs also. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/164622992433.3564931.6684311087845150271.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/164678196111.1200972.5001114956865989528.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692892567.2099075.13895804222087028813.stgit@warthog.procyon.org.uk/ # v3
| * netfs: Rename netfs_read_*request to netfs_io_*requestDavid Howells2022-03-181-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename netfs_read_*request to netfs_io_*request so that the same structures can be used for the write helpers too. perl -p -i -e 's/netfs_read_(request|subrequest)/netfs_io_$1/g' \ `git grep -l 'netfs_read_\(sub\|\)request'` perl -p -i -e 's/nr_rd_ops/nr_outstanding/g' \ `git grep -l nr_rd_ops` perl -p -i -e 's/nr_wr_ops/nr_copy_ops/g' \ `git grep -l nr_wr_ops` perl -p -i -e 's/netfs_read_source/netfs_io_source/g' \ `git grep -l 'netfs_read_source'` perl -p -i -e 's/netfs_io_request_ops/netfs_request_ops/g' \ `git grep -l 'netfs_io_request_ops'` perl -p -i -e 's/init_rreq/init_request/g' \ `git grep -l 'init_rreq'` Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/164622988070.3564931.7089670190434315183.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/164678195157.1200972.366609966927368090.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692891535.2099075.18435198075367420588.stgit@warthog.procyon.org.uk/ # v3
* | Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds2022-03-221-27/+10
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull filesystem folio updates from Matthew Wilcox: "Primarily this series converts some of the address_space operations to take a folio instead of a page. Notably: - a_ops->is_partially_uptodate() takes a folio instead of a page and changes the type of the 'from' and 'count' arguments to make it obvious they're bytes. - a_ops->invalidatepage() becomes ->invalidate_folio() and has a similar type change. - a_ops->launder_page() becomes ->launder_folio() - a_ops->set_page_dirty() becomes ->dirty_folio() and adds the address_space as an argument. There are a couple of other misc changes up front that weren't worth separating into their own pull request" * tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache: (53 commits) fs: Remove aops ->set_page_dirty fb_defio: Use noop_dirty_folio() fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio fs: Convert __set_page_dirty_buffers to block_dirty_folio nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio() mm: Convert swap_set_page_dirty() to swap_dirty_folio() ubifs: Convert ubifs_set_page_dirty to ubifs_dirty_folio f2fs: Convert f2fs_set_node_page_dirty to f2fs_dirty_node_folio f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio f2fs: Convert f2fs_set_meta_page_dirty to f2fs_dirty_meta_folio afs: Convert afs_dir_set_page_dirty() to afs_dir_dirty_folio() btrfs: Convert extent_range_redirty_for_io() to use folios fs: Convert trivial uses of __set_page_dirty_nobuffers to filemap_dirty_folio btrfs: Convert from set_page_dirty to dirty_folio fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio() fs: Add aops->dirty_folio fs: Remove aops->launder_page orangefs: Convert launder_page to launder_folio nfs: Convert from launder_page to launder_folio fuse: Convert from launder_page to launder_folio ...
| * | fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio()Matthew Wilcox (Oracle)2022-03-151-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert all users of fscache_set_page_dirty to use fscache_dirty_folio. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs
| * | 9p: Convert from launder_page to launder_folioMatthew Wilcox (Oracle)2022-03-151-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trivial conversion. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs
| * | 9p: Convert to invalidate_folioMatthew Wilcox (Oracle)2022-03-151-12/+3
| |/ | | | | | | | | | | | | | | | | | | This is a trivial conversion. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs
* / fs: allocate inode by using alloc_inode_sb()Muchun Song2022-03-221-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Revert "fs/9p: search open fids first"Dominique Martinet2022-01-301-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 478ba09edc1f2f2ee27180a06150cb2d1a686f9c. That commit was meant as a fix for setattrs with by fd (e.g. ftruncate) to use an open fid instead of the first fid it found on lookup. The proper fix for that is to use the fid associated with the open file struct, available in iattr->ia_file for such operations, and was actually done just before in 66246641609b ("9p: retrieve fid from file when file instance exist.") As such, this commit is no longer required. Furthermore, changing lookup to return open fids first had unwanted side effects, as it turns out the protocol forbids the use of open fids for further walks (e.g. clone_fid) and we broke mounts for some servers enforcing this rule. Note this only reverts to the old working behaviour, but it's still possible for lookup to return open fids if dentry->d_fsdata is not set, so more work is needed to make sure we respect this rule in the future, for example by adding a flag to the lookup functions to only match certain fid open modes depending on caller requirements. Link: https://lkml.kernel.org/r/20220130130651.712293-1-asmadeus@codewreck.org Fixes: 478ba09edc1f ("fs/9p: search open fids first") Cc: stable@vger.kernel.org # v5.11+ Reported-by: ron minnich <rminnich@gmail.com> Reported-by: ng@0x80.stream Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* Merge tag '9p-for-5.17-rc1' of git://github.com/martinetd/linuxLinus Torvalds2022-01-163-13/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull 9p updates from Dominique Martinet: "Fixes, split 9p_net_fd, and new reviewer: - fix possible uninitialized memory usage for setattr - fix fscache reading hole in a file just after it's been grown - split net/9p/trans_fd.c in its own module like other transports. The new transport module defaults to 9P_NET and is autoloaded if required so users should not be impacted - add Christian Schoenebeck to 9p reviewers - some more trivial cleanup" * tag '9p-for-5.17-rc1' of git://github.com/martinetd/linux: 9p: fix enodata when reading growing file net/9p: show error message if user 'msize' cannot be satisfied MAINTAINERS: 9p: add Christian Schoenebeck as reviewer 9p: only copy valid iattrs in 9P2000.L setattr implementation 9p: Use BUG_ON instead of if condition followed by BUG. net/p9: load default transports 9p/xen: autoload when xenbus service is available 9p/trans_fd: split into dedicated module fs: 9p: remove unneeded variable 9p/trans_virtio: Fix typo in the comment for p9_virtio_create()
| * 9p: fix enodata when reading growing fileDominique Martinet2022-01-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reading from a file that was just extended by a write, but the write had not yet reached the server would return ENODATA as illustrated by this command: $ xfs_io -c 'open -ft test' -c 'w 4096 1000' -c 'r 0 1000' wrote 1000/1000 bytes at offset 4096 1000.000000 bytes, 1 ops; 0.0001 sec (5.610 MiB/sec and 5882.3529 ops/sec) pread: No data available Fix this case by having netfs assume zeroes when reads from server come short like AFS and CEPH do Link: https://lkml.kernel.org/r/20220110111444.926753-1-asmadeus@codewreck.org Cc: stable@vger.kernel.org Fixes: eb497943fa21 ("9p: Convert to using the netfs helper lib to do reads and caching") Co-authored-by: David Howells <dhowells@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
| * 9p: only copy valid iattrs in 9P2000.L setattr implementationChristian Brauner2022-01-101-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 9P2000.L setattr method v9fs_vfs_setattr_dotl() copies struct iattr values without checking whether they are valid causing unitialized values to be copied. The 9P2000 setattr method v9fs_vfs_setattr() method gets this right. Check whether struct iattr fields are valid first before copying in v9fs_vfs_setattr_dotl() too and make sure that all other fields are set to 0 apart from {g,u}id which should be set to INVALID_{G,U}ID. This ensure that they can be safely sent over the wire or printed for debugging later on. Link: https://lkml.kernel.org/r/20211129114434.3637938-1-brauner@kernel.org Link: https://lkml.kernel.org/r/000000000000a0d53f05d1c72a4c%40google.com Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: stable@kernel.org Cc: v9fs-developer@lists.sourceforge.net Reported-by: syzbot+dfac92a50024b54acaa4@syzkaller.appspotmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> [Dominique: do not set a/mtime with just ATTR_A/MTIME as discussed] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
| * 9p: Use BUG_ON instead of if condition followed by BUG.Zhang Mingyu2022-01-101-2/+1
| | | | | | | | | | | | | | | | | | This issue was detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/20211112092547.9153-1-zhang.mingyu@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Zhang Mingyu <zhang.mingyu@zte.com.cn> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
| * fs: 9p: remove unneeded variableChangcheng Deng2021-12-181-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix the following coccicheck review: ./fs/9p/vfs_file.c: 117: 5-8: Unneeded variable Remove unneeded variable used to store return value. Link: https://lkml.kernel.org/r/20211109114343.132844-1-deng.changcheng@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* | 9p, afs, ceph, nfs: Use current_is_kswapd() rather than ↵David Howells2022-01-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfpflags_allow_blocking() In 9p, afs ceph, and nfs, gfpflags_allow_blocking() (which wraps a test for __GFP_DIRECT_RECLAIM being set) is used to determine if ->releasepage() should wait for the completion of a DIO write to fscache with something like: if (folio_test_fscache(folio)) { if (!gfpflags_allow_blocking(gfp) || !(gfp & __GFP_FS)) return false; folio_wait_fscache(folio); } Instead, current_is_kswapd() should be used instead. Note that this is based on a patch originally by Zhaoyang Huang[1]. In addition to extending it to the other network filesystems and putting it on top of my fscache rewrite, it also needs to include linux/swap.h in a bunch of places. Can current_is_kswapd() be moved to linux/mm.h? Changes ======= ver #5: - Dropping the changes for cifs. Originally-signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Co-developed-by: David Howells <dhowells@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: Steve French <smfrench@gmail.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: linux-cachefs@redhat.com cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/1638952658-20285-1-git-send-email-huangzhaoyang@gmail.com/ [1] Link: https://lore.kernel.org/r/164021590773.640689.16777975200823659231.stgit@warthog.procyon.org.uk/ # v4
* | 9p: Copy local writes to the cache when writing to the serverDavid Howells2022-01-103-1/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When writing to the server from v9fs_vfs_writepage(), copy the data to the cache object too. To make this possible, the cookie must have its active users count incremented when the page is dirtied and kept incremented until we manage to clean up all the pages. This allows the writeback to take place after the last file struct is released. This is done by taking a use on the cookie in v9fs_set_page_dirty() if we haven't already done so (controlled by the I_PINNING_FSCACHE_WB flag) and dropping the pin in v9fs_write_inode() if __writeback_single_inode() clears all the outstanding dirty pages (conveyed by the unpinned_fscache_wb flag in the writeback_control struct). Inode eviction must also clear the flag after truncating away all the outstanding pages. In the future this will be handled more gracefully by netfslib. Changes ======= ver #3: - Canonicalise the coherency data to make it endianness-independent. ver #2: - Fix an unused-var warning due to CONFIG_9P_FSCACHE=n[1]. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: v9fs-developer@lists.sourceforge.net cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819667027.215744.13815687931204222995.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906978015.143852.10646669694345706328.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967180760.1823006.5831751873616248910.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021574522.640689.13849966660182529125.stgit@warthog.procyon.org.uk/ # v4
* | 9p: Use fscache indexing rewrite and reenable cachingDavid Howells2022-01-1010-210/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the 9p filesystem to take account of the changes to fscache's indexing rewrite and reenable caching in 9p. The following changes have been made: (1) The fscache_netfs struct is no more, and there's no need to register the filesystem as a whole. (2) The session cookie is now an fscache_volume cookie, allocated with fscache_acquire_volume(). That takes three parameters: a string representing the "volume" in the index, a string naming the cache to use (or NULL) and a u64 that conveys coherency metadata for the volume. For 9p, I've made it render the volume name string as: "9p,<devname>,<cachetag>" where the cachetag is replaced by the aname if it wasn't supplied. This probably needs rethinking a bit as the aname can have slashes in it. It might be better to hash the cachetag and use the hash or I could substitute commas for the slashes or something. (3) The fscache_cookie_def is no more and needed information is passed directly to fscache_acquire_cookie(). The cache no longer calls back into the filesystem, but rather metadata changes are indicated at other times. fscache_acquire_cookie() is passed the same keying and coherency information as before. (4) The functions to set/reset/flush cookies are removed and fscache_use_cookie() and fscache_unuse_cookie() are used instead. fscache_use_cookie() is passed a flag to indicate if the cookie is opened for writing. fscache_unuse_cookie() is passed updates for the metadata if we changed it (ie. if the file was opened for writing). These are called when the file is opened or closed. (5) wait_on_page_bit[_killable]() is replaced with the specific wait functions for the bits waited upon. (6) I've got rid of some of the 9p-specific cache helper functions and called things like fscache_relinquish_cookie() directly as they'll optimise away if v9fs_inode_cookie() returns an unconditional NULL (which will be the case if CONFIG_9P_FSCACHE=n). (7) v9fs_vfs_setattr() is made to call fscache_resize() to change the size of the cache object. Notes: (A) We should call fscache_invalidate() if we detect that the server's copy of a file got changed by a third party, but I don't know where to do that. We don't need to do that when allocating the cookie as we get a check-and-invalidate when we initially bind to the cache object. (B) The copy-to-cache-on-writeback side of things will be handled in separate patch. Changes ======= ver #3: - Canonicalise the cookie key and coherency data to make them endianness-independent. ver #2: - Use gfpflags_allow_blocking() rather than using flag directly. - fscache_acquire_volume() now returns errors. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: v9fs-developer@lists.sourceforge.net cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819664645.215744.1555314582005286846.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906975017.143852.3459573173204394039.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967178512.1823006.17377493641569138183.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021573143.640689.3977487095697717967.stgit@warthog.procyon.org.uk/ # v4
* | fscache: Remove the contents of the fscache driver, pending rewriteDavid Howells2022-01-071-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the code that comprises the fscache driver as it's going to be substantially rewritten, with the majority of the code being erased in the rewrite. A small piece of linux/fscache.h is left as that is #included by a bunch of network filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819578724.215744.18210619052245724238.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906884814.143852.6727245089843862889.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967077097.1823006.1377665951499979089.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021485548.640689.13876080567388696162.stgit@warthog.procyon.org.uk/ # v4
* | fscache, cachefiles: Disable configurationDavid Howells2022-01-071-1/+1
|/ | | | | | | | | | | | Disable fscache and cachefiles in Kconfig whilst it is rewritten. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819576672.215744.12444272479560406780.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906882835.143852.11073015983885872901.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967075113.1823006.277316290062782998.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021481179.640689.2004199594774033658.stgit@warthog.procyon.org.uk/ # v4
* netfs, 9p, afs, ceph: Use foliosDavid Howells2021-11-102-47/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the netfs helper library to use folios throughout, convert the 9p and afs filesystems to use folios in their file I/O paths and convert the ceph filesystem to use just enough folios to compile. With these changes, afs passes -g quick xfstests. Changes ======= ver #5: - Got rid of folio_end{io,_read,_write}() and inlined the stuff it does instead (Willy decided he didn't want this after all). ver #4: - Fixed a bug in afs_redirty_page() whereby it didn't set the next page index in the loop and returned too early. - Simplified a check in v9fs_vfs_write_folio_locked()[1]. - Undid a change to afs_symlink_readpage()[1]. - Used offset_in_folio() in afs_write_end()[1]. - Changed from using page_endio() to folio_end{io,_read,_write}()[1]. ver #2: - Add 9p foliation. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: kafs-testing@auristor.com cc: Matthew Wilcox (Oracle) <willy@infradead.org> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/YYKa3bfQZxK5/wDN@casper.infradead.org/ [1] Link: https://lore.kernel.org/r/2408234.1628687271@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/162877311459.3085614.10601478228012245108.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162981153551.1901565.3124454657133703341.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/163005745264.2472992.9852048135392188995.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163584187452.4023316.500389675405550116.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/163649328026.309189.1124218109373941936.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/163657852454.834781.9265101983152100556.stgit@warthog.procyon.org.uk/ # v5
* 9p: fix a bunch of checkpatch warningsDominique Martinet2021-11-0412-37/+58
| | | | | | | | | | Sohaib Mohamed started a serie of tiny and incomplete checkpatch fixes but seemingly stopped halfway -- take over and do most of it. This is still missing net/9p/trans* and net/9p/protocol.c for a later time... Link: http://lkml.kernel.org/r/20211102134608.1588018-3-dominique.martinet@atmark-techno.com Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
* 9p: set readahead and io size according to maxsizeDominique Martinet2021-11-041-0/+3
| | | | | | | | | | | having a readahead of 128k with a msize of 128k (with overhead) lead to reading 124+4k everytime, making two roundtrips needlessly. tune readahead according to msize when cache is enabled for better performance Link: http://lkml.kernel.org/r/20211104120323.2189376-1-asmadeus@codewreck.org Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>