summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* GFS2: Move log buffer lists into transactionSteven Whitehouse2014-02-247-29/+53
| | | | | | | | | | | | | | | Over time, we hope to be able to improve the concurrency available in the log code. This is one small step towards that, by moving the buffer lists from the super block, and into the transaction structure, so that each transaction builds its own buffer lists. At transaction commit time, the buffer lists are merged into the currently accumulating transaction. That transaction then is passed into the before and after commit functions at journal flush time. Thus there should be no change in overall behaviour yet. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Reduce struct gfs2_trans in sizeSteven Whitehouse2014-02-212-3/+3
| | | | | | | | | A couple of "int" fields were being used as boolean values so we can make them bitfields of one bit, and put them in what might otherwise be a hole in the structure with 64 bit alignment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: add missing newlineDavid Teigland2014-02-171-1/+1
| | | | | | | | Log message is missing newline. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Mark functions as static in gfs2/rgrp.cRashika Kheria2014-02-101-2/+2
| | | | | | | | | | | | | Mark functions as static in gfs2/rgrp.c because they are not used outside this file. This eliminates the following warning in gfs2/rgrp.c: fs/gfs2/rgrp.c:1092:5: warning: no previous prototype for ‘gfs2_rgrp_bh_get’ [-Wmissing-prototypes] fs/gfs2/rgrp.c:1157:5: warning: no previous prototype for ‘update_rgrp_lvb’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Add meta readahead field in directory entriesSteven Whitehouse2014-02-071-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intent of this new field in the directory entry is to allow a subsequent lookup to know how many blocks, which are contiguous with the inode, contain metadata which relates to the inode. This will then allow the issuing of a single read to read these blocks, rather than reading the inode first, and then issuing a second read for the metadata. This only works under some fairly strict conditions, since we do not have back pointers from inodes to directory entries we must ensure that the blocks referenced in this way will always belong to the inode. This rules out being able to use this system for indirect blocks, as these can change as a result of truncate/rewrite. So the idea here is to restrict this to xattr blocks only for the time being. For most inodes, that means only a single block. Also, when using ACLs and/or SELinux or other LSMs, these will be added at inode creation time so that they will be contiguous with the inode on disk and also will almost always be needed when we read the inode in for permissions checks. Once an xattr block for an inode is allocated, it will never change until the inode is deallocated. This patch adds the new field, a further patch will add the readahead in due course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Lock i_mutex and use a local gfs2_holder for fallocateBob Peterson2014-02-061-4/+9
| | | | | | | | | | This patch causes GFS2 to lock the i_mutex during fallocate. It also switches from using a dinode's inode glock to using a local holder like the other GFS2 i_operations. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: journal data writepages updateSteven Whitehouse2014-02-062-36/+98
| | | | | | | | | | | | | | | | | | | GFS2 has carried what is more or less a copy of the write_cache_pages() for some time. It seems that this copy has slipped behind the core code over time. This patch brings it back uptodate, and in addition adds the tracepoint which would otherwise be missing. We could go further, and eliminate some or all of the code duplication here. The issue is that if we do that, then the function we need to split out from the existing write_cache_pages(), which will look a lot like gfs2_jdata_write_pagevec(), would land up putting quite a lot of extra variables on the stack. I know that has been a problem in the past in the writeback code path, which is why I've hesitated to do it here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Allocate block for xattr at inode alloc time, if requiredSteven Whitehouse2014-02-042-8/+44
| | | | | | | | | | | | | | | | | | | | | | | This is another step towards improving the allocation of xattr blocks at inode allocation time. Here we take advantage of Christoph's recent work on ACLs to allocate a block for the xattrs early if we know that we will be adding ACLs to the inode later on. The advantage of that is that it is much more likely that we'll get a contiguous run of two blocks where the first is the inode and the second is the xattr block. We still have to fall back to the original system in case we don't get the requested two contiguous blocks, or in case the ACLs are too large to fit into the block. Future patches will move more of the ACL setting code further up the gfs2_inode_create() function. Also, I'd like to be able to do the same thing with the xattrs from LSMs in due course, too. That way we should be able to slowly reduce the number of independent transactions, at least in the most common cases. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* GFS2: Plug on AIL flushSteven Whitehouse2014-02-031-0/+4
| | | | | | | | | | | | | | | | When we do a flush of the AIL list, we are writing out what is likely to be a lot of small I/Os, which are possibly in an order which is not ideal performance-wise. Since this is done by calling filemap_fdatatwrite for each individual inode's address space there is no overall plugging going on. In addition to that, we do not always wait for AIL i/o when we flush it, so that it is possible for things to get left behind on the queue. By adding explicit plugging here, we reduce the chances of this being an issues. A quick test using the AIL flush tracepoint shows a small, but measurable improvement. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* hpfs: optimize quad buffer loadingMikulas Patocka2014-02-021-46/+50
| | | | | | | | | | | | | | | | | | | | | | HPFS needs to load 4 consecutive 512-byte sectors when accessing the directory nodes or bitmaps. We can't switch to 2048-byte block size because files are allocated in the units of 512-byte sectors. Previously, the driver would allocate a 2048-byte area using kmalloc, copy the data from four buffers to this area and eventually copy them back if they were modified. In the current implementation of the buffer cache, buffers are allocated in the pagecache. That means that 4 consecutive 512-byte buffers are stored in consecutive areas in the kernel address space. So, we don't need to allocate extra memory and copy the content of the buffers there. This patch optimizes the code to avoid copying the buffers. It checks if the four buffers are stored in contiguous memory - if they are not, it falls back to allocating a 2048-byte area and copying data there. Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* hpfs: remember free spaceMikulas Patocka2014-02-023-10/+87
| | | | | | | | | | | | | | | | | | Previously, hpfs scanned all bitmaps each time the user asked for free space using statfs. This patch changes it so that hpfs scans the bitmaps only once, remembes the free space and on next invocation of statfs it returns the value instantly. New versions of wine are hammering on the statfs syscall very heavily, making some games unplayable when they're stored on hpfs, with load times in minutes. This should be backported to the stable kernels because it fixes user-visible problem (excessive level load times in wine). Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* afs: proc cells and rootcell are writeablePali Rohár2014-02-011-2/+2
| | | | | | | | | | | | | Both proc files are writeable and used for configuring cells. But there is missing correct mode flag for writeable files. Without this patch both proc files are read only. [ It turns out they aren't really read-only, since root can write to them even if the write bit isn't set due to CAP_DAC_OVERRIDE ] Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2014-02-0111-420/+553
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cifs fixes from Steve French: "A set of cifs fixes (mostly for symlinks, and SMB2 xattrs) and cleanups" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix check for regular file in couldbe_mf_symlink() [CIFS] Fix SMB2 mounts so they don't try to set or get xattrs via cifs CIFS: Cleanup cifs open codepath CIFS: Remove extra indentation in cifs_sfu_type CIFS: Cleanup cifs_mknod CIFS: Cleanup CIFSSMBOpen cifs: Add support for follow_link on dfs shares under posix extensions cifs: move unix extension call to cifs_query_symlink() cifs: Re-order M-F Symlink code cifs: Add create MFSymlinks to protocol ops struct cifs: use protocol specific call for query_mf_symlink() cifs: Rename MF symlink function names cifs: Rename and cleanup open_query_close_cifs_symlink() cifs: Fix memory leak in cifs_hardlink()
| * cifs: Fix check for regular file in couldbe_mf_symlink()Sachin Prabhu2014-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MF Symlinks are regular files containing content in a specified format. The function couldbe_mf_symlink() checks the mode for a set S_IFREG bit as a test to confirm that it is a regular file. This bit is also set for other filetypes and simply checking for this bit being set may return false positives. We ensure that we are actually checking for a regular file by using the S_ISREG macro to test instead. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reported-by: Neil Brown <neilb@suse.de> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * [CIFS] Fix SMB2 mounts so they don't try to set or get xattrs via cifsSteve French2014-01-262-19/+36
| | | | | | | | | | | | | | | | | | | | | | When mounting with smb2 (or smb2.1 or smb3) we need to check to make sure that attempts to query or set extended attributes do not attempt to send the request with the older cifs protocol instead (eventually we also need to add the support in SMB2 to query/set extended attributes but this patch prevents us from using the wrong protocol for extended attribute operations). Signed-off-by: Steve French <smfrench@gmail.com>
| * CIFS: Cleanup cifs open codepathPavel Shilovsky2014-01-208-100/+174
| | | | | | | | | | | | | | | | Rename CIFSSMBOpen to CIFS_open and make it take cifs_open_parms structure as a parm. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
| * CIFS: Remove extra indentation in cifs_sfu_typePavel Shilovsky2014-01-201-47/+50
| | | | | | | | | | Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
| * CIFS: Cleanup cifs_mknodPavel Shilovsky2014-01-201-26/+22
| | | | | | | | | | | | | | Rename camel case variable and fix comment style. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
| * CIFS: Cleanup CIFSSMBOpenPavel Shilovsky2014-01-202-72/+86
| | | | | | | | | | | | | | | | | | Remove indentation, fix comment style, rename camel case variables in preparation to make it work with cifs_open_parms structure as a parm. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Add support for follow_link on dfs shares under posix extensionsSachin Prabhu2014-01-201-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using posix extensions, dfs shares in the dfs root show up as symlinks resulting in userland tools such as 'ls' calling readlink() on these shares. Since these are dfs shares, we end up returning -EREMOTE. $ ls -l /mnt ls: cannot read symbolic link /mnt/test: Object is remote total 0 lrwxrwxrwx. 1 root root 19 Nov 6 09:47 test With added follow_link() support for dfs shares, when using unix extensions, we call GET_DFS_REFERRAL to obtain the DFS referral and return the first node returned. The dfs share in the dfs root is now displayed in the following manner. $ ls -l /mnt total 0 lrwxrwxrwx. 1 root root 19 Nov 6 09:47 test -> \vm140-31\test Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: move unix extension call to cifs_query_symlink()Sachin Prabhu2014-01-202-10/+15
| | | | | | | | | | | | | | | | | | | | Unix extensions rigth now are only applicable to smb1 operations. Move the check and subsequent unix extension call to the smb1 specific call to query_symlink() ie. cifs_query_symlink(). Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Re-order M-F Symlink codeSachin Prabhu2014-01-201-56/+68
| | | | | | | | | | | | | | | | | | This patch makes cosmetic changes. We group similar functions together and separate out the protocol specific functions. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Add create MFSymlinks to protocol ops structSachin Prabhu2014-01-204-42/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Add a new protocol ops function create_mf_symlink and have create_mf_symlink() use it. This patchset moves the MFSymlink operations completely to the ops structure so that we only use the right protocol versions when querying or creating MFSymlinks. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: use protocol specific call for query_mf_symlink()Sachin Prabhu2014-01-201-41/+20
| | | | | | | | | | | | | | | | | | We have an existing protocol specific call query_mf_symlink() created for check_mf_symlink which can also be used for query_mf_symlink(). Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Rename MF symlink function namesSachin Prabhu2014-01-204-26/+24
| | | | | | | | | | | | | | | | Clean up camel case in functionnames. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Rename and cleanup open_query_close_cifs_symlink()Sachin Prabhu2014-01-204-31/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | Rename open_query_close_cifs_symlink to cifs_query_mf_symlink() to make the name more consistent with other protocol version specific functions. We also pass tcon as an argument to the function. This is already available in the calling functions and we can avoid having to make an unnecessary lookup. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * cifs: Fix memory leak in cifs_hardlink()Christian Engelmayer2014-01-191-2/+4
| | | | | | | | | | | | | | | | Fix a potential memory leak in the cifs_hardlink() error handling path. Detected by Coverity: CID 728510, CID 728511. Signed-off-by: Christian Engelmayer <cengelma@gmx.at> Signed-off-by: Steve French <smfrench@gmail.com>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-02-016-50/+25
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Several obvious fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Fix mountpoint reference leakage in linkat hfsplus: use xattr handlers for removexattr Typo in compat_sys_lseek() declaration fs/super.c: sync ro remount after blocking writers vfs: unexport the getname() symbol
| * | Fix mountpoint reference leakage in linkatOleg Drokin2014-01-311-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent changes to retry on ESTALE in linkat (commit 442e31ca5a49e398351b2954b51f578353fdf210) introduced a mountpoint reference leak and a small memory leak in case a filesystem link operation returns ESTALE which is pretty normal for distributed filesystems like lustre, nfs and so on. Free old_path in such a case. [AV: there was another missing path_put() nearby - on the previous goto retry] Signed-off-by: Oleg Drokin: <green@linuxhacker.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | hfsplus: use xattr handlers for removexattrChristoph Hellwig2014-01-314-47/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hfsplus was already using the handlers for get and set operations, and with the removal of can_set_xattr we've now allow operations that wouldn't otherwise be allowed. With this we can also centralize the special-casing of the osx. attrs that don't have prefixes on disk in the osx xattr handlers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | fs/super.c: sync ro remount after blocking writersAndrew Ruder2014-01-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move sync_filesystem() after sb_prepare_remount_readonly(). If writers sneak in anywhere from sync_filesystem() to sb_prepare_remount_readonly() it can cause inodes to be dirtied and writeback to occur well after sys_mount() has completely successfully. This was spotted by corrupted ubifs filesystems on reboot, but appears that it can cause issues with any filesystem using writeback. Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> CC: Richard Weinberger <richard@nod.at> Co-authored-by: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Ruder <andrew.ruder@elecsyscorp.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | vfs: unexport the getname() symbolJeff Layton2014-01-311-1/+0
| | | | | | | | | | | | | | | | | | | | | Leaving getname() exported when putname() isn't is a bad idea. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2014-01-319-35/+84
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: "Highlights: - Fix several races in nfs_revalidate_mapping - NFSv4.1 slot leakage in the pNFS files driver - Stable fix for a slot leak in nfs40_sequence_done - Don't reject NFSv4 servers that support ACLs with only ALLOW aces" * tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: initialize the ACL support bits to zero. NFSv4.1: Cleanup NFSv4.1: Clean up nfs41_sequence_done NFSv4: Fix a slot leak in nfs40_sequence_done NFSv4.1 free slot before resending I/O to MDS nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING NFS: Fix races in nfs_revalidate_mapping sunrpc: turn warn_gssd() log message into a dprintk() NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping nfs: handle servers that support only ALLOW ACE type.
| * | | nfs: initialize the ACL support bits to zero.Malahal Naineni2014-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid returning incorrect acl mask attributes when the server doesn't support ACLs. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4.1: CleanupTrond Myklebust2014-01-291-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is now completely safe to call nfs41_sequence_free_slot with a NULL slot. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4.1: Clean up nfs41_sequence_doneTrond Myklebust2014-01-291-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the test for res->sr_slot == NULL out of the nfs41_sequence_free_slot helper and into the main function for efficiency. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4: Fix a slot leak in nfs40_sequence_doneTrond Myklebust2014-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check for whether or not we sent an RPC call in nfs40_sequence_done is insufficient to decide whether or not we are holding a session slot, and thus should not be used to decide when to free that slot. This patch replaces the RPC_WAS_SENT() test with the correct test for whether or not slot == NULL. Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # 3.12+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4.1 free slot before resending I/O to MDSAndy Adamson2014-01-293-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a dynamic session slot leak where a slot is preallocated and I/O is resent through the MDS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATINGJeff Layton2014-01-283-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the setting of NFS_INO_INVALIDATING gets reordered to before the clearing of NFS_INO_INVALID_DATA, then another task may hit a race window where both appear to be clear, even though the inode's pages are still in need of invalidation. Fix this by adding the appropriate memory barriers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFS: Fix races in nfs_revalidate_mappingTrond Myklebust2014-01-281-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d529ef83c355f97027ff85298a9709fe06216a66 (NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping) introduces a potential race, since it doesn't test the value of nfsi->cache_validity and set the bitlock in nfsi->flags atomically. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Jeff Layton <jlayton@redhat.com>
| * | | sunrpc: turn warn_gssd() log message into a dprintk()Jeff Layton2014-01-271-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original printk() made sense when the GSSAPI codepaths were called only when sec=krb5* was explicitly requested. Now however, in many cases the nfs client will try to acquire GSSAPI credentials by default, even when it's not requested. Since we don't have a great mechanism to distinguish between the two cases, just turn the pr_warn into a dprintk instead. With this change we can also get rid of the ratelimiting. We do need to keep the EXPORT_SYMBOL(gssd_running) in place since auth_gss.ko needs it and sunrpc.ko provides it. We can however, eliminate the gssd_running call in the nfs code since that's a bit of a layering violation. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mappingJeff Layton2014-01-274-6/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a possible race in how the nfs_invalidate_mapping function is handled. Currently, we go and invalidate the pages in the file and then clear NFS_INO_INVALID_DATA. The problem is that it's possible for a stale page to creep into the mapping after the page was invalidated (i.e., via readahead). If another writer comes along and sets the flag after that happens but before invalidate_inode_pages2 returns then we could clear the flag without the cache having been properly invalidated. So, we must clear the flag first and then invalidate the pages. Doing this however, opens another race: It's possible to have two concurrent read() calls that end up in nfs_revalidate_mapping at the same time. The first one clears the NFS_INO_INVALID_DATA flag and then goes to call nfs_invalidate_mapping. Just before calling that though, the other task races in, checks the flag and finds it cleared. At that point, it trusts that the mapping is good and gets the lock on the page, allowing the read() to be satisfied from the cache even though the data is no longer valid. These effects are easily manifested by running diotest3 from the LTP test suite on NFS. That program does a series of DIO writes and buffered reads. The operations are serialized and page-aligned but the existing code fails the test since it occasionally allows a read to come out of the cache incorrectly. While mixing direct and buffered I/O isn't recommended, I believe it's possible to hit this in other ways that just use buffered I/O, though that situation is much harder to reproduce. The problem is that the checking/clearing of that flag and the invalidation of the mapping really need to be atomic. Fix this by serializing concurrent invalidations with a bitlock. At the same time, we also need to allow other places that check NFS_INO_INVALID_DATA to check whether we might be in the middle of invalidating the file, so fix up a couple of places that do that to look for the new NFS_INO_INVALIDATING flag. Doing this requires us to be careful not to set the bitlock unnecessarily, so this code only does that if it believes it will be doing an invalidation. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | nfs: handle servers that support only ALLOW ACE type.Malahal Naineni2014-01-271-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we support ACLs if the NFS server file system supports both ALLOW and DENY ACE types. This patch makes the Linux client work with ACLs even if the server supports only 'ALLOW' ACE type. Signed-off-by: Malahal Naineni <malahal@us.ibm.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | | Merge tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggyLinus Torvalds2014-01-311-1/+14
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull jfs fix from David Kleikamp: "Minor bug fix for linux-3.14" * tag 'jfs-3.14' of git://github.com/kleikamp/linux-shaggy: jfs: fix xattr value size overflow in __jfs_setxattr
| * | | | jfs: fix xattr value size overflow in __jfs_setxattrJie Liu2014-01-021-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a potential overflow if the specified EA value size is greater than USHRT_MAX because the size of value is limited by the on-disk format (i.e, __le16), this issue could be reflected via the tests below: # touch /jfs/testfile # setfattr -n user.comment -v `perl -e 'print "A"x65536'` /jfs/testfile setfattr: /jfs/testfile: Invalid argument Syslog: ... jfs_xsetattr: xattr_size = 21, new_size = 65557 This patch add pre-checkups of EA value size against USHRT_MAX to avoid this problem, and return -E2BIG which is consistent with the VFS setxattr interface. Moreover, fix the debug code to print the correct function name. With this fix: setfattr: /jfs/testfile: Argument list too long Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
* | | | | ceph: fix missing dput in ceph_set_aclSage Weil2014-01-311-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add matching dput() for d_find_alias(). Move d_find_alias() down a bit at Julia's suggestion. [ Introduced by commit 72466d0b92e0: "ceph: fix posix ACL hooks" ] Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Ilya Dryomov <ilya.dryomov@inktank.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge branch 'for-linus' of ↵Linus Torvalds2014-01-3052-2029/+5103
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs updates from Chris Mason: "This is a pretty big pull, and most of these changes have been floating in btrfs-next for a long time. Filipe's properties work is a cool building block for inheriting attributes like compression down on a per inode basis. Jeff Mahoney kicked in code to export filesystem info into sysfs. Otherwise, lots of performance improvements, cleanups and bug fixes. Looks like there are still a few other small pending incrementals, but I wanted to get the bulk of this in first" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (149 commits) Btrfs: fix spin_unlock in check_ref_cleanup Btrfs: setup inode location during btrfs_init_inode_locked Btrfs: don't use ram_bytes for uncompressed inline items Btrfs: fix btrfs_search_slot_for_read backwards iteration Btrfs: do not export ulist functions Btrfs: rework ulist with list+rb_tree Btrfs: fix memory leaks on walking backrefs failure Btrfs: fix send file hole detection leading to data corruption Btrfs: add a reschedule point in btrfs_find_all_roots() Btrfs: make send's file extent item search more efficient Btrfs: fix to catch all errors when resolving indirect ref Btrfs: fix protection between walking backrefs and root deletion btrfs: fix warning while merging two adjacent extents Btrfs: fix infinite path build loops in incremental send btrfs: undo sysfs when open_ctree() fails Btrfs: fix snprintf usage by send's gen_unique_name btrfs: fix defrag 32-bit integer overflow btrfs: sysfs: list the NO_HOLES feature btrfs: sysfs: don't show reserved incompat feature btrfs: call permission checks earlier in ioctls and return EPERM ...
| * | | | | Btrfs: fix spin_unlock in check_ref_cleanupChris Mason2014-01-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our goto out should have gone a little farther. Signed-off-by: Chris Mason <clm@fb.com>
| * | | | | Btrfs: setup inode location during btrfs_init_inode_lockedChris Mason2014-01-291-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a race during inode init because the BTRFS_I(inode)->location is setup after the inode hash table lock is dropped. btrfs_find_actor uses the location field, so our search might not find an existing inode in the hash table if we race with the inode init code. This commit changes things to setup the location field sooner. Also the find actor now uses only the location objectid to match inodes. For inode hashing, we just need a unique and stable test, it doesn't have to reflect the inode numbers we show to userland. Signed-off-by: Chris Mason <clm@fb.com> CC: stable@vger.kernel.org
| * | | | | Btrfs: don't use ram_bytes for uncompressed inline itemsChris Mason2014-01-296-22/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we truncate an uncompressed inline item, ram_bytes isn't updated to reflect the new size. The fixe uses the size directly from the item header when reading uncompressed inlines, and also fixes truncate to update the size as it goes. Reported-by: Jens Axboe <axboe@fb.com> Signed-off-by: Chris Mason <clm@fb.com> CC: stable@vger.kernel.org