summaryrefslogtreecommitdiffstats
path: root/fs/gfs2
Commit message (Collapse)AuthorAgeFilesLines
* Rename superblock flags (MS_xyz -> SB_xyz)Linus Torvalds2017-11-273-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm, pagevec: remove cold parameter for pagevecsMel Gorman2017-11-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | Every pagevec_init user claims the pages being released are hot even in cases where it is unlikely the pages are hot. As no one cares about the hotness of pages being released to the allocator, just ditch the parameter. No performance impact is expected as the overhead is marginal. The parameter is removed simply because it is a bit stupid to have a useless parameter copied everywhere. Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: remove nr_pages argument from pagevec_lookup_{,range}_tag()Jan Kara2017-11-151-1/+1
| | | | | | | | | | | All users of pagevec_lookup() and pagevec_lookup_range() now pass PAGEVEC_SIZE as a desired number of pages. Just drop the argument. Link: http://lkml.kernel.org/r/20171009151359.31984-15-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* gfs2: use pagevec_lookup_range_tag()Jan Kara2017-11-151-18/+2
| | | | | | | | | | | | | We want only pages from given range in gfs2_write_cache_jdata(). Use pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and remove unnecessary code. Link: http://lkml.kernel.org/r/20171009151359.31984-9-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'gfs2-4.15.fixes' of ↵Linus Torvalds2017-11-1411-209/+469
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Bob Peterson: "We've got a total of 17 GFS2 patches for this merge window. The patches are basically in three categories: (1) patches related to broken xfstest cases, (2) patches related to improving iomap and start using it in GFS2, and (3) general typos and clarifications. Please note that one of the iomap patches extends beyond GFS2 and affects other file systems, but it was publically reviewed by a variety of file system people in the community. From Andreas Gruenbacher: - rename variable 'bsize' to 'factor' to clarify the logic related to gfs2_block_map. - correctly set ctime in the setflags ioctl, which fixes broken xfstests test 277. - fix broken xfstest 258, due to an atime initialization problem. - fix broken xfstest 307, in which GFS2 was not setting ctime when setting acls. - switch general iomap code from blkno to disk offset for a variety of file systems. - add a new IOMAP_F_DATA_INLINE flag for iomap to indicate blocks that have data mixed with metadata. - implement SEEK_HOLE and SEEK_DATA via iomap in GFS2. - fix failing xfstest case 066, which was due to not properly syncing dirty inodes when changing extended attributes. - fix a minor typo in a comment. - partially fix xfstest 424, which involved GET_FLAGS and SET_FLAGS ioctl. This is also a cleanup and simplification of the translation of flags from fs flags to gfs2 flags. - add support for STATX_ATTR_ in statx, which fixed broken xfstest 424. - fix for failing xfstest 093 which fixes a recursive glock problem with gfs2_xattr_get and _set From me: - make inode height info part of the 'metapath' data structure to facilitate using iomap in GFS2. - start using iomap inside GFS2 and switch GFS2's block_map functions to use iomap under the covers. - switch GFS2's fiemap implementation from using block_map to using iomap under the covers. - fix journaled data pages not being properly synced to media when writing inodes. This was caught with xfstests. - fix another failing xfstest case in which switching a file from ordered_write to journaled data via set_flags caused a deadlock" * tag 'gfs2-4.15.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Allow gfs2_xattr_set to be called with the glock held gfs2: Add support for statx inode flags gfs2: Fix and clean up {GET,SET}FLAGS ioctl gfs2: Fix a harmless typo gfs2: Fix xattr fsync GFS2: Take inode off order_write list when setting jdata flag GFS2: flush the log and all pages for jdata as we do for WB_SYNC_ALL gfs2: Implement SEEK_HOLE / SEEK_DATA via iomap GFS2: Switch fiemap implementation to use iomap GFS2: Implement iomap for block_map GFS2: Make height info part of metapath gfs2: Always update inode ctime in set_acl gfs2: Support negative atimes gfs2: Update ctime in setflags ioctl gfs2: Clarify gfs2_block_map
| * gfs2: Allow gfs2_xattr_set to be called with the glock heldAndreas Gruenbacher2017-10-311-7/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the following call path: gfs2_setattr -> setattr_prepare -> ... -> cap_inode_killpriv -> ... -> gfs2_xattr_set the glock is locked in gfs2_setattr, so check for recursive locking in gfs2_xattr_set as gfs2_xattr_get already does. While at it, get rid of need_unlock in gfs2_xattr_get. Fixes xfstest generic/093. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Abhijith Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Add support for statx inode flagsAndreas Gruenbacher2017-10-311-0/+14
| | | | | | | | | | | | | | | | | | | | | | Add support for the STATX_ATTR_ flags in statx. (Compression, encryption, and the nodump flag are not supported by gfs2.) Partially fixes xfstest generic/424. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Fix and clean up {GET,SET}FLAGS ioctlAndreas Gruenbacher2017-10-311-55/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to a simple array for mapping between the FS_*_FL and GFS_DIF_* flags. Clarify how the mapping between FS_JOURNAL_DATA_FL and the filesystem flags works. The GFS2_DIF_SYSTEM flag cannot be set from user space, so remove it from GFS2_FLAGS_USER_SET. Fail with -EINVAL when trying to set flags that are not supported instead of silently ignoring those flags. Partially fixes xfstest generic/424. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Fix a harmless typoAndreas Gruenbacher2017-10-311-1/+1
| | | | | | | | | | Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Fix xattr fsyncAndreas Gruenbacher2017-10-311-32/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure that changing xattrs marks the corresponding inode dirty so that a subsequent fsync will sync those changes to disk. We set I_DIRTY_SYNC as well as I_DIRTY_DATASYNC so that both fsync and fdatasync will sync xattr changes: xattrs can contain information critical to how the data can be accessed, so we don't want fdatasync to skip them. Fixes xfstest generic/066. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Take inode off order_write list when setting jdata flagBob Peterson2017-10-311-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a deadlock caused when the jdata flag is set for inodes that are already on the ordered write list. Since it is on the ordered write list, log_flush calls gfs2_ordered_write which calls filemap_fdatawrite. But since the inode had the jdata flag set, that calls gfs2_jdata_writepages, which tries to start a new transaction. A new transaction cannot be started because it tries to acquire the log_flush rwsem which is already locked by the log flush operation. The bottom line is: We cannot switch an inode from ordered to jdata until we eliminate any ordered data pages (via log flush) or any log_flush operation afterward will create the circular dependency above. So we need to flush the log before setting the diskflags to switch the file mode, then we need to remove the inode from the ordered writes list. Before this patch, the log flush was done for jdata->ordered, but that's wrong. If we're going from jdata to ordered, we don't need to call gfs2_log_flush because the call to filemap_fdatawrite will do it for us: filemap_fdatawrite() -> __filemap_fdatawrite_range() __filemap_fdatawrite_range() -> do_writepages() do_writepages() -> gfs2_jdata_writepages() gfs2_jdata_writepages() -> gfs2_log_flush() This patch modifies function do_gfs2_set_flags so that if a file has its jdata flag set, and it's already on the ordered write list, the log will be flushed and it will be removed from the list before setting the flag. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Abhijith Das <adas@redhat.com>
| * GFS2: flush the log and all pages for jdata as we do for WB_SYNC_ALLBob Peterson2017-10-311-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In function gfs2_write_inode, starting with patch a9185b41a4f84, we only flush the log and call filemap_fdatawait if we're passed in a wbc sync_mode of WB_SYNC_ALL. We also need to do these things if we're evicting a jdata inode, because we might have jdata pages still attached to bufdata descriptors that need to be revoked, but by the time it gets to evict() it's too late to start a new transaction. This patch changes it to treat jdata inodes as if WB_SYNC_ALL had been specified. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Abhijith Das <adas@redhat.com>
| * gfs2: Implement SEEK_HOLE / SEEK_DATA via iomapAndreas Gruenbacher2017-10-313-3/+54
| | | | | | | | | | | | | | So far, lseek on gfs2 did not report holes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Switch fiemap implementation to use iomapBob Peterson2017-10-312-25/+10
| | | | | | | | | | | | | | | | This patch switches GFS2's implementation of fiemap from the old block_map code to the new iomap interface. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * GFS2: Implement iomap for block_mapBob Peterson2017-10-313-68/+274
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements iomap for block mapping, and switches the block_map function to use it under the covers. The additional IOMAP_F_BOUNDARY iomap flag indicates when iomap has reached a "metadata boundary" and fetching the next mapping is likely to incur an additional I/O. This flag is used for setting the bh buffer boundary flag. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * GFS2: Make height info part of metapathBob Peterson2017-10-311-47/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch eliminates height parameters from function gfs2_bmap_alloc. Function find_metapath determines the metapath's "find height", also known as the desired height. Function lookup_metapath determines the metapath's "actual height", previously known as starting height or sheight. Function gfs2_bmap_alloc now gets both height values from the metapath. This simplification was done as a step toward switching the block_map functions to using iomap. The bh_map responsibilities are also removed from function gfs2_bmap_alloc for the same reason. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * Merge branch 'master' of ↵Andreas Gruenbacher2017-10-166-8/+8
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
| * | gfs2: Always update inode ctime in set_aclAndreas Gruenbacher2017-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Three-entry POSIX ACLs can be stored in the file mode permission bits, with no need to store them in extended attributes. When a process sets such a minimal ACL, the kernel updates the file mode like chmod does, and removes any existing extended attributes for that ACL. Make sure the ctime is always updated in that case. Fixes xfstest generic/307. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Support negative atimesAndreas Gruenbacher2017-09-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When inodes are read from disk, GFS2 will only update in-memory atimes older than the on-disk atimes; this prevents atimes from going backwards. The atimes of newly allocated inodes are initialized to 0. This means that when an atime is explicitly set to a negative value, this value will not persist. Fix by setting the atime of newly allocated inodes to the lowest possible value instead of 0. Fixes xfstest generic/258. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Update ctime in setflags ioctlAndreas Gruenbacher2017-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | The FS_IOC_SETFLAGS ioctl is supposed to update the inode ctime. Fixes xfstests generic/277. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * | gfs2: Clarify gfs2_block_mapAndreas Gruenbacher2017-09-251-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add a comment about the logical block size for directories. Rename "bsize" in gfs2_block_map to "factor". Fix a typo in the description of metaptr1. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
* | | License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2017-11-022-0/+2
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | Merge tag 'gfs2-for-linus-4.14-rc3' of ↵Linus Torvalds2017-09-251-9/+5
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Bob Peterson: "GFS2: Fix an old regression in GFS2's debugfs interface This fixes a regression introduced by commit 88ffbf3e037e ("GFS2: Use resizable hash table for glocks"). The regression caused the glock dump in debugfs to not report all the glocks, which makes debugging extremely difficult" * tag 'gfs2-for-linus-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix debugfs glocks dump
| * gfs2: Fix debugfs glocks dumpAndreas Gruenbacher2017-09-251-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | The switch to rhashtables (commit 88ffbf3e03) broke the debugfs glock dump (/sys/kernel/debug/gfs2/<device>/glocks) for dumps bigger than a single buffer: the right function for restarting an rhashtable iteration from the beginning of the hash table is rhashtable_walk_enter; rhashtable_walk_stop + rhashtable_walk_start will just resume from the current position. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Cc: stable@vger.kernel.org # v4.3+
* | Merge branch 'work.mount' of ↵Linus Torvalds2017-09-146-8/+8
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull mount flag updates from Al Viro: "Another chunk of fmount preparations from dhowells; only trivial conflicts for that part. It separates MS_... bits (very grotty mount(2) ABI) from the struct super_block ->s_flags (kernel-internal, only a small subset of MS_... stuff). This does *not* convert the filesystems to new constants; only the infrastructure is done here. The next step in that series is where the conflicts would be; that's the conversion of filesystems. It's purely mechanical and it's better done after the merge, so if you could run something like list=$(for i in MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC MS_SYNCHRONOUS MS_MANDLOCK MS_DIRSYNC MS_NOATIME MS_NODIRATIME MS_SILENT MS_POSIXACL MS_KERNMOUNT MS_I_VERSION MS_LAZYTIME; do git grep -l $i fs drivers/staging/lustre drivers/mtd ipc mm include/linux; done|sort|uniq|grep -v '^fs/namespace.c$') sed -i -e 's/\<MS_RDONLY\>/SB_RDONLY/g' \ -e 's/\<MS_NOSUID\>/SB_NOSUID/g' \ -e 's/\<MS_NODEV\>/SB_NODEV/g' \ -e 's/\<MS_NOEXEC\>/SB_NOEXEC/g' \ -e 's/\<MS_SYNCHRONOUS\>/SB_SYNCHRONOUS/g' \ -e 's/\<MS_MANDLOCK\>/SB_MANDLOCK/g' \ -e 's/\<MS_DIRSYNC\>/SB_DIRSYNC/g' \ -e 's/\<MS_NOATIME\>/SB_NOATIME/g' \ -e 's/\<MS_NODIRATIME\>/SB_NODIRATIME/g' \ -e 's/\<MS_SILENT\>/SB_SILENT/g' \ -e 's/\<MS_POSIXACL\>/SB_POSIXACL/g' \ -e 's/\<MS_KERNMOUNT\>/SB_KERNMOUNT/g' \ -e 's/\<MS_I_VERSION\>/SB_I_VERSION/g' \ -e 's/\<MS_LAZYTIME\>/SB_LAZYTIME/g' \ $list and commit it with something along the lines of 'convert filesystems away from use of MS_... constants' as commit message, it would save a quite a bit of headache next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Differentiate mount flags (MS_*) from internal superblock flags VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) vfs: Add sb_rdonly(sb) to query the MS_RDONLY flag on s_flags
| * VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)David Howells2017-07-176-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A | -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) | -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) | -!(sb_rdonly(SB)) +!sb_rdonly(SB) | -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) | -A || (sb_rdonly(SB)) +A || sb_rdonly(SB) | -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A | -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A | -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A | -(sb_rdonly(SB)) || A +sb_rdonly(SB) || A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) | -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) | -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>
* | Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-blockLinus Torvalds2017-09-073-3/+3
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block layer updates from Jens Axboe: "This is the first pull request for 4.14, containing most of the code changes. It's a quiet series this round, which I think we needed after the churn of the last few series. This contains: - Fix for a registration race in loop, from Anton Volkov. - Overflow complaint fix from Arnd for DAC960. - Series of drbd changes from the usual suspects. - Conversion of the stec/skd driver to blk-mq. From Bart. - A few BFQ improvements/fixes from Paolo. - CFQ improvement from Ritesh, allowing idling for group idle. - A few fixes found by Dan's smatch, courtesy of Dan. - A warning fixup for a race between changing the IO scheduler and device remova. From David Jeffery. - A few nbd fixes from Josef. - Support for cgroup info in blktrace, from Shaohua. - Also from Shaohua, new features in the null_blk driver to allow it to actually hold data, among other things. - Various corner cases and error handling fixes from Weiping Zhang. - Improvements to the IO stats tracking for blk-mq from me. Can drastically improve performance for fast devices and/or big machines. - Series from Christoph removing bi_bdev as being needed for IO submission, in preparation for nvme multipathing code. - Series from Bart, including various cleanups and fixes for switch fall through case complaints" * 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits) kernfs: checking for IS_ERR() instead of NULL drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set drbd: Fix allyesconfig build, fix recent commit drbd: switch from kmalloc() to kmalloc_array() drbd: abort drbd_start_resync if there is no connection drbd: move global variables to drbd namespace and make some static drbd: rename "usermode_helper" to "drbd_usermode_helper" drbd: fix race between handshake and admin disconnect/down drbd: fix potential deadlock when trying to detach during handshake drbd: A single dot should be put into a sequence. drbd: fix rmmod cleanup, remove _all_ debugfs entries drbd: Use setup_timer() instead of init_timer() to simplify the code. drbd: fix potential get_ldev/put_ldev refcount imbalance during attach drbd: new disk-option disable-write-same drbd: Fix resource role for newly created resources in events2 drbd: mark symbols static where possible drbd: Send P_NEG_ACK upon write error in protocol != C drbd: add explicit plugging when submitting batches drbd: change list_for_each_safe to while(list_first_entry_or_null) drbd: introduce drbd_recv_header_maybe_unplug ...
| * | block: replace bi_bdev with a gendisk pointer and partitions indexChristoph Hellwig2017-08-233-3/+3
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This way we don't need a block_device structure to submit I/O. The block_device has different life time rules from the gendisk and request_queue and is usually only available when the block device node is open. Other callers need to explicitly create one (e.g. the lightnvm passthrough code, or the new nvme multipathing code). For the actual I/O path all that we need is the gendisk, which exists once per block device. But given that the block layer also does partition remapping we additionally need a partition index, which is used for said remapping in generic_make_request. Note that all the block drivers generally want request_queue or sometimes the gendisk, so this removes a layer of indirection all over the stack. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge tag 'wberr-v4.14-1' of ↵Linus Torvalds2017-09-061-2/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull writeback error handling updates from Jeff Layton: "This pile continues the work from last cycle on better tracking writeback errors. In v4.13 we added some basic errseq_t infrastructure and converted a few filesystems to use it. This set continues refining that infrastructure, adds documentation, and converts most of the other filesystems to use it. The main exception at this point is the NFS client" * tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: ecryptfs: convert to file_write_and_wait in ->fsync mm: remove optimizations based on i_size in mapping writeback waits fs: convert a pile of fsync routines to errseq_t based reporting gfs2: convert to errseq_t based writeback error reporting for fsync fs: convert sync_file_range to use errseq_t based error-tracking mm: add file_fdatawait_range and file_write_and_wait fuse: convert to errseq_t based error tracking for fsync mm: consolidate dax / non-dax checks for writeback Documentation: add some docs for errseq_t errseq: rename __errseq_set to errseq_set
| * | gfs2: convert to errseq_t based writeback error reporting for fsyncJeff Layton2017-08-011-2/+4
| |/ | | | | | | | | | | | | Also, fix a place where a writeback error might get dropped in the gfs2_is_jdata case. Signed-off-by: Jeff Layton <jlayton@redhat.com>
* | Merge tag 'gfs2-4.14.fixes' of ↵Linus Torvalds2017-09-0620-110/+321
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "We've got a whopping 29 GFS2 patches for this merge window, mainly because we held some back from the previous merge window until we could get them perfected and well tested. We have a couple patch sets, including my patch set for protecting glock gl_object and Andreas Gruenbacher's patch set to fix the long-standing shrink- slab hang, plus a bunch of assorted bugs and cleanups. Summary: - I fixed a bug whereby an IO error would lead to a double-brelse. - Andreas Gruenbacher made a minor cleanup to call his relatively new function, gfs2_holder_initialized, rather than doing it manually. This was just missed by a previous patch set. - Jan Kara fixed a bug whereby the SGID was being cleared when inheriting ACLs. - Andreas found a bug and fixed it in his previous patch, "Get rid of flush_delayed_work in gfs2_evict_inode". A call to flush_delayed_work was deleted from *gfs2_inode_lookup and added to gfs2_create_inode. - Wang Xibo found and fixed a list_add call in inode_go_lock that specified the parameters in the wrong order. - Coly Li submitted a patch to add the REQ_PRIO to some of GFS2's metadata reads that were accidentally missing them. - I submitted a 4-patch set to protect the glock gl_object field. GFS2 was setting and checking gl_object with no locking mechanism, so the value was occasionally stomped on, which caused file system corruption. - I submitted a small cleanup to function gfs2_clear_rgrpd. It was needlessly adding rgrp glocks to the lru list, then pulling them back off immediately. The rgrp glocks don't use the lru list anyway, so doing so was just a waste of time. - I submitted a patch that checks the GLOF_LRU flag on a glock before trying to remove it from the lru_list. This avoids a lot of unnecessary spin_lock contention. - I submitted a patch to delete GFS2's debugfs files only after we evict all the glocks. Before this patch, GFS2 would delete the debugfs files, and if unmount hung waiting for a glock, there was no way to debug the problem. Now, if a hang occurs during umount, we can examine the debugfs files to figure out why it's hung. - Andreas Gruenbacher submitted a patch to fix some trivial typos. - Andreas also submitted a five-part patch set to fix the longstanding hang involving the slab shrinker: dlm requires memory, calls the inode shrinker, which calls gfs2's evict, which calls back into DLM before it can evict an inode. - Abhi Das submitted a patch to forcibly flush the active items list to relieve memory pressure. This fixes a long-standing bug whereby GFS2 was getting hung permanently in balance_dirty_pages. - Thomas Tai submitted a patch to fix a slab corruption problem due to a residual pointer left in the lock_dlm lockstruct. - I submitted a patch to withdraw the file system if IO errors are encountered while writing to the journals or statfs system file which were previously not being sent back up. Before, some IO errors were sometimes not be detected for several hours, and at recovery time, the journal errors made journal replay impossible. - Andreas has a patch to fix an annoying format-truncation compiler warning so GFS2 compiles cleanly. - I have a patch that fixes a handful of sparse compiler warnings. - Andreas fixed up an useless gl_object warning caused by an earlier patch. - Arvind Yadav added a patch to properly constify our rhashtable params declare. - I added a patch to fix a regression caused by the non-recursive delete and truncate patch that caused file system blocks to not be properly freed. - Ernesto A. Fernández added a patch to fix a place where GFS2 would send back the wrong return code setting extended attributes. - Ernesto also added a patch to fix a case in which GFS2 was improperly setting an inode's i_mode, potentially granting access to the wrong users" * tag 'gfs2-4.14.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (29 commits) gfs2: preserve i_mode if __gfs2_set_acl() fails gfs2: don't return ENODATA in __gfs2_xattr_set unless replacing GFS2: Fix non-recursive truncate bug gfs2: constify rhashtable_params GFS2: Fix gl_object warnings GFS2: Fix up some sparse warnings gfs2: Silence gcc format-truncation warning GFS2: Withdraw for IO errors writing to the journal or statfs gfs2: fix slab corruption during mounting and umounting gfs file system gfs2: forcibly flush ail to relieve memory pressure gfs2: Clean up waiting on glocks gfs2: Defer deleting inodes under memory pressure gfs2: gfs2_evict_inode: Put glocks asynchronously gfs2: Get rid of gfs2_set_nlink gfs2: gfs2_glock_get: Wait on freeing glocks gfs2: Fix trivial typos GFS2: Delete debugfs files only after we evict the glocks GFS2: Don't waste time locking lru_lock for non-lru glocks GFS2: Don't bother trying to add rgrps to the lru list GFS2: Clear gl_object when deleting an inode in gfs2_delete_inode ...
| * gfs2: preserve i_mode if __gfs2_set_acl() failsErnesto A. Fernández2017-08-311-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When changing a file's acl mask, __gfs2_set_acl() will first set the group bits of i_mode to the value of the mask, and only then set the actual extended attribute representing the new acl. If the second part fails (due to lack of space, for example) and the file had no acl attribute to begin with, the system will from now on assume that the mask permission bits are actual group permission bits, potentially granting access to the wrong users. Prevent this by only changing the inode mode after the acl has been set. Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: don't return ENODATA in __gfs2_xattr_set unless replacingErnesto A. Fernández2017-08-311-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function __gfs2_xattr_set() will return -ENODATA when called to remove a xattr that does not exist. The result is that setfacl will show an exit status of 1 when called to set only a file's mode bits (on a file with no ACLs), despite succeeding. A "No data available" error will be printed as well. To fix this return 0 instead, except when the XATTR_REPLACE flag is set, in which case -ENODATA is appropriate. This is consistent with how most other xattr setting functions work, in other filesystems. Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Fix non-recursive truncate bugBob Peterson2017-08-301-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch if you truncated a file to a smaller size it wasn't freeing all the blocks properly. There are two reasons. First, the metapath comparison was not comparing previous heights. I added a function, mp_eq_to_hgt, which checks the metapath at all heights prior to the target height. Second, in function find_nonnull_ptr, it needed to zero out all pointers for heights following the target height. Translated into decimal integer terms, this way a number like 299, when incremented, becomes 300, not 399. The 2 gets incremented to 3, and the following digits need to be reset. These two things allow the truncate state machine to properly find the blocks it needs to delete. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: constify rhashtable_paramsArvind Yadav2017-08-301-1/+1
| | | | | | | | | | | | | | | | | | | | rhashtable_params are not supposed to change at runtime. All Functions rhashtable_* working with const rhashtable_params provided by <linux/rhashtable.h>. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Fix gl_object warningsAndreas Gruenbacher2017-08-301-1/+1
| | | | | | | | | | | | | | | | The following cleanup is needed to avoid spilling the syslog with false warnings. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Fix up some sparse warningsBob Peterson2017-08-254-6/+9
| | | | | | | | | | | | | | | | | | This patch cleans up various pieces of GFS2 to avoid sparse errors. This doesn't fix them all, but it fixes several. The first error, in function glock_hash_walk was a genuine bug where the rhashtable could be started and not stopped. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Silence gcc format-truncation warningAndreas Gruenbacher2017-08-252-4/+4
| | | | | | | | | | | | | | | | | | Enlarge sd_fsname to be big enough for the longest long lock table name and an arbitrary journal number. This silences two -Wformat-truncation warnings with gcc 7.1.1. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Withdraw for IO errors writing to the journal or statfsBob Peterson2017-08-255-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, if GFS2 encountered IO errors while writing to the journal, it would not report the problem, so they would go unnoticed, sometimes for many hours. Sometimes this would only be noticed later, when recovery tried to do journal replay and failed due to invalid metadata at the blocks that resulted in IO errors. This patch makes GFS2's log daemon check for IO errors. If it encounters one, it withdraws from the file system and reports why in dmesg. A similar action is taken when IO errors occur when writing to the system statfs file. These errors are also reported back to any callers of fsync, since that requires the journal to be flushed. Therefore, any IO errors that would previously go unnoticed are now noticed and the file system is withdrawn as early as possible, thus preventing further file system damage. Also note that this reintroduces superblock variable sd_log_error, which Christoph removed with commit f729b66fca. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: fix slab corruption during mounting and umounting gfs file systemThomas Tai2017-08-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using cman-3.0.12.1 and gfs2-utils-3.0.12.1, mounting and unmounting GFS2 file system would cause kernel to hang. The slab allocator suggests that it is likely a double free memory corruption. The issue is traced back to v3.9-rc6 where a patch is submitted to use kzalloc() for storing a bitmap instead of using a local variable. The intention is to allocate memory during mount and to free memory during unmount. The original patch misses a code path which has already freed the memory and caused memory corruption. This patch sets the memory pointer to NULL after the memory is freed, so that double free memory corruption will not happen. gdlm_mount() '-- set_recover_size() which use kzalloc() '-- if dlm does not support ops callbacks then '--- free_recover_size() which use kfree() gldm_unmount() '-- free_recover_size() which use kfree() Previous patch which introduced the double free issue is commit 57c7310b8eb9 ("GFS2: use kmalloc for lvb bitmap") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
| * gfs2: forcibly flush ail to relieve memory pressureAbhi Das2017-08-103-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems with low memory, it is possible for gfs2 to infinitely loop in balance_dirty_pages() under heavy IO (creating sparse files). balance_dirty_pages() attempts to write out the dirty pages via gfs2_writepages() but none are found because these dirty pages are being used by the journaling code in the ail. Normally, the journal has an upper threshold which when hit triggers an automatic flush of the ail. But this threshold can be higher than the number of allowable dirty pages and result in the ail never being flushed. This patch forces an ail flush when gfs2_writepages() fails to write anything. This is a good indication that the ail might be holding some dirty pages. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Clean up waiting on glocksAndreas Gruenbacher2017-08-101-20/+7
| | | | | | | | | | | | | | | | | | The prepare_to_wait_on_glock and finish_wait_on_glock functions introduced in commit 56a365be "gfs2: gfs2_glock_get: Wait on freeing glocks" are better removed, resulting in cleaner code. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Defer deleting inodes under memory pressureAndreas Gruenbacher2017-08-101-0/+21
| | | | | | | | | | | | | | | | | | When under memory pressure and an inode's link count has dropped to zero, defer deleting the inode to the delete workqueue. This avoids calling into DLM under memory pressure, which can deadlock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: gfs2_evict_inode: Put glocks asynchronouslyAndreas Gruenbacher2017-08-103-3/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gfs2_evict_inode is called to free inodes under memory pressure. The function calls into DLM when an inode's last cluster-wide reference goes away (remote unlink) and to release the glock and associated DLM lock before finally destroying the inode. However, if DLM is blocked on memory to become available, calling into DLM again will deadlock. Avoid that by decoupling releasing glocks from destroying inodes in that case: with gfs2_glock_queue_put, glocks will be dequeued asynchronously in work queue context, when the associated inodes have likely already been destroyed. With this change, inodes can end up being unlinked, remote-unlink can be triggered, and then the inode can be reallocated before all remote-unlink callbacks are processed. To detect that, revalidate the link count in gfs2_evict_inode to make sure we're not deleting an allocated, referenced inode. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Get rid of gfs2_set_nlinkAndreas Gruenbacher2017-08-101-27/+1
| | | | | | | | | | | | | | | | | | | | | | | | Remove gfs2_set_nlink which prevents the link count of an inode from becoming non-zero once it has reached zero. The next commit reduces the amount of waiting on glocks when an inode is evicted from memory. With that, an inode can become reallocated before all the remote-unlink callbacks from a previous delete are processed, which causes the link count to change from zero to non-zero. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: gfs2_glock_get: Wait on freeing glocksAndreas Gruenbacher2017-08-101-22/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Keep glocks in their hash table until they are freed instead of removing them when their last reference is dropped. This allows to wait for any previous instances of a glock to go away in gfs2_glock_get before creating a new glocks. Special thanks to Andy Price for finding and fixing a problem which also required us to delete the rcu_read_unlock from the error case in function gfs2_glock_get. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * gfs2: Fix trivial typosAndreas Gruenbacher2017-08-092-2/+2
| | | | | | | | | | Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Delete debugfs files only after we evict the glocksBob Peterson2017-08-092-1/+1
| | | | | | | | | | | | | | | | This patch moves the call to gfs2_delete_debugfs_file so that it comes after the glock hash table has been cleared. This way we can query the debugfs files if umount hangs. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Don't waste time locking lru_lock for non-lru glocksBob Peterson2017-08-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Before this patch, glock_dq would call gfs2_glock_remove_from_lru. For glocks that are never put on the LRU, such as the transaction glock, this just takes the spin_lock, determines there's nothing to be done because the list is empty, then unlocks again. This was causing unnecessary lock contention on the lru_lock spin_lock. This patch adds a check for GLOF_LRU in the glops before taking the spin_lock. Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| * GFS2: Don't bother trying to add rgrps to the lru listBob Peterson2017-08-091-1/+0
| | | | | | | | | | | | | | | | | | This patch removes a call to gfs2_glock_add_to_lru from function gfs2_clear_rgrpd. The call is just a waste of time because as soon as it adds it to the lru_list, the call to gfs2_glock_put takes it back off again. Signed-off-by: Bob Peterson <rpeterso@redhat.com>