linux.git - Linux kernel mainline tree

	Commit message (Collapse)	Author	Age	Files	Lines
*	kunit: Rename 'kunitconfig' to '.kunitconfig'	SeongJae Park	2019-12-23	3	-10/+8
\| \| \| \| \| \| \| \| \| \| \|	This commit renames 'kunitconfig' to '.kunitconfig' so that it can be automatically ignored by git and do not disturb people who want to type 'kernel/' by pressing only the 'k' and then 'tab' key. Signed-off-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	kunit: Place 'test.log' under the 'build_dir'	SeongJae Park	2019-12-23	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \|	'kunit' writes the 'test.log' under the kernel source directory even though a 'build_dir' option is given. As users who use the option might expect the outputs to be placed under the specified directory, this commit modifies the logic to write the log file under the 'build_dir'. Signed-off-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	kunit: Create default config in '--build_dir'	SeongJae Park	2019-12-23	2	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	If both '--build_dir' and '--defconfig' are given, the handling of '--defconfig' ignores '--build_dir' option. This commit modifies the behavior to respect '--build_dir' option. Reported-by: Brendan Higgins <brendanhiggins@google.com> Suggested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	kunit: Remove duplicated defconfig creation	SeongJae Park	2019-12-23	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \|	'--defconfig' option is handled by the 'main() of the 'kunit.py' but again handled in following 'run_tests()'. This commit removes this duplicated handling of the option in the 'run_tests()'. Signed-off-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	docs/kunit/start: Use in-tree 'kunit_defconfig'	SeongJae Park	2019-12-23	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The kunit doc suggests users to get the default `kunitconfig` from an external git tree. However, the file is already located under the `arch/um/configs/` of the kernel tree. Because the local file is easier to access and maintain, this commit updates the doc to use it. Signed-off-by: SeongJae Park <sjpark@amazon.de> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Tested-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	selftests: livepatch: Fix it to do root uid check and skip	Shuah Khan	2019-12-23	2	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	livepatch test configures the system and debug environment to run tests. Some of these actions fail without root access and test dumps several permission denied messages before it exits. Fix test-state.sh to call setup_config instead of set_dynamic_debug as suggested by Petr Mladek <pmladek@suse.com> Fix it to check root uid and exit with skip code instead. Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	selftests: firmware: Fix it to do root uid check and skip	Shuah Khan	2019-12-23	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	firmware attempts to load test modules that require root access and fail. Fix it to check for root uid and exit with skip code instead. Before this fix: selftests: firmware: fw_run_tests.sh modprobe: ERROR: could not insert 'test_firmware': Operation not permitted You must have the following enabled in your kernel: CONFIG_TEST_FIRMWARE=y CONFIG_FW_LOADER=y CONFIG_FW_LOADER_USER_HELPER=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y not ok 1 selftests: firmware: fw_run_tests.sh # SKIP With this fix: selftests: firmware: fw_run_tests.sh skip all tests: must be run as root not ok 1 selftests: firmware: fw_run_tests.sh # SKIP Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Reviwed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	selftests: filesystems/epoll: fix build error	Shuah Khan	2019-12-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	epoll build fails to find pthread lib. Fix Makefile to use LDLIBS instead of LDFLAGS. LDLIBS is the right flag to use here with -l option when invoking ld. gcc -I../../../../../usr/include/ -lpthread epoll_wakeup_test.c -o .../tools/testing/selftests/filesystems/epoll/epoll_wakeup_test /usr/bin/ld: /tmp/ccaZvJUl.o: in function `kill_timeout': epoll_wakeup_test.c:(.text+0x4dd): undefined reference to `pthread_kill' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x4f2): undefined reference to `pthread_kill' /usr/bin/ld: /tmp/ccaZvJUl.o: in function `epoll9': epoll_wakeup_test.c:(.text+0x6382): undefined reference to `pthread_create' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x64d2): undefined reference to `pthread_create' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x6626): undefined reference to `pthread_join' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x684c): undefined reference to `pthread_tryjoin_np' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x6864): undefined reference to `pthread_kill' /usr/bin/ld: epoll_wakeup_test.c:(.text+0x6878): undefined reference to `pthread_join' Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
*	Linux 5.5-rc3v5.5-rc3	Linus Torvalds	2019-12-22	1	-1/+1
\|
*	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds	2019-12-22	6	-3/+18
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull vfs fixes from Al Viro: "Eric's s_inodes softlockup fixes + Jan's fix for recent regression from pipe rework" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: call fsnotify_sb_delete after evict_inodes fs: avoid softlockups in s_inodes iterators pipe: Fix bogus dereference in iov_iter_alignment()
\| *	fs: call fsnotify_sb_delete after evict_inodes	Eric Sandeen	2019-12-18	2	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a filesystem is unmounted, we currently call fsnotify_sb_delete() before evict_inodes(), which means that fsnotify_unmount_inodes() must iterate over all inodes on the superblock looking for any inodes with watches. This is inefficient and can lead to livelocks as it iterates over many unwatched inodes. At this point, SB_ACTIVE is gone and dropping refcount to zero kicks the inode out out immediately, so anything processed by fsnotify_sb_delete / fsnotify_unmount_inodes gets evicted in that loop. After that, the call to evict_inodes will evict everything else with a zero refcount. This should speed things up overall, and avoid livelocks in fsnotify_unmount_inodes(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	fs: avoid softlockups in s_inodes iterators	Eric Sandeen	2019-12-18	4	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Anything that walks all inodes on sb->s_inodes list without rescheduling risks softlockups. Previous efforts were made in 2 functions, see: c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() ac05fbb inode: don't softlockup when evicting inodes but there hasn't been an audit of all walkers, so do that now. This also consistently moves the cond_resched() calls to the bottom of each loop in cases where it already exists. One loop remains: remove_dquot_ref(), because I'm not quite sure how to deal with that one w/o taking the i_lock. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
\| *	pipe: Fix bogus dereference in iov_iter_alignment()	Jan Kara	2019-12-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We cannot look at 'i->pipe' unless we know the iter is a pipe. Move the ring_size load to a branch in iov_iter_alignment() where we've already checked the iter is a pipe to avoid bogus dereference. Reported-by: syzbot+bea68382bae9490e7dd6@syzkaller.appspotmail.com Fixes: 8cefc107ca54 ("pipe: Use head and tail pointers for the ring, not cursor and length") Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	Merge tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds	2019-12-22	13	-104/+341
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull xfs fixes from Darrick Wong: "Fix a few bugs that could lead to corrupt files, fsck complaints, and filesystem crashes: - Minor documentation fixes - Fix a file corruption due to read racing with an insert range operation. - Fix log reservation overflows when allocating large rt extents - Fix a buffer log item flags check - Don't allow administrators to mount with sunit= options that will cause later xfs_repair complaints about the root directory being suspicious because the fs geometry appeared inconsistent - Fix a non-static helper that should have been static" * tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Make the symbol 'xfs_rtalloc_log_count' static xfs: don't commit sunit/swidth updates to disk if that would cause repair failures xfs: split the sunit parameter update into two parts xfs: refactor agfl length computation function libxfs: resync with the userspace libxfs xfs: use bitops interface for buf log item AIL flag check xfs: fix log reservation overflows when allocating large rt extents xfs: stabilize insert range start boundary to avoid COW writeback race xfs: fix Sphinx documentation warning
\| * \|	xfs: Make the symbol 'xfs_rtalloc_log_count' static	Chen Wandun	2019-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the following sparse warning: fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static? Fixes: b1de6fc7520f ("xfs: fix log reservation overflows when allocating large rt extents") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
\| * \|	xfs: don't commit sunit/swidth updates to disk if that would cause repair ↵	Darrick J. Wong	2019-12-19	4	-1/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	failures Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
\| * \|	xfs: split the sunit parameter update into two parts	Darrick J. Wong	2019-12-19	1	-51/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the administrator provided a sunit= mount option, we need to validate the raw parameter, convert the mount option units (512b blocks) into the internal unit (fs blocks), and then validate that the (now cooked) parameter doesn't screw anything up on disk. The incore inode geometry computation can depend on the new sunit option, but a subsequent patch will make validating the cooked value depends on the computed inode geometry, so break the sunit update into two steps. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
\| * \|	xfs: refactor agfl length computation function	Darrick J. Wong	2019-12-19	1	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor xfs_alloc_min_freelist to accept a NULL @pag argument, in which case it returns the largest possible minimum length. This will be used in an upcoming patch to compute the length of the AGFL at mkfs time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
\| * \|	libxfs: resync with the userspace libxfs	Darrick J. Wong	2019-12-19	4	-24/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
\| * \|	xfs: use bitops interface for buf log item AIL flag check	Brian Foster	2019-12-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xfs_log_item flags were converted to atomic bitops as of commit 22525c17ed ("xfs: log item flags are racy"). The assert check for AIL presence in xfs_buf_item_relse() still uses the old value based check. This likely went unnoticed as XFS_LI_IN_AIL evaluates to 0 and causes the assert to unconditionally pass. Fix up the check. Signed-off-by: Brian Foster <bfoster@redhat.com> Fixes: 22525c17ed ("xfs: log item flags are racy") Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
\| * \|	xfs: fix log reservation overflows when allocating large rt extents	Darrick J. Wong	2019-12-17	1	-19/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Omar Sandoval reported that a 4G fallocate on the realtime device causes filesystem shutdowns due to a log reservation overflow that happens when we log the rtbitmap updates. Factor rtbitmap/rtsummary updates into the the tr_write and tr_itruncate log reservation calculation. "The following reproducer results in a transaction log overrun warning for me: mkfs.xfs -f -r rtdev=/dev/vdc -d rtinherit=1 -m reflink=0 /dev/vdb mount -o rtdev=/dev/vdc /dev/vdb /mnt fallocate -l 4G /mnt/foo Reported-by: Omar Sandoval <osandov@osandov.com> Tested-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
\| * \|	xfs: stabilize insert range start boundary to avoid COW writeback race	Brian Foster	2019-12-11	2	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	generic/522 (fsx) occasionally fails with a file corruption due to an insert range operation. The primary characteristic of the corruption is a misplaced insert range operation that differs from the requested target offset. The reason for this behavior is a race between the extent shift sequence of an insert range and a COW writeback completion that causes a front merge with the first extent in the shift. The shift preparation function flushes and unmaps from the target offset of the operation to the end of the file to ensure no modifications can be made and page cache is invalidated before file data is shifted. An insert range operation then splits the extent at the target offset, if necessary, and begins to shift the start offset of each extent starting from the end of the file to the start offset. The shift sequence operates at extent level and so depends on the preparation sequence to guarantee no changes can be made to the target range during the shift. If the block immediately prior to the target offset was dirty and shared, however, it can undergo writeback and move from the COW fork to the data fork at any point during the shift. If the block is contiguous with the block at the start offset of the insert range, it can front merge and alter the start offset of the extent. Once the shift sequence reaches the target offset, it shifts based on the latest start offset and silently changes the target offset of the operation and corrupts the file. To address this problem, update the shift preparation code to stabilize the start boundary along with the full range of the insert. Also update the existing corruption check to fail if any extent is shifted with a start offset behind the target offset of the insert range. This prevents insert from racing with COW writeback completion and fails loudly in the event of an unexpected extent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
\| * \|	xfs: fix Sphinx documentation warning	Randy Dunlap	2019-12-11	1	-1/+1
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix Sphinx documentation format warning by not indenting so much. Documentation/admin-guide/xfs.rst:257: WARNING: Block quote ends without a blank line; unexpected unindent. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: linux-xfs@vger.kernel.org Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
* \|	Merge tag 'ext4_for_linus_stable' of ↵	Linus Torvalds	2019-12-22	8	-104/+116
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "Ext4 bug fixes, including a regression fix" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: clarify impact of 'commit' mount option ext4: fix unused-but-set-variable warning in ext4_add_entry() jbd2: fix kernel-doc notation warning ext4: use RCU API in debug_print_tree ext4: validate the debug_want_extra_isize mount option at parse time ext4: reserve revoke credits in __ext4_new_inode ext4: unlock on error in ext4_expand_extra_isize() ext4: optimize __ext4_check_dir_entry() ext4: check for directory entries too close to block end ext4: fix ext4_empty_dir() for directories with holes
\| * \|	ext4: clarify impact of 'commit' mount option	Jan Kara	2019-12-21	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The description of 'commit' mount option dates back to ext3 times. Update the description to match current meaning for ext4. Reported-by: Paul Richards <paul.richards@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191218111210.14161-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: fix unused-but-set-variable warning in ext4_add_entry()	Yunfeng Ye	2019-12-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Warning is found when compile with "-Wunused-but-set-variable": fs/ext4/namei.c: In function ‘ext4_add_entry’: fs/ext4/namei.c:2167:23: warning: variable ‘sbi’ set but not used [-Wunused-but-set-variable] struct ext4_sb_info *sbi; ^~~ Fix this by moving the variable @sbi under CONFIG_UNICODE. Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/cb5eb904-224a-9701-c38f-cb23514b1fff@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	jbd2: fix kernel-doc notation warning	Randy Dunlap	2019-12-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix kernel-doc warning by inserting a beginning '*' character for the kernel-doc line. ../include/linux/jbd2.h:461: warning: bad line: journal. These are dirty buffers and revoke descriptor blocks. Link: https://lore.kernel.org/r/53e3ce27-ceae-560d-0fd4-f95728a33e12@infradead.org Cc: stable@kernel.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: use RCU API in debug_print_tree	Phong Tran	2019-12-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	struct ext4_sb_info.system_blks was marked __rcu. But access the pointer without using RCU lock and dereference. Sparse warning with __rcu notation: block_validity.c:139:29: warning: incorrect type in argument 1 (different address spaces) block_validity.c:139:29: expected struct rb_root const * block_validity.c:139:29: got struct rb_root [noderef] <asn:4> * Link: https://lore.kernel.org/r/20191213153306.30744-1-tranmanphong@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Phong Tran <tranmanphong@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: validate the debug_want_extra_isize mount option at parse time	Theodore Ts'o	2019-12-15	1	-74/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of setting s_want_extra_size and then making sure that it is a valid value afterwards, validate the field before we set it. This avoids races and other problems when remounting the file system. Link: https://lore.kernel.org/r/20191215063020.GA11512@mit.edu Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-and-tested-by: syzbot+4a39a025912b265cacef@syzkaller.appspotmail.com
\| * \|	ext4: reserve revoke credits in __ext4_new_inode	yangerkun	2019-12-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible that __ext4_new_inode will release the xattr block, so it will trigger a warning since there is revoke credits will be 0 if the handle == NULL. The below scripts can reproduce it easily. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3861 at fs/jbd2/revoke.c:374 jbd2_journal_revoke+0x30e/0x540 fs/jbd2/revoke.c:374 ... __ext4_forget+0x1d7/0x800 fs/ext4/ext4_jbd2.c:248 ext4_free_blocks+0x213/0x1d60 fs/ext4/mballoc.c:4743 ext4_xattr_release_block+0x55b/0x780 fs/ext4/xattr.c:1254 ext4_xattr_block_set+0x1c2c/0x2c40 fs/ext4/xattr.c:2112 ext4_xattr_set_handle+0xa7e/0x1090 fs/ext4/xattr.c:2384 __ext4_set_acl+0x54d/0x6c0 fs/ext4/acl.c:214 ext4_init_acl+0x218/0x2e0 fs/ext4/acl.c:293 __ext4_new_inode+0x352a/0x42b0 fs/ext4/ialloc.c:1151 ext4_mkdir+0x2e9/0xbd0 fs/ext4/namei.c:2774 vfs_mkdir+0x386/0x5f0 fs/namei.c:3811 do_mkdirat+0x11c/0x210 fs/namei.c:3834 do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:294 ... ------------------------------------- scripts: mkfs.ext4 /dev/vdb mount /dev/vdb /mnt cd /mnt && mkdir dir && for i in {1..8}; do setfacl -dm "u:user_"$i":rx" dir; done mkdir dir/dir1 && mv dir/dir1 ./ sh repro.sh && add some user [root@localhost ~]# cat repro.sh while [ 1 -eq 1 ]; do rm -rf dir rm -rf dir1/dir1 mkdir dir for i in {1..8}; do setfacl -dm "u:test"$i":rx" dir; done setfacl -m "u:user_9:rx" dir & mkdir dir1/dir1 & done Before exec repro.sh, dir1 has inherit the default acl from dir, and xattr block of dir1 dir is not the same, so the h_refcount of these two dir's xattr block will be 1. Then repro.sh can trigger the warning with the situation show as below. The last h_refcount can be clear with mkdir, and __ext4_new_inode has not reserved revoke credits, so the warning will happened, fix it by reserve revoke credits in __ext4_new_inode. Thread 1 Thread 2 mkdir dir set default acl(will create a xattr block blk1 and the refcount of ext4_xattr_header will be 1) ... mkdir dir1/dir1 ->....->ext4_init_acl ->__ext4_set_acl(set default acl, will reuse blk1, and h_refcount will be 2) setfacl->ext4_set_acl->... ->ext4_xattr_block_set(will create new block blk2 to store xattr) ->__ext4_set_acl(set access acl, since h_refcount of blk1 is 2, will create blk3 to store xattr) ->ext4_xattr_release_block(dec h_refcount of blk1 to 1) ->ext4_xattr_release_block(dec h_refcount and since it is 0, will release the block and trigger the warning) Link: https://lore.kernel.org/r/20191213014900.47228-1-yangerkun@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: unlock on error in ext4_expand_extra_isize()	Dan Carpenter	2019-12-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to unlock the xattr before returning on this error path. Cc: stable@kernel.org # 4.13 Fixes: c03b45b853f5 ("ext4, project: expand inode extra size if possible") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20191213185010.6k7yl2tck3wlsdkt@kili.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: optimize __ext4_check_dir_entry()	Theodore Ts'o	2019-12-14	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make __ext4_check_dir_entry() a bit easier to understand, and reduce the object size of the function by over 11%. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20191209004346.38526-1-tytso@mit.edu Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: check for directory entries too close to block end	Jan Kara	2019-12-14	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ext4_check_dir_entry() currently does not catch a case when a directory entry ends so close to the block end that the header of the next directory entry would not fit in the remaining space. This can lead to directory iteration code trying to access address beyond end of current buffer head leading to oops. CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191202170213.4761-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
\| * \|	ext4: fix ext4_empty_dir() for directories with holes	Jan Kara	2019-12-14	1	-14/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function ext4_empty_dir() doesn't correctly handle directories with holes and crashes on bh->b_data dereference when bh is NULL. Reorganize the loop to use 'offset' variable all the times instead of comparing pointers to current direntry with bh->b_data pointer. Also add more strict checking of '.' and '..' directory entries to avoid entering loop in possibly invalid state on corrupted filesystems. References: CVE-2019-19037 CC: stable@vger.kernel.org Fixes: 4e19d6b65fb4 ("ext4: allow directory holes") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20191202170213.4761-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* \| \|	Merge tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block	Linus Torvalds	2019-12-22	12	-39/+37
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull block fixes from Jens Axboe: "Let's try this one again, this time without the compat_ioctl changes. We've got those fixed up, but that can go out next week. This contains: - block queue flush lockdep annotation (Bart) - Type fix for bsg_queue_rq() (Bart) - Three dasd fixes (Stefan, Jan) - nbd deadlock fix (Mike) - Error handling bio user map fix (Yang) - iocost fix (Tejun) - sbitmap waitqueue addition fix that affects the kyber IO scheduler (David)" * tag 'block-5.5-20191221' of git://git.kernel.dk/linux-block: sbitmap: only queue kyber's wait callback if not already active block: fix memleak when __blk_rq_map_user_iov() is failed s390/dasd: fix typo in copyright statement s390/dasd: fix memleak in path handling error case s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly block: Fix a lockdep complaint triggered by request queue flushing block: Fix the type of 'sts' in bsg_queue_rq() block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT nbd: fix shutdown and recv work deadlock v2 iocost: over-budget forced IOs should schedule async delay
\| * \| \|	sbitmap: only queue kyber's wait callback if not already active	David Jeffery	2019-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under heavy loads where the kyber I/O scheduler hits the token limits for its scheduling domains, kyber can become stuck. When active requests complete, kyber may not be woken up leaving the I/O requests in kyber stuck. This stuck state is due to a race condition with kyber and the sbitmap functions it uses to run a callback when enough requests have completed. The running of a sbt_wait callback can race with the attempt to insert the sbt_wait. Since sbitmap_del_wait_queue removes the sbt_wait from the list first then sets the sbq field to NULL, kyber can see the item as not on a list but the call to sbitmap_add_wait_queue will see sbq as non-NULL. This results in the sbt_wait being inserted onto the wait list but ws_active doesn't get incremented. So the sbitmap queue does not know there is a waiter on a wait list. Since sbitmap doesn't think there is a waiter, kyber may never be informed that there are domain tokens available and the I/O never advances. With the sbt_wait on a wait list, kyber believes it has an active waiter so cannot insert a new waiter when reaching the domain's full state. This race can be fixed by only adding the sbt_wait to the queue if the sbq field is NULL. If sbq is not NULL, there is already an action active which will trigger the re-running of kyber. Let it run and add the sbt_wait to the wait list if still needing to wait. Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Reported-by: John Pittman <jpittman@redhat.com> Tested-by: John Pittman <jpittman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	block: fix memleak when __blk_rq_map_user_iov() is failed	Yang Yingliang	2019-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When I doing fuzzy test, get the memleak report: BUG: memory leak unreferenced object 0xffff88837af80000 (size 4096): comm "memleak", pid 3557, jiffies 4294817681 (age 112.499s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 20 00 00 00 10 01 00 00 00 00 00 00 01 00 00 00 ............... backtrace: [<000000001c894df8>] bio_alloc_bioset+0x393/0x590 [<000000008b139a3c>] bio_copy_user_iov+0x300/0xcd0 [<00000000a998bd8c>] blk_rq_map_user_iov+0x2f1/0x5f0 [<000000005ceb7f05>] blk_rq_map_user+0xf2/0x160 [<000000006454da92>] sg_common_write.isra.21+0x1094/0x1870 [<00000000064bb208>] sg_write.part.25+0x5d9/0x950 [<000000004fc670f6>] sg_write+0x5f/0x8c [<00000000b0d05c7b>] __vfs_write+0x7c/0x100 [<000000008e177714>] vfs_write+0x1c3/0x500 [<0000000087d23f34>] ksys_write+0xf9/0x200 [<000000002c8dbc9d>] do_syscall_64+0x9f/0x4f0 [<00000000678d8e9a>] entry_SYSCALL_64_after_hwframe+0x49/0xbe If __blk_rq_map_user_iov() is failed in blk_rq_map_user_iov(), the bio(s) which is allocated before this failing will leak. The refcount of the bio(s) is init to 1 and increased to 2 by calling bio_get(), but __blk_rq_unmap_user() only decrease it to 1, so the bio cannot be freed. Fix it by calling blk_rq_unmap_user(). Reviewed-by: Bob Liu <bob.liu@oracle.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	s390/dasd: fix typo in copyright statement	Stefan Haberland	2019-12-20	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coypright -> copyright Reported-by: Kate Stewart <kstewart@linuxfoundation.org> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	s390/dasd: fix memleak in path handling error case	Stefan Haberland	2019-12-20	1	-17/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If for whatever reason the dasd_eckd_check_characteristics() function exits after at least some paths have their configuration data allocated those data is never freed again. In the error case the device->private pointer is set to NULL and dasd_eckd_uncheck_device() will exit without freeing the path data because of this NULL pointer. Fix by calling dasd_eckd_clear_conf_data() for error cases. Also use dasd_eckd_clear_conf_data() in dasd_eckd_uncheck_device() to avoid code duplication. Reported-by: Qian Cai <cai@lca.pw> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	s390/dasd/cio: Interpret ccw_device_get_mdc return value correctly	Jan Höppner	2019-12-20	2	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The max data count (mdc) is an unsigned 16-bit integer value as per AR documentation and is received via ccw_device_get_mdc() for a specific path mask from the CIO layer. The function itself also always returns a positive mdc value or 0 in case mdc isn't supported or couldn't be determined. Though, the comment for this function describes a negative return value to indicate failures. As a result, the DASD device driver interprets the return value of ccw_device_get_mdc() incorrectly. The error case is essentially a dead code path. To fix this behaviour, check explicitly for a return value of 0 and change the comment for ccw_device_get_mdc() accordingly. This fix merely enables the error code path in the DASD functions get_fcx_max_data() and verify_fcx_max_data(). The actual functionality stays the same and is still correct. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	block: Fix a lockdep complaint triggered by request queue flushing	Bart Van Assche	2019-12-20	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid that running test nvme/012 from the blktests suite triggers the following false positive lockdep complaint: ============================================ WARNING: possible recursive locking detected 5.0.0-rc3-xfstests-00015-g1236f7d60242 #841 Not tainted -------------------------------------------- ksoftirqd/1/16 is trying to acquire lock: 000000000282032e (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0 but task is already holding lock: 00000000cbadcbc2 (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&fq->mq_flush_lock)->rlock); lock(&(&fq->mq_flush_lock)->rlock); * DEADLOCK * May be due to missing lock nesting notation 1 lock held by ksoftirqd/1/16: #0: 00000000cbadcbc2 (&(&fq->mq_flush_lock)->rlock){..-.}, at: flush_end_io+0x4e/0x1d0 stack backtrace: CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.0.0-rc3-xfstests-00015-g1236f7d60242 #841 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: dump_stack+0x67/0x90 __lock_acquire.cold.45+0x2b4/0x313 lock_acquire+0x98/0x160 _raw_spin_lock_irqsave+0x3b/0x80 flush_end_io+0x4e/0x1d0 blk_mq_complete_request+0x76/0x110 nvmet_req_complete+0x15/0x110 [nvmet] nvmet_bio_done+0x27/0x50 [nvmet] blk_update_request+0xd7/0x2d0 blk_mq_end_request+0x1a/0x100 blk_flush_complete_seq+0xe5/0x350 flush_end_io+0x12f/0x1d0 blk_done_softirq+0x9f/0xd0 __do_softirq+0xca/0x440 run_ksoftirqd+0x24/0x50 smpboot_thread_fn+0x113/0x1e0 kthread+0x121/0x140 ret_from_fork+0x3a/0x50 Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	block: Fix the type of 'sts' in bsg_queue_rq()	Bart Van Assche	2019-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following sparse warnings: block/bsg-lib.c:269:19: warning: incorrect type in initializer (different base types) block/bsg-lib.c:269:19: expected int sts block/bsg-lib.c:269:19: got restricted blk_status_t [usertype] block/bsg-lib.c:286:16: warning: incorrect type in return expression (different base types) block/bsg-lib.c:286:16: expected restricted blk_status_t block/bsg-lib.c:286:16: got int [assigned] sts Cc: Martin Wilck <mwilck@suse.com> Fixes: d46fe2cb2dce ("block: drop device references in bsg_queue_rq()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT	Roman Penyaev	2019-12-17	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Non-mq devs do not honor REQ_NOWAIT so give a chance to the caller to repeat request gracefully on -EAGAIN error. The problem is well reproduced using io_uring: mkfs.ext4 /dev/ram0 mount /dev/ram0 /mnt # Preallocate a file dd if=/dev/zero of=/mnt/file bs=1M count=1 # Start fio with io_uring and get -EIO fio --rw=write --ioengine=io_uring --size=1M --direct=1 --name=job --filename=/mnt/file Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	nbd: fix shutdown and recv work deadlock v2	Mike Christie	2019-12-16	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a regression added with: commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 Author: Mike Christie <mchristi@redhat.com> Date: Sun Aug 4 14:10:06 2019 -0500 nbd: fix max number of supported devs where we can deadlock during device shutdown. The problem occurs if the recv_work's nbd_config_put occurs after nbd_start_device_ioctl has returned and the userspace app has droppped its reference via closing the device and running nbd_release. The recv_work nbd_config_put call would then drop the refcount to zero and try to destroy the config which would try to do destroy_workqueue from the recv work. This patch just has nbd_start_device_ioctl do a flush_workqueue when it wakes so we know after the ioctl returns running works have exited. This also fixes a possible race where we could try to reuse the device while old recv_works are still running. Cc: stable@vger.kernel.org Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs") Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
\| * \| \|	iocost: over-budget forced IOs should schedule async delay	Tejun Heo	2019-12-16	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When over-budget IOs are force-issued through root cgroup, iocg_kick_delay() adjusts the async delay accordingly but doesn't actually schedule async throttle for the issuing task. This bug is pretty well masked because sooner or later the offending threads are gonna get directly throttled on regular IOs or have async delay scheduled by mem_cgroup_throttle_swaprate(). However, it can affect control quality on filesystem metadata heavy operations. Let's fix it by invoking blkcg_schedule_throttle() when iocg_kick_delay() says async delay is needed. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Cc: stable@vger.kernel.org Reported-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* \| \| \|	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	2019-12-22	8	-46/+65
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull KVM fixes from Paolo Bonzini: "PPC: - Fix a bug where we try to do an ultracall on a system without an ultravisor KVM: - Fix uninitialised sysreg accessor - Fix handling of demand-paged device mappings - Stop spamming the console on IMPDEF sysregs - Relax mappings of writable memslots - Assorted cleanups MIPS: - Now orphan, James Hogan is stepping down x86: - MAINTAINERS change, so long Radim and thanks for all the fish - supported CPUID fixes for AMD machines without SPEC_CTRL" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MAINTAINERS: remove Radim from KVM maintainers MAINTAINERS: Orphan KVM for MIPS kvm: x86: Host feature SSBD doesn't imply guest feature AMD_SSBD kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBD KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor KVM: arm/arm64: Properly handle faulting of device mappings KVM: arm64: Ensure 'params' is initialised when looking up sys register KVM: arm/arm64: Remove excessive permission check in kvm_arch_prepare_memory_region KVM: arm64: Don't log IMP DEF sysreg traps KVM: arm64: Sanely ratelimit sysreg messages KVM: arm/arm64: vgic: Use wrapper function to lock/unlock all vcpus in kvm_vgic_create() KVM: arm/arm64: vgic: Fix potential double free dist->spis in __kvm_vgic_destroy() KVM: arm/arm64: Get rid of unused arg in cpu_init_hyp_mode()
\| * \ \ \	Merge tag 'kvm-ppc-fixes-5.5-1' of ↵	Paolo Bonzini	2019-12-22	73	-696/+2712
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fix for 5.5 - Fix a bug where we try to do an ultracall on a system without an ultravisor.
\| \| * \| \| \|	KVM: PPC: Book3S HV: Don't do ultravisor calls on systems without ultravisor	Paul Mackerras	2019-12-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest") added a call to uv_svm_terminate, which is an ultravisor call, without any check that the guest is a secure guest or even that the system has an ultravisor. On a system without an ultravisor, the ultracall will degenerate to a hypercall, but since we are not in KVM guest context, the hypercall will get treated as a system call, which could have random effects depending on what happens to be in r0, and could also corrupt the current task's kernel stack. Hence this adds a test for the guest being a secure guest before doing uv_svm_terminate(). Fixes: 22945688acd4 ("KVM: PPC: Book3S HV: Support reset of secure guest") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
\| * \| \| \| \|	MAINTAINERS: remove Radim from KVM maintainers	Paolo Bonzini	2019-12-22	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Radim's kernel.org email is bouncing, which I take as a signal that he is not really able to deal with KVM at this time. Make MAINTAINERS match the effective value of KVM's bus factor. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
\| * \| \| \| \|	MAINTAINERS: Orphan KVM for MIPS	James Hogan	2019-12-22	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I haven't been active for 18 months, and don't have the hardware set up to test KVM for MIPS, so mark it as orphaned and remove myself as maintainer. Hopefully somebody from MIPS can pick this up. Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: kvm@vger.kernel.org Cc: linux-mips@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>