summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* cifs: force interface update before a fresh session setupShyam Prasad N2023-11-281-1/+5
| | | | | | | | | | | | | | | | commit d9a6d78096056a3cb5c5f07a730ab92f2f9ac4e6 upstream. During a session reconnect, it is possible that the server moved to another physical server (happens in case of Azure files). So at this time, force a query of server interfaces again (in case of multichannel session), such that the secondary channels connect to the right IP addresses (possibly updated now). Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* cifs: reconnect helper should set reconnect for the right channelShyam Prasad N2023-11-281-4/+5
| | | | | | | | | | | | | | | | | | commit c3326a61cdbf3ce1273d9198b6cbf90965d7e029 upstream. We introduced a helper function to be used by non-cifsd threads to mark the connection for reconnect. For multichannel, when only a particular channel needs to be reconnected, this had a bug. This change fixes that by marking that particular channel for reconnect. Fixes: dca65818c80c ("cifs: use a different reconnect helper for non-cifsd threads") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb: client: fix potential deadlock when releasing midsPaulo Alcantara2023-11-283-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit e6322fd177c6885a21dd4609dc5e5c973d1a2eb7 upstream. All release_mid() callers seem to hold a reference of @mid so there is no need to call kref_put(&mid->refcount, __release_mid) under @server->mid_lock spinlock. If they don't, then an use-after-free bug would have occurred anyways. By getting rid of such spinlock also fixes a potential deadlock as shown below CPU 0 CPU 1 ------------------------------------------------------------------ cifs_demultiplex_thread() cifs_debug_data_proc_show() release_mid() spin_lock(&server->mid_lock); spin_lock(&cifs_tcp_ses_lock) spin_lock(&server->mid_lock) __release_mid() smb2_find_smb_tcon() spin_lock(&cifs_tcp_ses_lock) *deadlock* Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb: client: fix use-after-free in smb2_query_info_compound()Paulo Alcantara2023-11-281-35/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5c86919455c1edec99ebd3338ad213b59271a71b upstream. The following UAF was triggered when running fstests generic/072 with KASAN enabled against Windows Server 2022 and mount options 'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm' BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs] Read of size 8 at addr ffff888014941048 by task xfs_io/27534 CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? __phys_addr+0x46/0x90 kasan_report+0xda/0x110 ? smb2_query_info_compound+0x423/0x6d0 [cifs] ? smb2_query_info_compound+0x423/0x6d0 [cifs] smb2_query_info_compound+0x423/0x6d0 [cifs] ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __stack_depot_save+0x39/0x480 ? kasan_save_stack+0x33/0x60 ? kasan_set_track+0x25/0x30 ? ____kasan_slab_free+0x126/0x170 smb2_queryfs+0xc2/0x2c0 [cifs] ? __pfx_smb2_queryfs+0x10/0x10 [cifs] ? __pfx___lock_acquire+0x10/0x10 smb311_queryfs+0x210/0x220 [cifs] ? __pfx_smb311_queryfs+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __lock_acquire+0x480/0x26c0 ? lock_release+0x1ed/0x640 ? srso_alias_return_thunk+0x5/0x7f ? do_raw_spin_unlock+0x9b/0x100 cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 ? __pfx___do_sys_fstatfs+0x10/0x10 ? srso_alias_return_thunk+0x5/0x7f ? lockdep_hardirqs_on_prepare+0x136/0x200 ? srso_alias_return_thunk+0x5/0x7f do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Allocated by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x8f/0xa0 open_cached_dir+0x71b/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] smb311_queryfs+0x210/0x220 [cifs] cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 ____kasan_slab_free+0x126/0x170 slab_free_freelist_hook+0xd0/0x1e0 __kmem_cache_free+0x9d/0x1b0 open_cached_dir+0xff5/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] This is a race between open_cached_dir() and cached_dir_lease_break() where the cache entry for the open directory handle receives a lease break while creating it. And before returning from open_cached_dir(), we put the last reference of the new @cfid because of !@cfid->has_lease. Besides the UAF, while running xfstests a lot of missed lease breaks have been noticed in tests that run several concurrent statfs(2) calls on those cached fids CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108 CIFS: VFS: Dump pending requests: CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108 ... To fix both, in open_cached_dir() ensure that @cfid->has_lease is set right before sending out compounded request so that any potential lease break will be get processed by demultiplex thread while we're still caching @cfid. And, if open failed for some reason, re-check @cfid->has_lease to decide whether or not put lease reference. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb: client: fix use-after-free bug in cifs_debug_data_proc_show()Paulo Alcantara2023-11-281-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d328c09ee9f15ee5a26431f5aad7c9239fa85e62 upstream. Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @ses. This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting [ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ... [ 816.260138] Call Trace: [ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381 Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb3: fix caching of ctime on setxattrSteve French2023-11-281-1/+4
| | | | | | | | | | | | | commit 5923d6686a100c2b4cabd4c2ca9d5a12579c7614 upstream. Fixes xfstest generic/728 which had been failing due to incorrect ctime after setxattr and removexattr Update ctime on successful set of xattr Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb3: allow dumping session and tcon id to improve stats analysis and debuggingSteve French2023-11-282-0/+31
| | | | | | | | | | | | | | | | | | | | | | | commit de4eceab578ead12a71e5b5588a57e142bbe8ceb upstream. When multiple mounts are to the same share from the same client it was not possible to determine which section of /proc/fs/cifs/Stats (and DebugData) correspond to that mount. In some recent examples this turned out to be a significant problem when trying to analyze performance data - since there are many cases where unless we know the tree id and session id we can't figure out which stats (e.g. number of SMB3.1.1 requests by type, the total time they take, which is slowest, how many fail etc.) apply to which mount. The only existing loosely related ioctl CIFS_IOC_GET_MNT_INFO does not return the information needed to uniquely identify which tcon is which mount although it does return various flags and device info. Add a cifs.ko ioctl CIFS_IOC_GET_TCON_INFO (0x800ccf0c) to return tid, session id, tree connect count. Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb3: fix touch -h of symlinkSteve French2023-11-281-0/+1
| | | | | | | | | | | | | | | | | commit 475efd9808a3094944a56240b2711349e433fb66 upstream. For example: touch -h -t 02011200 testfile where testfile is a symlink would not change the timestamp, but touch -t 02011200 testfile does work to change the timestamp of the target Suggested-by: David Howells <dhowells@redhat.com> Reported-by: Micah Veilleux <micah.veilleux@iba-group.com> Closes: https://bugzilla.samba.org/show_bug.cgi?id=14476 Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* smb3: fix creating FIFOs when mounting with "sfu" mount optionSteve French2023-11-283-2/+12
| | | | | | | | | | | | | | | | | | | | | | | commit 72bc63f5e23a38b65ff2a201bdc11401d4223fa9 upstream. Fixes some xfstests including generic/564 and generic/157 The "sfu" mount option can be useful for creating special files (character and block devices in particular) but could not create FIFOs. It did recognize existing empty files with the "system" attribute flag as FIFOs but this is too general, so to support creating FIFOs more safely use a new tag (but the same length as those for char and block devices ie "IntxLNK" and "IntxBLK") "LnxFIFO" to indicate that the file should be treated as a FIFO (when mounted with the "sfu"). For some additional context note that "sfu" followed the way that "Services for Unix" on Windows handled these special files (at least for character and block devices and symlinks), which is different than newer Windows which can handle special files as reparse points (which isn't an option to many servers). Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* fs: add ctime accessors infrastructureJeff Layton2023-11-282-1/+60
| | | | | | | | | | | | | | | | | | | | commit 9b6304c1d53745c300b86f202d0dcff395e2d2db upstream. struct timespec64 has unused bits in the tv_nsec field that can be used for other purposes. In future patches, we're going to change how the inode->i_ctime is accessed in certain inodes in order to make use of them. In order to do that safely though, we'll need to eradicate raw accesses of the inode->i_ctime field from the kernel. Add new accessor functions for the ctime that we use to replace them. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Message-Id: <20230705185812.579118-2-jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xhci: Enable RPM on controllers that support low-power statesBasavaraj Natikar2023-11-281-1/+3
| | | | | | | | | | | | | | | | commit a5d6264b638efeca35eff72177fd28d149e0764b upstream. Use the low-power states of the underlying platform to enable runtime PM. If the platform doesn't support runtime D3, then enabling default RPM will result in the controller malfunctioning, as in the case of hotplug devices not being detected because of a failed interrupt generation. Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20231019102924.2797346-16-mathias.nyman@linux.intel.com Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* parisc/power: Fix power soft-off when running on qemuHelge Deller2023-11-281-1/+1
| | | | | | | | | | | | commit 6ad6e15a9c46b8f0932cd99724f26f3db4db1cdf upstream. Firmware returns the physical address of the power switch, so need to use gsc_writel() instead of direct memory access. Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu") Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* parisc/pgtable: Do not drop upper 5 address bits of physical addressHelge Deller2023-11-281-4/+3
| | | | | | | | | | | | | commit 166b0110d1ee53290bd11618df6e3991c117495a upstream. When calculating the pfn for the iitlbt/idtlbt instruction, do not drop the upper 5 address bits. This doesn't seem to have an effect on physical hardware which uses less physical address bits, but in qemu the missing bits are visible. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* parisc: Prevent booting 64-bit kernels on PA1.x machinesHelge Deller2023-11-281-3/+2
| | | | | | | | | | | | | commit a406b8b424fa01f244c1aab02ba186258448c36b upstream. Bail out early with error message when trying to boot a 64-bit kernel on 32-bit machines. This fixes the previous commit to include the check for true 64-bit kernels as well. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 591d2108f3abc ("parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines") Cc: <stable@vger.kernel.org> # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mm/hugetlb: use nth_page() in place of direct struct page manipulationZi Yan2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 426056efe835cf4864ccf4c328fe3af9146fc539 ] When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle it properly. A wrong or non-existing page might be tried to be grabbed, either leading to a non freeable page or kernel memory access errors. No bug is reported. It comes from code inspection. Link: https://lkml.kernel.org/r/20230913201248.452081-3-zi.yan@sent.com Fixes: 57a196a58421 ("hugetlb: simplify hugetlb handling in follow_page_mask") Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* mm/hugetlb: prepare hugetlb_follow_page_mask() for FOLL_PINPeter Xu2023-11-281-11/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 458568c92953dee3716234711f1a2830a35261f3 ] follow_page() doesn't use FOLL_PIN, meanwhile hugetlb seems to not be the target of FOLL_WRITE either. However add the checks. Namely, either the need to CoW due to missing write bit, or proper unsharing on !AnonExclusive pages over R/O pins to reject the follow page. That brings this function closer to follow_hugetlb_page(). So we don't care before, and also for now. But we'll care if we switch over slow-gup to use hugetlb_follow_page_mask(). We'll also care when to return -EMLINK properly, as that's the gup internal api to mean "we should unshare". Not really needed for follow page path, though. When at it, switching the try_grab_page() to use WARN_ON_ONCE(), to be clear that it just should never fail. When error happens, instead of setting page==NULL, capture the errno instead. Link: https://lkml.kernel.org/r/20230628215310.73782-3-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: James Houghton <jthoughton@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kirill A . Shutemov <kirill@shutemov.name> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 426056efe835 ("mm/hugetlb: use nth_page() in place of direct struct page manipulation") Signed-off-by: Sasha Levin <sashal@kernel.org>
* rcutorture: Fix stuttering races and other issuesJoel Fernandes (Google)2023-11-281-33/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cca42bd8eb1b54a4c9bbf48c79d120e66619a3e4 ] The stuttering code isn't functioning as expected. Ideally, it should pause the torture threads for a designated period before resuming. Yet, it fails to halt the test for the correct duration. Additionally, a race condition exists, potentially causing the stuttering code to pause for an extended period if the 'spt' variable is non-zero due to the stutter orchestration thread's inadequate CPU time. Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels. This happens as the stuttering code may run within a softirq due to RCU callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds, thus obstructing RCU's progress. This situation triggers a warning message in the logs: [ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9 This warning suggests that an RCU torture object, although invisible to RCU readers, couldn't make it past the pipe array and be freed -- a strong indication that there weren't enough grace periods during the stutter interval. To address these issues, this patch sets the "stutter end" time to an absolute point in the future set by the main stutter thread. This is then used for waiting in stutter_wait(). While the stutter thread still defines this absolute time, the waiters' waiting logic doesn't rely on the stutter thread receiving sufficient CPU time to halt the stuttering as the halting is now self-controlled. Cc: stable@vger.kernel.org Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* torture: Make torture_hrtimeout_ns() take an hrtimer mode parameterPaul E. McKenney2023-11-282-7/+9
| | | | | | | | | | | | | | | | [ Upstream commit a741deac787f0d2d7068638c067db20af9e63752 ] The current torture-test sleeps are waiting for a duration, but there are situations where it is better to wait for an absolute time, for example, when ending a stutter interval. This commit therefore adds an hrtimer mode parameter to torture_hrtimeout_ns(). Why not also the other torture_hrtimeout_*() functions? The theory is that most absolute times will be in nanoseconds, especially not (say) jiffies. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin <sashal@kernel.org>
* torture: Move stutter_wait() timeouts to hrtimersPaul E. McKenney2023-11-281-2/+2
| | | | | | | | | | | [ Upstream commit 10af43671e8bf4ac153c4991a17cdf57bc6d2cfe ] In order to gain better race coverage, move the test start/stop waits in stutter_wait() to torture_hrtimeout_jiffies(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin <sashal@kernel.org>
* torture: Make torture_hrtimeout_*() use TASK_IDLEPaul E. McKenney2023-11-281-1/+1
| | | | | | | | | | | | [ Upstream commit 872948c665f50a1446e8a34b1ed57bb0b3a9ca4a ] Given that it is expected that more code will use torture_hrtimeout_*(), including for longer timeouts, make it use TASK_IDLE instead of TASK_UNINTERRUPTIBLE. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin <sashal@kernel.org>
* torture: Add lock_torture writer_fifo module parameterDietmar Eggemann2023-11-283-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 5d248bb39fe1388943acb6510f8f48fa5570e0ec ] This commit adds a module parameter that causes the locktorture writer to run at real-time priority. To use it: insmod /lib/modules/torture.ko random_shuffle=1 insmod /lib/modules/locktorture.ko torture_type=mutex_lock rt_boost=1 rt_boost_factor=50 nested_locks=3 writer_fifo=1 ^^^^^^^^^^^^^ A predecessor to this patch has been helpful to uncover issues with the proxy-execution series. [ paulmck: Remove locktorture-specific code from kernel/torture.c. ] Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: kernel-team@android.com Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> [jstultz: Include header change to build, reword commit message] Signed-off-by: John Stultz <jstultz@google.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin <sashal@kernel.org>
* torture: Add a kthread-creation callback to _torture_create_kthread()Paul E. McKenney2023-11-282-3/+10
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 67d5404d274376890d6d095a10e6565854918f8e ] This commit adds a kthread-creation callback to the _torture_create_kthread() function, which allows callers of a new torture_create_kthread_cb() macro to specify a function to be invoked after the kthread is created but before it is awakened for the first time. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: kernel-team@android.com Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: John Stultz <jstultz@google.com> Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues") Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e cardLukas Wunner2023-11-281-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c9260693aa0c1e029ed23693cfd4d7814eee6624 ] Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume") shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100 msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required for Conventional PCI. But it turns out that there are PCIe devices which require a longer delay than prescribed before first config space access after reset recovery or resume from D3cold: Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator "raises a PCI system error (PERR), as reported by the IPMI event log, and the hardware itself would suffer a catastrophic event, cycling the server" unless the longer delay is observed. The card is specified to conform to PCIe r1.0 and indeed only supports Gen1 speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same 100 msec delay as PCIe r6.1 sec 6.6.1: To allow components to perform internal initialization, system software must wait for at least 100 ms from the end of a reset (cold/warm/hot) before it is permitted to issue Configuration Requests The behavior of the Torrent QN16e card thus appears to be a quirk. Treat it as such and lengthen the reset delay for this specific device. Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume") Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.1695304512.git.lukas@wunner.de Reported-by: Chad Schroeder <CSchroeder@sonifi.com> Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM6PR16MB2844.namprd16.prod.outlook.com/ Tested-by: Chad Schroeder <CSchroeder@sonifi.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Sasha Levin <sashal@kernel.org>
* PCI: qcom-ep: Add dedicated callback for writing to DBI2 registersManivannan Sadhasivam2023-11-281-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit a07d2497ed657eb2efeb967af47e22f573dcd1d6 ] The DWC core driver exposes the write_dbi2() callback for writing to the DBI2 registers in a vendor-specific way. On the Qcom EP platforms, the DBI_CS2 bit in the ELBI region needs to be asserted before writing to any DBI2 registers and deasserted once done. So, let's implement the callback for the Qcom PCIe EP driver so that the DBI2 writes are correctly handled in the hardware. Without this callback, the DBI2 register writes like BAR size won't go through and as a result, the default BAR size is set for all BARs. [kwilczynski: commit log, renamed function to match the DWC convention] Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver") Suggested-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/linux-pci/20231025130029.74693-2-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Cc: stable@vger.kernel.org # 5.16+ Signed-off-by: Sasha Levin <sashal@kernel.org>
* pmdomain: imx: Make imx pgc power domain also set the fwnodePengfei Li2023-11-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 374de39d38f97b0e58cfee88da590b2d056ccf7f ] Currently, The imx pgc power domain doesn't set the fwnode pointer, which results in supply regulator device can't get consumer imx pgc power domain device from fwnode when creating a link. This causes the driver core to instead try to create a link between the parent gpc device of imx pgc power domain device and supply regulator device. However, at this point, the gpc device has already been bound, and the link creation will fail. So adding the fwnode pointer to the imx pgc power domain device will fix this issue. Signed-off-by: Pengfei Li <pengfei.li_1@nxp.com> Tested-by: Emil Kronborg <emil.kronborg@protonmail.com> Fixes: 3fb16866b51d ("driver core: fw_devlink: Make cycle detection more robust") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020185949.537083-1-pengfei.li_1@nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* pmdomain: amlogic: Fix mask for the second NNA mem PD domainTomeu Vizoso2023-11-281-1/+1
| | | | | | | | | | | | | | | | [ Upstream commit b131329b9bfbd1b4c0c5e088cb0c6ec03a12930f ] Without this change, the NPU hangs when the 8th NN core is used. It matches what the out-of-tree driver does. Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net> Fixes: 9a217b7e8953 ("soc: amlogic: meson-pwrc: Add NNA power domain for A311D") Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231016080205.41982-2-tomeu@tomeuvizoso.net Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* pmdomain: bcm: bcm2835-power: check if the ASB register is equal to enableMaíra Canal2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2e75396f1df61e1f1d26d0d703fc7292c4ae4371 ] The commit c494a447c14e ("soc: bcm: bcm2835-power: Refactor ASB control") refactored the ASB control by using a general function to handle both the enable and disable. But this patch introduced a subtle regression: we need to check if !!(readl(base + reg) & ASB_ACK) == enable, not just check if (readl(base + reg) & ASB_ACK) == true. Currently, this is causing an invalid register state in V3D when unloading and loading the driver, because `bcm2835_asb_disable()` will return -ETIMEDOUT and `bcm2835_asb_power_off()` will fail to disable the ASB slave for V3D. Fixes: c494a447c14e ("soc: bcm: bcm2835-power: Refactor ASB control") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Stefan Wahren <stefan.wahren@i2se.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231024101251.6357-2-mcanal@igalia.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* cxl/port: Fix delete_endpoint() vs parent unregistration raceDan Williams2023-11-281-15/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8d2ad999ca3c64cb08cf6a58d227b9d9e746d708 upstream. The CXL subsystem, at cxl_mem ->probe() time, establishes a lineage of ports (struct cxl_port objects) between an endpoint and the root of a CXL topology. Each port including the endpoint port is attached to the cxl_port driver. Given that setup, it follows that when either any port in that lineage goes through a cxl_port ->remove() event, or the memdev goes through a cxl_mem ->remove() event. The hierarchy below the removed port, or the entire hierarchy if the memdev is removed needs to come down. The delete_endpoint() callback is careful to check whether it is being called to tear down the hierarchy, or if it is only being called to teardown the memdev because an ancestor port is going through ->remove(). That care needs to take the device_lock() of the endpoint's parent. Which requires 2 bugs to be fixed: 1/ A reference on the parent is needed to prevent use-after-free scenarios like this signature: BUG: spinlock bad magic on CPU#0, kworker/u56:0/11 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023 Workqueue: cxl_port detach_memdev [cxl_core] RIP: 0010:spin_bug+0x65/0xa0 Call Trace: do_raw_spin_lock+0x69/0xa0 __mutex_lock+0x695/0xb80 delete_endpoint+0xad/0x150 [cxl_core] devres_release_all+0xb8/0x110 device_unbind_cleanup+0xe/0x70 device_release_driver_internal+0x1d2/0x210 detach_memdev+0x15/0x20 [cxl_core] process_one_work+0x1e3/0x4c0 worker_thread+0x1dd/0x3d0 2/ In the case of RCH topologies, the parent device that needs to be locked is not always @port->dev as returned by cxl_mem_find_port(), use endpoint->dev.parent instead. Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver") Cc: <stable@vger.kernel.org> Reported-by: Robert Richter <rrichter@amd.com> Closes: http://lore.kernel.org/r/20231018171713.1883517-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* cxl/region: Fix x1 root-decoder granularity calculationsJim Harris2023-11-282-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 98a04c7aced2b43b3ac4befe216c4eecc7257d4b upstream. Root decoder granularity must match value from CFWMS, which may not be the region's granularity for non-interleaved root decoders. So when calculating granularities for host bridge decoders, use the region's granularity instead of the root decoder's granularity to ensure the correct granularities are set for the host bridge decoders and any downstream switch decoders. Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch. Region created with 2048 granularity using following command line: cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ -g 2048 -s 2048M Use "cxl list -PDE | grep granularity" to get a view of the granularity set at each level of the topology. Before this patch: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":256, After: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Cc: <stable@vger.kernel.org> Signed-off-by: Jim Harris <jim.harris@samsung.com> Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in [djbw: fixup the prebuilt cxl_test region] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix random hot join failure since timeout errorFrank Li2023-11-281-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 9aaeef113c55248ecf3ab941c2e4460aaa8b8b9a upstream. master side report: silvaco-i3c-master 44330000.i3c-master: Error condition: MSTATUS 0x020090c7, MERRWARN 0x00100000 BIT 20: TIMEOUT error The module has stalled too long in a frame. This happens when: - The TX FIFO or RX FIFO is not handled and the bus is stuck in the middle of a message, - No STOP was issued and between messages, - IBI manual is used and no decision was made. The maximum stall period is 100 μs. This can be considered as being just a warning as the system IRQ latency can easily be greater than 100us. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20231023161658.3890811-7-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix SDA keep low when polling IBIWON timeout happenFrank Li2023-11-281-0/+1
| | | | | | | | | | | | | | | | commit dfd7cd6aafdb1f5ba93828e97e56b38304b23a05 upstream. Upon IBIWON timeout, the SDA line will always be kept low if we don't emit a stop. Calling svc_i3c_master_emit_stop() there will let the bus return to idle state. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231023161658.3890811-6-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix check wrong status register in irq handlerFrank Li2023-11-281-1/+1
| | | | | | | | | | | | | | | commit 225d5ef048c4ed01a475c95d94833bd7dd61072d upstream. svc_i3c_master_irq_handler() wrongly checks register SVC_I3C_MINTMASKED. It should be SVC_I3C_MSTATUS. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231023161658.3890811-5-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix ibi may not return mandatory data byteFrank Li2023-11-281-0/+8
| | | | | | | | | | | | | | | | | | | | commit c85e209b799f12d18a90ae6353b997b1bb1274a5 upstream. MSTATUS[RXPEND] is only updated after the data transfer cycle started. This creates an issue when the I3C clock is slow, and the CPU is running fast enough that MSTATUS[RXPEND] may not be updated when the code reaches checking point. As a result, mandatory data can be missed. Add a wait for MSTATUS[COMPLETE] to ensure that all mandatory data is already in FIFO. It also works without mandatory data. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231023161658.3890811-4-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix wrong data return when IBI happen during start frameFrank Li2023-11-281-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5e5e3c92e748a6d859190e123b9193cf4911fcca upstream. ┌─────┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┌───── SCL: ┘ └─────┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┘ ───┐ ┌─────┐ ┌─────┐ ┌───────────┐ SDA: └───────────────────────┘ └─────┘ └─────┘ └───── xxx╱ ╲╱ ╲╱ ╲╱ ╲╱ ╲ : xxx╲IBI ╱╲ Addr(0x0a) ╱╲ RW ╱╲NACK╱╲ S ╱ If an In-Band Interrupt (IBI) occurs and IBI work thread is not immediately scheduled, when svc_i3c_master_priv_xfers() initiates the I3C transfer and attempts to send address 0x7e, the target interprets it as an IBI handler and returns the target address 0x0a. However, svc_i3c_master_priv_xfers() does not handle this case and proceeds with other transfers, resulting in incorrect data being returned. Add IBIWON check in svc_i3c_master_xfer(). In case this situation occurs, return a failure to the driver. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231023161658.3890811-3-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: svc: fix race condition in ibi work threadFrank Li2023-11-281-0/+14
| | | | | | | | | | | | | | | | commit 6bf3fc268183816856c96b8794cd66146bc27b35 upstream. The ibi work thread operates asynchronously with other transfers, such as svc_i3c_master_priv_xfers(). Introduce mutex protection to ensure the completion of the entire i3c/i2c transaction. Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: <stable@vger.kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20231023161658.3890811-2-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i3c: master: cdns: Fix reading status registerJoshua Yeong2023-11-281-3/+3
| | | | | | | | | | | | | | commit 4bd8405257da717cd556f99e5fb68693d12c9766 upstream. IBIR_DEPTH and CMDR_DEPTH should read from status0 instead of status1. Cc: stable@vger.kernel.org Fixes: 603f2bee2c54 ("i3c: master: Add driver for Cadence IP") Signed-off-by: Joshua Yeong <joshua.yeong@starfivetech.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20230913031743.11439-2-joshua.yeong@starfivetech.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* cxl/region: Do not try to cleanup after cxl_region_setup_targets() failsJim Harris2023-11-281-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0718588c7aaa7a1510b4de972370535b61dddd0d upstream. Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") tried to avoid 'eiw' initialization errors when ->nr_targets exceeded 16, by just decrementing ->nr_targets when cxl_region_setup_targets() failed. Commit 86987c766276 ("cxl/region: Cleanup target list on attach error") extended that cleanup to also clear cxled->pos and p->targets[pos]. The initialization error was incidentally fixed separately by: Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable warnings") which was merged a few days after 5e42bcbc3fef. But now the original cleanup when cxl_region_setup_targets() fails prevents endpoint and switch decoder resources from being reused: 1) the cleanup does not set the decoder's region to NULL, which results in future dpa_size_store() calls returning -EBUSY 2) the decoder is not properly freed, which results in future commit errors associated with the upstream switch Now that the initialization errors were fixed separately, the proper cleanup for this case is to just return immediately. Then the resources associated with this target get cleanup up as normal when the failed region is deleted. The ->nr_targets decrement in the error case also helped prevent a p->targets[] array overflow, so add a new check to prevent against that overflow. Tested by trying to create an invalid region for a 2 switch * 2 endpoint topology, and then following up with creating a valid region. Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()") Cc: <stable@vger.kernel.org> Signed-off-by: Jim Harris <jim.harris@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mtd: cfi_cmdset_0001: Byte swap OTP infoLinus Walleij2023-11-281-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 565fe150624ee77dc63a735cc1b3bff5101f38a3 upstream. Currently the offset into the device when looking for OTP bits can go outside of the address of the MTD NOR devices, and if that memory isn't readable, bad things happen on the IXP4xx (added prints that illustrate the problem before the crash): cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x00000100 ixp4xx_copy_from copy from 0x00000100 to 0xc880dd78 cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x12000000 ixp4xx_copy_from copy from 0x12000000 to 0xc880dd78 8<--- cut here --- Unable to handle kernel paging request at virtual address db000000 [db000000] *pgd=00000000 (...) This happens in this case because the IXP4xx is big endian and the 32- and 16-bit fields in the struct cfi_intelext_otpinfo are not properly byteswapped. Compare to how the code in read_pri_intelext() byteswaps the fields in struct cfi_pri_intelext. Adding a small byte swapping loop for the OTP in read_pri_intelext() and the crash goes away. The problem went unnoticed for many years until I enabled CONFIG_MTD_OTP on the IXP4xx as well, triggering the bug. Cc: stable@vger.kernel.org Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20231020-mtd-otp-byteswap-v4-1-0d132c06aa9d@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned longFlorent Revest2023-11-282-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0da668333fb07805c2836d5d50e26eda915b24a1 upstream. Defining a prctl flag as an int is a footgun because on a 64 bit machine and with a variadic implementation of prctl (like in musl and glibc), when used directly as a prctl argument, it can get casted to long with garbage upper bits which would result in unexpected behaviors. This patch changes the constant to an unsigned long to eliminate that possibilities. This does not break UAPI. I think that a stable backport would be "nice to have": to reduce the chances that users build binaries that could end up with garbage bits in their MDWE prctl arguments. We are not aware of anyone having yet encountered this corner case with MDWE prctls but a backport would reduce the likelihood it happens, since this sort of issues has happened with other prctls. But If this is perceived as a backporting burden, I suppose we could also live without a stable backport. Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by: Florent Revest <revest@chromium.org> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mm/memory_hotplug: use pfn math in place of direct struct page manipulationZi Yan2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1640a0ef80f6d572725f5b0330038c18e98ea168 upstream. When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use pfn calculation to handle it properly. Without the fix, a wrong number of page might be skipped. Since skip cannot be negative, scan_movable_page() will end early and might miss a movable page with -ENOENT. This might fail offline_pages(). No bug is reported. The fix comes from code inspection. Link: https://lkml.kernel.org/r/20230913201248.452081-4-zi.yan@sent.com Fixes: eeb0efd071d8 ("mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages") Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mm/cma: use nth_page() in place of direct struct page manipulationZi Yan2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2e7cfe5cd5b6b0b98abf57a3074885979e187c1c upstream. Patch series "Use nth_page() in place of direct struct page manipulation", v3. On SPARSEMEM without VMEMMAP, struct page is not guaranteed to be contiguous, since each memory section's memmap might be allocated independently. hugetlb pages can go beyond a memory section size, thus direct struct page manipulation on hugetlb pages/subpages might give wrong struct page. Kernel provides nth_page() to do the manipulation properly. Use that whenever code can see hugetlb pages. This patch (of 5): When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle it properly. Without the fix, page_kasan_tag_reset() could reset wrong page tags, causing a wrong kasan result. No related bug is reported. The fix comes from code inspection. Link: https://lkml.kernel.org/r/20230913201248.452081-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20230913201248.452081-2-zi.yan@sent.com Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* s390/cmma: fix detection of DAT pagesHeiko Carstens2023-11-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | commit 44d93045247661acbd50b1629e62f415f2747577 upstream. If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction. This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however is incorrect: only the first three of the four pages which belong to segment and region tables will be marked as being used for DAT. The last page is incorrectly marked as no-dat. This can result in incorrect guest TLB flushes. Fix this by simply marking all four pages. Cc: <stable@vger.kernel.org> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()Heiko Carstens2023-11-281-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 09cda0a400519b1541591c506e54c9c48e3101bf upstream. If the cmma no-dat feature is available all pages that are not used for dynamic address translation are marked as "no-dat" with the ESSA instruction. This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. This also means that pages which are used for dynamic address translation must not be marked as "no-dat", since the hypervisor may then incorrectly not purge guest TLB entries. Region and segment tables allocated via vmem_crst_alloc() are incorrectly marked as "no-dat", as soon as slab_is_available() returns true. Such tables are allocated e.g. when kernel page tables are split, memory is hotplugged, or a DCSS segment is loaded. Fix this by adding the missing arch_set_page_dat() call. Cc: <stable@vger.kernel.org> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* dmaengine: stm32-mdma: correct desc prep when channel runningAlain Volmat2023-11-281-2/+2
| | | | | | | | | | | | | | | | | | | | commit 03f25d53b145bc2f7ccc82fc04e4482ed734f524 upstream. In case of the prep descriptor while the channel is already running, the CCR register value stored into the channel could already have its EN bit set. This would lead to a bad transfer since, at start transfer time, enabling the channel while other registers aren't yet properly set. To avoid this, ensure to mask the CCR_EN bit when storing the ccr value into the mdma channel structure. Fixes: a4ffb13c8946 ("dmaengine: Add STM32 MDMA driver") Signed-off-by: Alain Volmat <alain.volmat@foss.st.com> Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Cc: stable@vger.kernel.org Tested-by: Alain Volmat <alain.volmat@foss.st.com> Link: https://lore.kernel.org/r/20231009082450.452877-1-amelie.delaunay@foss.st.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mcb: fix error handling for different scenarios when parsingSanjuán García, Jorge2023-11-282-1/+2
| | | | | | | | | | | | | | | | | | commit 63ba2d07b4be72b94216d20561f43e1150b25d98 upstream. chameleon_parse_gdd() may fail for different reasons and end up in the err tag. Make sure we at least always free the mcb_device allocated with mcb_alloc_dev(). If mcb_device_register() fails, make sure to give up the reference in the same place the device was added. Fixes: 728ac3389296 ("mcb: mcb-parse: fix error handing in chameleon_parse_gdd()") Cc: stable <stable@kernel.org> Reviewed-by: Jose Javier Rodriguez Barbarin <JoseJavier.Rodriguez@duagon.com> Signed-off-by: Jorge Sanjuan Garcia <jorge.sanjuangarcia@duagon.com> Link: https://lore.kernel.org/r/20231019141434.57971-2-jorge.sanjuangarcia@duagon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* driver core: Release all resources during unbind before updating device linksSaravana Kannan2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 2e84dc37920012b458e9458b19fc4ed33f81bc74 upstream. This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released. Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: stable <stable@kernel.org> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by: Saravana Kannan <saravanak@google.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mark Brown <broonie@kernel.org> Cc: Matti Vaittinen <mazziesaccount@gmail.com> Cc: James Clark <james.clark@arm.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tracing: Have the user copy of synthetic event address use correct contextSteven Rostedt (Google)2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4f7969bcd6d33042d62e249b41b5578161e4c868 upstream. A synthetic event is created by the synthetic event interface that can read both user or kernel address memory. In reality, it reads any arbitrary memory location from within the kernel. If the address space is in USER (where CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE is set) then it uses strncpy_from_user_nofault() to copy strings otherwise it uses strncpy_from_kernel_nofault(). But since both functions use the same variable there's no annotation to what that variable is (ie. __user). This makes sparse complain. Quiet sparse by typecasting the strncpy_from_user_nofault() variable to a __user pointer. Link: https://lore.kernel.org/linux-trace-kernel/20231031151033.73c42e23@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 0934ae9977c2 ("tracing: Fix reading strings from synthetic events"); Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311010013.fm8WTxa5-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* selftests/clone3: Fix broken test under !CONFIG_TIME_NSTiezhu Yang2023-11-281-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit fc7f04dc23db50206bee7891516ed4726c3f64cf upstream. When execute the following command to test clone3 under !CONFIG_TIME_NS: # make headers && cd tools/testing/selftests/clone3 && make && ./clone3 we can see the following error info: # [7538] Trying clone3() with flags 0x80 (size 0) # Invalid argument - Failed to create new process # [7538] clone3() with flags says: -22 expected 0 not ok 18 [7538] Result (-22) is different than expected (0) ... # Totals: pass:18 fail:1 xfail:0 xpass:0 skip:0 error:0 This is because if CONFIG_TIME_NS is not set, but the flag CLONE_NEWTIME (0x80) is used to clone a time namespace, it will return -EINVAL in copy_time_ns(). If kernel does not support CONFIG_TIME_NS, /proc/self/ns/time will be not exist, and then we should skip clone3() test with CLONE_NEWTIME. With this patch under !CONFIG_TIME_NS: # make headers && cd tools/testing/selftests/clone3 && make && ./clone3 ... # Time namespaces are not supported ok 18 # SKIP Skipping clone3() with CLONE_NEWTIME ... # Totals: pass:18 fail:0 xfail:0 xpass:0 skip:1 error:0 Link: https://lkml.kernel.org/r/1689066814-13295-1-git-send-email-yangtiezhu@loongson.cn Fixes: 515bddf0ec41 ("selftests/clone3: test clone3 with CLONE_NEWTIME") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christian Brauner <brauner@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* i2c: core: Run atomic i2c xfer when !preemptibleBenjamin Bara2023-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit aa49c90894d06e18a1ee7c095edbd2f37c232d02 upstream. Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is disabled. However, non-atomic i2c transfers require preemption (e.g. in wait_for_completion() while waiting for the DMA). panic() calls preempt_disable_notrace() before calling emergency_restart(). Therefore, if an i2c device is used for the restart, the xfer should be atomic. This avoids warnings like: [ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c Use !preemptible() instead, which is basically the same check as pre-v5.2. Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Wolfram Sang <wsa@kernel.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com> Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-2-18699d5dcd76@skidata.com Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* kernel/reboot: emergency_restart: Set correct system_stateBenjamin Bara2023-11-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 60466c067927abbcaff299845abd4b7069963139 upstream. As the emergency restart does not call kernel_restart_prepare(), the system_state stays in SYSTEM_RUNNING. Since bae1d3a05a8b, this hinders i2c_in_atomic_xfer_mode() from becoming active, and therefore might lead to avoidable warnings in the restart handlers, e.g.: [ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c Avoid these by setting the correct system_state. Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Nishanth Menon <nm@ti.com> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com> Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-1-18699d5dcd76@skidata.com Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>