summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* net/mlx5: DR, Check force-loopback RC QP capability independently from RoCEYevgeny Kliteynik2023-05-301-1/+3
| | | | | | | | | | | | | | | | | commit c7dd225bc224726c22db08e680bf787f60ebdee3 upstream. SW Steering uses RC QP for writing STEs to ICM. This writingis done in LB (loopback), and FL (force-loopback) QP is preferred for performance. FL is available when RoCE is enabled or disabled based on RoCE caps. This patch adds reading of FL capability from HCA caps in addition to the existing reading from RoCE caps, thus fixing the case where we didn't have loopback enabled when RoCE was disabled. Fixes: 7304d603a57a ("net/mlx5: DR, Add support for force-loopback QP") Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* ASoC: Intel: Skylake: Fix declaration of enum skl_ch_cfgCezary Rojewski2023-05-301-1/+2
| | | | | | | | | | | | | | commit 95109657471311601b98e71f03d0244f48dc61bb upstream. Constant 'C4_CHANNEL' does not exist on the firmware side. Value 0xC is reserved for 'C7_1' instead. Fixes: 04afbbbb1cba ("ASoC: Intel: Skylake: Update the topology interface structure") Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Link: https://lore.kernel.org/r/20230519201711.4073845-4-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* x86/pci/xen: populate MSI sysfs entriesMaximilian Heyne2023-05-301-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 335b4223466dd75f9f3ea4918187afbadd22e5c8 upstream. Commit bf5e758f02fc ("genirq/msi: Simplify sysfs handling") reworked the creation of sysfs entries for MSI IRQs. The creation used to be in msi_domain_alloc_irqs_descs_locked after calling ops->domain_alloc_irqs. Then it moved into __msi_domain_alloc_irqs which is an implementation of domain_alloc_irqs. However, Xen comes with the only other implementation of domain_alloc_irqs and hence doesn't run the sysfs population code anymore. Commit 6c796996ee70 ("x86/pci/xen: Fixup fallout from the PCI/MSI overhaul") set the flag MSI_FLAG_DEV_SYSFS for the xen msi_domain_info but that doesn't actually have an effect because Xen uses it's own domain_alloc_irqs implementation. Fix this by making use of the fallback functions for sysfs population. Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling") Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20230503131656.15928-1-mheyne@amazon.de Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* fs: fix undefined behavior in bit shift for SB_NOUSERHao Ge2023-05-301-21/+21
| | | | | | | | | | | | | | | | commit f15afbd34d8fadbd375f1212e97837e32bc170cc upstream. Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. It was spotted by UBSAN. So let's just fix this by using the BIT() helper for all SB_* flags. Fixes: e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags") Signed-off-by: Hao Ge <gehao@kylinos.cn> Message-Id: <20230424051835.374204-1-gehao@kylinos.cn> [brauner@kernel.org: use BIT() for all SB_* flags] Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* firmware: arm_ffa: Fix FFA device names for logical partitionsSudeep Holla2023-05-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 19b8766459c41c6f318f8a548cc1c66dffd18363 upstream. Each physical partition can provide multiple services each with UUID. Each such service can be presented as logical partition with a unique combination of VM ID and UUID. The number of distinct UUID in a system will be less than or equal to the number of logical partitions. However, currently it fails to register more than one logical partition or service within a physical partition as the device name contains only VM ID while both VM ID and UUID are maintained in the partition information. The kernel complains with the below message: | sysfs: cannot create duplicate filename '/devices/arm-ffa-8001' | CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc7 #8 | Hardware name: FVP Base RevC (DT) | Call trace: | dump_backtrace+0xf8/0x118 | show_stack+0x18/0x24 | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | sysfs_create_dir_ns+0xe0/0x13c | kobject_add_internal+0x220/0x3d4 | kobject_add+0x94/0x100 | device_add+0x144/0x5d8 | device_register+0x20/0x30 | ffa_device_register+0x88/0xd8 | ffa_setup_partitions+0x108/0x1b8 | ffa_init+0x2ec/0x3a4 | do_one_initcall+0xcc/0x240 | do_initcall_level+0x8c/0xac | do_initcalls+0x54/0x94 | do_basic_setup+0x1c/0x28 | kernel_init_freeable+0x100/0x16c | kernel_init+0x20/0x1a0 | ret_from_fork+0x10/0x20 | kobject_add_internal failed for arm-ffa-8001 with -EEXIST, don't try to | register things with the same name in the same directory. | arm_ffa arm-ffa: unable to register device arm-ffa-8001 err=-17 | ARM FF-A: ffa_setup_partitions: failed to register partition ID 0x8001 By virtue of being random enough to avoid collisions when generated in a distributed system, there is no way to compress UUID keys to the number of bits required to identify each. We can eliminate '-' in the name but it is not worth eliminating 4 bytes and add unnecessary logic for doing that. Also v1.0 doesn't provide the UUID of the partitions which makes it hard to use the same for the device name. So to keep it simple, let us alloc an ID using ida_alloc() and append the same to "arm-ffa" to make up a unique device name. Also stash the id value in ffa_dev to help freeing the ID later when the device is destroyed. Fixes: e781858488b9 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration") Reported-by: Lucian Paul-Trifu <lucian.paul-trifu@arm.com> Link: https://lore.kernel.org/r/20230419-ffa_fixes_6-4-v2-3-d9108e43a176@arm.com Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* power: supply: bq27xxx: Ensure power_supply_changed() is called on current ↵Hans de Goede2023-05-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | sign changes commit 939a116142012926e25de0ea6b7e2f8d86a5f1b6 upstream. On gauges where the current register is signed, there is no charging flag in the flags register. So only checking flags will not result in power_supply_changed() getting called when e.g. a charger is plugged in and the current sign changes from negative (discharging) to positive (charging). This causes userspace's notion of the status to lag until userspace does a poll. And when a power_supply_leds.c LED trigger is used to indicate charging status with a LED, this LED will lag until the capacity percentage changes, which may take many minutes (because the LED trigger only is updated on power_supply_changed() calls). Fix this by calling bq27xxx_battery_current_and_status() on gauges with a signed current register and checking if the status has changed. Fixes: 297a533b3e62 ("bq27x00: Cache battery registers") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* power: supply: bq27xxx: Fix poll_interval handling and races on removeHans de Goede2023-05-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c00bc80462afc7963f449d7f21d896d2f629cacc upstream. Before this patch bq27xxx_battery_teardown() was setting poll_interval = 0 to avoid bq27xxx_battery_update() requeuing the delayed_work item. There are 2 problems with this: 1. If the driver is unbound through sysfs, rather then the module being rmmod-ed, this changes poll_interval unexpectedly 2. This is racy, after it being set poll_interval could be changed before bq27xxx_battery_update() checks it through /sys/module/bq27xxx_battery/parameters/poll_interval Fix this by added a removed attribute to struct bq27xxx_device_info and using that instead of setting poll_interval to 0. There also is another poll_interval related race on remove(), writing /sys/module/bq27xxx_battery/parameters/poll_interval will requeue the delayed_work item for all devices on the bq27xxx_battery_devices list and the device being removed was only removed from that list after cancelling the delayed_work item. Fix this by moving the removal from the bq27xxx_battery_devices list to before cancelling the delayed_work item. Fixes: 8cfaaa811894 ("bq27x00_battery: Fix OOPS caused by unregistring bq27x00 driver") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* drm: fix drmm_mutex_init()Matthew Auld2023-05-301-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c21f11d182c2180d8b90eaff84f574cfa845b250 upstream. In mutex_init() lockdep identifies a lock by defining a special static key for each lock class. However if we wrap the macro in a function, like in drmm_mutex_init(), we end up generating: int drmm_mutex_init(struct drm_device *dev, struct mutex *lock) { static struct lock_class_key __key; __mutex_init((lock), "lock", &__key); .... } The static __key here is what lockdep uses to identify the lock class, however since this is just a normal function the key here will be created once, where all callers then use the same key. In effect the mutex->depmap.key will be the same pointer for different drmm_mutex_init() callers. This then results in impossible lockdep splats since lockdep thinks completely unrelated locks are the same lock class. To fix this turn drmm_mutex_init() into a macro such that it generates a different "static struct lock_class_key __key" for each invocation, which looks to be inline with what mutex_init() wants. v2: - Revamp the commit message with clearer explanation of the issue. - Rather export __drmm_mutex_release() than static inline. Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by: Sarah Walker <sarah.walker@imgtec.com> Fixes: e13f13e039dc ("drm: Add DRM-managed mutex_init()") Cc: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230519090733.489019-1-matthew.auld@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* USB: core: Add routines for endpoint checks in old driversAlan Stern2023-05-301-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 13890626501ffda22b18213ddaf7930473da5792 upstream. Many of the older USB drivers in the Linux USB stack were written based simply on a vendor's device specification. They use the endpoint information in the spec and assume these endpoints will always be present, with the properties listed, in any device matching the given vendor and product IDs. While that may have been true back then, with spoofing and fuzzing it is not true any more. More and more we are finding that those old drivers need to perform at least a minimum of checking before they try to use any endpoint other than ep0. To make this checking as simple as possible, we now add a couple of utility routines to the USB core. usb_check_bulk_endpoints() and usb_check_int_endpoints() take an interface pointer together with a list of endpoint addresses (numbers and directions). They check that the interface's current alternate setting includes endpoints with those addresses and that each of these endpoints has the right type: bulk or interrupt, respectively. Although we already have usb_find_common_endpoints() and related routines meant for a similar purpose, they are not well suited for this kind of checking. Those routines find endpoints of various kinds, but only one (either the first or the last) of each kind, and they don't verify that the endpoints' addresses agree with what the caller expects. In theory the new routines could be more general: They could take a particular altsetting as their argument instead of always using the interface's current altsetting. In practice I think this won't matter too much; multiple altsettings tend to be used for transferring media (audio or visual) over isochronous endpoints, not bulk or interrupt. Drivers for such devices will generally require more sophisticated checking than these simplistic routines provide. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/dd2c8e8c-2c87-44ea-ba17-c64b97e201c9@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* net: fix stack overflow when LRO is disabled for virtual interfacesTaehee Yoo2023-05-302-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ae9b15fbe63447bc1d3bba3769f409d17ca6fdf6 upstream. When the virtual interface's feature is updated, it synchronizes the updated feature for its own lower interface. This propagation logic should be worked as the iteration, not recursively. But it works recursively due to the netdev notification unexpectedly. This problem occurs when it disables LRO only for the team and bonding interface type. team0 | +------+------+-----+-----+ | | | | | team1 team2 team3 ... team200 If team0's LRO feature is updated, it generates the NETDEV_FEAT_CHANGE event to its own lower interfaces(team1 ~ team200). It is worked by netdev_sync_lower_features(). So, the NETDEV_FEAT_CHANGE notification logic of each lower interface work iteratively. But generated NETDEV_FEAT_CHANGE event is also sent to the upper interface too. upper interface(team0) generates the NETDEV_FEAT_CHANGE event for its own lower interfaces again. lower and upper interfaces receive this event and generate this event again and again. So, the stack overflow occurs. But it is not the infinite loop issue. Because the netdev_sync_lower_features() updates features before generating the NETDEV_FEAT_CHANGE event. Already synchronized lower interfaces skip notification logic. So, it is just the problem that iteration logic is changed to the recursive unexpectedly due to the notification mechanism. Reproducer: ip link add team0 type team ethtool -K team0 lro on for i in {1..200} do ip link add team$i master team0 type team ethtool -K team$i lro on done ethtool -K team0 lro off In order to fix it, the notifier_ctx member of bonding/team is introduced. Reported-by: syzbot+60748c96cf5c6df8e581@syzkaller.appspotmail.com Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230517143010.3596250-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tpm: Prevent hwrng from activating during resumeJarkko Sakkinen2023-05-301-0/+1
| | | | | | | | | | | | | | [ Upstream commit 99d46450625590d410f86fe4660a5eff7d3b8343 ] Set TPM_CHIP_FLAG_SUSPENDED in tpm_pm_suspend() and reset in tpm_pm_resume(). While the flag is set, tpm_hwrng() gives back zero bytes. This prevents hwrng from racing during resume. Cc: stable@vger.kernel.org Fixes: 6e592a065d51 ("tpm: Move Linux RNG connection to hwrng") Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM driversJarkko Sakkinen2023-05-301-6/+7
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 0c8862de05c1a087795ee0a87bf61a6394306cc0 ] TPM chip bootstrapping was removed from tpm_chip_register(), and it was relocated to tpm_tis_core. This breaks all drivers which are not based on tpm_tis because the chip will not get properly initialized. Take the corrective steps: 1. Rename tpm_chip_startup() as tpm_chip_bootstrap() and make it one-shot. 2. Call tpm_chip_bootstrap() in tpm_chip_register(), which reverts the things as tehy used to be. Cc: Lino Sanfilippo <l.sanfilippo@kunbus.com> Fixes: 548eb516ec0f ("tpm, tpm_tis: startup chip before testing for interrupts") Reported-by: Pengfei Xu <pengfei.xu@intel.com> Link: https://lore.kernel.org/all/ZEjqhwHWBnxcaRV5@xpf.sh.intel.com/ Tested-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Stable-dep-of: 99d464506255 ("tpm: Prevent hwrng from activating during resume") Signed-off-by: Sasha Levin <sashal@kernel.org>
* SUNRPC: always free ctxt when freeing deferred requestNeilBrown2023-05-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 948f072ada23e0a504c5e4d7d71d4c83bd0785ec ] Since the ->xprt_ctxt pointer was added to svc_deferred_req, it has not been sufficient to use kfree() to free a deferred request. We may need to free the ctxt as well. As freeing the ctxt is all that ->xpo_release_rqst() does, we repurpose it to explicit do that even when the ctxt is not stored in an rqst. So we now have ->xpo_release_ctxt() which is given an xprt and a ctxt, which may have been taken either from an rqst or from a dreq. The caller is now responsible for clearing that pointer after the call to ->xpo_release_ctxt. We also clear dr->xprt_ctxt when the ctxt is moved into a new rqst when revisiting a deferred request. This ensures there is only one pointer to the ctxt, so the risk of double freeing in future is reduced. The new code in svc_xprt_release which releases both the ctxt and any rq_deferred depends on this. Fixes: 773f91b2cf3f ("SUNRPC: Fix NFSD's request deferral on RDMA transports") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* platform: Provide a remove callback that returns no valueUwe Kleine-König2023-05-241-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 5c5a7680e67ba6fbbb5f4d79fa41485450c1985c ] struct platform_driver::remove returning an integer made driver authors expect that returning an error code was proper error handling. However the driver core ignores the error and continues to remove the device because there is nothing the core could do anyhow and reentering the remove callback again is only calling for trouble. So this is an source for errors typically yielding resource leaks in the error path. As there are too many platform drivers to neatly convert them all to return void in a single go, do it in several steps after this patch: a) Convert all drivers to implement .remove_new() returning void instead of .remove() returning int; b) Change struct platform_driver::remove() to return void and so make it identical to .remove_new(); c) Change all drivers back to .remove() now with the better prototype; d) drop struct platform_driver::remove_new(). While this touches all drivers eventually twice, steps a) and c) can be done one driver after another and so reduces coordination efforts immensely and simplifies review. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221209150914.3557650-1-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 17955aba7877 ("ASoC: fsl_micfil: Fix error handler with pm_runtime_enable") Signed-off-by: Sasha Levin <sashal@kernel.org>
* sched: Fix KCSAN noinstr violationJosh Poimboeuf2023-05-241-1/+1
| | | | | | | | | | | | | | | | [ Upstream commit e0b081d17a9f4e5c0cbb0e5fbeb1abe3de0f7e4e ] With KCSAN enabled, end_of_stack() can get out-of-lined. Force it inline. Fixes the following warnings: vmlinux.o: warning: objtool: check_stackleak_irqoff+0x2b: call to end_of_stack() leaves .noinstr.text section Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/cc1b4d73d3a428a00d206242a68fdf99a934ca7b.1681320026.git.jpoimboe@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: Add new quirk for broken set random RPA timeout for ATS2851Raul Cheleguini2023-05-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 91b6d02ddcd113352bdd895990b252065c596de7 ] The ATS2851 based controller advertises support for command "LE Set Random Private Address Timeout" but does not actually implement it, impeding the controller initialization. Add the quirk HCI_QUIRK_BROKEN_SET_RPA_TIMEOUT to unblock the controller initialization. < HCI Command: LE Set Resolvable Private... (0x08|0x002e) plen 2 Timeout: 900 seconds > HCI Event: Command Status (0x0f) plen 4 LE Set Resolvable Private Address Timeout (0x08|0x002e) ncmd 1 Status: Unknown HCI Command (0x01) Co-developed-by: imoc <wzj9912@gmail.com> Signed-off-by: imoc <wzj9912@gmail.com> Signed-off-by: Raul Cheleguini <raul.cheleguini@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* Bluetooth: Add new quirk for broken local ext features page 2Vasily Khoruzhick2023-05-241-0/+7
| | | | | | | | | | | | | | | [ Upstream commit 8194f1ef5a815aea815a91daf2c721eab2674f1f ] Some adapters (e.g. RTL8723CS) advertise that they have more than 2 pages for local ext features, but they don't support any features declared in these pages. RTL8723CS reports max_page = 2 and declares support for sync train and secure connection, but it responds with either garbage or with error in status on corresponding commands. Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com> Signed-off-by: Bastian Germann <bage@debian.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* ipvs: Update width of source for ip_vs_sync_conn_optionsSimon Horman2023-05-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e3478c68f6704638d08f437cbc552ca5970c151a ] In ip_vs_sync_conn_v0() copy is made to struct ip_vs_sync_conn_options. That structure looks like this: struct ip_vs_sync_conn_options { struct ip_vs_seq in_seq; struct ip_vs_seq out_seq; }; The source of the copy is the in_seq field of struct ip_vs_conn. Whose type is struct ip_vs_seq. Thus we can see that the source - is not as wide as the amount of data copied, which is the width of struct ip_vs_sync_conn_option. The copy is safe because the next field in is another struct ip_vs_seq. Make use of struct_group() to annotate this. Flagged by gcc-13 as: In file included from ./include/linux/string.h:254, from ./include/linux/bitmap.h:11, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/paravirt.h:17, from ./arch/x86/include/asm/cpuid.h:62, from ./arch/x86/include/asm/processor.h:19, from ./arch/x86/include/asm/timex.h:5, from ./include/linux/timex.h:67, from ./include/linux/time32.h:13, from ./include/linux/time.h:60, from ./include/linux/stat.h:19, from ./include/linux/module.h:13, from net/netfilter/ipvs/ip_vs_sync.c:38: In function 'fortify_memcpy_chk', inlined from 'ip_vs_sync_conn_v0' at net/netfilter/ipvs/ip_vs_sync.c:606:3: ./include/linux/fortify-string.h:529:25: error: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 529 | __read_overflow2_field(q_size_field, size); | Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netdev: Enforce index cap in netdev_get_tx_queueNick Child2023-05-241-0/+1
| | | | | | | | | | | | | | | | | [ Upstream commit 1cc6571f562774f1d928dc8b3cff50829b86e970 ] When requesting a TX queue at a given index, warn on out-of-bounds referencing if the index is greater than the allocated number of queues. Specifically, since this function is used heavily in the networking stack use DEBUG_NET_WARN_ON_ONCE to avoid executing a new branch on every packet. Signed-off-by: Nick Child <nnac123@linux.ibm.com> Link: https://lore.kernel.org/r/20230321150725.127229-2-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4Shanker Donthineni2023-05-241-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 35727af2b15d98a2dd2811d631d3a3886111312e ] The T241 platform suffers from the T241-FABRIC-4 erratum which causes unexpected behavior in the GIC when multiple transactions are received simultaneously from different sources. This hardware issue impacts NVIDIA server platforms that use more than two T241 chips interconnected. Each chip has support for 320 {E}SPIs. This issue occurs when multiple packets from different GICs are incorrectly interleaved at the target chip. The erratum text below specifies exactly what can cause multiple transfer packets susceptible to interleaving and GIC state corruption. GIC state corruption can lead to a range of problems, including kernel panics, and unexpected behavior. >From the erratum text: "In some cases, inter-socket AXI4 Stream packets with multiple transfers, may be interleaved by the fabric when presented to ARM Generic Interrupt Controller. GIC expects all transfers of a packet to be delivered without any interleaving. The following GICv3 commands may result in multiple transfer packets over inter-socket AXI4 Stream interface: - Register reads from GICD_I* and GICD_N* - Register writes to 64-bit GICD registers other than GICD_IROUTERn* - ITS command MOVALL Multiple commands in GICv4+ utilize multiple transfer packets, including VMOVP, VMOVI, VMAPP, and 64-bit register accesses." This issue impacts system configurations with more than 2 sockets, that require multi-transfer packets to be sent over inter-socket AXI4 Stream interface between GIC instances on different sockets. GICv4 cannot be supported. GICv3 SW model can only be supported with the workaround. Single and Dual socket configurations are not impacted by this issue and support GICv3 and GICv4." Link: https://developer.nvidia.com/docs/t241-fabric-4/nvidia-t241-fabric-4-errata.pdf Writing to the chip alias region of the GICD_In{E} registers except GICD_ICENABLERn has an equivalent effect as writing to the global distributor. The SPI interrupt deactivate path is not impacted by the erratum. To fix this problem, implement a workaround that ensures read accesses to the GICD_In{E} registers are directed to the chip that owns the SPI, and disable GICv4.x features. To simplify code changes, the gic_configure_irq() function uses the same alias region for both read and write operations to GICD_ICFGR. Co-developed-by: Vikram Sethi <vsethi@nvidia.com> Signed-off-by: Vikram Sethi <vsethi@nvidia.com> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> (for SMCCC/SOC ID bits) Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230319024314.3540573-2-sdonthineni@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* firmware: arm_sdei: Fix sleep from invalid context BUGPierre Gondois2023-05-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit d2c48b2387eb89e0bf2a2e06e30987cf410acad4 ] Running a preempt-rt (v6.2-rc3-rt1) based kernel on an Ampere Altra triggers: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 24, name: cpuhp/0 preempt_count: 0, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by cpuhp/0/24: #0: ffffda30217c70d0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x5c/0x248 #1: ffffda30217c7120 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x5c/0x248 #2: ffffda3021c711f0 (sdei_list_lock){....}-{3:3}, at: sdei_cpuhp_up+0x3c/0x130 irq event stamp: 36 hardirqs last enabled at (35): [<ffffda301e85b7bc>] finish_task_switch+0xb4/0x2b0 hardirqs last disabled at (36): [<ffffda301e812fec>] cpuhp_thread_fun+0x21c/0x248 softirqs last enabled at (0): [<ffffda301e80b184>] copy_process+0x63c/0x1ac0 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 0 PID: 24 Comm: cpuhp/0 Not tainted 5.19.0-rc3-rt5-[...] Hardware name: WIWYNN Mt.Jade Server [...] Call trace: dump_backtrace+0x114/0x120 show_stack+0x20/0x70 dump_stack_lvl+0x9c/0xd8 dump_stack+0x18/0x34 __might_resched+0x188/0x228 rt_spin_lock+0x70/0x120 sdei_cpuhp_up+0x3c/0x130 cpuhp_invoke_callback+0x250/0xf08 cpuhp_thread_fun+0x120/0x248 smpboot_thread_fn+0x280/0x320 kthread+0x130/0x140 ret_from_fork+0x10/0x20 sdei_cpuhp_up() is called in the STARTING hotplug section, which runs with interrupts disabled. Use a CPUHP_AP_ONLINE_DYN entry instead to execute the cpuhp cb later, with preemption enabled. SDEI originally got its own cpuhp slot to allow interacting with perf. It got superseded by pNMI and this early slot is not relevant anymore. [1] Some SDEI calls (e.g. SDEI_1_0_FN_SDEI_PE_MASK) take actions on the calling CPU. It is checked that preemption is disabled for them. _ONLINE cpuhp cb are executed in the 'per CPU hotplug thread'. Preemption is enabled in those threads, but their cpumask is limited to 1 CPU. Move 'WARN_ON_ONCE(preemptible())' statements so that SDEI cpuhp cb don't trigger them. Also add a check for the SDEI_1_0_FN_SDEI_PRIVATE_RESET SDEI call which acts on the calling CPU. [1]: https://lore.kernel.org/all/5813b8c5-ae3e-87fd-fccc-94c9cd08816d@arm.com/ Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20230216084920.144064-1-pierre.gondois@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* open: return EINVAL for O_DIRECTORY | O_CREATChristian Brauner2023-05-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 43b450632676fb60e9faeddff285d9fac94a4f58 ] After a couple of years and multiple LTS releases we received a report that the behavior of O_DIRECTORY | O_CREAT changed starting with v5.7. On kernels prior to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL had the following semantics: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: create regular file * d exists and is a regular file: ENOTDIR * d exists and is a directory: EISDIR (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: create regular file * d exists and is a regular file: EEXIST * d exists and is a directory: EEXIST (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory On kernels since to v5.7 combinations of O_DIRECTORY, O_CREAT, O_EXCL have the following semantics: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: ENOTDIR (create regular file) * d exists and is a regular file: ENOTDIR * d exists and is a directory: EISDIR (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: ENOTDIR (create regular file) * d exists and is a regular file: EEXIST * d exists and is a directory: EEXIST (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory This is a fairly substantial semantic change that userspace didn't notice until Pedro took the time to deliberately figure out corner cases. Since no one noticed this breakage we can somewhat safely assume that O_DIRECTORY | O_CREAT combinations are likely unused. The v5.7 breakage is especially weird because while ENOTDIR is returned indicating failure a regular file is actually created. This doesn't make a lot of sense. Time was spent finding potential users of this combination. Searching on codesearch.debian.net showed that codebases often express semantical expectations about O_DIRECTORY | O_CREAT which are completely contrary to what our code has done and currently does. The expectation often is that this particular combination would create and open a directory. This suggests users who tried to use that combination would stumble upon the counterintuitive behavior no matter if pre-v5.7 or post v5.7 and quickly realize neither semantics give them what they want. For some examples see the code examples in [1] to [3] and the discussion in [4]. There are various ways to address this issue. The lazy/simple option would be to restore the pre-v5.7 behavior and to just live with that bug forever. But since there's a real chance that the O_DIRECTORY | O_CREAT quirk isn't relied upon we should try to get away with murder(ing bad semantics) first. If we need to Frankenstein pre-v5.7 behavior later so be it. So let's simply return EINVAL categorically for O_DIRECTORY | O_CREAT combinations. In addition to cleaning up the old bug this also opens up the possiblity to make that flag combination do something more intuitive in the future. Starting with this commit the following semantics apply: (1) open("/tmp/d", O_DIRECTORY | O_CREAT) * d doesn't exist: EINVAL * d exists and is a regular file: EINVAL * d exists and is a directory: EINVAL (2) open("/tmp/d", O_DIRECTORY | O_CREAT | O_EXCL) * d doesn't exist: EINVAL * d exists and is a regular file: EINVAL * d exists and is a directory: EINVAL (3) open("/tmp/d", O_DIRECTORY | O_EXCL) * d doesn't exist: ENOENT * d exists and is a regular file: ENOTDIR * d exists and is a directory: open directory One additional note, O_TMPFILE is implemented as: #define __O_TMPFILE 020000000 #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY) #define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT) For older kernels it was important to return an explicit error when O_TMPFILE wasn't supported. So O_TMPFILE requires that O_DIRECTORY is raised alongside __O_TMPFILE. It also enforced that O_CREAT wasn't specified. Since O_DIRECTORY | O_CREAT could be used to create a regular allowing that combination together with __O_TMPFILE would've meant that false positives were possible, i.e., that a regular file was created instead of a O_TMPFILE. This could've been used to trick userspace into thinking it operated on a O_TMPFILE when it wasn't. Now that we block O_DIRECTORY | O_CREAT completely the check for O_CREAT in the __O_TMPFILE branch via if ((flags & O_TMPFILE_MASK) != O_TMPFILE) can be dropped. Instead we can simply check verify that O_DIRECTORY is raised via if (!(flags & O_DIRECTORY)) and explain this in two comments. As Aleksa pointed out O_PATH is unaffected by this change since it always returned EINVAL if O_CREAT was specified - with or without O_DIRECTORY. Link: https://lore.kernel.org/lkml/20230320071442.172228-1-pedro.falcato@gmail.com Link: https://sources.debian.org/src/flatpak/1.14.4-1/subprojects/libglnx/glnx-dirfd.c/?hl=324#L324 [1] Link: https://sources.debian.org/src/flatpak-builder/1.2.3-1/subprojects/libglnx/glnx-shutil.c/?hl=251#L251 [2] Link: https://sources.debian.org/src/ostree/2022.7-2/libglnx/glnx-dirfd.c/?hl=324#L324 [3] Link: https://www.openwall.com/lists/oss-security/2014/11/26/14 [4] Reported-by: Pedro Falcato <pedro.falcato@gmail.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: add vlan_get_protocol_and_depth() helperEric Dumazet2023-05-241-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 4063384ef762cc5946fc7a3f89879e76c6ec51e2 ] Before blamed commit, pskb_may_pull() was used instead of skb_header_pointer() in __vlan_get_protocol() and friends. Few callers depended on skb->head being populated with MAC header, syzbot caught one of them (skb_mac_gso_segment()) Add vlan_get_protocol_and_depth() to make the intent clearer and use it where sensible. This is a more generic fix than commit e9d3f80935b6 ("net/af_packet: make sure to pull mac header") which was dealing with a similar issue. kernel BUG at include/linux/skbuff.h:2655 ! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023 RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline] RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136 Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286 RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9 R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012 R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7 FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> [<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419 [<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline] [<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725 [<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313 [<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029 [<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline] [<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142 [<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline] [<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline] [<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139 [<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline] [<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline] [<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147 [<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80 [<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 469aceddfa3e ("vlan: consolidate VLAN parsing code and limit max parsing depth") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* bonding: fix send_peer_notif overflowHangbin Liu2023-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 9949e2efb54eb3001cb2f6512ff3166dddbfb75d ] Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications"). the bond->send_peer_notif will be num_peer_notif multiplied by peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif overflow easily. e.g. ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000 To fix the overflow, let's set the send_peer_notif to u32 and limit peer_notif_delay to 300s. Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053 Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().Kuniyuki Iwashima2023-05-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit dfd9248c071a3710c24365897459538551cb7167 ] KCSAN found a data race in sock_recv_cmsgs() where the read access to sk->sk_stamp needs READ_ONCE(). BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0: sock_write_timestamp include/net/sock.h:2670 [inline] sock_recv_cmsgs include/net/sock.h:2722 [inline] packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489 sock_recvmsg_nosec net/socket.c:1019 [inline] sock_recvmsg+0x11a/0x130 net/socket.c:1040 sock_read_iter+0x176/0x220 net/socket.c:1118 call_read_iter include/linux/fs.h:1845 [inline] new_sync_read fs/read_write.c:389 [inline] vfs_read+0x5e0/0x630 fs/read_write.c:470 ksys_read+0x163/0x1a0 fs/read_write.c:613 __do_sys_read fs/read_write.c:623 [inline] __se_sys_read fs/read_write.c:621 [inline] __x64_sys_read+0x41/0x50 fs/read_write.c:621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1: sock_recv_cmsgs include/net/sock.h:2721 [inline] packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489 sock_recvmsg_nosec net/socket.c:1019 [inline] sock_recvmsg+0x11a/0x130 net/socket.c:1040 sock_read_iter+0x176/0x220 net/socket.c:1118 call_read_iter include/linux/fs.h:1845 [inline] new_sync_read fs/read_write.c:389 [inline] vfs_read+0x5e0/0x630 fs/read_write.c:470 ksys_read+0x163/0x1a0 fs/read_write.c:613 __do_sys_read fs/read_write.c:623 [inline] __se_sys_read fs/read_write.c:621 [inline] __x64_sys_read+0x41/0x50 fs/read_write.c:621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0xffffffffc4653600 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 6c7c98bad488 ("sock: avoid dirtying sk_stamp, if possible") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230508175543.55756-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* linux/dim: Do nothing if no time delta between samplesRoy Novich2023-05-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 162bd18eb55adf464a0fa2b4144b8d61c75ff7c2 ] Add return value for dim_calc_stats. This is an indication for the caller if curr_stats was assigned by the function. Avoid using curr_stats uninitialized over {rdma/net}_dim, when no time delta between samples. Coverity reported this potential use of an uninitialized variable. Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* x86/amd_nb: Add PCI ID for family 19h model 78hMario Limonciello2023-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9 upstream. Commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe") switched to using amd_smn_read() which relies upon the misc PCI ID used by DF function 3 being included in a table. The ID for model 78h is missing in that table, so amd_smn_read() doesn't work. Add the missing ID into amd_nb, restoring s2idle on this system. [ bp: Simplify commit message. ] Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230427053338.16653-2-mario.limonciello@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* drm/dsc: fix DP_DSC_MAX_BPP_DELTA_* macro valuesJani Nikula2023-05-171-2/+2
| | | | | | | | | | | | | | commit 0d68683838f2850dd8ff31f1121e05bfb7a2def0 upstream. The macro values just don't match the specs. Fix them. Fixes: 1482ec00be4a ("drm: Add missing DP DSC extended capability definitions.") Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406134615.1422509-2-jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* drm/dsc: fix drm_edp_dsc_sink_output_bpp() DPCD high byte usageJani Nikula2023-05-172-4/+2
| | | | | | | | | | | | | | | | | | | [ Upstream commit 13525645e2246ebc8a21bd656248d86022a6ee8f ] The operator precedence between << and & is wrong, leading to the high byte being completely ignored. For example, with the 6.4 format, 32 becomes 0 and 24 becomes 8. Fix it, and remove the slightly confusing and unnecessary DP_DSC_MAX_BITS_PER_PIXEL_HI_SHIFT macro while at it. Fixes: 0575650077ea ("drm/dp: DRM DP helper/macros to get DP sink DSC parameters") Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Manasi Navare <navaremanasi@google.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: <stable@vger.kernel.org> # v5.0+ Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230406134615.1422509-1-jani.nikula@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* drm: Add missing DP DSC extended capability definitions.Stanislav Lisovskiy2023-05-171-1/+8
| | | | | | | | | | | | | | | | | [ Upstream commit 1482ec00be4a3634aeffbcc799791a723df69339 ] Adding DP DSC register definitions, we might need for further DSC implementation, supporting MST and DP branch pass-through mode. v2: - Fixed checkpatch comment warning v3: - Removed function which is not yet used(Jani Nikula) Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221101094222.22091-2-stanislav.lisovskiy@intel.com Stable-dep-of: 13525645e224 ("drm/dsc: fix drm_edp_dsc_sink_output_bpp() DPCD high byte usage") Signed-off-by: Sasha Levin <sashal@kernel.org>
* f2fs: refactor extent_cache to support for read and moreJaegeuk Kim2023-05-171-26/+36
| | | | | | | | | | | [ Upstream commit e7547daccd6a37522f0af74ec4b5a3036f3dd328 ] This patch prepares extent_cache to be ready for addition. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 043d2d00b443 ("f2fs: factor out victim_entry usage from general rb_tree use") Signed-off-by: Sasha Levin <sashal@kernel.org>
* crypto: api - Add scaffolding to change completion function signatureHerbert Xu2023-05-172-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c35e03eaece71101ff6cbf776b86403860ac8cc3 ] The crypto completion function currently takes a pointer to a struct crypto_async_request object. However, in reality the API does not allow the use of any part of the object apart from the data field. For example, ahash/shash will create a fake object on the stack to pass along a different data field. This leads to potential bugs where the user may try to dereference or otherwise use the crypto_async_request object. This patch adds some temporary scaffolding so that the completion function can take a void * instead. Once affected users have been converted this can be removed. The helper crypto_request_complete will remain even after the conversion is complete. It should be used instead of calling the completion function directly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Stable-dep-of: 4140aafcff16 ("crypto: engine - fix crypto_queue backlog handling") Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: nf_tables: deactivate anonymous set from preparation phasePablo Neira Ayuso2023-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c1592a89942e9678f7d9c8030efa777c0d57edab upstream. Toggle deleted anonymous sets as inactive in the next generation, so users cannot perform any update on it. Clear the generation bitmask in case the transaction is aborted. The following KASAN splat shows a set element deletion for a bound anonymous set that has been already removed in the same transaction. [ 64.921510] ================================================================== [ 64.923123] BUG: KASAN: wild-memory-access in nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.924745] Write of size 8 at addr dead000000000122 by task test/890 [ 64.927903] CPU: 3 PID: 890 Comm: test Not tainted 6.3.0+ #253 [ 64.931120] Call Trace: [ 64.932699] <TASK> [ 64.934292] dump_stack_lvl+0x33/0x50 [ 64.935908] ? nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.937551] kasan_report+0xda/0x120 [ 64.939186] ? nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.940814] nf_tables_commit+0xa24/0x1490 [nf_tables] [ 64.942452] ? __kasan_slab_alloc+0x2d/0x60 [ 64.944070] ? nf_tables_setelem_notify+0x190/0x190 [nf_tables] [ 64.945710] ? kasan_set_track+0x21/0x30 [ 64.947323] nfnetlink_rcv_batch+0x709/0xd90 [nfnetlink] [ 64.948898] ? nfnetlink_rcv_msg+0x480/0x480 [nfnetlink] Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* bonding (gcc13): synchronize bond_{a,t}lb_xmit() typesJiri Slaby (SUSE)2023-05-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | commit 777fa87c7682228e155cf0892ba61cb2ab1fe3ae upstream. Both bond_alb_xmit() and bond_tlb_xmit() produce a valid warning with gcc-13: drivers/net/bonding/bond_alb.c:1409:13: error: conflicting types for 'bond_tlb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff *, struct net_device *)' ... include/net/bond_alb.h:160:5: note: previous declaration of 'bond_tlb_xmit' with type 'int(struct sk_buff *, struct net_device *)' drivers/net/bonding/bond_alb.c:1523:13: error: conflicting types for 'bond_alb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff *, struct net_device *)' ... include/net/bond_alb.h:159:5: note: previous declaration of 'bond_alb_xmit' with type 'int(struct sk_buff *, struct net_device *)' I.e. the return type of the declaration is int, while the definitions spell netdev_tx_t. Synchronize both of them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114409.10417-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* btrfs: scrub: reject unsupported scrub flagsQu Wenruo2023-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 604e6681e114d05a2e384c4d1e8ef81918037ef5 upstream. Since the introduction of scrub interface, the only flag that we support is BTRFS_SCRUB_READONLY. Thus there is no sanity checks, if there are some undefined flags passed in, we just ignore them. This is problematic if we want to introduce new scrub flags, as we have no way to determine if such flags are supported. Address the problem by introducing a check for the flags, and if unsupported flags are set, return -EOPNOTSUPP to inform the user space. This check should be backported for all supported kernels before any new scrub flags are introduced. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* mailbox: zynqmp: Fix typo in IPI documentationTanmay Shah2023-05-111-1/+1
| | | | | | | | | | | | | | | commit 79963fbfc233759bd8a43462f120d15a1bd4f4fa upstream. Xilinx IPI message buffers allows 32-byte data transfer. Fix documentation that says 12 bytes Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller") Signed-off-by: Tanmay Shah <tanmay.shah@amd.com> Acked-by: Michal Simek <michal.simek@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230311012407.1292118-4-tanmay.shah@amd.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* SUNRPC: remove the maximum number of retries in call_bind_statusDai Ngo2023-05-111-2/+1
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 691d0b782066a6eeeecbfceb7910a8f6184e6105 ] Currently call_bind_status places a hard limit of 3 to the number of retries on EACCES error. This limit was done to prevent NLM unlock requests from being hang forever when the server keeps returning garbage. However this change causes problem for cases when NLM service takes longer than 9 seconds to register with the port mapper after a restart. This patch removes this hard coded limit and let the RPC handles the retry based on the standard hard/soft task semantics. Fixes: 0b760113a3a1 ("NLM: Don't hang forever on NLM unlock requests") Reported-by: Helen Chao <helen.chao@oracle.com> Tested-by: Helen Chao <helen.chao@oracle.com> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* RDMA/mlx5: Fix flow counter query via DEVXMark Bloch2023-05-111-1/+2
| | | | | | | | | | | | | | | | | | [ Upstream commit 3e358ea8614ddfbc59ca7a3f5dff5dde2b350b2c ] Commit cited in "fixes" tag added bulk support for flow counters but it didn't account that's also possible to query a counter using a non-base id if the counter was allocated as bulk. When a user performs a query, validate the flow counter id given in the mailbox is inside the valid range taking bulk value into account. Fixes: 208d70f562e5 ("IB/mlx5: Support flow counters offset for bulk counters") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Link: https://lore.kernel.org/r/79d7fbe291690128e44672418934256254d93115.1681377114.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* uapi/linux/const.h: prefer ISO-friendly __typeof__Kevin Brodsky2023-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 31088f6f7906253ef4577f6a9b84e2d42447dba0 ] typeof is (still) a GNU extension, which means that it cannot be used when building ISO C (e.g. -std=c99). It should therefore be avoided in uapi headers in favour of the ISO-friendly __typeof__. Unfortunately this issue could not be detected by CONFIG_UAPI_HEADER_TEST=y as the __ALIGN_KERNEL() macro is not expanded in any uapi header. This matters from a userspace perspective, not a kernel one. uapi headers and their contents are expected to be usable in a variety of situations, and in particular when building ISO C applications (with -std=c99 or similar). This particular problem can be reproduced by trying to use the __ALIGN_KERNEL macro directly in application code, say: #include <linux/const.h> int align(int x, int a) { return __KERNEL_ALIGN(x, a); } and trying to build that with -std=c99. Link: https://lkml.kernel.org/r/20230411092747.3759032-1-kevin.brodsky@arm.com Fixes: a79ff731a1b2 ("netfilter: xtables: make XT_ALIGN() usable in exported headers by exporting __ALIGN_KERNEL()") Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reported-by: Ruben Ayrapetyan <ruben.ayrapetyan@arm.com> Tested-by: Ruben Ayrapetyan <ruben.ayrapetyan@arm.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Tested-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* linux/vt_buffer.h: allow either builtin or modular for macrosRandy Dunlap2023-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 2b76ffe81e32afd6d318dc4547e2ba8c46207b77 ] Fix build errors on ARCH=alpha when CONFIG_MDA_CONSOLE=m. This allows the ARCH macros to be the only ones defined. In file included from ../drivers/video/console/mdacon.c:37: ../arch/alpha/include/asm/vga.h:17:40: error: expected identifier or '(' before 'volatile' 17 | static inline void scr_writew(u16 val, volatile u16 *addr) | ^~~~~~~~ ../include/linux/vt_buffer.h:24:34: note: in definition of macro 'scr_writew' 24 | #define scr_writew(val, addr) (*(addr) = (val)) | ^~~~ ../include/linux/vt_buffer.h:24:40: error: expected ')' before '=' token 24 | #define scr_writew(val, addr) (*(addr) = (val)) | ^ ../arch/alpha/include/asm/vga.h:17:20: note: in expansion of macro 'scr_writew' 17 | static inline void scr_writew(u16 val, volatile u16 *addr) | ^~~~~~~~~~ ../arch/alpha/include/asm/vga.h:25:29: error: expected identifier or '(' before 'volatile' 25 | static inline u16 scr_readw(volatile const u16 *addr) | ^~~~~~~~ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: dri-devel@lists.freedesktop.org Cc: linux-fbdev@vger.kernel.org Link: https://lore.kernel.org/r/20230329021529.16188-1-rdunlap@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: nf_tables: don't write table validation state without mutexFlorian Westphal2023-05-111-1/+0
| | | | | | | | | | | | | | | [ Upstream commit 9a32e9850686599ed194ccdceb6cd3dd56b2d9b9 ] The ->cleanup callback needs to be removed, this doesn't work anymore as the transaction mutex is already released in the ->abort function. Just do it after a successful validation pass, this either happens from commit or abort phases where transaction mutex is held. Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* netfilter: conntrack: fix wrong ct->timeout valueTzung-Bi Shih2023-05-111-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 73db1b8f2bb6725b7391e85aab41fdf592b3c0c1 ] (struct nf_conn)->timeout is an interval before the conntrack confirmed. After confirmed, it becomes a timestamp. It is observed that timeout of an unconfirmed conntrack: - Set by calling ctnetlink_change_timeout(). As a result, `nfct_time_stamp` was wrongly added to `ct->timeout` twice. - Get by calling ctnetlink_dump_timeout(). As a result, `nfct_time_stamp` was wrongly subtracted. Call Trace: <TASK> dump_stack_lvl ctnetlink_dump_timeout __ctnetlink_glue_build ctnetlink_glue_build __nfqnl_enqueue_packet nf_queue nf_hook_slow ip_mc_output ? __pfx_ip_finish_output ip_send_skb ? __pfx_dst_output udp_send_skb udp_sendmsg ? __pfx_ip_generic_getfrag sock_sendmsg Separate the 2 cases in: - Setting `ct->timeout` in __nf_ct_set_timeout(). - Getting `ct->timeout` in ctnetlink_dump_timeout(). Pablo appends: Update ctnetlink to set up the timeout _after_ the IPS_CONFIRMED flag is set on, otherwise conntrack creation via ctnetlink breaks. Note that the problem described in this patch occurs since the introduction of the nfnetlink_queue conntrack support, select a sufficiently old Fixes: tag for -stable kernel to pick up this fix. Fixes: a4b4766c3ceb ("netfilter: nfnetlink_queue: rename related to nfqueue attaching conntrack info") Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* xsk: Fix unaligned descriptor validationKal Conley2023-05-111-7/+2
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit d769ccaf957fe7391f357c0a923de71f594b8a2b ] Make sure unaligned descriptors that straddle the end of the UMEM are considered invalid. Currently, descriptor validation is broken for zero-copy mode which only checks descriptors at page granularity. For example, descriptors in zero-copy mode that overrun the end of the UMEM but not a page boundary are (incorrectly) considered valid. The UMEM boundary check needs to happen before the page boundary and contiguity checks in xp_desc_crosses_non_contig_pg(). Do this check in xp_unaligned_validate_desc() instead like xp_check_unaligned() already does. Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Kal Conley <kal.conley@dectris.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230405235920.7305-2-kal.conley@dectris.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* net: qrtr: correct types of trace event parametersSimon Horman2023-05-111-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 054fbf7ff8143d35ca7d3bb5414bb44ee1574194 ] The arguments passed to the trace events are of type unsigned int, however the signature of the events used __le32 parameters. I may be missing the point here, but sparse flagged this and it does seem incorrect to me. net/qrtr/ns.c: note: in included file (through include/trace/trace_events.h, include/trace/define_trace.h, include/trace/events/qrtr.h): ./include/trace/events/qrtr.h:11:1: warning: cast to restricted __le32 ./include/trace/events/qrtr.h:11:1: warning: restricted __le32 degrades to integer ./include/trace/events/qrtr.h:11:1: warning: restricted __le32 degrades to integer ... (a lot more similar warnings) net/qrtr/ns.c:115:47: expected restricted __le32 [usertype] service net/qrtr/ns.c:115:47: got unsigned int service net/qrtr/ns.c:115:61: warning: incorrect type in argument 2 (different base types) ... (a lot more similar warnings) Fixes: dfddb54043f0 ("net: qrtr: Add tracepoint support") Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230402-qrtr-trace-types-v1-1-92ad55008dd3@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: libsas: Add sas_ata_device_link_abort()John Garry2023-05-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 44112922674b94a7d699dfff6307fc830018df7c ] Similar to how AHCI handles NCQ errors in ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort(), add an NCQ error handler for LLDDs to call to initiate a link abort. This will mark all outstanding QCs as failed and kick-off EH. Note: A "force reset" argument is added for drivers which require the ATA error handling to always reset the device. A driver may require this feature for when SATA device per-SCSI cmnd resources are only released during reset for ATA EH. As such, we need an option to force reset to be done, regardless of what any EH autopsy decides. The SATA device FIS fields are set to indicate a device error from ata_eh_analyze_tf(). Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1665998435-199946-2-git-send-email-john.garry@huawei.com Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Niklas Cassel <niklas.cassel@wdc.com> # pm80xx Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: bb544224da77 ("scsi: hisi_sas: Handle NCQ error when IPTT is valid") Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: target: Fix multiple LUN_RESET handlingMike Christie2023-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 673db054d7a2b5a470d7a25baf65956d005ad729 ] This fixes a bug where an initiator thinks a LUN_RESET has cleaned up running commands when it hasn't. The bug was added in commit 51ec502a3266 ("target: Delete tmr from list before processing"). The problem occurs when: 1. We have N I/O cmds running in the target layer spread over 2 sessions. 2. The initiator sends a LUN_RESET for each session. 3. session1's LUN_RESET loops over all the running commands from both sessions and moves them to its local drain_task_list. 4. session2's LUN_RESET does not see the LUN_RESET from session1 because the commit above has it remove itself. session2 also does not see any commands since the other reset moved them off the state lists. 5. sessions2's LUN_RESET will then complete with a successful response. 6. sessions2's inititor believes the running commands on its session are now cleaned up due to the successful response and cleans up the running commands from its side. It then restarts them. 7. The commands do eventually complete on the backend and the target starts to return aborted task statuses for them. The initiator will either throw a invalid ITT error or might accidentally lookup a new task if the ITT has been reallocated already. Fix the bug by reverting the patch, and serialize the execution of LUN_RESETs and Preempt and Aborts. Also prevent us from waiting on LUN_RESETs in core_tmr_drain_tmr_list, because it turns out the original patch fixed a bug that was not mentioned. For LUN_RESET1 core_tmr_drain_tmr_list can see a second LUN_RESET and wait on it. Then the second reset will run core_tmr_drain_tmr_list and see the first reset and wait on it resulting in a deadlock. Fixes: 51ec502a3266 ("target: Delete tmr from list before processing") Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-8-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: target: iscsit: isert: Alloc per conn cmd counterMike Christie2023-05-111-0/+3
| | | | | | | | | | | | | [ Upstream commit 6d256bee602b131bd4fbc92863b6a1210bcf6325 ] This has iscsit allocate a per conn cmd counter and converts iscsit/isert to use it instead of the per session one. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-5-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 395cee83d02d ("scsi: target: iscsit: Stop/wait on cmds during conn close") Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: target: Pass in cmd counter to use during cmd setupMike Christie2023-05-111-3/+5
| | | | | | | | | | | | | | [ Upstream commit 8e288be8606ad87c1726618eacfb8fbd3ab4b806 ] Allow target_get_sess_cmd() users to pass in the cmd counter they want to use. Right now we pass in the session's cmd counter but in a subsequent commit iSCSI will switch from per session to per conn. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-4-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 395cee83d02d ("scsi: target: iscsit: Stop/wait on cmds during conn close") Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: target: Move cmd counter allocationMike Christie2023-05-111-1/+3
| | | | | | | | | | | | | | | | | | | [ Upstream commit 4edba7e4a8f39112398d3cda94128a8e13a7d527 ] iSCSI needs to allocate its cmd counter per connection for MCS support where we need to stop and wait on commands running on a connection instead of per session. This moves the cmd counter allocation to target_setup_session() which is used by drivers that need the stop+wait behavior per session. xcopy doesn't need stop+wait at all, so we will be OK moving the cmd counter allocation outside of transport_init_session(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-3-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 395cee83d02d ("scsi: target: iscsit: Stop/wait on cmds during conn close") Signed-off-by: Sasha Levin <sashal@kernel.org>
* scsi: target: Move sess cmd counter to new structMike Christie2023-05-112-4/+10
| | | | | | | | | | | | | | | | | | | | [ Upstream commit becd9be6069e7b183c084f460f0eb363e43cc487 ] iSCSI needs to wait on outstanding commands like how SRP and the FC/FCoE drivers do. It can't use target_stop_session() because for MCS support we can't stop the entire session during recovery because if other connections are OK then we want to be able to continue to execute I/O on them. Move the per session cmd counters to a new struct so iSCSI can allocate them per connection. The xcopy code can also just not allocate in the future since it doesn't need to track commands. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230319015620.96006-2-michael.christie@oracle.com Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 395cee83d02d ("scsi: target: iscsit: Stop/wait on cmds during conn close") Signed-off-by: Sasha Levin <sashal@kernel.org>