summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* can: isotp: padlen(): make const array static, makes object smallerColin Ian King2020-11-031-8/+10
| | | | | | | | | | | | | | | | | | | Don't populate the const array plen on the stack but instead it static. Makes the object code smaller by 926 bytes. Before: text data bss dec hex filename 26531 1943 64 28538 6f7a net/can/isotp.o After: text data bss dec hex filename 25509 2039 64 27612 6bdc net/can/isotp.o (gcc version 10.2.0) Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201020154203.54711-1-colin.king@canonical.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: isotp: isotp_rcv_cf(): enable RX timeout handling in listen-only modeOliver Hartkopp2020-11-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | As reported by Thomas Wagner: https://github.com/hartkopp/can-isotp/issues/34 the timeout handling for data frames is not enabled when the isotp socket is used in listen-only mode (sockopt CAN_ISOTP_LISTEN_MODE). This mode is enabled by the isotpsniffer application which therefore became inconsistend with the strict rx timeout rules when running the isotp protocol in the operational mode. This patch fixes this inconsistency by moving the return condition for the listen-only mode behind the timeout handling code. Reported-by: Thomas Wagner <thwa1@web.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://github.com/hartkopp/can-isotp/issues/34 Link: https://lore.kernel.org/r/20201019120229.89326-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: isotp: Explain PDU in CAN_ISOTP help textGeert Uytterhoeven2020-11-031-2/+3
| | | | | | | | | | | | | The help text for the CAN_ISOTP config symbol uses the acronym "PDU". However, this acronym is not explained here, nor in Documentation/networking/can.rst. Expand the acronym to make it easier for users to decide if they need to enable the CAN_ISOTP option or not. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20201013141341.28487-1-geert+renesas@glider.be Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: j1939: j1939_sk_bind(): return failure if netdev is downZhang Changzhong2020-11-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | When a netdev down event occurs after a successful call to j1939_sk_bind(), j1939_netdev_notify() can handle it correctly. But if the netdev already in down state before calling j1939_sk_bind(), j1939_sk_release() will stay in wait_event_interruptible() blocked forever. Because in this case, j1939_netdev_notify() won't be called and j1939_tp_txtimer() won't call j1939_session_cancel() or other function to clear session for ENETDOWN error, this lead to mismatch of j1939_session_get/put() and jsk->skb_pending will never decrease to zero. To reproduce it use following commands: 1. ip link add dev vcan0 type vcan 2. j1939acd -r 100,80-120 1122334455667788 vcan0 3. presses ctrl-c and thread will be blocked forever This patch adds check for ndev->flags in j1939_sk_bind() to avoid this kind of situation and return with -ENETDOWN. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1599460308-18770-1-git-send-email-zhangchangzhong@huawei.com Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: j1939: use backquotes for code samplesYegor Yefremov2020-11-031-44/+44
| | | | | | | | This patch adds backquotes for code samples. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Link: https://lore.kernel.org/r/20201026094442.16587-1-yegorslists@googlemail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: j1939: swap addr and pgn in the send exampleYegor Yefremov2020-11-031-2/+2
| | | | | | | | | The address was wrongly assigned to the PGN field and vice versa. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Link: https://lore.kernel.org/r/20201022083708.8755-1-yegorslists@googlemail.com Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: j1939: fix syntax and spellingYegor Yefremov2020-11-031-14/+14
| | | | | | | | | This patches fixes the syntax an spelling of the j1939 documentation. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Link: https://lore.kernel.org/r/20201020101043.6369-1-yegorslists@googlemail.com Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: j1939: rename jacd toolYegor Yefremov2020-11-031-1/+1
| | | | | | | | | | | Due to naming conflicts, jacd was renamed to j1939acd in: https://github.com/linux-can/can-utils/pull/199 Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Link: https://lore.kernel.org/r/20201020081134.3597-1-yegorslists@googlemail.com Link: https://github.com/linux-can/can-utils/pull/199 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()Oleksij Rempel2020-11-031-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All user space generated SKBs are owned by a socket (unless injected into the key via AF_PACKET). If a socket is closed, all associated skbs will be cleaned up. This leads to a problem when a CAN driver calls can_put_echo_skb() on a unshared SKB. If the socket is closed prior to the TX complete handler, can_get_echo_skb() and the subsequent delivering of the echo SKB to all registered callbacks, a SKB with a refcount of 0 is delivered. To avoid the problem, in can_get_echo_skb() the original SKB is now always cloned, regardless of shared SKB or not. If the process exists it can now safely discard its SKBs, without disturbing the delivery of the echo SKB. The problem shows up in the j1939 stack, when it clones the incoming skb, which detects the already 0 refcount. We can easily reproduce this with following example: testj1939 -B -r can0: & cansend can0 1823ff40#0123 WARNING: CPU: 0 PID: 293 at lib/refcount.c:25 refcount_warn_saturate+0x108/0x174 refcount_t: addition on 0; use-after-free. Modules linked in: coda_vpu imx_vdoa videobuf2_vmalloc dw_hdmi_ahb_audio vcan CPU: 0 PID: 293 Comm: cansend Not tainted 5.5.0-rc6-00376-g9e20dcb7040d #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Backtrace: [<c010f570>] (dump_backtrace) from [<c010f90c>] (show_stack+0x20/0x24) [<c010f8ec>] (show_stack) from [<c0c3e1a4>] (dump_stack+0x8c/0xa0) [<c0c3e118>] (dump_stack) from [<c0127fec>] (__warn+0xe0/0x108) [<c0127f0c>] (__warn) from [<c01283c8>] (warn_slowpath_fmt+0xa8/0xcc) [<c0128324>] (warn_slowpath_fmt) from [<c0539c0c>] (refcount_warn_saturate+0x108/0x174) [<c0539b04>] (refcount_warn_saturate) from [<c0ad2cac>] (j1939_can_recv+0x20c/0x210) [<c0ad2aa0>] (j1939_can_recv) from [<c0ac9dc8>] (can_rcv_filter+0xb4/0x268) [<c0ac9d14>] (can_rcv_filter) from [<c0aca2cc>] (can_receive+0xb0/0xe4) [<c0aca21c>] (can_receive) from [<c0aca348>] (can_rcv+0x48/0x98) [<c0aca300>] (can_rcv) from [<c09b1fdc>] (__netif_receive_skb_one_core+0x64/0x88) [<c09b1f78>] (__netif_receive_skb_one_core) from [<c09b2070>] (__netif_receive_skb+0x38/0x94) [<c09b2038>] (__netif_receive_skb) from [<c09b2130>] (netif_receive_skb_internal+0x64/0xf8) [<c09b20cc>] (netif_receive_skb_internal) from [<c09b21f8>] (netif_receive_skb+0x34/0x19c) [<c09b21c4>] (netif_receive_skb) from [<c0791278>] (can_rx_offload_napi_poll+0x58/0xb4) Fixes: 0ae89beb283a ("can: add destructor for self generated skbs") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: http://lore.kernel.org/r/20200124132656.22156-1-o.rempel@pengutronix.de Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: dev: __can_get_echo_skb(): fix real payload length return value for RTR ↵Oliver Hartkopp2020-11-031-2/+6
| | | | | | | | | | | | | | | | frames The can_get_echo_skb() function returns the number of received bytes to be used for netdev statistics. In the case of RTR frames we get a valid (potential non-zero) data length value which has to be passed for further operations. But on the wire RTR frames have no payload length. Therefore the value to be used in the statistics has to be zero for RTR frames. Reported-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/r/20201020064443.80164-1-socketcan@hartkopp.net Fixes: cf5046b309b3 ("can: dev: let can_get_echo_skb() return dlc of CAN frame") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ contextVincent Mailhol2020-11-031-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a driver calls can_get_echo_skb() during a hardware IRQ (which is often, but not always, the case), the 'WARN_ON(in_irq)' in net/core/skbuff.c#skb_release_head_state() might be triggered, under network congestion circumstances, together with the potential risk of a NULL pointer dereference. The root cause of this issue is the call to kfree_skb() instead of dev_kfree_skb_irq() in net/core/dev.c#enqueue_to_backlog(). This patch prevents the skb to be freed within the call to netif_rx() by incrementing its reference count with skb_get(). The skb is finally freed by one of the in-irq-context safe functions: dev_consume_skb_any() or dev_kfree_skb_any(). The "any" version is used because some drivers might call can_get_echo_skb() in a normal context. The reason for this issue to occur is that initially, in the core network stack, loopback skb were not supposed to be received in hardware IRQ context. The CAN stack is an exeption. This bug was previously reported back in 2017 in [1] but the proposed patch never got accepted. While [1] directly modifies net/core/dev.c, we try to propose here a smoother modification local to CAN network stack (the assumption behind is that only CAN devices are affected by this issue). [1] http://lore.kernel.org/r/57a3ffb6-3309-3ad5-5a34-e93c3fe3614d@cetitec.com Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/r/20201002154219.4887-2-mailhol.vincent@wanadoo.fr Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: rx-offload: don't call kfree_skb() from IRQ contextMarc Kleine-Budde2020-11-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A CAN driver, using the rx-offload infrastructure, is reading CAN frames (usually in IRQ context) from the hardware and placing it into the rx-offload queue to be delivered to the networking stack via NAPI. In case the rx-offload queue is full, trying to add more skbs results in the skbs being dropped using kfree_skb(). If done from hard-IRQ context this results in the following warning: [ 682.552693] ------------[ cut here ]------------ [ 682.557360] WARNING: CPU: 0 PID: 3057 at net/core/skbuff.c:650 skb_release_head_state+0x74/0x84 [ 682.566075] Modules linked in: can_raw can coda_vpu flexcan dw_hdmi_ahb_audio v4l2_jpeg imx_vdoa can_dev [ 682.575597] CPU: 0 PID: 3057 Comm: cansend Tainted: G W 5.7.0+ #18 [ 682.583098] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 682.589657] [<c0112628>] (unwind_backtrace) from [<c010c1c4>] (show_stack+0x10/0x14) [ 682.597423] [<c010c1c4>] (show_stack) from [<c06c481c>] (dump_stack+0xe0/0x114) [ 682.604759] [<c06c481c>] (dump_stack) from [<c0128f10>] (__warn+0xc0/0x10c) [ 682.611742] [<c0128f10>] (__warn) from [<c0129314>] (warn_slowpath_fmt+0x5c/0xc0) [ 682.619248] [<c0129314>] (warn_slowpath_fmt) from [<c0b95dec>] (skb_release_head_state+0x74/0x84) [ 682.628143] [<c0b95dec>] (skb_release_head_state) from [<c0b95e08>] (skb_release_all+0xc/0x24) [ 682.636774] [<c0b95e08>] (skb_release_all) from [<c0b95eac>] (kfree_skb+0x74/0x1c8) [ 682.644479] [<c0b95eac>] (kfree_skb) from [<bf001d1c>] (can_rx_offload_queue_sorted+0xe0/0xe8 [can_dev]) [ 682.654051] [<bf001d1c>] (can_rx_offload_queue_sorted [can_dev]) from [<bf001d6c>] (can_rx_offload_get_echo_skb+0x48/0x94 [can_dev]) [ 682.666007] [<bf001d6c>] (can_rx_offload_get_echo_skb [can_dev]) from [<bf01efe4>] (flexcan_irq+0x194/0x5dc [flexcan]) [ 682.676734] [<bf01efe4>] (flexcan_irq [flexcan]) from [<c019c1ec>] (__handle_irq_event_percpu+0x4c/0x3ec) [ 682.686322] [<c019c1ec>] (__handle_irq_event_percpu) from [<c019c5b8>] (handle_irq_event_percpu+0x2c/0x88) [ 682.695993] [<c019c5b8>] (handle_irq_event_percpu) from [<c019c64c>] (handle_irq_event+0x38/0x5c) [ 682.704887] [<c019c64c>] (handle_irq_event) from [<c01a1058>] (handle_fasteoi_irq+0xc8/0x180) [ 682.713432] [<c01a1058>] (handle_fasteoi_irq) from [<c019b2c0>] (generic_handle_irq+0x30/0x44) [ 682.722063] [<c019b2c0>] (generic_handle_irq) from [<c019b8f8>] (__handle_domain_irq+0x64/0xdc) [ 682.730783] [<c019b8f8>] (__handle_domain_irq) from [<c06df4a4>] (gic_handle_irq+0x48/0x9c) [ 682.739158] [<c06df4a4>] (gic_handle_irq) from [<c0100b30>] (__irq_svc+0x70/0x98) [ 682.746656] Exception stack(0xe80e9dd8 to 0xe80e9e20) [ 682.751725] 9dc0: 00000001 e80e8000 [ 682.759922] 9de0: e820cf80 00000000 ffffe000 00000000 eaf08fe4 00000000 600d0013 00000000 [ 682.768117] 9e00: c1732e3c c16093a8 e820d4c0 e80e9e28 c018a57c c018b870 600d0013 ffffffff [ 682.776315] [<c0100b30>] (__irq_svc) from [<c018b870>] (lock_acquire+0x108/0x4e8) [ 682.783821] [<c018b870>] (lock_acquire) from [<c0e938e4>] (down_write+0x48/0xa8) [ 682.791242] [<c0e938e4>] (down_write) from [<c02818dc>] (unlink_file_vma+0x24/0x40) [ 682.798922] [<c02818dc>] (unlink_file_vma) from [<c027a258>] (free_pgtables+0x34/0xb8) [ 682.806858] [<c027a258>] (free_pgtables) from [<c02835a4>] (exit_mmap+0xe4/0x170) [ 682.814361] [<c02835a4>] (exit_mmap) from [<c01248e0>] (mmput+0x5c/0x110) [ 682.821171] [<c01248e0>] (mmput) from [<c012e910>] (do_exit+0x374/0xbe4) [ 682.827892] [<c012e910>] (do_exit) from [<c0130888>] (do_group_exit+0x38/0xb4) [ 682.835132] [<c0130888>] (do_group_exit) from [<c0130914>] (__wake_up_parent+0x0/0x14) [ 682.843063] irq event stamp: 1936 [ 682.846399] hardirqs last enabled at (1935): [<c02938b0>] rmqueue+0xf4/0xc64 [ 682.853553] hardirqs last disabled at (1936): [<c0100b20>] __irq_svc+0x60/0x98 [ 682.860799] softirqs last enabled at (1878): [<bf04cdcc>] raw_release+0x108/0x1f0 [can_raw] [ 682.869256] softirqs last disabled at (1876): [<c0b8f478>] release_sock+0x18/0x98 [ 682.876753] ---[ end trace 7bca4751ce44c444 ]--- This patch fixes the problem by replacing the kfree_skb() by dev_kfree_skb_any(), as rx-offload might be called from threaded IRQ handlers as well. Fixes: ca913f1ac024 ("can: rx-offload: can_rx_offload_queue_sorted(): fix error handling, avoid skb mem leak") Fixes: 6caf8a6d6586 ("can: rx-offload: can_rx_offload_queue_tail(): fix error handling, avoid skb mem leak") Link: http://lore.kernel.org/r/20201019190524.1285319-3-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* can: proc: can_remove_proc(): silence remove_proc_entry warningZhang Changzhong2020-11-031-2/+4
| | | | | | | | | | | | | | | | If can_init_proc() fail to create /proc/net/can directory, can_remove_proc() will trigger a warning: WARNING: CPU: 6 PID: 7133 at fs/proc/generic.c:672 remove_proc_entry+0x17b0 Kernel panic - not syncing: panic_on_warn set ... Fix to return early from can_remove_proc() if can proc_dir does not exists. Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1594709090-3203-1-git-send-email-zhangchangzhong@huawei.com Fixes: 8e8cda6d737d ("can: initial support for network namespaces") Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* dt-bindings: can: flexcan: convert fsl,*flexcan bindings to yamlOleksij Rempel2020-11-032-57/+135
| | | | | | | | | | In order to automate the verification of DT nodes convert fsl-flexcan.txt to fsl,flexcan.yaml Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20201022075218.11880-3-o.rempel@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* dt-bindings: can: add can-controller.yamlOleksij Rempel2020-11-031-0/+18
| | | | | | | | | For now we have only node name as common rule for all CAN controllers Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20201022075218.11880-2-o.rempel@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* sfp: Fix error handing in sfp_probe()YueHaibing2020-11-021-1/+2
| | | | | | | | | | | gpiod_to_irq() never return 0, but returns negative in case of error, check it and set gpio_irq to 0. Fixes: 73970055450e ("sfp: add SFP module support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031031053.25264-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* powerpc/vnic: Extend "failover pending" windowSukadev Bhattiprolu2020-11-021-4/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5a18e1e0c193b introduced the 'failover_pending' state to track the "failover pending window" - where we wait for the partner to become ready (after a transport event) before actually attempting to failover. i.e window is between following two events: a. we get a transport event due to a FAILOVER b. later, we get CRQ_INITIALIZED indicating the partner is ready at which point we schedule a FAILOVER reset. and ->failover_pending is true during this window. If during this window, we attempt to open (or close) a device, we pretend that the operation succeded and let the FAILOVER reset path complete the operation. This is fine, except if the transport event ("a" above) occurs during the open and after open has already checked whether a failover is pending. If that happens, we fail the open, which can cause the boot scripts to leave the interface down requiring administrator to manually bring up the device. This fix "extends" the failover pending window till we are _actually_ ready to perform the failover reset (i.e until after we get the RTNL lock). Since open() holds the RTNL lock, we can be sure that we either finish the open or if the open() fails due to the failover pending window, we can again pretend that open is done and let the failover complete it. We could try and block the open until failover is completed but a) that could still timeout the application and b) Existing code "pretends" that failover occurred "just after" open succeeded, so marks the open successful and lets the failover complete the open. So, mark the open successful even if the transport event occurs before we actually start the open. Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Acked-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20201030170711.1562994-1-sukadev@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: dsa: qca8k: Fix port MTU settingJonathan McDowell2020-11-021-2/+2
| | | | | | | | | | | | | The qca8k only supports a switch-wide MTU setting, and the code to take the max of all ports was only looking at the port currently being set. Fix to examine all ports. Reported-by: DENG Qingfang <dqfext@gmail.com> Fixes: f58d2598cf70 ("net: dsa: qca8k: implement the port MTU callbacks") Signed-off-by: Jonathan McDowell <noodles@earth.li> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201030183315.GA6736@earth.li Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platformsPetr Malat2020-11-021-2/+2
| | | | | | | | | | | | | | | | Commit 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning")' broke err reading from sctp_arg, because it reads the value as 32-bit integer, although the value is stored as 16-bit integer. Later this value is passed to the userspace in 16-bit variable, thus the user always gets 0 on big-endian platforms. Fix it by reading the __u16 field of sctp_arg union, as reading err field would produce a sparse warning. Fixes: 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning") Signed-off-by: Petr Malat <oss@malat.biz> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/20201030132633.7045-1-oss@malat.biz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: ethernet: ti: cpsw: disable PTPv1 hw timestamping advertisementGrygorii Strashko2020-11-022-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | The TI CPTS does not natively support PTPv1, only PTPv2. But, as it happens, the CPTS can provide HW timestamp for PTPv1 Sync messages, because CPTS HW parser looks for PTP messageType id in PTP message octet 0 which value is 0 for PTPv1. As result, CPTS HW can detect Sync messages for PTPv1 and PTPv2 (Sync messageType = 0 for both), but it fails for any other PTPv1 messages (Delay_req/resp) and will return PTP messageType id 0 for them. The commit e9523a5a32a1 ("net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter") added PTPv1 hw timestamping advertisement by mistake, only to make Linux Kernel "timestamping" utility work, and this causes issues with only PTPv1 compatible HW/SW - Sync HW timestamped, but Delay_req/resp are not. Hence, fix it disabling PTPv1 hw timestamping advertisement, so only PTPv1 compatible HW/SW can properly roll back to SW timestamping. Fixes: e9523a5a32a1 ("net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20201029190910.30789-1-grygorii.strashko@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'dpaa_eth-buffer-layout-fixes'Jakub Kicinski2020-11-021-10/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Camelia Groza says: ==================== dpaa_eth: buffer layout fixes The patches are related to the software workaround for the A050385 erratum. The first patch ensures optimal buffer usage for non-erratum scenarios. The second patch fixes a currently inconsequential discrepancy between the FMan and Ethernet drivers. Changes in v3: - refactor defines for clarity in 1/2 - add more details on the user impact in 1/2 - remove unnecessary inline identifier in 2/2 Changes in v2: - make the returned value for TX ports explicit in 2/2 - simplify the buf_layout reference in 2/2 ==================== Link: https://lore.kernel.org/r/cover.1604339942.git.camelia.groza@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dpaa_eth: fix the RX headroom size alignmentCamelia Groza2020-11-021-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | The headroom reserved for received frames needs to be aligned to an RX specific value. There is currently a discrepancy between the values used in the Ethernet driver and the values passed to the FMan. Coincidentally, the resulting aligned values are identical. Fixes: 3c68b8fffb48 ("dpaa_eth: FMan erratum A050385 workaround") Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dpaa_eth: update the buffer layout for non-A050385 erratum scenariosCamelia Groza2020-11-021-6/+10
|/ | | | | | | | | | | | | | | Impose a larger RX private data area only when the A050385 erratum is present on the hardware. A smaller buffer size is sufficient in all other scenarios. This enables a wider range of linear Jumbo frame sizes in non-erratum scenarios, instead of turning to multi buffer Scatter/Gather frames. The maximum linear frame size is increased by 128 bytes for non-erratum arm64 platforms. Cleanup the hardware annotations header defines in the process. Fixes: 3c68b8fffb48 ("dpaa_eth: FMan erratum A050385 workaround") Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge tag 'mac80211-for-net-2020-10-30' of ↵Jakub Kicinski2020-11-0210-55/+102
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A couple of fixes, for * HE on 2.4 GHz * a few issues syzbot found, but we have many more reports :-( * a regression in nl80211-transported EAPOL frames which had affected a number of users, from Mathy * kernel-doc markings in mac80211, from Mauro * a format argument in reg.c, from Ye Bin ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * mac80211: don't require VHT elements for HE on 2.4 GHzJohannes Berg2020-10-301-1/+2
| | | | | | | | | | | | | | | | | | | | | | After the previous similar bugfix there was another bug here, if no VHT elements were found we also disabled HE. Fix this to disable HE only on the 5 GHz band; on 6 GHz it was already not disabled, and on 2.4 GHz there need (should) not be any VHT. Fixes: 57fa5e85d53c ("mac80211: determine chandef from HE 6 GHz operation") Link: https://lore.kernel.org/r/20201013140156.535a2fc6192f.Id6e5e525a60ac18d245d86f4015f1b271fce6ee6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * cfg80211: regulatory: Fix inconsistent format argumentYe Bin2020-10-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | Fix follow warning: [net/wireless/reg.c:3619]: (warning) %d in format string (no. 2) requires 'int' but the argument type is 'unsigned int'. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Link: https://lore.kernel.org/r/20201009070215.63695-1-yebin10@huawei.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * mac80211: fix kernel-doc markupsMauro Carvalho Chehab2020-10-303-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some identifiers have different names between their prototypes and the kernel-doc markup. Others need to be fixed, as kernel-doc markups should use this format: identifier - description In the specific case of __sta_info_flush(), add a documentation for sta_info_flush(), as this one is the one used outside sta_info.c. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Link: https://lore.kernel.org/r/978d35eef2dc76e21c81931804e4eaefbd6d635e.1603469755.git.mchehab+huawei@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * mac80211: always wind down STA stateJohannes Berg2020-10-301-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When (for example) an IBSS station is pre-moved to AUTHORIZED before it's inserted, and then the insertion fails, we don't clean up the fast RX/TX states that might already have been created, since we don't go through all the state transitions again on the way down. Do that, if it hasn't been done already, when the station is freed. I considered only freeing the fast TX/RX state there, but we might add more state so it's more robust to wind down the state properly. Note that we warn if the station was ever inserted, it should have been properly cleaned up in that case, and the driver will probably not like things happening out of order. Reported-by: syzbot+2e293dbd67de2836ba42@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201009141710.7223b322a955.I95bd08b9ad0e039c034927cce0b75beea38e059b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * cfg80211: initialize wdev data earlierJohannes Berg2020-10-303-29/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a race condition in the netdev registration in that NETDEV_REGISTER actually happens after the netdev is available, and so if we initialize things only there, we might get called with an uninitialized wdev through nl80211 - not using a wdev but using a netdev interface index. I found this while looking into a syzbot report, but it doesn't really seem to be related, and unfortunately there's no repro for it (yet). I can't (yet) explain how it managed to get into cfg80211_release_pmsr() from nl80211_netlink_notify() without the wdev having been initialized, as the latter only iterates the wdevs that are linked into the rdev, which even without the change here happened after init. However, looking at this, it seems fairly clear that the init needs to be done earlier, otherwise we might even re-init on a netns move, when data might still be pending. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20201009135821.fdcbba3aad65.Ie9201d91dbcb7da32318812effdc1561aeaf4cdc@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * mac80211: fix use of skb payload instead of headerJohannes Berg2020-10-301-13/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ieee80211_skb_resize() is called from ieee80211_build_hdr() the skb has no 802.11 header yet, in fact it consist only of the payload as the ethernet frame is removed. As such, we're using the payload data for ieee80211_is_mgmt(), which is of course completely wrong. This didn't really hurt us because these are always data frames, so we could only have added more tailroom than we needed if we determined it was a management frame and sdata->crypto_tx_tailroom_needed_cnt was false. However, syzbot found that of course there need not be any payload, so we're using at best uninitialized memory for the check. Fix this to pass explicitly the kind of frame that we have instead of checking there, by replacing the "bool may_encrypt" argument with an argument that can carry the three possible states - it's not going to be encrypted, it's a management frame, or it's a data frame (and then we check sdata->crypto_tx_tailroom_needed_cnt). Reported-by: syzbot+32fd1a1bfe355e93f1e2@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20201009132538.e1fd7f802947.I799b288466ea2815f9d4c84349fae697dca2f189@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * mac80211: fix regression where EAPOL frames were sent in plaintextMathy Vanhoef2020-10-301-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When sending EAPOL frames via NL80211 they are treated as injected frames in mac80211. Due to commit 1df2bdba528b ("mac80211: never drop injected frames even if normally not allowed") these injected frames were not assigned a sta context in the function ieee80211_tx_dequeue, causing certain wireless network cards to always send EAPOL frames in plaintext. This may cause compatibility issues with some clients or APs, which for instance can cause the group key handshake to fail and in turn would cause the station to get disconnected. This commit fixes this regression by assigning a sta context in ieee80211_tx_dequeue to injected frames as well. Note that sending EAPOL frames in plaintext is not a security issue since they contain their own encryption and authentication protection. Cc: stable@vger.kernel.org Fixes: 1df2bdba528b ("mac80211: never drop injected frames even if normally not allowed") Reported-by: Thomas Deutschmann <whissi@gentoo.org> Tested-by: Christian Hesse <list@eworm.de> Tested-by: Thomas Deutschmann <whissi@gentoo.org> Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be> Link: https://lore.kernel.org/r/20201019160113.350912-1-Mathy.Vanhoef@kuleuven.be Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski2020-10-3118-37/+76
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Incorrect netlink report logic in flowtable and genID. 2) Add a selftest to check that wireguard passes the right sk to ip_route_me_harder, from Jason A. Donenfeld. 3) Pass the actual sk to ip_route_me_harder(), also from Jason. 4) Missing expression validation of updates via nft --check. 5) Update byte and packet counters regardless of whether they match, from Stefano Brivio. ==================== Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | netfilter: ipset: Update byte and packet counters regardless of whether they ↵Stefano Brivio2020-10-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | match In ip_set_match_extensions(), for sets with counters, we take care of updating counters themselves by calling ip_set_update_counter(), and of checking if the given comparison and values match, by calling ip_set_match_counter() if needed. However, if a given comparison on counters doesn't match the configured values, that doesn't mean the set entry itself isn't matching. This fix restores the behaviour we had before commit 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching"), without reintroducing the issue fixed there: back then, mtype_data_match() first updated counters in any case, and then took care of matching on counters. Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set, ip_set_update_counter() will anyway skip counter updates if desired. The issue observed is illustrated by this reproducer: ipset create c hash:ip counters ipset add c 192.0.2.1 iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP if we now send packets from 192.0.2.1, bytes and packets counters for the entry as shown by 'ipset list' are always zero, and, no matter how many bytes we send, the rule will never match, because counters themselves are not updated. Reported-by: Mithil Mhatre <mmhatre@redhat.com> Fixes: 4750005a85f7 ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: nf_tables: missing validation from the abort pathPablo Neira Ayuso2020-10-303-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If userspace does not include the trailing end of batch message, then nfnetlink aborts the transaction. This allows to check that ruleset updates trigger no errors. After this patch, invoking this command from the prerouting chain: # nft -c add rule x y fib saddr . oif type local fails since oif is not supported there. This patch fixes the lack of rule validation from the abort/check path to catch configuration errors such as the one above. Fixes: a654de8fdc18 ("netfilter: nf_tables: fix chain dependency validation") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: use actual socket sk rather than skb sk when routing harderJason A. Donenfeld2020-10-3012-24/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If netfilter changes the packet mark when mangling, the packet is rerouted using the route_me_harder set of functions. Prior to this commit, there's one big difference between route_me_harder and the ordinary initial routing functions, described in the comment above __ip_queue_xmit(): /* Note: skb->sk can be different from sk, in case of tunnels */ int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl, That function goes on to correctly make use of sk->sk_bound_dev_if, rather than skb->sk->sk_bound_dev_if. And indeed the comment is true: a tunnel will receive a packet in ndo_start_xmit with an initial skb->sk. It will make some transformations to that packet, and then it will send the encapsulated packet out of a *new* socket. That new socket will basically always have a different sk_bound_dev_if (otherwise there'd be a routing loop). So for the purposes of routing the encapsulated packet, the routing information as it pertains to the socket should come from that socket's sk, rather than the packet's original skb->sk. For that reason __ip_queue_xmit() and related functions all do the right thing. One might argue that all tunnels should just call skb_orphan(skb) before transmitting the encapsulated packet into the new socket. But tunnels do *not* do this -- and this is wisely avoided in skb_scrub_packet() too -- because features like TSQ rely on skb->destructor() being called when that buffer space is truely available again. Calling skb_orphan(skb) too early would result in buffers filling up unnecessarily and accounting info being all wrong. Instead, additional routing must take into account the new sk, just as __ip_queue_xmit() notes. So, this commit addresses the problem by fishing the correct sk out of state->sk -- it's already set properly in the call to nf_hook() in __ip_local_out(), which receives the sk as part of its normal functionality. So we make sure to plumb state->sk through the various route_me_harder functions, and then make correct use of it following the example of __ip_queue_xmit(). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | wireguard: selftests: check that route_me_harder packets use the right skJason A. Donenfeld2020-10-302-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If netfilter changes the packet mark, the packet is rerouted. The ip_route_me_harder family of functions fails to use the right sk, opting to instead use skb->sk, resulting in a routing loop when used with tunnels. With the next change fixing this issue in netfilter, test for the relevant condition inside our test suite, since wireguard was where the bug was discovered. Reported-by: Chen Minqiang <ptpt52@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | netfilter: nftables: fix netlink report logic in flowtable and genidPablo Neira Ayuso2020-10-301-2/+2
| |/ | | | | | | | | | | | | | | The netlink report should be sent regardless the available listeners. Fixes: 84d7fce69388 ("netfilter: nf_tables: export rule-set generation ID") Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flagswenxu2020-10-311-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tunnel device such as vxlan, bareudp and geneve in the lwt mode set the outer df only based TUNNEL_DONT_FRAGMENT. And this was also the behavior for gre device before switching to use ip_md_tunnel_xmit in commit 962924fa2b7a ("ip_gre: Refactor collect metatdata mode tunnel xmit to ip_md_tunnel_xmit") When the ip_gre in lwt mode xmit with ip_md_tunnel_xmi changed the rule and make the discrepancy between handling of DF by different tunnels. So in the ip_md_tunnel_xmit should follow the same rule like other tunnels. Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel") Signed-off-by: wenxu <wenxu@ucloud.cn> Link: https://lore.kernel.org/r/1604028728-31100-1-git-send-email-wenxu@ucloud.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | cadence: force nonlinear buffers to be clonedMark Deneen2020-10-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In my test setup, I had a SAMA5D27 device configured with ip forwarding, and second device with usb ethernet (r8152) sending ICMP packets.  If the packet was larger than about 220 bytes, the SAMA5 device would "oops" with the following trace: kernel BUG at net/core/skbuff.c:1863! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: xt_MASQUERADE ppp_async ppp_generic slhc iptable_nat xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 can_raw can bridge stp llc ipt_REJECT nf_reject_ipv4 sd_mod cdc_ether usbnet usb_storage r8152 scsi_mod mii o ption usb_wwan usbserial micrel macb at91_sama5d2_adc phylink gpio_sama5d2_piobu m_can_platform m_can industrialio_triggered_buffer kfifo_buf of_mdio can_dev fixed_phy sdhci_of_at91 sdhci_pltfm libphy sdhci mmc_core ohci_at91 ehci_atmel o hci_hcd iio_rescale industrialio sch_fq_codel spidev prox2_hal(O) CPU: 0 PID: 0 Comm: swapper Tainted: G           O      5.9.1-prox2+ #1 Hardware name: Atmel SAMA5 PC is at skb_put+0x3c/0x50 LR is at macb_start_xmit+0x134/0xad0 [macb] pc : [<c05258cc>]    lr : [<bf0ea5b8>]    psr: 20070113 sp : c0d01a60  ip : c07232c0  fp : c4250000 r10: c0d03cc8  r9 : 00000000  r8 : c0d038c0 r7 : 00000000  r6 : 00000008  r5 : c59b66c0  r4 : 0000002a r3 : 8f659eff  r2 : c59e9eea  r1 : 00000001  r0 : c59b66c0 Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none Control: 10c53c7d  Table: 2640c059  DAC: 00000051 Process swapper (pid: 0, stack limit = 0x75002d81) <snipped stack> [<c05258cc>] (skb_put) from [<bf0ea5b8>] (macb_start_xmit+0x134/0xad0 [macb]) [<bf0ea5b8>] (macb_start_xmit [macb]) from [<c053e504>] (dev_hard_start_xmit+0x90/0x11c) [<c053e504>] (dev_hard_start_xmit) from [<c0571180>] (sch_direct_xmit+0x124/0x260) [<c0571180>] (sch_direct_xmit) from [<c053eae4>] (__dev_queue_xmit+0x4b0/0x6d0) [<c053eae4>] (__dev_queue_xmit) from [<c05a5650>] (ip_finish_output2+0x350/0x580) [<c05a5650>] (ip_finish_output2) from [<c05a7e24>] (ip_output+0xb4/0x13c) [<c05a7e24>] (ip_output) from [<c05a39d0>] (ip_forward+0x474/0x500) [<c05a39d0>] (ip_forward) from [<c05a13d8>] (ip_sublist_rcv_finish+0x3c/0x50) [<c05a13d8>] (ip_sublist_rcv_finish) from [<c05a19b8>] (ip_sublist_rcv+0x11c/0x188) [<c05a19b8>] (ip_sublist_rcv) from [<c05a2494>] (ip_list_rcv+0xf8/0x124) [<c05a2494>] (ip_list_rcv) from [<c05403c4>] (__netif_receive_skb_list_core+0x1a0/0x20c) [<c05403c4>] (__netif_receive_skb_list_core) from [<c05405c4>] (netif_receive_skb_list_internal+0x194/0x230) [<c05405c4>] (netif_receive_skb_list_internal) from [<c0540684>] (gro_normal_list.part.0+0x14/0x28) [<c0540684>] (gro_normal_list.part.0) from [<c0541280>] (napi_complete_done+0x16c/0x210) [<c0541280>] (napi_complete_done) from [<bf14c1c0>] (r8152_poll+0x684/0x708 [r8152]) [<bf14c1c0>] (r8152_poll [r8152]) from [<c0541424>] (net_rx_action+0x100/0x328) [<c0541424>] (net_rx_action) from [<c01012ec>] (__do_softirq+0xec/0x274) [<c01012ec>] (__do_softirq) from [<c012d6d4>] (irq_exit+0xcc/0xd0) [<c012d6d4>] (irq_exit) from [<c0160960>] (__handle_domain_irq+0x58/0xa4) [<c0160960>] (__handle_domain_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x90) Exception stack(0xc0d01ef0 to 0xc0d01f38) 1ee0:                                     00000000 0000003d 0c31f383 c0d0fa00 1f00: c0d2eb80 00000000 c0d2e630 4dad8c49 4da967b0 0000003d 0000003d 00000000 1f20: fffffff5 c0d01f40 c04e0f88 c04e0f8c 30070013 ffffffff [<c0100b0c>] (__irq_svc) from [<c04e0f8c>] (cpuidle_enter_state+0x7c/0x378) [<c04e0f8c>] (cpuidle_enter_state) from [<c04e12c4>] (cpuidle_enter+0x28/0x38) [<c04e12c4>] (cpuidle_enter) from [<c014f710>] (do_idle+0x194/0x214) [<c014f710>] (do_idle) from [<c014fa50>] (cpu_startup_entry+0xc/0x14) [<c014fa50>] (cpu_startup_entry) from [<c0a00dc8>] (start_kernel+0x46c/0x4a0) Code: e580c054 8a000002 e1a00002 e8bd8070 (e7f001f2) ---[ end trace 146c8a334115490c ]--- The solution was to force nonlinear buffers to be cloned.  This was previously reported by Klaus Doth (https://www.spinics.net/lists/netdev/msg556937.html) but never formally submitted as a patch. This is the third revision, hopefully the formatting is correct this time! Suggested-by: Klaus Doth <krnl@doth.eu> Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Signed-off-by: Mark Deneen <mdeneen@saucontech.com> Link: https://lore.kernel.org/r/20201030155814.622831-1-mdeneen@saucontech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge branch 'ipv6-reply-icmp-error-if-fragment-doesn-t-contain-all-headers'Jakub Kicinski2020-10-313-2/+40
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hangbin Liu says: ==================== IPv6: reply ICMP error if fragment doesn't contain all headers When our Engineer run latest IPv6 Core Conformance test, test v6LC.1.3.6: First Fragment Doesn’t Contain All Headers[1] failed. The test purpose is to verify that the node (Linux for example) should properly process IPv6 packets that don’t include all the headers through the Upper-Layer header. Based on RFC 8200, Section 4.5 Fragment Header - If the first fragment does not include all headers through an Upper-Layer header, then that fragment should be discarded and an ICMP Parameter Problem, Code 3, message should be sent to the source of the fragment, with the Pointer field set to zero. The first patch add a definition for ICMPv6 Parameter Problem, code 3. The second patch add a check for the 1st fragment packet to make sure Upper-Layer header exist. [1] Page 68, v6LC.1.3.6: First Fragment Doesn’t Contain All Headers part A, B, C and D at https://ipv6ready.org/docs/Core_Conformance_5_0_0.pdf [2] My reproducer: import sys, os from scapy.all import * def send_frag_dst_opt(src_ip6, dst_ip6): ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44) frag_1 = IPv6ExtHdrFragment(nh = 60, m = 1) dst_opt = IPv6ExtHdrDestOpt(nh = 58) frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1) icmp_echo = ICMPv6EchoRequest(seq = 1) pkt_1 = ip6/frag_1/dst_opt pkt_2 = ip6/frag_2/icmp_echo send(pkt_1) send(pkt_2) def send_frag_route_opt(src_ip6, dst_ip6): ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44) frag_1 = IPv6ExtHdrFragment(nh = 43, m = 1) route_opt = IPv6ExtHdrRouting(nh = 58) frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1) icmp_echo = ICMPv6EchoRequest(seq = 2) pkt_1 = ip6/frag_1/route_opt pkt_2 = ip6/frag_2/icmp_echo send(pkt_1) send(pkt_2) if __name__ == '__main__': src = sys.argv[1] dst = sys.argv[2] conf.iface = sys.argv[3] send_frag_dst_opt(src, dst) send_frag_route_opt(src, dst) ==================== Link: https://lore.kernel.org/r/20201027123313.3717941-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | IPv6: reply ICMP error if the first fragment don't include all headersHangbin Liu2020-10-312-2/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on RFC 8200, Section 4.5 Fragment Header: - If the first fragment does not include all headers through an Upper-Layer header, then that fragment should be discarded and an ICMP Parameter Problem, Code 3, message should be sent to the source of the fragment, with the Pointer field set to zero. Checking each packet header in IPv6 fast path will have performance impact, so I put the checking in ipv6_frag_rcv(). As the packet may be any kind of L4 protocol, I only checked some common protocols' header length and handle others by (offset + 1) > skb->len. Also use !(frag_off & htons(IP6_OFFSET)) to catch atomic fragments (fragmented packet with only one fragment). When send ICMP error message, if the 1st truncated fragment is ICMP message, icmp6_send() will break as is_ineligible() return true. So I added a check in is_ineligible() to let fragment packet with nexthdr ICMP but no ICMP header return false. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | ICMPv6: Add ICMPv6 Parameter Problem, code 3 definitionHangbin Liu2020-10-311-0/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | Based on RFC7112, Section 6: IANA has added the following "Type 4 - Parameter Problem" message to the "Internet Control Message Protocol version 6 (ICMPv6) Parameters" registry: CODE NAME/DESCRIPTION 3 IPv6 First Fragment has incomplete IPv6 Header Chain Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: atm: fix update of position index in lec_seq_nextColin Ian King2020-10-311-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The position index in leq_seq_next is not updated when the next entry is fetched an no more entries are available. This causes seq_file to report the following error: "seq_file: buggy .next function lec_seq_next [lec] did not update position index" Fix this by always updating the position index. [ Note: this is an ancient 2002 bug, the sha is from the tglx/history repo ] Fixes 4aea2cbff417 ("[ATM]: Move lan seq_file ops to lec.c [1/3]") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201027114925.21843-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: stmmac: Fix channel lock initializationMarek Szyprowski2020-10-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0366f7e06a6b ("net: stmmac: add ethtool support for get/set channels") refactored channel initialization, but during that operation, the spinlock initialization got lost. Fix this. This fixes the following lockdep warning: meson8b-dwmac ff3f0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 331 Comm: kworker/1:2H Not tainted 5.9.0-rc3+ #1858 Hardware name: Hardkernel ODROID-N2 (DT) Workqueue: kblockd blk_mq_run_work_fn Call trace: dump_backtrace+0x0/0x1d0 show_stack+0x14/0x20 dump_stack+0xe8/0x154 register_lock_class+0x58c/0x590 __lock_acquire+0x7c/0x1790 lock_acquire+0xf4/0x440 _raw_spin_lock_irqsave+0x80/0xb0 stmmac_tx_timer+0x4c/0xb0 [stmmac] call_timer_fn+0xc4/0x3e8 run_timer_softirq+0x2b8/0x6c0 efi_header_end+0x114/0x5f8 irq_exit+0x104/0x110 __handle_domain_irq+0x60/0xb8 gic_handle_irq+0x58/0xb0 el1_irq+0xbc/0x180 _raw_spin_unlock_irqrestore+0x48/0x90 mmc_blk_rw_wait+0x70/0x160 mmc_blk_mq_issue_rq+0x510/0x830 mmc_mq_queue_rq+0x13c/0x278 blk_mq_dispatch_rq_list+0x2a0/0x698 __blk_mq_do_dispatch_sched+0x254/0x288 __blk_mq_sched_dispatch_requests+0x190/0x1d8 blk_mq_sched_dispatch_requests+0x34/0x70 __blk_mq_run_hw_queue+0xcc/0x148 blk_mq_run_work_fn+0x20/0x28 process_one_work+0x2a8/0x718 worker_thread+0x48/0x460 kthread+0x134/0x160 ret_from_fork+0x10/0x1c Fixes: 0366f7e06a6b ("net: stmmac: add ethtool support for get/set channels") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20201029185011.4749-1-m.szyprowski@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | stmmac: intel: Fix kernel panic on pci probeWong Vee Khee2020-10-301-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit "stmmac: intel: Adding ref clock 1us tic for LPI cntr" introduced a regression which leads to the kernel panic duing loading of the dwmac_intel module. Move the code block after pci resources is obtained. Fixes: b4c5f83ae3f3 ("stmmac: intel: Adding ref clock 1us tic for LPI cntr") Cc: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Link: https://lore.kernel.org/r/20201029093228.1741-1-vee.khee.wong@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | gianfar: Account for Tx PTP timestamp in the skb headroomClaudiu Manoil2020-10-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When PTP timestamping is enabled on Tx, the controller inserts the Tx timestamp at the beginning of the frame buffer, between SFD and the L2 frame header. This means that the skb provided by the stack is required to have enough headroom otherwise a new skb needs to be created by the driver to accommodate the timestamp inserted by h/w. Up until now the driver was relying on the second option, using skb_realloc_headroom() to create a new skb to accommodate PTP frames. Turns out that this method is not reliable, as reallocation of skbs for PTP frames along with the required overhead (skb_set_owner_w, consume_skb) is causing random crashes in subsequent skb_*() calls, when multiple concurrent TCP streams are run at the same time on the same device (as seen in James' report). Note that these crashes don't occur with a single TCP stream, nor with multiple concurrent UDP streams, but only when multiple TCP streams are run concurrently with the PTP packet flow (doing skb reallocation). This patch enforces the first method, by requesting enough headroom from the stack to accommodate PTP frames, and so avoiding skb_realloc_headroom() & co, and the crashes no longer occur. There's no reason not to set needed_headroom to a large enough value to accommodate PTP frames, so in this regard this patch is a fix. Reported-by: James Jurack <james.jurack@ametek.com> Fixes: bee9e58c9e98 ("gianfar:don't add FCB length to hard_header_len") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20201020173605.1173-1-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | gianfar: Replace skb_realloc_headroom with skb_cow_head for PTPClaudiu Manoil2020-10-301-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When PTP timestamping is enabled on Tx, the controller inserts the Tx timestamp at the beginning of the frame buffer, between SFD and the L2 frame header. This means that the skb provided by the stack is required to have enough headroom otherwise a new skb needs to be created by the driver to accommodate the timestamp inserted by h/w. Up until now the driver was relying on skb_realloc_headroom() to create new skbs to accommodate PTP frames. Turns out that this method is not reliable in this context at least, as skb_realloc_headroom() for PTP frames can cause random crashes, mostly in subsequent skb_*() calls, when multiple concurrent TCP streams are run at the same time with the PTP flow on the same device (as seen in James' report). I also noticed that when the system is loaded by sending multiple TCP streams, the driver receives cloned skbs in large numbers. skb_cow_head() instead proves to be stable in this scenario, and not only handles cloned skbs too but it's also more efficient and widely used in other drivers. The commit introducing skb_realloc_headroom in the driver goes back to 2009, commit 93c1285c5d92 ("gianfar: reallocate skb when headroom is not enough for fcb"). For practical purposes I'm referencing a newer commit (from 2012) that brings the code to its current structure (and fixes the PTP case). Fixes: 9c4886e5e63b ("gianfar: Fix invalid TX frames returned on error queue when time stamping") Reported-by: James Jurack <james.jurack@ametek.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20201029081057.8506-1-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | net: fec: fix MDIO probing for some FEC hardware blocksGreg Ungerer2020-10-302-13/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some (apparently older) versions of the FEC hardware block do not like the MMFR register being cleared to avoid generation of MII events at initialization time. The action of clearing this register results in no future MII events being generated at all on the problem block. This means the probing of the MDIO bus will find no PHYs. Create a quirk that can be checked at the FECs MII init time so that the right thing is done. The quirk is set as appropriate for the FEC hardware blocks that are known to need this. Fixes: f166f890c8f0 ("net: ethernet: fec: Replace interrupt driven MDIO with polled IO") Signed-off-by: Greg Ungerer <gerg@linux-m68k.org> Acked-by: Fugang Duan <fugand.duan@nxp.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com> Link: https://lore.kernel.org/r/20201028052232.1315167-1-gerg@linux-m68k.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | ip6_tunnel: set inner ipproto before ip6_tnl_encapAlexander Ovechkin2020-10-301-2/+2
|/ | | | | | | | | | | | | | | | | | | ip6_tnl_encap assigns to proto transport protocol which encapsulates inner packet, but we must pass to set_inner_ipproto protocol of that inner packet. Calling set_inner_ipproto after ip6_tnl_encap might break gso. For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to incorrect calling sequence of gso functions: ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment instead of: ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment Fixes: 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> Link: https://lore.kernel.org/r/20201029171012.20904-1-ovov@yandex-team.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge tag 'fallthrough-fixes-clang-5.10-rc2' of ↵Linus Torvalds2020-10-292-0/+4
|\ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fix from Gustavo A. R. Silva: "This fixes a ton of fall-through warnings when building with Clang 12.0.0 and -Wimplicit-fallthrough" * tag 'fallthrough-fixes-clang-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: include: jhash/signal: Fix fall-through warnings for Clang