summaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2022-03-24 13:13:26 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2022-03-24 13:13:26 -0700
commit169e77764adc041b1dacba84ea90516a895d43b2 (patch)
treeaf7124681fa65d40fccee902af5194ab9f9c95f4 /Documentation
parent7403e6d8263937dea206dd201fed1ceed190ca18 (diff)
parent89695196f0ba78a17453f9616355f2ca6b293402 (diff)
downloadlinux-stable-169e77764adc041b1dacba84ea90516a895d43b2.tar.gz
linux-stable-169e77764adc041b1dacba84ea90516a895d43b2.tar.bz2
linux-stable-169e77764adc041b1dacba84ea90516a895d43b2.zip
Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski: "The sprinkling of SPI drivers is because we added a new one and Mark sent us a SPI driver interface conversion pull request. Core ---- - Introduce XDP multi-buffer support, allowing the use of XDP with jumbo frame MTUs and combination with Rx coalescing offloads (LRO). - Speed up netns dismantling (5x) and lower the memory cost a little. Remove unnecessary per-netns sockets. Scope some lists to a netns. Cut down RCU syncing. Use batch methods. Allow netdev registration to complete out of order. - Support distinguishing timestamp types (ingress vs egress) and maintaining them across packet scrubbing points (e.g. redirect). - Continue the work of annotating packet drop reasons throughout the stack. - Switch netdev error counters from an atomic to dynamically allocated per-CPU counters. - Rework a few preempt_disable(), local_irq_save() and busy waiting sections problematic on PREEMPT_RT. - Extend the ref_tracker to allow catching use-after-free bugs. BPF --- - Introduce "packing allocator" for BPF JIT images. JITed code is marked read only, and used to be allocated at page granularity. Custom allocator allows for more efficient memory use, lower iTLB pressure and prevents identity mapping huge pages from getting split. - Make use of BTF type annotations (e.g. __user, __percpu) to enforce the correct probe read access method, add appropriate helpers. - Convert the BPF preload to use light skeleton and drop the user-mode-driver dependency. - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling its use as a packet generator. - Allow local storage memory to be allocated with GFP_KERNEL if called from a hook allowed to sleep. - Introduce fprobe (multi kprobe) to speed up mass attachment (arch bits to come later). - Add unstable conntrack lookup helpers for BPF by using the BPF kfunc infra. - Allow cgroup BPF progs to return custom errors to user space. - Add support for AF_UNIX iterator batching. - Allow iterator programs to use sleepable helpers. - Support JIT of add, and, or, xor and xchg atomic ops on arm64. - Add BTFGen support to bpftool which allows to use CO-RE in kernels without BTF info. - Large number of libbpf API improvements, cleanups and deprecations. Protocols --------- - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev. - Adjust TSO packet sizes based on min_rtt, allowing very low latency links (data centers) to always send full-sized TSO super-frames. - Make IPv6 flow label changes (AKA hash rethink) more configurable, via sysctl and setsockopt. Distinguish between server and client behavior. - VxLAN support to "collect metadata" devices to terminate only configured VNIs. This is similar to VLAN filtering in the bridge. - Support inserting IPv6 IOAM information to a fraction of frames. - Add protocol attribute to IP addresses to allow identifying where given address comes from (kernel-generated, DHCP etc.) - Support setting socket and IPv6 options via cmsg on ping6 sockets. - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS. Define dscp_t and stop taking ECN bits into account in fib-rules. - Add support for locked bridge ports (for 802.1X). - tun: support NAPI for packets received from batched XDP buffs, doubling the performance in some scenarios. - IPv6 extension header handling in Open vSwitch. - Support IPv6 control message load balancing in bonding, prevent neighbor solicitation and advertisement from using the wrong port. Support NS/NA monitor selection similar to existing ARP monitor. - SMC - improve performance with TCP_CORK and sendfile() - support auto-corking - support TCP_NODELAY - MCTP (Management Component Transport Protocol) - add user space tag control interface - I2C binding driver (as specified by DMTF DSP0237) - Multi-BSSID beacon handling in AP mode for WiFi. - Bluetooth: - handle MSFT Monitor Device Event - add MGMT Adv Monitor Device Found/Lost events - Multi-Path TCP: - add support for the SO_SNDTIMEO socket option - lots of selftest cleanups and improvements - Increase the max PDU size in CAN ISOTP to 64 kB. Driver API ---------- - Add HW counters for SW netdevs, a mechanism for devices which offload packet forwarding to report packet statistics back to software interfaces such as tunnels. - Select the default NIC queue count as a fraction of number of physical CPU cores, instead of hard-coding to 8. - Expose devlink instance locks to drivers. Allow device layer of drivers to use that lock directly instead of creating their own which always runs into ordering issues in devlink callbacks. - Add header/data split indication to guide user space enabling of TCP zero-copy Rx. - Allow configuring completion queue event size. - Refactor page_pool to enable fragmenting after allocation. - Add allocation and page reuse statistics to page_pool. - Improve Multiple Spanning Trees support in the bridge to allow reuse of topologies across VLANs, saving HW resources in switches. - DSA (Distributed Switch Architecture): - replay and offload of host VLAN entries - offload of static and local FDB entries on LAG interfaces - FDB isolation and unicast filtering New hardware / drivers ---------------------- - Ethernet: - LAN937x T1 PHYs - Davicom DM9051 SPI NIC driver - Realtek RTL8367S, RTL8367RB-VB switch and MDIO - Microchip ksz8563 switches - Netronome NFP3800 SmartNICs - Fungible SmartNICs - MediaTek MT8195 switches - WiFi: - mt76: MediaTek mt7916 - mt76: MediaTek mt7921u USB adapters - brcmfmac: Broadcom BCM43454/6 - Mobile: - iosm: Intel M.2 7360 WWAN card Drivers ------- - Convert many drivers to the new phylink API built for split PCS designs but also simplifying other cases. - Intel Ethernet NICs: - add TTY for GNSS module for E810T device - improve AF_XDP performance - GTP-C and GTP-U filter offload - QinQ VLAN support - Mellanox Ethernet NICs (mlx5): - support xdp->data_meta - multi-buffer XDP - offload tc push_eth and pop_eth actions - Netronome Ethernet NICs (nfp): - flow-independent tc action hardware offload (police / meter) - AF_XDP - Other Ethernet NICs: - at803x: fiber and SFP support - xgmac: mdio: preamble suppression and custom MDC frequencies - r8169: enable ASPM L1.2 if system vendor flags it as safe - macb/gem: ZynqMP SGMII - hns3: add TX push mode - dpaa2-eth: software TSO - lan743x: multi-queue, mdio, SGMII, PTP - axienet: NAPI and GRO support - Mellanox Ethernet switches (mlxsw): - source and dest IP address rewrites - RJ45 ports - Marvell Ethernet switches (prestera): - basic routing offload - multi-chain TC ACL offload - NXP embedded Ethernet switches (ocelot & felix): - PTP over UDP with the ocelot-8021q DSA tagging protocol - basic QoS classification on Felix DSA switch using dcbnl - port mirroring for ocelot switches - Microchip high-speed industrial Ethernet (sparx5): - offloading of bridge port flooding flags - PTP Hardware Clock - Other embedded switches: - lan966x: PTP Hardward Clock - qca8k: mdio read/write operations via crafted Ethernet packets - Qualcomm 802.11ax WiFi (ath11k): - add LDPC FEC type and 802.11ax High Efficiency data in radiotap - enable RX PPDU stats in monitor co-exist mode - Intel WiFi (iwlwifi): - UHB TAS enablement via BIOS - band disablement via BIOS - channel switch offload - 32 Rx AMPDU sessions in newer devices - MediaTek WiFi (mt76): - background radar detection - thermal management improvements on mt7915 - SAR support for more mt76 platforms - MBSSID and 6 GHz band on mt7915 - RealTek WiFi: - rtw89: AP mode - rtw89: 160 MHz channels and 6 GHz band - rtw89: hardware scan - Bluetooth: - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS) - Microchip CAN (mcp251xfd): - multiple RX-FIFOs and runtime configurable RX/TX rings - internal PLL, runtime PM handling simplification - improve chip detection and error handling after wakeup" * tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits) llc: fix netdevice reference leaks in llc_ui_bind() drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool ice: don't allow to run ice_send_event_to_aux() in atomic ctx ice: fix 'scheduling while atomic' on aux critical err interrupt net/sched: fix incorrect vlan_push_eth dest field net: bridge: mst: Restrict info size queries to bridge ports net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init() drivers: net: xgene: Fix regression in CRC stripping net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT net: dsa: fix missing host-filtered multicast addresses net/mlx5e: Fix build warning, detected write beyond size of field iwlwifi: mvm: Don't fail if PPAG isn't supported selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" netdevice: add missing dm_private kdoc net: bridge: mst: prevent NULL deref in br_mst_info_size() selftests: forwarding: Use same VRF for port and VLAN upper ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-timecard116
-rw-r--r--Documentation/admin-guide/sysctl/net.rst9
-rw-r--r--Documentation/bpf/bpf_prog_run.rst117
-rw-r--r--Documentation/bpf/btf.rst45
-rw-r--r--Documentation/bpf/index.rst1
-rw-r--r--Documentation/bpf/instruction-set.rst215
-rw-r--r--Documentation/bpf/verifier.rst2
-rw-r--r--Documentation/devicetree/bindings/i2c/i2c.txt4
-rw-r--r--Documentation/devicetree/bindings/net/can/allwinner,sun4i-a10-can.yaml3
-rw-r--r--Documentation/devicetree/bindings/net/can/bosch,m_can.yaml9
-rw-r--r--Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml3
-rw-r--r--Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/can/xilinx,can.yaml161
-rw-r--r--Documentation/devicetree/bindings/net/can/xilinx_can.txt61
-rw-r--r--Documentation/devicetree/bindings/net/cdns,macb.yaml56
-rw-r--r--Documentation/devicetree/bindings/net/davicom,dm9051.yaml62
-rw-r--r--Documentation/devicetree/bindings/net/dsa/dsa-port.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml6
-rw-r--r--Documentation/devicetree/bindings/net/dsa/realtek-smi.txt240
-rw-r--r--Documentation/devicetree/bindings/net/dsa/realtek.yaml394
-rw-r--r--Documentation/devicetree/bindings/net/fsl-fman.txt22
-rw-r--r--Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt1
-rw-r--r--Documentation/devicetree/bindings/net/mctp-i2c-controller.yaml92
-rw-r--r--Documentation/devicetree/bindings/net/mediatek-dwmac.txt91
-rw-r--r--Documentation/devicetree/bindings/net/mediatek-dwmac.yaml175
-rw-r--r--Documentation/devicetree/bindings/net/micrel.txt17
-rw-r--r--Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/mscc-miim.txt2
-rw-r--r--Documentation/devicetree/bindings/net/renesas,etheravb.yaml4
-rw-r--r--Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml42
-rw-r--r--Documentation/devicetree/bindings/phy/fsl,lynx-28g.yaml40
-rw-r--r--Documentation/devicetree/bindings/phy/transmit-amplitude.yaml103
-rw-r--r--Documentation/networking/bonding.rst11
-rw-r--r--Documentation/networking/devlink/index.rst16
-rw-r--r--Documentation/networking/dsa/sja1105.rst27
-rw-r--r--Documentation/networking/ethtool-netlink.rst19
-rw-r--r--Documentation/networking/index.rst1
-rw-r--r--Documentation/networking/ip-sysctl.rst23
-rw-r--r--Documentation/networking/mctp.rst48
-rw-r--r--Documentation/networking/page_pool.rst56
-rw-r--r--Documentation/networking/smc-sysctl.rst23
-rw-r--r--Documentation/networking/timestamping.rst2
-rw-r--r--Documentation/trace/fprobe.rst174
-rw-r--r--Documentation/trace/index.rst1
45 files changed, 2011 insertions, 491 deletions
diff --git a/Documentation/ABI/testing/sysfs-timecard b/Documentation/ABI/testing/sysfs-timecard
index 97f6773794a5..220478156297 100644
--- a/Documentation/ABI/testing/sysfs-timecard
+++ b/Documentation/ABI/testing/sysfs-timecard
@@ -37,8 +37,15 @@ Description: (RO) Set of available destinations (sinks) for a SMA
PPS2 signal is sent to the PPS2 selector
TS1 signal is sent to timestamper 1
TS2 signal is sent to timestamper 2
+ TS3 signal is sent to timestamper 3
+ TS4 signal is sent to timestamper 4
IRIG signal is sent to the IRIG-B module
DCF signal is sent to the DCF module
+ FREQ1 signal is sent to frequency counter 1
+ FREQ2 signal is sent to frequency counter 2
+ FREQ3 signal is sent to frequency counter 3
+ FREQ4 signal is sent to frequency counter 4
+ None signal input is disabled
===== ================================================
What: /sys/class/timecard/ocpN/available_sma_outputs
@@ -50,10 +57,16 @@ Description: (RO) Set of available sources for a SMA output signal.
10Mhz output is from the 10Mhz reference clock
PHC output PPS is from the PHC clock
MAC output PPS is from the Miniature Atomic Clock
- GNSS output PPS is from the GNSS module
+ GNSS1 output PPS is from the first GNSS module
GNSS2 output PPS is from the second GNSS module
IRIG output is from the PHC, in IRIG-B format
DCF output is from the PHC, in DCF format
+ GEN1 output is from frequency generator 1
+ GEN2 output is from frequency generator 2
+ GEN3 output is from frequency generator 3
+ GEN4 output is from frequency generator 4
+ GND output is GND
+ VCC output is VCC
===== ================================================
What: /sys/class/timecard/ocpN/clock_source
@@ -63,6 +76,97 @@ Description: (RW) Contains the current synchronization source used by
the PHC. May be changed by writing one of the listed
values from the available_clock_sources attribute set.
+What: /sys/class/timecard/ocpN/clock_status_drift
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Contains the current drift value used by the firmware
+ for internal disciplining of the atomic clock.
+
+What: /sys/class/timecard/ocpN/clock_status_offset
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Contains the current offset value used by the firmware
+ for internal disciplining of the atomic clock.
+
+What: /sys/class/timecard/ocpN/freqX
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Optional directory containing the sysfs nodes for
+ frequency counter <X>.
+
+What: /sys/class/timecard/ocpN/freqX/frequency
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Contains the measured frequency over the specified
+ measurement period.
+
+What: /sys/class/timecard/ocpN/freqX/seconds
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RW) Specifies the number of seconds from 0-255 that the
+ frequency should be measured over. Write 0 to disable.
+
+What: /sys/class/timecard/ocpN/genX
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Optional directory containing the sysfs nodes for
+ frequency generator <X>.
+
+What: /sys/class/timecard/ocpN/genX/duty
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Specifies the signal duty cycle as a percentage from 1-99.
+
+What: /sys/class/timecard/ocpN/genX/period
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Specifies the signal period in nanoseconds.
+
+What: /sys/class/timecard/ocpN/genX/phase
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Specifies the signal phase offset in nanoseconds.
+
+What: /sys/class/timecard/ocpN/genX/polarity
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Specifies the signal polarity, either 1 or 0.
+
+What: /sys/class/timecard/ocpN/genX/running
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Either 0 or 1, showing if the signal generator is running.
+
+What: /sys/class/timecard/ocpN/genX/start
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RO) Shows the time in <sec>.<nsec> that the signal generator
+ started running.
+
+What: /sys/class/timecard/ocpN/genX/signal
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RW) Used to start the signal generator, and summarize
+ the current status.
+
+ The signal generator may be started by writing the signal
+ period, followed by the optional signal values. If the
+ optional values are not provided, they default to the current
+ settings, which may be obtained from the other sysfs nodes.
+
+ period [duty [phase [polarity]]]
+
+ echo 500000000 > signal # 1/2 second period
+ echo 1000000 40 100 > signal
+ echo 0 > signal # turn off generator
+
+ Period and phase are specified in nanoseconds. Duty cycle is
+ a percentage from 1-99. Polarity is 1 or 0.
+
+ Reading this node will return:
+
+ period duty phase polarity start_time
+
What: /sys/class/timecard/ocpN/gnss_sync
Date: September 2021
Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
@@ -126,6 +230,16 @@ Description: (RW) These attributes specify the direction of the signal
The 10Mhz reference clock input is currently only valid
on SMA1 and may not be combined with other destination sinks.
+What: /sys/class/timecard/ocpN/tod_correction
+Date: March 2022
+Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
+Description: (RW) The incoming GNSS signal is in UTC time, and the NMEA
+ format messages do not provide a TAI offset. This sets the
+ correction value for the incoming time.
+
+ If UBX_LS is enabled, this should be 0, and the offset is
+ taken from the UBX-NAV-TIMELS message.
+
What: /sys/class/timecard/ocpN/ts_window_adjust
Date: September 2021
Contact: Jonathan Lemon <jonathan.lemon@gmail.com>
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 4150f74c521a..f86b5e1623c6 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -365,6 +365,15 @@ new netns has been created.
Default : 0 (for compatibility reasons)
+txrehash
+--------
+
+Controls default hash rethink behaviour on listening socket when SO_TXREHASH
+option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
+
+If set to 1 (default), hash rethink is performed on listening socket.
+If set to 0, hash rethink is not performed.
+
2. /proc/sys/net/unix - Parameters for Unix domain sockets
----------------------------------------------------------
diff --git a/Documentation/bpf/bpf_prog_run.rst b/Documentation/bpf/bpf_prog_run.rst
new file mode 100644
index 000000000000..4868c909df5c
--- /dev/null
+++ b/Documentation/bpf/bpf_prog_run.rst
@@ -0,0 +1,117 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Running BPF programs from userspace
+===================================
+
+This document describes the ``BPF_PROG_RUN`` facility for running BPF programs
+from userspace.
+
+.. contents::
+ :local:
+ :depth: 2
+
+
+Overview
+--------
+
+The ``BPF_PROG_RUN`` command can be used through the ``bpf()`` syscall to
+execute a BPF program in the kernel and return the results to userspace. This
+can be used to unit test BPF programs against user-supplied context objects, and
+as way to explicitly execute programs in the kernel for their side effects. The
+command was previously named ``BPF_PROG_TEST_RUN``, and both constants continue
+to be defined in the UAPI header, aliased to the same value.
+
+The ``BPF_PROG_RUN`` command can be used to execute BPF programs of the
+following types:
+
+- ``BPF_PROG_TYPE_SOCKET_FILTER``
+- ``BPF_PROG_TYPE_SCHED_CLS``
+- ``BPF_PROG_TYPE_SCHED_ACT``
+- ``BPF_PROG_TYPE_XDP``
+- ``BPF_PROG_TYPE_SK_LOOKUP``
+- ``BPF_PROG_TYPE_CGROUP_SKB``
+- ``BPF_PROG_TYPE_LWT_IN``
+- ``BPF_PROG_TYPE_LWT_OUT``
+- ``BPF_PROG_TYPE_LWT_XMIT``
+- ``BPF_PROG_TYPE_LWT_SEG6LOCAL``
+- ``BPF_PROG_TYPE_FLOW_DISSECTOR``
+- ``BPF_PROG_TYPE_STRUCT_OPS``
+- ``BPF_PROG_TYPE_RAW_TRACEPOINT``
+- ``BPF_PROG_TYPE_SYSCALL``
+
+When using the ``BPF_PROG_RUN`` command, userspace supplies an input context
+object and (for program types operating on network packets) a buffer containing
+the packet data that the BPF program will operate on. The kernel will then
+execute the program and return the results to userspace. Note that programs will
+not have any side effects while being run in this mode; in particular, packets
+will not actually be redirected or dropped, the program return code will just be
+returned to userspace. A separate mode for live execution of XDP programs is
+provided, documented separately below.
+
+Running XDP programs in "live frame mode"
+-----------------------------------------
+
+The ``BPF_PROG_RUN`` command has a separate mode for running live XDP programs,
+which can be used to execute XDP programs in a way where packets will actually
+be processed by the kernel after the execution of the XDP program as if they
+arrived on a physical interface. This mode is activated by setting the
+``BPF_F_TEST_XDP_LIVE_FRAMES`` flag when supplying an XDP program to
+``BPF_PROG_RUN``.
+
+The live packet mode is optimised for high performance execution of the supplied
+XDP program many times (suitable for, e.g., running as a traffic generator),
+which means the semantics are not quite as straight-forward as the regular test
+run mode. Specifically:
+
+- When executing an XDP program in live frame mode, the result of the execution
+ will not be returned to userspace; instead, the kernel will perform the
+ operation indicated by the program's return code (drop the packet, redirect
+ it, etc). For this reason, setting the ``data_out`` or ``ctx_out`` attributes
+ in the syscall parameters when running in this mode will be rejected. In
+ addition, not all failures will be reported back to userspace directly;
+ specifically, only fatal errors in setup or during execution (like memory
+ allocation errors) will halt execution and return an error. If an error occurs
+ in packet processing, like a failure to redirect to a given interface,
+ execution will continue with the next repetition; these errors can be detected
+ via the same trace points as for regular XDP programs.
+
+- Userspace can supply an ifindex as part of the context object, just like in
+ the regular (non-live) mode. The XDP program will be executed as though the
+ packet arrived on this interface; i.e., the ``ingress_ifindex`` of the context
+ object will point to that interface. Furthermore, if the XDP program returns
+ ``XDP_PASS``, the packet will be injected into the kernel networking stack as
+ though it arrived on that ifindex, and if it returns ``XDP_TX``, the packet
+ will be transmitted *out* of that same interface. Do note, though, that
+ because the program execution is not happening in driver context, an
+ ``XDP_TX`` is actually turned into the same action as an ``XDP_REDIRECT`` to
+ that same interface (i.e., it will only work if the driver has support for the
+ ``ndo_xdp_xmit`` driver op).
+
+- When running the program with multiple repetitions, the execution will happen
+ in batches. The batch size defaults to 64 packets (which is same as the
+ maximum NAPI receive batch size), but can be specified by userspace through
+ the ``batch_size`` parameter, up to a maximum of 256 packets. For each batch,
+ the kernel executes the XDP program repeatedly, each invocation getting a
+ separate copy of the packet data. For each repetition, if the program drops
+ the packet, the data page is immediately recycled (see below). Otherwise, the
+ packet is buffered until the end of the batch, at which point all packets
+ buffered this way during the batch are transmitted at once.
+
+- When setting up the test run, the kernel will initialise a pool of memory
+ pages of the same size as the batch size. Each memory page will be initialised
+ with the initial packet data supplied by userspace at ``BPF_PROG_RUN``
+ invocation. When possible, the pages will be recycled on future program
+ invocations, to improve performance. Pages will generally be recycled a full
+ batch at a time, except when a packet is dropped (by return code or because
+ of, say, a redirection error), in which case that page will be recycled
+ immediately. If a packet ends up being passed to the regular networking stack
+ (because the XDP program returns ``XDP_PASS``, or because it ends up being
+ redirected to an interface that injects it into the stack), the page will be
+ released and a new one will be allocated when the pool is empty.
+
+ When recycling, the page content is not rewritten; only the packet boundary
+ pointers (``data``, ``data_end`` and ``data_meta``) in the context object will
+ be reset to the original values. This means that if a program rewrites the
+ packet contents, it has to be prepared to see either the original content or
+ the modified version on subsequent invocations.
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
index 1ebf4c5c7ddc..7940da9bc6c1 100644
--- a/Documentation/bpf/btf.rst
+++ b/Documentation/bpf/btf.rst
@@ -503,6 +503,19 @@ valid index (starting from 0) pointing to a member or an argument.
* ``info.vlen``: 0
* ``type``: the type with ``btf_type_tag`` attribute
+Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types.
+It has the following btf type chain:
+::
+
+ ptr -> [type_tag]*
+ -> [const | volatile | restrict | typedef]*
+ -> base_type
+
+Basically, a pointer type points to zero or more
+type_tag, then zero or more const/volatile/restrict/typedef
+and finally the base type. The base type is one of
+int, ptr, array, struct, union, enum, func_proto and float types.
+
3. BTF Kernel API
=================
@@ -565,18 +578,15 @@ A map can be created with ``btf_fd`` and specified key/value type id.::
In libbpf, the map can be defined with extra annotation like below:
::
- struct bpf_map_def SEC("maps") btf_map = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(struct ipv_counts),
- .max_entries = 4,
- };
- BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
+ struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct ipv_counts);
+ __uint(max_entries, 4);
+ } btf_map SEC(".maps");
-Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
-value types for the map. During ELF parsing, libbpf is able to extract
-key/value type_id's and assign them to BPF_MAP_CREATE attributes
-automatically.
+During ELF parsing, libbpf is able to extract key/value type_id's and assign
+them to BPF_MAP_CREATE attributes automatically.
.. _BPF_Prog_Load:
@@ -824,13 +834,12 @@ structure has bitfields. For example, for the following map,::
___A b1:4;
enum A b2:4;
};
- struct bpf_map_def SEC("maps") tmpmap = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(__u32),
- .value_size = sizeof(struct tmp_t),
- .max_entries = 1,
- };
- BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
+ struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct tmp_t);
+ __uint(max_entries, 1);
+ } tmpmap SEC(".maps");
bpftool is able to pretty print like below:
::
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index ef5c996547ec..96056a7447c7 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -21,6 +21,7 @@ that goes into great technical depth about the BPF Architecture.
helpers
programs
maps
+ bpf_prog_run
classic_vs_extended.rst
bpf_licensing
test_debug
diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index 3704836fe6df..5300837ac2c9 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -22,7 +22,13 @@ necessary across calls.
Instruction encoding
====================
-eBPF uses 64-bit instructions with the following encoding:
+eBPF has two instruction encodings:
+
+ * the basic instruction encoding, which uses 64 bits to encode an instruction
+ * the wide instruction encoding, which appends a second 64-bit immediate value
+ (imm64) after the basic instruction for a total of 128 bits.
+
+The basic instruction encoding looks as follows:
============= ======= =============== ==================== ============
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
@@ -82,9 +88,9 @@ BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
otherwise identical operations.
The code field encodes the operation as below:
- ======== ===== ==========================
+ ======== ===== =================================================
code value description
- ======== ===== ==========================
+ ======== ===== =================================================
BPF_ADD 0x00 dst += src
BPF_SUB 0x10 dst -= src
BPF_MUL 0x20 dst \*= src
@@ -98,8 +104,8 @@ The code field encodes the operation as below:
BPF_XOR 0xa0 dst ^= src
BPF_MOV 0xb0 dst = src
BPF_ARSH 0xc0 sign extending shift right
- BPF_END 0xd0 endianness conversion
- ======== ===== ==========================
+ BPF_END 0xd0 byte swap operations (see separate section below)
+ ======== ===== =================================================
BPF_ADD | BPF_X | BPF_ALU means::
@@ -118,6 +124,42 @@ BPF_XOR | BPF_K | BPF_ALU64 means::
src_reg = src_reg ^ imm32
+Byte swap instructions
+----------------------
+
+The byte swap instructions use an instruction class of ``BFP_ALU`` and a 4-bit
+code field of ``BPF_END``.
+
+The byte swap instructions instructions operate on the destination register
+only and do not use a separate source register or immediate value.
+
+The 1-bit source operand field in the opcode is used to to select what byte
+order the operation convert from or to:
+
+ ========= ===== =================================================
+ source value description
+ ========= ===== =================================================
+ BPF_TO_LE 0x00 convert between host byte order and little endian
+ BPF_TO_BE 0x08 convert between host byte order and big endian
+ ========= ===== =================================================
+
+The imm field encodes the width of the swap operations. The following widths
+are supported: 16, 32 and 64.
+
+Examples:
+
+``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means::
+
+ dst_reg = htole16(dst_reg)
+
+``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means::
+
+ dst_reg = htobe64(dst_reg)
+
+``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and
+``BPF_TO_LE`` respetively.
+
+
Jump instructions
-----------------
@@ -176,63 +218,96 @@ The mode modifier is one of:
============= ===== ====================================
mode modifier value description
============= ===== ====================================
- BPF_IMM 0x00 used for 64-bit mov
- BPF_ABS 0x20 legacy BPF packet access
- BPF_IND 0x40 legacy BPF packet access
- BPF_MEM 0x60 all normal load and store operations
+ BPF_IMM 0x00 64-bit immediate instructions
+ BPF_ABS 0x20 legacy BPF packet access (absolute)
+ BPF_IND 0x40 legacy BPF packet access (indirect)
+ BPF_MEM 0x60 regular load and store operations
BPF_ATOMIC 0xc0 atomic operations
============= ===== ====================================
-BPF_MEM | <size> | BPF_STX means::
+
+Regular load and store operations
+---------------------------------
+
+The ``BPF_MEM`` mode modifier is used to encode regular load and store
+instructions that transfer data between a register and memory.
+
+``BPF_MEM | <size> | BPF_STX`` means::
*(size *) (dst_reg + off) = src_reg
-BPF_MEM | <size> | BPF_ST means::
+``BPF_MEM | <size> | BPF_ST`` means::
*(size *) (dst_reg + off) = imm32
-BPF_MEM | <size> | BPF_LDX means::
+``BPF_MEM | <size> | BPF_LDX`` means::
dst_reg = *(size *) (src_reg + off)
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
+Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
Atomic operations
-----------------
-eBPF includes atomic operations, which use the immediate field for extra
-encoding::
+Atomic operations are operations that operate on memory and can not be
+interrupted or corrupted by other access to the same memory region
+by other eBPF programs or means outside of this specification.
+
+All atomic operations supported by eBPF are encoded as store operations
+that use the ``BPF_ATOMIC`` mode modifier as follows:
+
+ * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
+ * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
+ * 8-bit and 16-bit wide atomic operations are not supported.
- .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
- .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+The imm field is used to encode the actual atomic operation.
+Simple atomic operation use a subset of the values defined to encode
+arithmetic operations in the imm field to encode the atomic operation:
-The basic atomic operations supported are::
+ ======== ===== ===========
+ imm value description
+ ======== ===== ===========
+ BPF_ADD 0x00 atomic add
+ BPF_OR 0x40 atomic or
+ BPF_AND 0x50 atomic and
+ BPF_XOR 0xa0 atomic xor
+ ======== ===== ===========
- BPF_ADD
- BPF_AND
- BPF_OR
- BPF_XOR
-Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
-memory location addresed by ``dst_reg + off`` is atomically modified, with
-``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
-immediate, then these operations also overwrite ``src_reg`` with the
-value that was in memory before it was modified.
+``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means::
-The more special operations are::
+ *(u32 *)(dst_reg + off16) += src_reg
- BPF_XCHG
+``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means::
-This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
-off``. ::
+ *(u64 *)(dst_reg + off16) += src_reg
- BPF_CMPXCHG
+``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``.
-This atomically compares the value addressed by ``dst_reg + off`` with
-``R0``. If they match it is replaced with ``src_reg``. In either case, the
-value that was there before is zero-extended and loaded back to ``R0``.
+In addition to the simple atomic operations, there also is a modifier and
+two complex atomic operations:
-Note that 1 and 2 byte atomic operations are not supported.
+ =========== ================ ===========================
+ imm value description
+ =========== ================ ===========================
+ BPF_FETCH 0x01 modifier: return old value
+ BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
+ BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
+ =========== ================ ===========================
+
+The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
+always set for the complex atomic operations. If the ``BPF_FETCH`` flag
+is set, then the operation also overwrites ``src_reg`` with the value that
+was in memory before it was modified.
+
+The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value
+addressed by ``dst_reg + off``.
+
+The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
+``dst_reg + off`` with ``R0``. If they match, the value addressed by
+``dst_reg + off`` is replaced with ``src_reg``. In either case, the
+value that was at ``dst_reg + off`` before the operation is zero-extended
+and loaded back to ``R0``.
Clang can generate atomic instructions by default when ``-mcpu=v3`` is
enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
@@ -240,40 +315,52 @@ Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
the atomics features, while keeping a lower ``-mcpu`` version, you can use
``-Xclang -target-feature -Xclang +alu32``.
-You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
-referring to the exclusive-add operation encoded when the immediate field is
-zero.
+64-bit immediate instructions
+-----------------------------
-16-byte instructions
---------------------
+Instructions with the ``BPF_IMM`` mode modifier use the wide instruction
+encoding for an extra imm64 value.
-eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
-of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
-instruction that loads 64-bit immediate value into a dst_reg.
+There is currently only one such instruction.
-Packet access instructions
---------------------------
+``BPF_LD | BPF_DW | BPF_IMM`` means::
-eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
-(BPF_IND | <size> | BPF_LD) which are used to access packet data.
+ dst_reg = imm64
-They had to be carried over from classic BPF to have strong performance of
-socket filters running in eBPF interpreter. These instructions can only
-be used when interpreter context is a pointer to ``struct sk_buff`` and
-have seven implicit operands. Register R6 is an implicit input that must
-contain pointer to sk_buff. Register R0 is an implicit output which contains
-the data fetched from the packet. Registers R1-R5 are scratch registers
-and must not be used to store the data across BPF_ABS | BPF_LD or
-BPF_IND | BPF_LD instructions.
-These instructions have implicit program exit condition as well. When
-eBPF program is trying to access the data beyond the packet boundary,
-the interpreter will abort the execution of the program. JIT compilers
-therefore must preserve this property. src_reg and imm32 fields are
-explicit inputs to these instructions.
+Legacy BPF Packet access instructions
+-------------------------------------
-For example, BPF_IND | BPF_W | BPF_LD means::
+eBPF has special instructions for access to packet data that have been
+carried over from classic BPF to retain the performance of legacy socket
+filters running in the eBPF interpreter.
- R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
+The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
+``BPF_IND | <size> | BPF_LD``.
-and R1 - R5 are clobbered.
+These instructions are used to access packet data and can only be used when
+the program context is a pointer to networking packet. ``BPF_ABS``
+accesses packet data at an absolute offset specified by the immediate data
+and ``BPF_IND`` access packet data at an offset that includes the value of
+a register in addition to the immediate data.
+
+These instructions have seven implicit operands:
+
+ * Register R6 is an implicit input that must contain pointer to a
+ struct sk_buff.
+ * Register R0 is an implicit output which contains the data fetched from
+ the packet.
+ * Registers R1-R5 are scratch registers that are clobbered after a call to
+ ``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions.
+
+These instructions have an implicit program exit condition as well. When an
+eBPF program is trying to access the data beyond the packet boundary, the
+program execution will be aborted.
+
+``BPF_ABS | BPF_W | BPF_LD`` means::
+
+ R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32))
+
+``BPF_IND | BPF_W | BPF_LD`` means::
+
+ R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
diff --git a/Documentation/bpf/verifier.rst b/Documentation/bpf/verifier.rst
index fae5f6273bac..d4326caf01f9 100644
--- a/Documentation/bpf/verifier.rst
+++ b/Documentation/bpf/verifier.rst
@@ -329,7 +329,7 @@ Program with unreachable instructions::
BPF_EXIT_INSN(),
};
-Error:
+Error::
unreachable insn 1
diff --git a/Documentation/devicetree/bindings/i2c/i2c.txt b/Documentation/devicetree/bindings/i2c/i2c.txt
index b864916e087f..fc3dd7ec0445 100644
--- a/Documentation/devicetree/bindings/i2c/i2c.txt
+++ b/Documentation/devicetree/bindings/i2c/i2c.txt
@@ -95,6 +95,10 @@ wants to support one of the below features, it should adapt these bindings.
- smbus-alert
states that the optional SMBus-Alert feature apply to this bus.
+- mctp-controller
+ indicates that the system is accessible via this bus as an endpoint for
+ MCTP over I2C transport.
+
Required properties (per child device)
--------------------------------------
diff --git a/Documentation/devicetree/bindings/net/can/allwinner,sun4i-a10-can.yaml b/Documentation/devicetree/bindings/net/can/allwinner,sun4i-a10-can.yaml
index c93fe9d3ea82..3c51b2d02957 100644
--- a/Documentation/devicetree/bindings/net/can/allwinner,sun4i-a10-can.yaml
+++ b/Documentation/devicetree/bindings/net/can/allwinner,sun4i-a10-can.yaml
@@ -10,6 +10,9 @@ maintainers:
- Chen-Yu Tsai <wens@csie.org>
- Maxime Ripard <mripard@kernel.org>
+allOf:
+ - $ref: can-controller.yaml#
+
properties:
compatible:
oneOf:
diff --git a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
index 401ab7cdb379..b7f9803c1c6d 100644
--- a/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
+++ b/Documentation/devicetree/bindings/net/can/bosch,m_can.yaml
@@ -9,7 +9,10 @@ title: Bosch MCAN controller Bindings
description: Bosch MCAN controller for CAN bus
maintainers:
- - Sriram Dash <sriram.dash@samsung.com>
+ - Chandrasekar Ramakrishnan <rcsekar@samsung.com>
+
+allOf:
+ - $ref: can-controller.yaml#
properties:
compatible:
@@ -66,8 +69,8 @@ properties:
M_CAN includes the following elements according to user manual:
11-bit Filter 0-128 elements / 0-128 words
29-bit Filter 0-64 elements / 0-128 words
- Rx FIFO 0 0-64 elements / 0-1152 words
- Rx FIFO 1 0-64 elements / 0-1152 words
+ Rx FIFO 0 0-64 elements / 0-1152 words
+ Rx FIFO 1 0-64 elements / 0-1152 words
Rx Buffers 0-64 elements / 0-1152 words
Tx Event FIFO 0-32 elements / 0-64 words
Tx Buffers 0-32 elements / 0-576 words
diff --git a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml
index 2a884c1fe0e0..b3826af6bd6e 100644
--- a/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml
+++ b/Documentation/devicetree/bindings/net/can/microchip,mcp251xfd.yaml
@@ -11,6 +11,9 @@ title:
maintainers:
- Marc Kleine-Budde <mkl@pengutronix.de>
+allOf:
+ - $ref: can-controller.yaml#
+
properties:
compatible:
oneOf:
diff --git a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
index 546c6e6d2fb0..91a3554ca950 100644
--- a/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
+++ b/Documentation/devicetree/bindings/net/can/renesas,rcar-canfd.yaml
@@ -35,6 +35,8 @@ properties:
- renesas,r9a07g044-canfd # RZ/G2{L,LC}
- const: renesas,rzg2l-canfd # RZ/G2L family
+ - const: renesas,r8a779a0-canfd # R-Car V3U
+
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml
new file mode 100644
index 000000000000..65af8183cb9c
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml
@@ -0,0 +1,161 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/can/xilinx,can.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title:
+ Xilinx Axi CAN/Zynq CANPS controller
+
+maintainers:
+ - Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
+
+properties:
+ compatible:
+ enum:
+ - xlnx,zynq-can-1.0
+ - xlnx,axi-can-1.00.a
+ - xlnx,canfd-1.0
+ - xlnx,canfd-2.0
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ minItems: 1
+ maxItems: 2
+
+ clock-names:
+ maxItems: 2
+
+ power-domains:
+ maxItems: 1
+
+ tx-fifo-depth:
+ $ref: "/schemas/types.yaml#/definitions/uint32"
+ description: CAN Tx fifo depth (Zynq, Axi CAN).
+
+ rx-fifo-depth:
+ $ref: "/schemas/types.yaml#/definitions/uint32"
+ description: CAN Rx fifo depth (Zynq, Axi CAN, CAN FD in sequential Rx mode)
+
+ tx-mailbox-count:
+ $ref: "/schemas/types.yaml#/definitions/uint32"
+ description: CAN Tx mailbox buffer count (CAN FD)
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+unevaluatedProperties: false
+
+allOf:
+ - $ref: can-controller.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - xlnx,zynq-can-1.0
+
+ then:
+ properties:
+ clock-names:
+ items:
+ - const: can_clk
+ - const: pclk
+ required:
+ - tx-fifo-depth
+ - rx-fifo-depth
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - xlnx,axi-can-1.00.a
+
+ then:
+ properties:
+ clock-names:
+ items:
+ - const: can_clk
+ - const: s_axi_aclk
+ required:
+ - tx-fifo-depth
+ - rx-fifo-depth
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - xlnx,canfd-1.0
+ - xlnx,canfd-2.0
+
+ then:
+ properties:
+ clock-names:
+ items:
+ - const: can_clk
+ - const: s_axi_aclk
+ required:
+ - tx-mailbox-count
+ - rx-fifo-depth
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ can@e0008000 {
+ compatible = "xlnx,zynq-can-1.0";
+ reg = <0xe0008000 0x1000>;
+ clocks = <&clkc 19>, <&clkc 36>;
+ clock-names = "can_clk", "pclk";
+ interrupts = <GIC_SPI 28 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-parent = <&intc>;
+ tx-fifo-depth = <0x40>;
+ rx-fifo-depth = <0x40>;
+ };
+
+ - |
+ can@40000000 {
+ compatible = "xlnx,axi-can-1.00.a";
+ reg = <0x40000000 0x10000>;
+ clocks = <&clkc 0>, <&clkc 1>;
+ clock-names = "can_clk", "s_axi_aclk";
+ interrupt-parent = <&intc>;
+ interrupts = <GIC_SPI 59 IRQ_TYPE_EDGE_RISING>;
+ tx-fifo-depth = <0x40>;
+ rx-fifo-depth = <0x40>;
+ };
+
+ - |
+ can@40000000 {
+ compatible = "xlnx,canfd-1.0";
+ reg = <0x40000000 0x2000>;
+ clocks = <&clkc 0>, <&clkc 1>;
+ clock-names = "can_clk", "s_axi_aclk";
+ interrupt-parent = <&intc>;
+ interrupts = <GIC_SPI 59 IRQ_TYPE_EDGE_RISING>;
+ tx-mailbox-count = <0x20>;
+ rx-fifo-depth = <0x20>;
+ };
+
+ - |
+ can@ff060000 {
+ compatible = "xlnx,canfd-2.0";
+ reg = <0xff060000 0x6000>;
+ clocks = <&clkc 0>, <&clkc 1>;
+ clock-names = "can_clk", "s_axi_aclk";
+ interrupt-parent = <&intc>;
+ interrupts = <GIC_SPI 59 IRQ_TYPE_EDGE_RISING>;
+ tx-mailbox-count = <0x20>;
+ rx-fifo-depth = <0x40>;
+ };
diff --git a/Documentation/devicetree/bindings/net/can/xilinx_can.txt b/Documentation/devicetree/bindings/net/can/xilinx_can.txt
deleted file mode 100644
index 100cc40b8510..000000000000
--- a/Documentation/devicetree/bindings/net/can/xilinx_can.txt
+++ /dev/null
@@ -1,61 +0,0 @@
-Xilinx Axi CAN/Zynq CANPS controller Device Tree Bindings
----------------------------------------------------------
-
-Required properties:
-- compatible : Should be:
- - "xlnx,zynq-can-1.0" for Zynq CAN controllers
- - "xlnx,axi-can-1.00.a" for Axi CAN controllers
- - "xlnx,canfd-1.0" for CAN FD controllers
- - "xlnx,canfd-2.0" for CAN FD 2.0 controllers
-- reg : Physical base address and size of the controller
- registers map.
-- interrupts : Property with a value describing the interrupt
- number.
-- clock-names : List of input clock names
- - "can_clk", "pclk" (For CANPS),
- - "can_clk", "s_axi_aclk" (For AXI CAN and CAN FD).
- (See clock bindings for details).
-- clocks : Clock phandles (see clock bindings for details).
-- tx-fifo-depth : Can Tx fifo depth (Zynq, Axi CAN).
-- rx-fifo-depth : Can Rx fifo depth (Zynq, Axi CAN, CAN FD in
- sequential Rx mode).
-- tx-mailbox-count : Can Tx mailbox buffer count (CAN FD).
-- rx-mailbox-count : Can Rx mailbox buffer count (CAN FD in mailbox Rx
- mode).
-
-
-Example:
-
-For Zynq CANPS Dts file:
- zynq_can_0: can@e0008000 {
- compatible = "xlnx,zynq-can-1.0";
- clocks = <&clkc 19>, <&clkc 36>;
- clock-names = "can_clk", "pclk";
- reg = <0xe0008000 0x1000>;
- interrupts = <0 28 4>;
- interrupt-parent = <&intc>;
- tx-fifo-depth = <0x40>;
- rx-fifo-depth = <0x40>;
- };
-For Axi CAN Dts file:
- axi_can_0: axi-can@40000000 {
- compatible = "xlnx,axi-can-1.00.a";
- clocks = <&clkc 0>, <&clkc 1>;
- clock-names = "can_clk","s_axi_aclk" ;
- reg = <0x40000000 0x10000>;
- interrupt-parent = <&intc>;
- interrupts = <0 59 1>;
- tx-fifo-depth = <0x40>;
- rx-fifo-depth = <0x40>;
- };
-For CAN FD Dts file:
- canfd_0: canfd@40000000 {
- compatible = "xlnx,canfd-1.0";
- clocks = <&clkc 0>, <&clkc 1>;
- clock-names = "can_clk", "s_axi_aclk";
- reg = <0x40000000 0x2000>;
- interrupt-parent = <&intc>;
- interrupts = <0 59 1>;
- tx-mailbox-count = <0x20>;
- rx-fifo-depth = <0x20>;
- };
diff --git a/Documentation/devicetree/bindings/net/cdns,macb.yaml b/Documentation/devicetree/bindings/net/cdns,macb.yaml
index 8dd06db34169..6cd3d853dcba 100644
--- a/Documentation/devicetree/bindings/net/cdns,macb.yaml
+++ b/Documentation/devicetree/bindings/net/cdns,macb.yaml
@@ -81,6 +81,25 @@ properties:
phy-handle: true
+ phys:
+ maxItems: 1
+
+ phy-names:
+ const: sgmii-phy
+ description:
+ Required with ZynqMP SoC when in SGMII mode.
+ Should reference PS-GTR generic PHY device for this controller
+ instance. See ZynqMP example.
+
+ resets:
+ maxItems: 1
+ description:
+ Recommended with ZynqMP, specify reset control for this
+ controller instance with zynqmp-reset driver.
+
+ reset-names:
+ maxItems: 1
+
fixed-link: true
iommus:
@@ -157,3 +176,40 @@ examples:
reset-gpios = <&pioE 6 1>;
};
};
+
+ - |
+ #include <dt-bindings/clock/xlnx-zynqmp-clk.h>
+ #include <dt-bindings/power/xlnx-zynqmp-power.h>
+ #include <dt-bindings/reset/xlnx-zynqmp-resets.h>
+ #include <dt-bindings/phy/phy.h>
+
+ bus {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ gem1: ethernet@ff0c0000 {
+ compatible = "cdns,zynqmp-gem", "cdns,gem";
+ interrupt-parent = <&gic>;
+ interrupts = <0 59 4>, <0 59 4>;
+ reg = <0x0 0xff0c0000 0x0 0x1000>;
+ clocks = <&zynqmp_clk LPD_LSBUS>, <&zynqmp_clk GEM1_REF>,
+ <&zynqmp_clk GEM1_TX>, <&zynqmp_clk GEM1_RX>,
+ <&zynqmp_clk GEM_TSU>;
+ clock-names = "pclk", "hclk", "tx_clk", "rx_clk", "tsu_clk";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ #stream-id-cells = <1>;
+ iommus = <&smmu 0x875>;
+ power-domains = <&zynqmp_firmware PD_ETH_1>;
+ resets = <&zynqmp_reset ZYNQMP_RESET_GEM1>;
+ reset-names = "gem1_rst";
+ status = "okay";
+ phy-mode = "sgmii";
+ phy-names = "sgmii-phy";
+ phys = <&psgtr 1 PHY_TYPE_SGMII 1 1>;
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ pause;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/davicom,dm9051.yaml b/Documentation/devicetree/bindings/net/davicom,dm9051.yaml
new file mode 100644
index 000000000000..52e852fef753
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/davicom,dm9051.yaml
@@ -0,0 +1,62 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/davicom,dm9051.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Davicom DM9051 SPI Ethernet Controller
+
+maintainers:
+ - Joseph CHANG <josright123@gmail.com>
+
+description: |
+ The DM9051 is a fully integrated and cost-effective low pin count single
+ chip Fast Ethernet controller with a Serial Peripheral Interface (SPI).
+
+allOf:
+ - $ref: ethernet-controller.yaml#
+
+properties:
+ compatible:
+ const: davicom,dm9051
+
+ reg:
+ maxItems: 1
+
+ spi-max-frequency:
+ maximum: 45000000
+
+ interrupts:
+ maxItems: 1
+
+ local-mac-address: true
+
+ mac-address: true
+
+required:
+ - compatible
+ - reg
+ - spi-max-frequency
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ # Raspberry Pi platform
+ - |
+ /* for Raspberry Pi with pin control stuff for GPIO irq */
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/gpio/gpio.h>
+ spi {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethernet@0 {
+ compatible = "davicom,dm9051";
+ reg = <0>; /* spi chip select */
+ local-mac-address = [00 00 00 00 00 00];
+ interrupt-parent = <&gpio>;
+ interrupts = <26 IRQ_TYPE_LEVEL_LOW>;
+ spi-max-frequency = <31200000>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/dsa/dsa-port.yaml b/Documentation/devicetree/bindings/net/dsa/dsa-port.yaml
index 702df848a71d..e60867c7c571 100644
--- a/Documentation/devicetree/bindings/net/dsa/dsa-port.yaml
+++ b/Documentation/devicetree/bindings/net/dsa/dsa-port.yaml
@@ -51,6 +51,8 @@ properties:
- edsa
- ocelot
- ocelot-8021q
+ - rtl8_4
+ - rtl8_4t
- seville
phy-handle: true
diff --git a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
index 84985f53bffd..184152087b60 100644
--- a/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
+++ b/Documentation/devicetree/bindings/net/dsa/microchip,ksz.yaml
@@ -42,6 +42,12 @@ properties:
description:
Set if the output SYNCLKO frequency should be set to 125MHz instead of 25MHz.
+ microchip,synclko-disable:
+ $ref: /schemas/types.yaml#/definitions/flag
+ description:
+ Set if the output SYNCLKO clock should be disabled. Do not mix with
+ microchip,synclko-125.
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/net/dsa/realtek-smi.txt b/Documentation/devicetree/bindings/net/dsa/realtek-smi.txt
deleted file mode 100644
index 7959ec237983..000000000000
--- a/Documentation/devicetree/bindings/net/dsa/realtek-smi.txt
+++ /dev/null
@@ -1,240 +0,0 @@
-Realtek SMI-based Switches
-==========================
-
-The SMI "Simple Management Interface" is a two-wire protocol using
-bit-banged GPIO that while it reuses the MDIO lines MCK and MDIO does
-not use the MDIO protocol. This binding defines how to specify the
-SMI-based Realtek devices.
-
-Required properties:
-
-- compatible: must be exactly one of:
- "realtek,rtl8365mb" (4+1 ports)
- "realtek,rtl8366"
- "realtek,rtl8366rb" (4+1 ports)
- "realtek,rtl8366s" (4+1 ports)
- "realtek,rtl8367"
- "realtek,rtl8367b"
- "realtek,rtl8368s" (8 port)
- "realtek,rtl8369"
- "realtek,rtl8370" (8 port)
-
-Required properties:
-- mdc-gpios: GPIO line for the MDC clock line.
-- mdio-gpios: GPIO line for the MDIO data line.
-- reset-gpios: GPIO line for the reset signal.
-
-Optional properties:
-- realtek,disable-leds: if the LED drivers are not used in the
- hardware design this will disable them so they are not turned on
- and wasting power.
-
-Required subnodes:
-
-- interrupt-controller
-
- This defines an interrupt controller with an IRQ line (typically
- a GPIO) that will demultiplex and handle the interrupt from the single
- interrupt line coming out of one of the SMI-based chips. It most
- importantly provides link up/down interrupts to the PHY blocks inside
- the ASIC.
-
-Required properties of interrupt-controller:
-
-- interrupt: parent interrupt, see interrupt-controller/interrupts.txt
-- interrupt-controller: see interrupt-controller/interrupts.txt
-- #address-cells: should be <0>
-- #interrupt-cells: should be <1>
-
-- mdio
-
- This defines the internal MDIO bus of the SMI device, mostly for the
- purpose of being able to hook the interrupts to the right PHY and
- the right PHY to the corresponding port.
-
-Required properties of mdio:
-
-- compatible: should be set to "realtek,smi-mdio" for all SMI devices
-
-See net/mdio.txt for additional MDIO bus properties.
-
-See net/dsa/dsa.txt for a list of additional required and optional properties
-and subnodes of DSA switches.
-
-Examples:
-
-An example for the RTL8366RB:
-
-switch {
- compatible = "realtek,rtl8366rb";
- /* 22 = MDIO (has input reads), 21 = MDC (clock, output only) */
- mdc-gpios = <&gpio0 21 GPIO_ACTIVE_HIGH>;
- mdio-gpios = <&gpio0 22 GPIO_ACTIVE_HIGH>;
- reset-gpios = <&gpio0 14 GPIO_ACTIVE_LOW>;
-
- switch_intc: interrupt-controller {
- /* GPIO 15 provides the interrupt */
- interrupt-parent = <&gpio0>;
- interrupts = <15 IRQ_TYPE_LEVEL_LOW>;
- interrupt-controller;
- #address-cells = <0>;
- #interrupt-cells = <1>;
- };
-
- ports {
- #address-cells = <1>;
- #size-cells = <0>;
- reg = <0>;
- port@0 {
- reg = <0>;
- label = "lan0";
- phy-handle = <&phy0>;
- };
- port@1 {
- reg = <1>;
- label = "lan1";
- phy-handle = <&phy1>;
- };
- port@2 {
- reg = <2>;
- label = "lan2";
- phy-handle = <&phy2>;
- };
- port@3 {
- reg = <3>;
- label = "lan3";
- phy-handle = <&phy3>;
- };
- port@4 {
- reg = <4>;
- label = "wan";
- phy-handle = <&phy4>;
- };
- port@5 {
- reg = <5>;
- label = "cpu";
- ethernet = <&gmac0>;
- phy-mode = "rgmii";
- fixed-link {
- speed = <1000>;
- full-duplex;
- };
- };
- };
-
- mdio {
- compatible = "realtek,smi-mdio", "dsa-mdio";
- #address-cells = <1>;
- #size-cells = <0>;
-
- phy0: phy@0 {
- reg = <0>;
- interrupt-parent = <&switch_intc>;
- interrupts = <0>;
- };
- phy1: phy@1 {
- reg = <1>;
- interrupt-parent = <&switch_intc>;
- interrupts = <1>;
- };
- phy2: phy@2 {
- reg = <2>;
- interrupt-parent = <&switch_intc>;
- interrupts = <2>;
- };
- phy3: phy@3 {
- reg = <3>;
- interrupt-parent = <&switch_intc>;
- interrupts = <3>;
- };
- phy4: phy@4 {
- reg = <4>;
- interrupt-parent = <&switch_intc>;
- interrupts = <12>;
- };
- };
-};
-
-An example for the RTL8365MB-VC:
-
-switch {
- compatible = "realtek,rtl8365mb";
- mdc-gpios = <&gpio1 16 GPIO_ACTIVE_HIGH>;
- mdio-gpios = <&gpio1 17 GPIO_ACTIVE_HIGH>;
- reset-gpios = <&gpio5 0 GPIO_ACTIVE_LOW>;
-
- switch_intc: interrupt-controller {
- interrupt-parent = <&gpio5>;
- interrupts = <1 IRQ_TYPE_LEVEL_LOW>;
- interrupt-controller;
- #address-cells = <0>;
- #interrupt-cells = <1>;
- };
-
- ports {
- #address-cells = <1>;
- #size-cells = <0>;
- reg = <0>;
- port@0 {
- reg = <0>;
- label = "swp0";
- phy-handle = <&ethphy0>;
- };
- port@1 {
- reg = <1>;
- label = "swp1";
- phy-handle = <&ethphy1>;
- };
- port@2 {
- reg = <2>;
- label = "swp2";
- phy-handle = <&ethphy2>;
- };
- port@3 {
- reg = <3>;
- label = "swp3";
- phy-handle = <&ethphy3>;
- };
- port@6 {
- reg = <6>;
- label = "cpu";
- ethernet = <&fec1>;
- phy-mode = "rgmii";
- tx-internal-delay-ps = <2000>;
- rx-internal-delay-ps = <2000>;
-
- fixed-link {
- speed = <1000>;
- full-duplex;
- pause;
- };
- };
- };
-
- mdio {
- compatible = "realtek,smi-mdio";
- #address-cells = <1>;
- #size-cells = <0>;
-
- ethphy0: phy@0 {
- reg = <0>;
- interrupt-parent = <&switch_intc>;
- interrupts = <0>;
- };
- ethphy1: phy@1 {
- reg = <1>;
- interrupt-parent = <&switch_intc>;
- interrupts = <1>;
- };
- ethphy2: phy@2 {
- reg = <2>;
- interrupt-parent = <&switch_intc>;
- interrupts = <2>;
- };
- ethphy3: phy@3 {
- reg = <3>;
- interrupt-parent = <&switch_intc>;
- interrupts = <3>;
- };
- };
-};
diff --git a/Documentation/devicetree/bindings/net/dsa/realtek.yaml b/Documentation/devicetree/bindings/net/dsa/realtek.yaml
new file mode 100644
index 000000000000..8756060895a8
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/realtek.yaml
@@ -0,0 +1,394 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/dsa/realtek.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Realtek switches for unmanaged switches
+
+allOf:
+ - $ref: dsa.yaml#
+
+maintainers:
+ - Linus Walleij <linus.walleij@linaro.org>
+
+description:
+ Realtek advertises these chips as fast/gigabit switches or unmanaged
+ switches. They can be controlled using different interfaces, like SMI,
+ MDIO or SPI.
+
+ The SMI "Simple Management Interface" is a two-wire protocol using
+ bit-banged GPIO that while it reuses the MDIO lines MCK and MDIO does
+ not use the MDIO protocol. This binding defines how to specify the
+ SMI-based Realtek devices. The realtek-smi driver is a platform driver
+ and it must be inserted inside a platform node.
+
+ The MDIO-connected switches use MDIO protocol to access their registers.
+ The realtek-mdio driver is an MDIO driver and it must be inserted inside
+ an MDIO node.
+
+properties:
+ compatible:
+ enum:
+ - realtek,rtl8365mb
+ - realtek,rtl8366
+ - realtek,rtl8366rb
+ - realtek,rtl8366s
+ - realtek,rtl8367
+ - realtek,rtl8367b
+ - realtek,rtl8367rb
+ - realtek,rtl8367s
+ - realtek,rtl8368s
+ - realtek,rtl8369
+ - realtek,rtl8370
+ description: |
+ realtek,rtl8365mb: 4+1 ports
+ realtek,rtl8366: 5+1 ports
+ realtek,rtl8366rb: 5+1 ports
+ realtek,rtl8366s: 5+1 ports
+ realtek,rtl8367:
+ realtek,rtl8367b:
+ realtek,rtl8367rb: 5+2 ports
+ realtek,rtl8367s: 5+2 ports
+ realtek,rtl8368s: 8 ports
+ realtek,rtl8369: 8+1 ports
+ realtek,rtl8370: 8+2 ports
+
+ mdc-gpios:
+ description: GPIO line for the MDC clock line.
+ maxItems: 1
+
+ mdio-gpios:
+ description: GPIO line for the MDIO data line.
+ maxItems: 1
+
+ reset-gpios:
+ description: GPIO to be used to reset the whole device
+ maxItems: 1
+
+ realtek,disable-leds:
+ type: boolean
+ description: |
+ if the LED drivers are not used in the hardware design,
+ this will disable them so they are not turned on
+ and wasting power.
+
+ interrupt-controller:
+ type: object
+ description: |
+ This defines an interrupt controller with an IRQ line (typically
+ a GPIO) that will demultiplex and handle the interrupt from the single
+ interrupt line coming out of one of the Realtek switch chips. It most
+ importantly provides link up/down interrupts to the PHY blocks inside
+ the ASIC.
+
+ properties:
+
+ interrupt-controller: true
+
+ interrupts:
+ maxItems: 1
+ description:
+ A single IRQ line from the switch, either active LOW or HIGH
+
+ '#address-cells':
+ const: 0
+
+ '#interrupt-cells':
+ const: 1
+
+ required:
+ - interrupt-controller
+ - '#address-cells'
+ - '#interrupt-cells'
+
+ mdio:
+ $ref: /schemas/net/mdio.yaml#
+ unevaluatedProperties: false
+
+ properties:
+ compatible:
+ const: realtek,smi-mdio
+
+if:
+ required:
+ - reg
+
+then:
+ not:
+ required:
+ - mdc-gpios
+ - mdio-gpios
+ - mdio
+
+ properties:
+ mdc-gpios: false
+ mdio-gpios: false
+ mdio: false
+
+else:
+ required:
+ - mdc-gpios
+ - mdio-gpios
+ - mdio
+ - reset-gpios
+
+required:
+ - compatible
+
+ # - mdc-gpios
+ # - mdio-gpios
+ # - reset-gpios
+ # - mdio
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+
+ platform {
+ switch {
+ compatible = "realtek,rtl8366rb";
+ /* 22 = MDIO (has input reads), 21 = MDC (clock, output only) */
+ mdc-gpios = <&gpio0 21 GPIO_ACTIVE_HIGH>;
+ mdio-gpios = <&gpio0 22 GPIO_ACTIVE_HIGH>;
+ reset-gpios = <&gpio0 14 GPIO_ACTIVE_LOW>;
+
+ switch_intc1: interrupt-controller {
+ /* GPIO 15 provides the interrupt */
+ interrupt-parent = <&gpio0>;
+ interrupts = <15 IRQ_TYPE_LEVEL_LOW>;
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <1>;
+ };
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ port@0 {
+ reg = <0>;
+ label = "lan0";
+ phy-handle = <&phy0>;
+ };
+ port@1 {
+ reg = <1>;
+ label = "lan1";
+ phy-handle = <&phy1>;
+ };
+ port@2 {
+ reg = <2>;
+ label = "lan2";
+ phy-handle = <&phy2>;
+ };
+ port@3 {
+ reg = <3>;
+ label = "lan3";
+ phy-handle = <&phy3>;
+ };
+ port@4 {
+ reg = <4>;
+ label = "wan";
+ phy-handle = <&phy4>;
+ };
+ port@5 {
+ reg = <5>;
+ label = "cpu";
+ ethernet = <&gmac0>;
+ phy-mode = "rgmii";
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+ };
+
+ mdio {
+ compatible = "realtek,smi-mdio";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ phy0: ethernet-phy@0 {
+ reg = <0>;
+ interrupt-parent = <&switch_intc1>;
+ interrupts = <0>;
+ };
+ phy1: ethernet-phy@1 {
+ reg = <1>;
+ interrupt-parent = <&switch_intc1>;
+ interrupts = <1>;
+ };
+ phy2: ethernet-phy@2 {
+ reg = <2>;
+ interrupt-parent = <&switch_intc1>;
+ interrupts = <2>;
+ };
+ phy3: ethernet-phy@3 {
+ reg = <3>;
+ interrupt-parent = <&switch_intc1>;
+ interrupts = <3>;
+ };
+ phy4: ethernet-phy@4 {
+ reg = <4>;
+ interrupt-parent = <&switch_intc1>;
+ interrupts = <12>;
+ };
+ };
+ };
+ };
+
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+
+ platform {
+ switch {
+ compatible = "realtek,rtl8365mb";
+ mdc-gpios = <&gpio1 16 GPIO_ACTIVE_HIGH>;
+ mdio-gpios = <&gpio1 17 GPIO_ACTIVE_HIGH>;
+ reset-gpios = <&gpio5 0 GPIO_ACTIVE_LOW>;
+
+ switch_intc2: interrupt-controller {
+ interrupt-parent = <&gpio5>;
+ interrupts = <1 IRQ_TYPE_LEVEL_LOW>;
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <1>;
+ };
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ port@0 {
+ reg = <0>;
+ label = "swp0";
+ phy-handle = <&ethphy0>;
+ };
+ port@1 {
+ reg = <1>;
+ label = "swp1";
+ phy-handle = <&ethphy1>;
+ };
+ port@2 {
+ reg = <2>;
+ label = "swp2";
+ phy-handle = <&ethphy2>;
+ };
+ port@3 {
+ reg = <3>;
+ label = "swp3";
+ phy-handle = <&ethphy3>;
+ };
+ port@6 {
+ reg = <6>;
+ label = "cpu";
+ ethernet = <&fec1>;
+ phy-mode = "rgmii";
+ tx-internal-delay-ps = <2000>;
+ rx-internal-delay-ps = <2000>;
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ pause;
+ };
+ };
+ };
+
+ mdio {
+ compatible = "realtek,smi-mdio";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ethphy0: ethernet-phy@0 {
+ reg = <0>;
+ interrupt-parent = <&switch_intc2>;
+ interrupts = <0>;
+ };
+ ethphy1: ethernet-phy@1 {
+ reg = <1>;
+ interrupt-parent = <&switch_intc2>;
+ interrupts = <1>;
+ };
+ ethphy2: ethernet-phy@2 {
+ reg = <2>;
+ interrupt-parent = <&switch_intc2>;
+ interrupts = <2>;
+ };
+ ethphy3: ethernet-phy@3 {
+ reg = <3>;
+ interrupt-parent = <&switch_intc2>;
+ interrupts = <3>;
+ };
+ };
+ };
+ };
+
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ switch@29 {
+ compatible = "realtek,rtl8367s";
+ reg = <29>;
+
+ reset-gpios = <&gpio2 20 GPIO_ACTIVE_LOW>;
+
+ switch_intc3: interrupt-controller {
+ interrupt-parent = <&gpio0>;
+ interrupts = <11 IRQ_TYPE_EDGE_FALLING>;
+ interrupt-controller;
+ #address-cells = <0>;
+ #interrupt-cells = <1>;
+ };
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ port@0 {
+ reg = <0>;
+ label = "lan4";
+ };
+
+ port@1 {
+ reg = <1>;
+ label = "lan3";
+ };
+
+ port@2 {
+ reg = <2>;
+ label = "lan2";
+ };
+
+ port@3 {
+ reg = <3>;
+ label = "lan1";
+ };
+
+ port@4 {
+ reg = <4>;
+ label = "wan";
+ };
+
+ port@7 {
+ reg = <7>;
+ ethernet = <&ethernet>;
+ phy-mode = "rgmii";
+ tx-internal-delay-ps = <2000>;
+ rx-internal-delay-ps = <0>;
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/fsl-fman.txt b/Documentation/devicetree/bindings/net/fsl-fman.txt
index 020337f3c05f..801efc7d6818 100644
--- a/Documentation/devicetree/bindings/net/fsl-fman.txt
+++ b/Documentation/devicetree/bindings/net/fsl-fman.txt
@@ -388,14 +388,24 @@ PROPERTIES
Value type: <prop-encoded-array>
Definition: A standard property.
-- bus-frequency
+- clocks
+ Usage: optional
+ Value type: <phandle>
+ Definition: A reference to the input clock of the controller
+ from which the MDC frequency is derived.
+
+- clock-frequency
Usage: optional
Value type: <u32>
- Definition: Specifies the external MDIO bus clock speed to
- be used, if different from the standard 2.5 MHz.
- This may be due to the standard speed being unsupported (e.g.
- due to a hardware problem), or to advertise that all relevant
- components in the system support a faster speed.
+ Definition: Specifies the external MDC frequency, in Hertz, to
+ be used. Requires that the input clock is specified in the
+ "clocks" property. See also: mdio.yaml.
+
+- suppress-preamble
+ Usage: optional
+ Value type: <boolean>
+ Definition: Disable generation of preamble bits. See also:
+ mdio.yaml.
- interrupts
Usage: required for external MDIO
diff --git a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
index 691f886cfc4a..2bf31572b08d 100644
--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -5,6 +5,7 @@ Required properties:
"marvell,armada-370-neta"
"marvell,armada-xp-neta"
"marvell,armada-3700-neta"
+ "marvell,armada-ac5-neta"
- reg: address and length of the register set for the device.
- interrupts: interrupt for the device
- phy: See ethernet.txt file in the same directory.
diff --git a/Documentation/devicetree/bindings/net/mctp-i2c-controller.yaml b/Documentation/devicetree/bindings/net/mctp-i2c-controller.yaml
new file mode 100644
index 000000000000..afd11c9422fa
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/mctp-i2c-controller.yaml
@@ -0,0 +1,92 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/mctp-i2c-controller.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MCTP I2C transport binding
+
+maintainers:
+ - Matt Johnston <matt@codeconstruct.com.au>
+
+description: |
+ An mctp-i2c-controller defines a local MCTP endpoint on an I2C controller.
+ MCTP I2C is specified by DMTF DSP0237.
+
+ An mctp-i2c-controller must be attached to an I2C adapter which supports
+ slave functionality. I2C busses (either directly or as subordinate mux
+ busses) are attached to the mctp-i2c-controller with a 'mctp-controller'
+ property on each used bus. Each mctp-controller I2C bus will be presented
+ to the host system as a separate MCTP I2C instance.
+
+properties:
+ compatible:
+ const: mctp-i2c-controller
+
+ reg:
+ minimum: 0x40000000
+ maximum: 0x4000007f
+ description: |
+ 7 bit I2C address of the local endpoint.
+ I2C_OWN_SLAVE_ADDRESS (1<<30) flag must be set.
+
+additionalProperties: false
+
+required:
+ - compatible
+ - reg
+
+examples:
+ - |
+ // Basic case of a single I2C bus
+ #include <dt-bindings/i2c/i2c.h>
+
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ mctp-controller;
+
+ mctp@30 {
+ compatible = "mctp-i2c-controller";
+ reg = <(0x30 | I2C_OWN_SLAVE_ADDRESS)>;
+ };
+ };
+
+ - |
+ // Mux topology with multiple MCTP-handling busses under
+ // a single mctp-i2c-controller.
+ // i2c1 and i2c6 can have MCTP devices, i2c5 does not.
+ #include <dt-bindings/i2c/i2c.h>
+
+ i2c1: i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ mctp-controller;
+
+ mctp@50 {
+ compatible = "mctp-i2c-controller";
+ reg = <(0x50 | I2C_OWN_SLAVE_ADDRESS)>;
+ };
+ };
+
+ i2c-mux {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ i2c-parent = <&i2c1>;
+
+ i2c5: i2c@0 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ reg = <0>;
+ eeprom@33 {
+ reg = <0x33>;
+ };
+ };
+
+ i2c6: i2c@1 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ reg = <1>;
+ mctp-controller;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
deleted file mode 100644
index afbcaebf062e..000000000000
--- a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-MediaTek DWMAC glue layer controller
-
-This file documents platform glue layer for stmmac.
-Please see stmmac.txt for the other unchanged properties.
-
-The device node has following properties.
-
-Required properties:
-- compatible: Should be "mediatek,mt2712-gmac" for MT2712 SoC
-- reg: Address and length of the register set for the device
-- interrupts: Should contain the MAC interrupts
-- interrupt-names: Should contain a list of interrupt names corresponding to
- the interrupts in the interrupts property, if available.
- Should be "macirq" for the main MAC IRQ
-- clocks: Must contain a phandle for each entry in clock-names.
-- clock-names: The name of the clock listed in the clocks property. These are
- "axi", "apb", "mac_main", "ptp_ref", "rmii_internal" for MT2712 SoC.
-- mac-address: See ethernet.txt in the same directory
-- phy-mode: See ethernet.txt in the same directory
-- mediatek,pericfg: A phandle to the syscon node that control ethernet
- interface and timing delay.
-
-Optional properties:
-- mediatek,tx-delay-ps: TX clock delay macro value. Default is 0.
- It should be defined for RGMII/MII interface.
- It should be defined for RMII interface when the reference clock is from MT2712 SoC.
-- mediatek,rx-delay-ps: RX clock delay macro value. Default is 0.
- It should be defined for RGMII/MII interface.
- It should be defined for RMII interface.
-Both delay properties need to be a multiple of 170 for RGMII interface,
-or will round down. Range 0~31*170.
-Both delay properties need to be a multiple of 550 for MII/RMII interface,
-or will round down. Range 0~31*550.
-
-- mediatek,rmii-rxc: boolean property, if present indicates that the RMII
- reference clock, which is from external PHYs, is connected to RXC pin
- on MT2712 SoC.
- Otherwise, is connected to TXC pin.
-- mediatek,rmii-clk-from-mac: boolean property, if present indicates that
- MT2712 SoC provides the RMII reference clock, which outputs to TXC pin only.
-- mediatek,txc-inverse: boolean property, if present indicates that
- 1. tx clock will be inversed in MII/RGMII case,
- 2. tx clock inside MAC will be inversed relative to reference clock
- which is from external PHYs in RMII case, and it rarely happen.
- 3. the reference clock, which outputs to TXC pin will be inversed in RMII case
- when the reference clock is from MT2712 SoC.
-- mediatek,rxc-inverse: boolean property, if present indicates that
- 1. rx clock will be inversed in MII/RGMII case.
- 2. reference clock will be inversed when arrived at MAC in RMII case, when
- the reference clock is from external PHYs.
- 3. the inside clock, which be sent to MAC, will be inversed in RMII case when
- the reference clock is from MT2712 SoC.
-- assigned-clocks: mac_main and ptp_ref clocks
-- assigned-clock-parents: parent clocks of the assigned clocks
-
-Example:
- eth: ethernet@1101c000 {
- compatible = "mediatek,mt2712-gmac";
- reg = <0 0x1101c000 0 0x1300>;
- interrupts = <GIC_SPI 237 IRQ_TYPE_LEVEL_LOW>;
- interrupt-names = "macirq";
- phy-mode ="rgmii-rxid";
- mac-address = [00 55 7b b5 7d f7];
- clock-names = "axi",
- "apb",
- "mac_main",
- "ptp_ref",
- "rmii_internal";
- clocks = <&pericfg CLK_PERI_GMAC>,
- <&pericfg CLK_PERI_GMAC_PCLK>,
- <&topckgen CLK_TOP_ETHER_125M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
- assigned-clocks = <&topckgen CLK_TOP_ETHER_125M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
- assigned-clock-parents = <&topckgen CLK_TOP_ETHERPLL_125M>,
- <&topckgen CLK_TOP_APLL1_D3>,
- <&topckgen CLK_TOP_ETHERPLL_50M>;
- power-domains = <&scpsys MT2712_POWER_DOMAIN_AUDIO>;
- mediatek,pericfg = <&pericfg>;
- mediatek,tx-delay-ps = <1530>;
- mediatek,rx-delay-ps = <1530>;
- mediatek,rmii-rxc;
- mediatek,txc-inverse;
- mediatek,rxc-inverse;
- snps,txpbl = <1>;
- snps,rxpbl = <1>;
- snps,reset-gpio = <&pio 87 GPIO_ACTIVE_LOW>;
- snps,reset-active-low;
- };
diff --git a/Documentation/devicetree/bindings/net/mediatek-dwmac.yaml b/Documentation/devicetree/bindings/net/mediatek-dwmac.yaml
new file mode 100644
index 000000000000..901944683322
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/mediatek-dwmac.yaml
@@ -0,0 +1,175 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/mediatek-dwmac.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MediaTek DWMAC glue layer controller
+
+maintainers:
+ - Biao Huang <biao.huang@mediatek.com>
+
+description:
+ This file documents platform glue layer for stmmac.
+
+# We need a select here so we don't match all nodes with 'snps,dwmac'
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - mediatek,mt2712-gmac
+ - mediatek,mt8195-gmac
+ required:
+ - compatible
+
+allOf:
+ - $ref: "snps,dwmac.yaml#"
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - mediatek,mt2712-gmac
+ - const: snps,dwmac-4.20a
+ - items:
+ - enum:
+ - mediatek,mt8195-gmac
+ - const: snps,dwmac-5.10a
+
+ clocks:
+ minItems: 5
+ items:
+ - description: AXI clock
+ - description: APB clock
+ - description: MAC Main clock
+ - description: PTP clock
+ - description: RMII reference clock provided by MAC
+ - description: MAC clock gate
+
+ clock-names:
+ minItems: 5
+ items:
+ - const: axi
+ - const: apb
+ - const: mac_main
+ - const: ptp_ref
+ - const: rmii_internal
+ - const: mac_cg
+
+ mediatek,pericfg:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ The phandle to the syscon node that control ethernet
+ interface and timing delay.
+
+ mediatek,tx-delay-ps:
+ description:
+ The internal TX clock delay (provided by this driver) in nanoseconds.
+ For MT2712 RGMII interface, Allowed value need to be a multiple of 170,
+ or will round down. Range 0~31*170.
+ For MT2712 RMII/MII interface, Allowed value need to be a multiple of 550,
+ or will round down. Range 0~31*550.
+ For MT8195 RGMII/RMII/MII interface, Allowed value need to be a multiple of 290,
+ or will round down. Range 0~31*290.
+
+ mediatek,rx-delay-ps:
+ description:
+ The internal RX clock delay (provided by this driver) in nanoseconds.
+ For MT2712 RGMII interface, Allowed value need to be a multiple of 170,
+ or will round down. Range 0~31*170.
+ For MT2712 RMII/MII interface, Allowed value need to be a multiple of 550,
+ or will round down. Range 0~31*550.
+ For MT8195 RGMII/RMII/MII interface, Allowed value need to be a multiple
+ of 290, or will round down. Range 0~31*290.
+
+ mediatek,rmii-rxc:
+ type: boolean
+ description:
+ If present, indicates that the RMII reference clock, which is from external
+ PHYs, is connected to RXC pin. Otherwise, is connected to TXC pin.
+
+ mediatek,rmii-clk-from-mac:
+ type: boolean
+ description:
+ If present, indicates that MAC provides the RMII reference clock, which
+ outputs to TXC pin only.
+
+ mediatek,txc-inverse:
+ type: boolean
+ description:
+ If present, indicates that
+ 1. tx clock will be inversed in MII/RGMII case,
+ 2. tx clock inside MAC will be inversed relative to reference clock
+ which is from external PHYs in RMII case, and it rarely happen.
+ 3. the reference clock, which outputs to TXC pin will be inversed in RMII case
+ when the reference clock is from MAC.
+
+ mediatek,rxc-inverse:
+ type: boolean
+ description:
+ If present, indicates that
+ 1. rx clock will be inversed in MII/RGMII case.
+ 2. reference clock will be inversed when arrived at MAC in RMII case, when
+ the reference clock is from external PHYs.
+ 3. the inside clock, which be sent to MAC, will be inversed in RMII case when
+ the reference clock is from MAC.
+
+ mediatek,mac-wol:
+ type: boolean
+ description:
+ If present, indicates that MAC supports WOL(Wake-On-LAN), and MAC WOL will be enabled.
+ Otherwise, PHY WOL is perferred.
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - interrupt-names
+ - clocks
+ - clock-names
+ - phy-mode
+ - mediatek,pericfg
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/mt2712-clk.h>
+ #include <dt-bindings/gpio/gpio.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/power/mt2712-power.h>
+
+ eth: ethernet@1101c000 {
+ compatible = "mediatek,mt2712-gmac", "snps,dwmac-4.20a";
+ reg = <0x1101c000 0x1300>;
+ interrupts = <GIC_SPI 237 IRQ_TYPE_LEVEL_LOW>;
+ interrupt-names = "macirq";
+ phy-mode ="rgmii-rxid";
+ mac-address = [00 55 7b b5 7d f7];
+ clock-names = "axi",
+ "apb",
+ "mac_main",
+ "ptp_ref",
+ "rmii_internal";
+ clocks = <&pericfg CLK_PERI_GMAC>,
+ <&pericfg CLK_PERI_GMAC_PCLK>,
+ <&topckgen CLK_TOP_ETHER_125M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
+ assigned-clocks = <&topckgen CLK_TOP_ETHER_125M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
+ assigned-clock-parents = <&topckgen CLK_TOP_ETHERPLL_125M>,
+ <&topckgen CLK_TOP_APLL1_D3>,
+ <&topckgen CLK_TOP_ETHERPLL_50M>;
+ power-domains = <&scpsys MT2712_POWER_DOMAIN_AUDIO>;
+ mediatek,pericfg = <&pericfg>;
+ mediatek,tx-delay-ps = <1530>;
+ snps,txpbl = <1>;
+ snps,rxpbl = <1>;
+ snps,reset-gpio = <&pio 87 GPIO_ACTIVE_LOW>;
+ snps,reset-delays-us = <0 10000 10000>;
+ };
diff --git a/Documentation/devicetree/bindings/net/micrel.txt b/Documentation/devicetree/bindings/net/micrel.txt
index 8d157f0295a5..c5ab62c39133 100644
--- a/Documentation/devicetree/bindings/net/micrel.txt
+++ b/Documentation/devicetree/bindings/net/micrel.txt
@@ -45,3 +45,20 @@ Optional properties:
In fiber mode, auto-negotiation is disabled and the PHY can only work in
100base-fx (full and half duplex) modes.
+
+ - lan8814,ignore-ts: If present the PHY will not support timestamping.
+
+ This option acts as check whether Timestamping is supported by
+ hardware or not. LAN8814 phy support hardware tmestamping.
+
+ - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps.
+
+ - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps.
+
+ - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps.
+
+ - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps.
+
+ - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps.
+
+ - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.
diff --git a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
index e79e4e166ad8..13812768b923 100644
--- a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
+++ b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml
@@ -38,6 +38,7 @@ properties:
- description: register based extraction
- description: frame dma based extraction
- description: analyzer interrupt
+ - description: ptp interrupt
interrupt-names:
minItems: 1
@@ -45,6 +46,7 @@ properties:
- const: xtr
- const: fdma
- const: ana
+ - const: ptp
resets:
items:
diff --git a/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml b/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml
index 347b912a46bb..6c86d3d85e99 100644
--- a/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml
+++ b/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml
@@ -53,12 +53,14 @@ properties:
items:
- description: register based extraction
- description: frame dma based extraction
+ - description: ptp interrupt
interrupt-names:
minItems: 1
items:
- const: xtr
- const: fdma
+ - const: ptp
resets:
items:
diff --git a/Documentation/devicetree/bindings/net/mscc-miim.txt b/Documentation/devicetree/bindings/net/mscc-miim.txt
index 7104679cf59d..70e0cb1ee485 100644
--- a/Documentation/devicetree/bindings/net/mscc-miim.txt
+++ b/Documentation/devicetree/bindings/net/mscc-miim.txt
@@ -2,7 +2,7 @@ Microsemi MII Management Controller (MIIM) / MDIO
=================================================
Properties:
-- compatible: must be "mscc,ocelot-miim"
+- compatible: must be "mscc,ocelot-miim" or "microchip,lan966x-miim"
- reg: The base address of the MDIO bus controller register bank. Optionally, a
second register bank can be defined if there is an associated reset register
for internal PHYs
diff --git a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
index bda821065a2b..ee2ccacc39ff 100644
--- a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
+++ b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
@@ -45,8 +45,10 @@ properties:
- items:
- enum:
+ - renesas,r9a07g043-gbeth # RZ/G2UL
- renesas,r9a07g044-gbeth # RZ/G2{L,LC}
- - const: renesas,rzg2l-gbeth # RZ/G2L
+ - renesas,r9a07g054-gbeth # RZ/V2L
+ - const: renesas,rzg2l-gbeth # RZ/{G2L,G2UL,V2L} family
reg: true
diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
index 269cd63fb544..3216c999f0e1 100644
--- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
+++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml
@@ -18,7 +18,7 @@ description: |
wireless device. The node is expected to be specified as a child
node of the PCI controller to which the wireless chip is connected.
Alternatively, it can specify the wireless part of the MT7628/MT7688
- or MT7622 SoC.
+ or MT7622/MT7986 SoC.
allOf:
- $ref: ieee80211.yaml#
@@ -29,9 +29,13 @@ properties:
- mediatek,mt76
- mediatek,mt7628-wmac
- mediatek,mt7622-wmac
+ - mediatek,mt7986-wmac
reg:
- maxItems: 1
+ minItems: 1
+ maxItems: 3
+ description:
+ MT7986 should contain 3 regions consys, dcm, and sku, in this order.
interrupts:
maxItems: 1
@@ -39,6 +43,17 @@ properties:
power-domains:
maxItems: 1
+ memory-region:
+ maxItems: 1
+
+ resets:
+ maxItems: 1
+ description:
+ Specify the consys reset for mt7986.
+
+ reset-name:
+ const: consys
+
mediatek,infracfg:
$ref: /schemas/types.yaml#/definitions/phandle
description:
@@ -69,6 +84,15 @@ properties:
calibration data is generic and specific calibration data should be
pulled from the OTP ROM
+ mediatek,disable-radar-background:
+ type: boolean
+ description:
+ Disable/enable radar/CAC detection running on a dedicated offchannel
+ chain available on some hw.
+ Background radar/CAC detection allows to avoid the CAC downtime
+ switching on a different channel during CAC detection on the selected
+ radar channel.
+
led:
type: object
$ref: /schemas/leds/common.yaml#
@@ -165,7 +189,7 @@ required:
- compatible
- reg
-additionalProperties: false
+unevaluatedProperties: false
examples:
- |
@@ -231,3 +255,15 @@ examples:
power-domains = <&scpsys 3>;
};
+
+ - |
+ wifi@18000000 {
+ compatible = "mediatek,mt7986-wmac";
+ resets = <&watchdog 23>;
+ reset-names = "consys";
+ reg = <0x18000000 0x1000000>,
+ <0x10003000 0x1000>,
+ <0x11d10000 0x1000>;
+ interrupts = <GIC_SPI 213 IRQ_TYPE_LEVEL_HIGH>;
+ memory-region = <&wmcpu_emi>;
+ };
diff --git a/Documentation/devicetree/bindings/phy/fsl,lynx-28g.yaml b/Documentation/devicetree/bindings/phy/fsl,lynx-28g.yaml
new file mode 100644
index 000000000000..4d91e2f4f247
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/fsl,lynx-28g.yaml
@@ -0,0 +1,40 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/fsl,lynx-28g.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale Lynx 28G SerDes PHY binding
+
+maintainers:
+ - Ioana Ciornei <ioana.ciornei@nxp.com>
+
+properties:
+ compatible:
+ enum:
+ - fsl,lynx-28g
+
+ reg:
+ maxItems: 1
+
+ "#phy-cells":
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - "#phy-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ serdes_1: phy@1ea0000 {
+ compatible = "fsl,lynx-28g";
+ reg = <0x0 0x1ea0000 0x0 0x1e30>;
+ #phy-cells = <1>;
+ };
+ };
diff --git a/Documentation/devicetree/bindings/phy/transmit-amplitude.yaml b/Documentation/devicetree/bindings/phy/transmit-amplitude.yaml
new file mode 100644
index 000000000000..51492fe738ec
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/transmit-amplitude.yaml
@@ -0,0 +1,103 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/transmit-amplitude.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Common PHY and network PCS transmit amplitude property binding
+
+description:
+ Binding describing the peak-to-peak transmit amplitude for common PHYs
+ and network PCSes.
+
+maintainers:
+ - Marek BehĂșn <kabel@kernel.org>
+
+properties:
+ tx-p2p-microvolt:
+ description:
+ Transmit amplitude voltages in microvolts, peak-to-peak. If this property
+ contains multiple values for various PHY modes, the
+ 'tx-p2p-microvolt-names' property must be provided and contain
+ corresponding mode names.
+
+ tx-p2p-microvolt-names:
+ description: |
+ Names of the modes corresponding to voltages in the 'tx-p2p-microvolt'
+ property. Required only if multiple voltages are provided.
+
+ If a value of 'default' is provided, the system should use it for any PHY
+ mode that is otherwise not defined here. If 'default' is not provided, the
+ system should use manufacturer default value.
+ minItems: 1
+ maxItems: 16
+ items:
+ enum:
+ - default
+
+ # ethernet modes
+ - sgmii
+ - qsgmii
+ - xgmii
+ - 1000base-x
+ - 2500base-x
+ - 5gbase-r
+ - rxaui
+ - xaui
+ - 10gbase-kr
+ - usxgmii
+ - 10gbase-r
+ - 25gbase-r
+
+ # PCIe modes
+ - pcie
+ - pcie1
+ - pcie2
+ - pcie3
+ - pcie4
+ - pcie5
+ - pcie6
+
+ # USB modes
+ - usb
+ - usb-ls
+ - usb-fs
+ - usb-hs
+ - usb-ss
+ - usb-ss+
+ - usb-4
+
+ # storage modes
+ - sata
+ - ufs-hs
+ - ufs-hs-a
+ - ufs-hs-b
+
+ # display modes
+ - lvds
+ - dp
+ - dp-rbr
+ - dp-hbr
+ - dp-hbr2
+ - dp-hbr3
+ - dp-uhbr-10
+ - dp-uhbr-13.5
+ - dp-uhbr-20
+
+ # camera modes
+ - mipi-dphy
+ - mipi-dphy-univ
+ - mipi-dphy-v2.5-univ
+
+dependencies:
+ tx-p2p-microvolt-names: [ tx-p2p-microvolt ]
+
+additionalProperties: true
+
+examples:
+ - |
+ phy: phy {
+ #phy-cells = <1>;
+ tx-p2p-microvolt = <915000>, <1100000>, <1200000>;
+ tx-p2p-microvolt-names = "2500base-x", "usb-hs", "usb-ss";
+ };
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index ab98373535ea..525e6842dd33 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -313,6 +313,17 @@ arp_ip_target
maximum number of targets that can be specified is 16. The
default value is no IP addresses.
+ns_ip6_target
+
+ Specifies the IPv6 addresses to use as IPv6 monitoring peers when
+ arp_interval is > 0. These are the targets of the NS request
+ sent to determine the health of the link to the targets.
+ Specify these values in ffff:ffff::ffff:ffff format. Multiple IPv6
+ addresses must be separated by a comma. At least one IPv6
+ address must be given for NS/NA monitoring to function. The
+ maximum number of targets that can be specified is 16. The
+ default value is no IPv6 addresses.
+
arp_validate
Specifies whether or not ARP probes and replies should be
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index 443123772f44..c17cdb079611 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -4,6 +4,22 @@ Linux Devlink Documentation
devlink is an API to expose device information and resources not directly
related to any device class, such as chip-wide/switch-ASIC-wide configuration.
+Locking
+-------
+
+Driver facing APIs are currently transitioning to allow more explicit
+locking. Drivers can use the existing ``devlink_*`` set of APIs, or
+new APIs prefixed by ``devl_*``. The older APIs handle all the locking
+in devlink core, but don't allow registration of most sub-objects once
+the main devlink object is itself registered. The newer ``devl_*`` APIs assume
+the devlink instance lock is already held. Drivers can take the instance
+lock by calling ``devl_lock()``. It is also held in most of the callbacks.
+Eventually all callbacks will be invoked under the devlink instance lock,
+refer to the use of the ``DEVLINK_NL_FLAG_NO_LOCK`` flag in devlink core
+to find out which callbacks are not converted, yet.
+
+Drivers are encouraged to use the devlink instance lock for their own needs.
+
Interface documentation
-----------------------
diff --git a/Documentation/networking/dsa/sja1105.rst b/Documentation/networking/dsa/sja1105.rst
index 29b1bae0cf00..e0219c1452ab 100644
--- a/Documentation/networking/dsa/sja1105.rst
+++ b/Documentation/networking/dsa/sja1105.rst
@@ -293,6 +293,33 @@ of dropped frames, which is a sum of frames dropped due to timing violations,
lack of destination ports and MTU enforcement checks). Byte-level counters are
not available.
+Limitations
+===========
+
+The SJA1105 switch family always performs VLAN processing. When configured as
+VLAN-unaware, frames carry a different VLAN tag internally, depending on
+whether the port is standalone or under a VLAN-unaware bridge.
+
+The virtual link keys are always fixed at {MAC DA, VLAN ID, VLAN PCP}, but the
+driver asks for the VLAN ID and VLAN PCP when the port is under a VLAN-aware
+bridge. Otherwise, it fills in the VLAN ID and PCP automatically, based on
+whether the port is standalone or in a VLAN-unaware bridge, and accepts only
+"VLAN-unaware" tc-flower keys (MAC DA).
+
+The existing tc-flower keys that are offloaded using virtual links are no
+longer operational after one of the following happens:
+
+- port was standalone and joins a bridge (VLAN-aware or VLAN-unaware)
+- port is part of a bridge whose VLAN awareness state changes
+- port was part of a bridge and becomes standalone
+- port was standalone, but another port joins a VLAN-aware bridge and this
+ changes the global VLAN awareness state of the bridge
+
+The driver cannot veto all these operations, and it cannot update/remove the
+existing tc-flower filters either. So for proper operation, the tc-flower
+filters should be installed only after the forwarding configuration of the port
+has been made, and removed by user space before making any changes to it.
+
Device Tree bindings and board design
=====================================
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 9d98e0511249..24d9be69065d 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -860,8 +860,17 @@ Kernel response contents:
``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
==================================== ====== ===========================
+``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
+page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
+If enabled the device is configured to place frame headers and data into
+separate buffers. The device configuration must make it possible to receive
+full memory pages of data, for example because MTU is high enough or through
+HW-GRO.
+
RINGS_SET
=========
@@ -877,6 +886,7 @@ Request contents:
``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
==================================== ====== ===========================
Kernel checks that requested ring sizes do not exceed limits reported by
@@ -884,6 +894,15 @@ driver. Driver may impose additional constraints and may not suspport all
attributes.
+``ETHTOOL_A_RINGS_CQE_SIZE`` specifies the completion queue event size.
+Completion queue events(CQE) are the events posted by NIC to indicate the
+completion status of a packet when the packet is sent(like send success or
+error) or received(like pointers to packet fragments). The CQE size parameter
+enables to modify the CQE size other than default size if NIC supports it.
+A bigger CQE can have more receive buffer pointers inturn NIC can transfer
+a bigger frame from wire. Based on the NIC hardware, the overall completion
+queue size can be adjusted in the driver if CQE size is modified.
+
CHANNELS_GET
============
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 58bc8cd367c6..ce017136ab05 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -96,6 +96,7 @@ Contents:
sctp
secid
seg6-sysctl
+ smc-sysctl
statistics
strparser
switchdev
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 2572eecc3e86..b0024aa7b051 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -878,6 +878,29 @@ tcp_min_tso_segs - INTEGER
Default: 2
+tcp_tso_rtt_log - INTEGER
+ Adjustment of TSO packet sizes based on min_rtt
+
+ Starting from linux-5.18, TCP autosizing can be tweaked
+ for flows having small RTT.
+
+ Old autosizing was splitting the pacing budget to send 1024 TSO
+ per second.
+
+ tso_packet_size = sk->sk_pacing_rate / 1024;
+
+ With the new mechanism, we increase this TSO sizing using:
+
+ distance = min_rtt_usec / (2^tcp_tso_rtt_log)
+ tso_packet_size += gso_max_size >> distance;
+
+ This means that flows between very close hosts can use bigger
+ TSO packets, reducing their cpu costs.
+
+ If you want to use the old autosizing, set this sysctl to 0.
+
+ Default: 9 (2^9 = 512 usec)
+
tcp_pacing_ss_ratio - INTEGER
sk->sk_pacing_rate is set by TCP stack using a ratio applied
to current rate. (current_rate = cwnd * mss / srtt)
diff --git a/Documentation/networking/mctp.rst b/Documentation/networking/mctp.rst
index 46f74bffce0f..c628cb5406d2 100644
--- a/Documentation/networking/mctp.rst
+++ b/Documentation/networking/mctp.rst
@@ -212,6 +212,54 @@ remote address is already known, or the message does not require a reply.
Like the send calls, sockets will only receive responses to requests they have
sent (TO=1) and may only respond (TO=0) to requests they have received.
+``ioctl(SIOCMCTPALLOCTAG)`` and ``ioctl(SIOCMCTPDROPTAG)``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+These tags give applications more control over MCTP message tags, by allocating
+(and dropping) tag values explicitly, rather than the kernel automatically
+allocating a per-message tag at ``sendmsg()`` time.
+
+In general, you will only need to use these ioctls if your MCTP protocol does
+not fit the usual request/response model. For example, if you need to persist
+tags across multiple requests, or a request may generate more than one response.
+In these cases, the ioctls allow you to decouple the tag allocation (and
+release) from individual message send and receive operations.
+
+Both ioctls are passed a pointer to a ``struct mctp_ioc_tag_ctl``:
+
+.. code-block:: C
+
+ struct mctp_ioc_tag_ctl {
+ mctp_eid_t peer_addr;
+ __u8 tag;
+ __u16 flags;
+ };
+
+``SIOCMCTPALLOCTAG`` allocates a tag for a specific peer, which an application
+can use in future ``sendmsg()`` calls. The application populates the
+``peer_addr`` member with the remote EID. Other fields must be zero.
+
+On return, the ``tag`` member will be populated with the allocated tag value.
+The allocated tag will have the following tag bits set:
+
+ - ``MCTP_TAG_OWNER``: it only makes sense to allocate tags if you're the tag
+ owner
+
+ - ``MCTP_TAG_PREALLOC``: to indicate to ``sendmsg()`` that this is a
+ preallocated tag.
+
+ - ... and the actual tag value, within the least-significant three bits
+ (``MCTP_TAG_MASK``). Note that zero is a valid tag value.
+
+The tag value should be used as-is for the ``smctp_tag`` member of ``struct
+sockaddr_mctp``.
+
+``SIOCMCTPDROPTAG`` releases a tag that has been previously allocated by a
+``SIOCMCTPALLOCTAG`` ioctl. The ``peer_addr`` must be the same as used for the
+allocation, and the ``tag`` value must match exactly the tag returned from the
+allocation (including the ``MCTP_TAG_OWNER`` and ``MCTP_TAG_PREALLOC`` bits).
+The ``flags`` field must be zero.
+
Kernel internals
================
diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst
index a147591ce203..5db8c263b0c6 100644
--- a/Documentation/networking/page_pool.rst
+++ b/Documentation/networking/page_pool.rst
@@ -105,6 +105,47 @@ a page will cause no race conditions is enough.
Please note the caller must not use data area after running
page_pool_put_page_bulk(), as this function overwrites it.
+* page_pool_get_stats(): Retrieve statistics about the page_pool. This API
+ is only available if the kernel has been configured with
+ ``CONFIG_PAGE_POOL_STATS=y``. A pointer to a caller allocated ``struct
+ page_pool_stats`` structure is passed to this API which is filled in. The
+ caller can then report those stats to the user (perhaps via ethtool,
+ debugfs, etc.). See below for an example usage of this API.
+
+Stats API and structures
+------------------------
+If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API
+``page_pool_get_stats()`` and structures described below are available. It
+takes a pointer to a ``struct page_pool`` and a pointer to a ``struct
+page_pool_stats`` allocated by the caller.
+
+The API will fill in the provided ``struct page_pool_stats`` with
+statistics about the page_pool.
+
+The stats structure has the following fields::
+
+ struct page_pool_stats {
+ struct page_pool_alloc_stats alloc_stats;
+ struct page_pool_recycle_stats recycle_stats;
+ };
+
+
+The ``struct page_pool_alloc_stats`` has the following fields:
+ * ``fast``: successful fast path allocations
+ * ``slow``: slow path order-0 allocations
+ * ``slow_high_order``: slow path high order allocations
+ * ``empty``: ptr ring is empty, so a slow path allocation was forced.
+ * ``refill``: an allocation which triggered a refill of the cache
+ * ``waive``: pages obtained from the ptr ring that cannot be added to
+ the cache due to a NUMA mismatch.
+
+The ``struct page_pool_recycle_stats`` has the following fields:
+ * ``cached``: recycling placed page in the page pool cache
+ * ``cache_full``: page pool cache was full
+ * ``ring``: page placed into the ptr ring
+ * ``ring_full``: page released from page pool because the ptr ring was full
+ * ``released_refcnt``: page released (and not recycled) because refcnt > 1
+
Coding examples
===============
@@ -157,6 +198,21 @@ NAPI poller
}
}
+Stats
+-----
+
+.. code-block:: c
+
+ #ifdef CONFIG_PAGE_POOL_STATS
+ /* retrieve stats */
+ struct page_pool_stats stats = { 0 };
+ if (page_pool_get_stats(page_pool, &stats)) {
+ /* perhaps the driver reports statistics with ethool */
+ ethtool_print_allocation_stats(&stats.alloc_stats);
+ ethtool_print_recycle_stats(&stats.recycle_stats);
+ }
+ #endif
+
Driver unload
-------------
diff --git a/Documentation/networking/smc-sysctl.rst b/Documentation/networking/smc-sysctl.rst
new file mode 100644
index 000000000000..0987fd1bc220
--- /dev/null
+++ b/Documentation/networking/smc-sysctl.rst
@@ -0,0 +1,23 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
+SMC Sysctl
+==========
+
+/proc/sys/net/smc/* Variables
+=============================
+
+autocorking_size - INTEGER
+ Setting SMC auto corking size:
+ SMC auto corking is like TCP auto corking from the application's
+ perspective of view. When applications do consecutive small
+ write()/sendmsg() system calls, we try to coalesce these small writes
+ as much as possible, to lower total amount of CDC and RDMA Write been
+ sent.
+ autocorking_size limits the maximum corked bytes that can be sent to
+ the under device in 1 single sending. If set to 0, the SMC auto corking
+ is disabled.
+ Applications can still use TCP_CORK for optimal behavior when they
+ know how/when to uncork their sockets.
+
+ Default: 64K
diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index f5809206eb93..be4eb1242057 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -668,7 +668,7 @@ timestamping:
(through another RX timestamping FIFO). Deferral on RX is typically
necessary when retrieving the timestamp needs a sleepable context. In
that case, it is the responsibility of the DSA driver to call
- ``netif_rx_ni()`` on the freshly timestamped skb.
+ ``netif_rx()`` on the freshly timestamped skb.
3.2.2 Ethernet PHYs
^^^^^^^^^^^^^^^^^^^
diff --git a/Documentation/trace/fprobe.rst b/Documentation/trace/fprobe.rst
new file mode 100644
index 000000000000..b64bec1ce144
--- /dev/null
+++ b/Documentation/trace/fprobe.rst
@@ -0,0 +1,174 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+Fprobe - Function entry/exit probe
+==================================
+
+.. Author: Masami Hiramatsu <mhiramat@kernel.org>
+
+Introduction
+============
+
+Fprobe is a function entry/exit probe mechanism based on ftrace.
+Instead of using ftrace full feature, if you only want to attach callbacks
+on function entry and exit, similar to the kprobes and kretprobes, you can
+use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
+instrumentation for multiple functions with single handler. This document
+describes how to use fprobe.
+
+The usage of fprobe
+===================
+
+The fprobe is a wrapper of ftrace (+ kretprobe-like return callback) to
+attach callbacks to multiple function entry and exit. User needs to set up
+the `struct fprobe` and pass it to `register_fprobe()`.
+
+Typically, `fprobe` data structure is initialized with the `entry_handler`
+and/or `exit_handler` as below.
+
+.. code-block:: c
+
+ struct fprobe fp = {
+ .entry_handler = my_entry_callback,
+ .exit_handler = my_exit_callback,
+ };
+
+To enable the fprobe, call one of register_fprobe(), register_fprobe_ips(), and
+register_fprobe_syms(). These functions register the fprobe with different types
+of parameters.
+
+The register_fprobe() enables a fprobe by function-name filters.
+E.g. this enables @fp on "func*()" function except "func2()".::
+
+ register_fprobe(&fp, "func*", "func2");
+
+The register_fprobe_ips() enables a fprobe by ftrace-location addresses.
+E.g.
+
+.. code-block:: c
+
+ unsigned long ips[] = { 0x.... };
+
+ register_fprobe_ips(&fp, ips, ARRAY_SIZE(ips));
+
+And the register_fprobe_syms() enables a fprobe by symbol names.
+E.g.
+
+.. code-block:: c
+
+ char syms[] = {"func1", "func2", "func3"};
+
+ register_fprobe_syms(&fp, syms, ARRAY_SIZE(syms));
+
+To disable (remove from functions) this fprobe, call::
+
+ unregister_fprobe(&fp);
+
+You can temporally (soft) disable the fprobe by::
+
+ disable_fprobe(&fp);
+
+and resume by::
+
+ enable_fprobe(&fp);
+
+The above is defined by including the header::
+
+ #include <linux/fprobe.h>
+
+Same as ftrace, the registered callbacks will start being called some time
+after the register_fprobe() is called and before it returns. See
+:file:`Documentation/trace/ftrace.rst`.
+
+Also, the unregister_fprobe() will guarantee that the both enter and exit
+handlers are no longer being called by functions after unregister_fprobe()
+returns as same as unregister_ftrace_function().
+
+The fprobe entry/exit handler
+=============================
+
+The prototype of the entry/exit callback function is as follows:
+
+.. code-block:: c
+
+ void callback_func(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs);
+
+Note that both entry and exit callbacks have same ptototype. The @entry_ip is
+saved at function entry and passed to exit handler.
+
+@fp
+ This is the address of `fprobe` data structure related to this handler.
+ You can embed the `fprobe` to your data structure and get it by
+ container_of() macro from @fp. The @fp must not be NULL.
+
+@entry_ip
+ This is the ftrace address of the traced function (both entry and exit).
+ Note that this may not be the actual entry address of the function but
+ the address where the ftrace is instrumented.
+
+@regs
+ This is the `pt_regs` data structure at the entry and exit. Note that
+ the instruction pointer of @regs may be different from the @entry_ip
+ in the entry_handler. If you need traced instruction pointer, you need
+ to use @entry_ip. On the other hand, in the exit_handler, the instruction
+ pointer of @regs is set to the currect return address.
+
+Share the callbacks with kprobes
+================================
+
+Since the recursion safeness of the fprobe (and ftrace) is a bit different
+from the kprobes, this may cause an issue if user wants to run the same
+code from the fprobe and the kprobes.
+
+Kprobes has per-cpu 'current_kprobe' variable which protects the kprobe
+handler from recursion in all cases. On the other hand, fprobe uses
+only ftrace_test_recursion_trylock(). This allows interrupt context to
+call another (or same) fprobe while the fprobe user handler is running.
+
+This is not a matter if the common callback code has its own recursion
+detection, or it can handle the recursion in the different contexts
+(normal/interrupt/NMI.)
+But if it relies on the 'current_kprobe' recursion lock, it has to check
+kprobe_running() and use kprobe_busy_*() APIs.
+
+Fprobe has FPROBE_FL_KPROBE_SHARED flag to do this. If your common callback
+code will be shared with kprobes, please set FPROBE_FL_KPROBE_SHARED
+*before* registering the fprobe, like:
+
+.. code-block:: c
+
+ fprobe.flags = FPROBE_FL_KPROBE_SHARED;
+
+ register_fprobe(&fprobe, "func*", NULL);
+
+This will protect your common callback from the nested call.
+
+The missed counter
+==================
+
+The `fprobe` data structure has `fprobe::nmissed` counter field as same as
+kprobes.
+This counter counts up when;
+
+ - fprobe fails to take ftrace_recursion lock. This usually means that a function
+ which is traced by other ftrace users is called from the entry_handler.
+
+ - fprobe fails to setup the function exit because of the shortage of rethook
+ (the shadow stack for hooking the function return.)
+
+The `fprobe::nmissed` field counts up in both cases. Therefore, the former
+skips both of entry and exit callback and the latter skips the exit
+callback, but in both case the counter will increase by 1.
+
+Note that if you set the FTRACE_OPS_FL_RECURSION and/or FTRACE_OPS_FL_RCU to
+`fprobe::ops::flags` (ftrace_ops::flags) when registering the fprobe, this
+counter may not work correctly, because ftrace skips the fprobe function which
+increase the counter.
+
+
+Functions and structures
+========================
+
+.. kernel-doc:: include/linux/fprobe.h
+.. kernel-doc:: kernel/trace/fprobe.c
+
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 3a47aa8341c6..f9b7bcb5a630 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -9,6 +9,7 @@ Linux Tracing Technologies
tracepoint-analysis
ftrace
ftrace-uses
+ fprobe
kprobes
kprobetrace
uprobetracer