diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-04-26 16:07:23 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-04-26 16:07:23 -0700 |
commit | 6e98b09da931a00bf4e0477d0fa52748bf28fcce (patch) | |
tree | 9c658ed95add5693f42f29f63df80a2ede3f6ec2 /Documentation | |
parent | b68ee1c6131c540a62ecd443be89c406401df091 (diff) | |
parent | 9b78d919632b7149d311aaad5a977e4b48b10321 (diff) | |
download | linux-6e98b09da931a00bf4e0477d0fa52748bf28fcce.tar.gz linux-6e98b09da931a00bf4e0477d0fa52748bf28fcce.tar.bz2 linux-6e98b09da931a00bf4e0477d0fa52748bf28fcce.zip |
Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
default value allows for better BIG TCP performances
- Reduce compound page head access for zero-copy data transfers
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
possible
- Threaded NAPI improvements, adding defer skb free support and
unneeded softirq avoidance
- Address dst_entry reference count scalability issues, via false
sharing avoidance and optimize refcount tracking
- Add lockless accesses annotation to sk_err[_soft]
- Optimize again the skb struct layout
- Extends the skb drop reasons to make it usable by multiple
subsystems
- Better const qualifier awareness for socket casts
BPF:
- Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and
variable-sized accesses
- Add a new BPF netfilter program type and minimal support to hook
BPF programs to netfilter hooks such as prerouting or forward
- Add more precise memory usage reporting for all BPF map types
- Adds support for using {FOU,GUE} encap with an ipip device
operating in collect_md mode and add a set of BPF kfuncs for
controlling encap params
- Allow BPF programs to detect at load time whether a particular
kfunc exists or not, and also add support for this in light
skeleton
- Bigger batch of BPF verifier improvements to prepare for upcoming
BPF open-coded iterators allowing for less restrictive looping
capabilities
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce
BPF programs to NULL-check before passing such pointers into kfunc
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
in local storage maps
- Enable RCU semantics for task BPF kptrs and allow referenced kptr
tasks to be stored in BPF maps
- Add support for refcounted local kptrs to the verifier for allowing
shared ownership, useful for adding a node to both the BPF list and
rbtree
- Add BPF verifier support for ST instructions in
convert_ctx_access() which will help new -mcpu=v4 clang flag to
start emitting them
- Add ARM32 USDT support to libbpf
- Improve bpftool's visual program dump which produces the control
flow graph in a DOT format by adding C source inline annotations
Protocols:
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
indicates the provenance of the IP address
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition
- Add the handshake upcall mechanism, allowing the user-space to
implement generic TLS handshake on kernel's behalf
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing
resilience to nodes failures
- SCTP: add support for Fair Capacity and Weighted Fair Queueing
schedulers
- MPTCP: delay first subflow allocation up to its first usage. This
will allow for later better LSM interaction
- xfrm: Remove inner/outer modes from input/output path. These are
not needed anymore
- WiFi:
- reduced neighbor report (RNR) handling for AP mode
- HW timestamping support
- support for randomized auth/deauth TA for PASN privacy
- per-link debugfs for multi-link
- TC offload support for mac80211 drivers
- mac80211 mesh fast-xmit and fast-rx support
- enable Wi-Fi 7 (EHT) mesh support
Netfilter:
- Add nf_tables 'brouting' support, to force a packet to be routed
instead of being bridged
- Update bridge netfilter and ovs conntrack helpers to handle IPv6
Jumbo packets properly, i.e. fetch the packet length from
hop-by-hop extension header. This is needed for BIT TCP support
- The iptables 32bit compat interface isn't compiled in by default
anymore
- Move ip(6)tables builtin icmp matches to the udptcp one. This has
the advantage that icmp/icmpv6 match doesn't load the
iptables/ip6tables modules anymore when iptables-nft is used
- Extended netlink error report for netdevice in flowtables and
netdev/chains. Allow for incrementally add/delete devices to netdev
basechain. Allow to create netdev chain without device
Driver API:
- Remove redundant Device Control Error Reporting Enable, as PCI core
has already error reporting enabled at enumeration time
- Move Multicast DB netlink handlers to core, allowing devices other
then bridge to use them
- Allow the page_pool to directly recycle the pages from safely
localized NAPI
- Implement lockless TX queue stop/wake combo macros, allowing for
further code de-duplication and sanitization
- Add YNL support for user headers and struct attrs
- Add partial YNL specification for devlink
- Add partial YNL specification for ethtool
- Add tc-mqprio and tc-taprio support for preemptible traffic classes
- Add tx push buf len param to ethtool, specifies the maximum number
of bytes of a transmitted packet a driver can push directly to the
underlying device
- Add basic LED support for switch/phy
- Add NAPI documentation, stop relaying on external links
- Convert dsa_master_ioctl() to netdev notifier. This is a
preparatory work to make the hardware timestamping layer selectable
by user space
- Add transceiver support and improve the error messages for CAN-FD
controllers
New hardware / drivers:
- Ethernet:
- AMD/Pensando core device support
- MediaTek MT7981 SoC
- MediaTek MT7988 SoC
- Broadcom BCM53134 embedded switch
- Texas Instruments CPSW9G ethernet switch
- Qualcomm EMAC3 DWMAC ethernet
- StarFive JH7110 SoC
- NXP CBTX ethernet PHY
- WiFi:
- Apple M1 Pro/Max devices
- RealTek rtl8710bu/rtl8188gu
- RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
- Bluetooth:
- Realtek RTL8821CS, RTL8851B, RTL8852BS
- Mediatek MT7663, MT7922
- NXP w8997
- Actions Semi ATS2851
- QTI WCN6855
- Marvell 88W8997
- Can:
- STMicroelectronics bxcan stm32f429
Drivers:
- Ethernet NICs:
- Intel (1G, icg):
- add tracking and reporting of QBV config errors
- add support for configuring max SDU for each Tx queue
- Intel (100G, ice):
- refactor mailbox overflow detection to support Scalable IOV
- GNSS interface optimization
- Intel (i40e):
- support XDP multi-buffer
- nVidia/Mellanox:
- add the support for linux bridge multicast offload
- enable TC offload for egress and engress MACVLAN over bond
- add support for VxLAN GBP encap/decap flows offload
- extend packet offload to fully support libreswan
- support tunnel mode in mlx5 IPsec packet offload
- extend XDP multi-buffer support
- support MACsec VLAN offload
- add support for dynamic msix vectors allocation
- drop RX page_cache and fully use page_pool
- implement thermal zone to report NIC temperature
- Netronome/Corigine:
- add support for multi-zone conntrack offload
- Solarflare/Xilinx:
- support offloading TC VLAN push/pop actions to the MAE
- support TC decap rules
- support unicast PTP
- Other NICs:
- Broadcom (bnxt): enforce software based freq adjustments only on
shared PHC NIC
- RealTek (r8169): refactor to addess ASPM issues during NAPI poll
- Micrel (lan8841): add support for PTP_PF_PEROUT
- Cadence (macb): enable PTP unicast
- Engleder (tsnep): add XDP socket zero-copy support
- virtio-net: implement exact header length guest feature
- veth: add page_pool support for page recycling
- vxlan: add MDB data path support
- gve: add XDP support for GQI-QPL format
- geneve: accept every ethertype
- macvlan: allow some packets to bypass broadcast queue
- mana: add support for jumbo frame
- Ethernet high-speed switches:
- Microchip (sparx5): Add support for TC flower templates
- Ethernet embedded switches:
- Broadcom (b54):
- configure 6318 and 63268 RGMII ports
- Marvell (mv88e6xxx):
- faster C45 bus scan
- Microchip:
- lan966x:
- add support for IS1 VCAP
- better TX/RX from/to CPU performances
- ksz9477: add ETS Qdisc support
- ksz8: enhance static MAC table operations and error handling
- sama7g5: add PTP capability
- NXP (ocelot):
- add support for external ports
- add support for preemptible traffic classes
- Texas Instruments:
- add CPSWxG SGMII support for J7200 and J721E
- Intel WiFi (iwlwifi):
- preparation for Wi-Fi 7 EHT and multi-link support
- EHT (Wi-Fi 7) sniffer support
- hardware timestamping support for some devices/firwmares
- TX beacon protection on newer hardware
- Qualcomm 802.11ax WiFi (ath11k):
- MU-MIMO parameters support
- ack signal support for management packets
- RealTek WiFi (rtw88):
- SDIO bus support
- better support for some SDIO devices (e.g. MAC address from
efuse)
- RealTek WiFi (rtw89):
- HW scan support for 8852b
- better support for 6 GHz scanning
- support for various newer firmware APIs
- framework firmware backwards compatibility
- MediaTek WiFi (mt76):
- P2P support
- mesh A-MSDU support
- EHT (Wi-Fi 7) support
- coredump support"
* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
net: phy: hide the PHYLIB_LEDS knob
net: phy: marvell-88x2222: remove unnecessary (void*) conversions
tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
net: amd: Fix link leak when verifying config failed
net: phy: marvell: Fix inconsistent indenting in led_blink_set
lan966x: Don't use xdp_frame when action is XDP_TX
tsnep: Add XDP socket zero-copy TX support
tsnep: Add XDP socket zero-copy RX support
tsnep: Move skb receive action to separate function
tsnep: Add functions for queue enable/disable
tsnep: Rework TX/RX queue initialization
tsnep: Replace modulo operation with mask
net: phy: dp83867: Add led_brightness_set support
net: phy: Fix reading LED reg property
drivers: nfc: nfcsim: remove return value check of `dev_dir`
net: phy: dp83867: Remove unnecessary (void*) conversions
net: ethtool: coalesce: try to make user settings stick twice
net: mana: Check if netdev/napi_alloc_frag returns single page
net: mana: Rename mana_refill_rxoob and remove some empty lines
net: veth: add page_pool stats
...
Diffstat (limited to 'Documentation')
115 files changed, 4690 insertions, 1290 deletions
diff --git a/Documentation/PCI/pci-error-recovery.rst b/Documentation/PCI/pci-error-recovery.rst index bdafeb4b66dc..9981d330da8f 100644 --- a/Documentation/PCI/pci-error-recovery.rst +++ b/Documentation/PCI/pci-error-recovery.rst @@ -418,7 +418,6 @@ That is, the recovery API only requires that: - drivers/next/e100.c - drivers/net/e1000 - drivers/net/e1000e - - drivers/net/ixgb - drivers/net/ixgbe - drivers/net/cxgb3 - drivers/net/s2io.c diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst index bfff0e7e37c2..38372a956d65 100644 --- a/Documentation/bpf/bpf_design_QA.rst +++ b/Documentation/bpf/bpf_design_QA.rst @@ -314,7 +314,7 @@ Q: What is the compatibility story for special BPF types in map values? Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map values (when using BTF support for BPF maps). This allows to use helpers for such objects on these fields inside map values. Users are also allowed to embed -pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the +pointers to some kernel types (with __kptr_untrusted and __kptr BTF tags). Will the kernel preserve backwards compatibility for these features? A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else: @@ -324,7 +324,7 @@ For struct types that have been added already, like bpf_spin_lock and bpf_timer, the kernel will preserve backwards compatibility, as they are part of UAPI. For kptrs, they are also part of UAPI, but only with respect to the kptr -mechanism. The types that you can use with a __kptr and __kptr_ref tagged +mechanism. The types that you can use with a __kptr_untrusted and __kptr tagged pointer in your struct are NOT part of the UAPI contract. The supported types can and will change across kernel releases. However, operations like accessing kptr fields and bpf_kptr_xchg() helper will continue to be supported across kernel diff --git a/Documentation/bpf/bpf_devel_QA.rst b/Documentation/bpf/bpf_devel_QA.rst index b421d94dc9f2..609b71f5747d 100644 --- a/Documentation/bpf/bpf_devel_QA.rst +++ b/Documentation/bpf/bpf_devel_QA.rst @@ -128,7 +128,8 @@ into the bpf-next tree will make their way into net-next tree. net and net-next are both run by David S. Miller. From there, they will go into the kernel mainline tree run by Linus Torvalds. To read up on the process of net and net-next being merged into the mainline tree, see -the :ref:`netdev-FAQ` +the documentation on netdev subsystem at +Documentation/process/maintainer-netdev.rst. @@ -147,7 +148,8 @@ request):: Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to? --------------------------------------------------------------------------------- -A: The process is the very same as described in the :ref:`netdev-FAQ`, +A: The process is the very same as described in the netdev subsystem +documentation at Documentation/process/maintainer-netdev.rst, so please read up on it. The subject line must indicate whether the patch is a fix or rather "next-like" content in order to let the maintainers know whether it is targeted at bpf or bpf-next. @@ -206,8 +208,9 @@ ii) run extensive BPF test suite and Once the BPF pull request was accepted by David S. Miller, then the patches end up in net or net-next tree, respectively, and make their way from there further into mainline. Again, see the -:ref:`netdev-FAQ` for additional information e.g. on how often they are -merged to mainline. +documentation for netdev subsystem at +Documentation/process/maintainer-netdev.rst for additional information +e.g. on how often they are merged to mainline. Q: How long do I need to wait for feedback on my BPF patches? ------------------------------------------------------------- @@ -230,7 +233,8 @@ Q: Are patches applied to bpf-next when the merge window is open? ----------------------------------------------------------------- A: For the time when the merge window is open, bpf-next will not be processed. This is roughly analogous to net-next patch processing, -so feel free to read up on the :ref:`netdev-FAQ` about further details. +so feel free to read up on the netdev docs at +Documentation/process/maintainer-netdev.rst about further details. During those two weeks of merge window, we might ask you to resend your patch series once bpf-next is open again. Once Linus released @@ -394,7 +398,8 @@ netdev kernel mailing list in Cc and ask for the fix to be queued up: netdev@vger.kernel.org The process in general is the same as on netdev itself, see also the -:ref:`netdev-FAQ`. +the documentation on networking subsystem at +Documentation/process/maintainer-netdev.rst. Q: Do you also backport to kernels not currently maintained as stable? ---------------------------------------------------------------------- @@ -410,7 +415,7 @@ Q: The BPF patch I am about to submit needs to go to stable as well What should I do? A: The same rules apply as with netdev patch submissions in general, see -the :ref:`netdev-FAQ`. +the netdev docs at Documentation/process/maintainer-netdev.rst. Never add "``Cc: stable@vger.kernel.org``" to the patch description, but ask the BPF maintainers to queue the patches instead. This can be done @@ -684,7 +689,6 @@ when: .. Links -.. _netdev-FAQ: Documentation/process/maintainer-netdev.rst .. _selftests: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/ diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst index 528feddf2db9..2c872a1ee08e 100644 --- a/Documentation/bpf/clang-notes.rst +++ b/Documentation/bpf/clang-notes.rst @@ -20,6 +20,12 @@ Arithmetic instructions For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with ``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included. +Jump instructions +================= + +If ``-O0`` is used, Clang will generate the ``BPF_CALL | BPF_X | BPF_JMP`` (0x8d) +instruction, which is not supported by the Linux kernel verifier. + Atomic operations ================= diff --git a/Documentation/bpf/cpumasks.rst b/Documentation/bpf/cpumasks.rst index 24bef9cbbeee..41efd8874eeb 100644 --- a/Documentation/bpf/cpumasks.rst +++ b/Documentation/bpf/cpumasks.rst @@ -51,7 +51,7 @@ For example: .. code-block:: c struct cpumask_map_value { - struct bpf_cpumask __kptr_ref * cpumask; + struct bpf_cpumask __kptr * cpumask; }; struct array_map { @@ -117,18 +117,13 @@ For example: As mentioned and illustrated above, these ``struct bpf_cpumask *`` objects can also be stored in a map and used as kptrs. If a ``struct bpf_cpumask *`` is in a map, the reference can be removed from the map with bpf_kptr_xchg(), or -opportunistically acquired with bpf_cpumask_kptr_get(): - -.. kernel-doc:: kernel/bpf/cpumask.c - :identifiers: bpf_cpumask_kptr_get - -Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: +opportunistically acquired using RCU: .. code-block:: c /* struct containing the struct bpf_cpumask kptr which is stored in the map. */ struct cpumasks_kfunc_map_value { - struct bpf_cpumask __kptr_ref * bpf_cpumask; + struct bpf_cpumask __kptr * bpf_cpumask; }; /* The map containing struct cpumasks_kfunc_map_value entries. */ @@ -144,7 +139,7 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: /** * A simple example tracepoint program showing how a * struct bpf_cpumask * kptr that is stored in a map can - * be acquired using the bpf_cpumask_kptr_get() kfunc. + * be passed to kfuncs using RCU protection. */ SEC("tp_btf/cgroup_mkdir") int BPF_PROG(cgrp_ancestor_example, struct cgroup *cgrp, const char *path) @@ -158,26 +153,21 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map: if (!v) return -ENOENT; + bpf_rcu_read_lock(); /* Acquire a reference to the bpf_cpumask * kptr that's already stored in the map. */ - kptr = bpf_cpumask_kptr_get(&v->cpumask); - if (!kptr) + kptr = v->cpumask; + if (!kptr) { /* If no bpf_cpumask was present in the map, it's because * we're racing with another CPU that removed it with * bpf_kptr_xchg() between the bpf_map_lookup_elem() - * above, and our call to bpf_cpumask_kptr_get(). - * bpf_cpumask_kptr_get() internally safely handles this - * race, and will return NULL if the cpumask is no longer - * present in the map by the time we invoke the kfunc. + * above, and our load of the pointer from the map. */ + bpf_rcu_read_unlock(); return -EBUSY; + } - /* Free the reference we just took above. Note that the - * original struct bpf_cpumask * kptr is still in the map. It will - * be freed either at a later time if another context deletes - * it from the map, or automatically by the BPF subsystem if - * it's still present when the map is destroyed. - */ - bpf_cpumask_release(kptr); + bpf_cpumask_setall(kptr); + bpf_rcu_read_unlock(); return 0; } diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index af515de5fc38..492980ece1ab 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -11,7 +11,8 @@ Documentation conventions ========================= For brevity, this document uses the type notion "u64", "u32", etc. -to mean an unsigned integer whose width is the specified number of bits. +to mean an unsigned integer whose width is the specified number of bits, +and "s32", etc. to mean a signed integer of the specified number of bits. Registers and calling convention ================================ @@ -38,14 +39,11 @@ eBPF has two instruction encodings: * the wide instruction encoding, which appends a second 64-bit immediate (i.e., constant) value after the basic instruction for a total of 128 bits. -The basic instruction encoding is as follows, where MSB and LSB mean the most significant -bits and least significant bits, respectively: +The fields conforming an encoded basic instruction are stored in the +following order:: -============= ======= ======= ======= ============ -32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) -============= ======= ======= ======= ============ -imm offset src_reg dst_reg opcode -============= ======= ======= ======= ============ + opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF. + opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF. **imm** signed integer immediate value @@ -63,6 +61,18 @@ imm offset src_reg dst_reg opcode **opcode** operation to perform +Note that the contents of multi-byte fields ('imm' and 'offset') are +stored using big-endian byte ordering in big-endian BPF and +little-endian byte ordering in little-endian BPF. + +For example:: + + opcode offset imm assembly + src_reg dst_reg + 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little + dst_reg src_reg + 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big + Note that most instructions do not use all of the fields. Unused fields shall be cleared to zero. @@ -72,18 +82,23 @@ The 64 bits following the basic instruction contain a pseudo instruction using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, and imm containing the high 32 bits of the immediate value. -================= ================== -64 bits (MSB) 64 bits (LSB) -================= ================== -basic instruction pseudo instruction -================= ================== +This is depicted in the following figure:: + + basic_instruction + .-----------------------------. + | | + code:8 regs:8 offset:16 imm:32 unused:32 imm:32 + | | + '--------------' + pseudo instruction Thus the 64-bit immediate value is constructed as follows: imm64 = (next_imm << 32) | imm where 'next_imm' refers to the imm value of the pseudo instruction -following the basic instruction. +following the basic instruction. The unused bytes in the pseudo +instruction are reserved and shall be cleared to zero. Instruction classes ------------------- @@ -228,28 +243,58 @@ Jump instructions otherwise identical operations. The 'code' field encodes the operation as below: -======== ===== ========================= ============ -code value description notes -======== ===== ========================= ============ -BPF_JA 0x00 PC += off BPF_JMP only -BPF_JEQ 0x10 PC += off if dst == src -BPF_JGT 0x20 PC += off if dst > src unsigned -BPF_JGE 0x30 PC += off if dst >= src unsigned -BPF_JSET 0x40 PC += off if dst & src -BPF_JNE 0x50 PC += off if dst != src -BPF_JSGT 0x60 PC += off if dst > src signed -BPF_JSGE 0x70 PC += off if dst >= src signed -BPF_CALL 0x80 function call -BPF_EXIT 0x90 function / program return BPF_JMP only -BPF_JLT 0xa0 PC += off if dst < src unsigned -BPF_JLE 0xb0 PC += off if dst <= src unsigned -BPF_JSLT 0xc0 PC += off if dst < src signed -BPF_JSLE 0xd0 PC += off if dst <= src signed -======== ===== ========================= ============ +======== ===== === =========================================== ========================================= +code value src description notes +======== ===== === =========================================== ========================================= +BPF_JA 0x0 0x0 PC += offset BPF_JMP only +BPF_JEQ 0x1 any PC += offset if dst == src +BPF_JGT 0x2 any PC += offset if dst > src unsigned +BPF_JGE 0x3 any PC += offset if dst >= src unsigned +BPF_JSET 0x4 any PC += offset if dst & src +BPF_JNE 0x5 any PC += offset if dst != src +BPF_JSGT 0x6 any PC += offset if dst > src signed +BPF_JSGE 0x7 any PC += offset if dst >= src signed +BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ +BPF_CALL 0x8 0x1 call PC += offset see `Program-local functions`_ +BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ +BPF_EXIT 0x9 0x0 return BPF_JMP only +BPF_JLT 0xa any PC += offset if dst < src unsigned +BPF_JLE 0xb any PC += offset if dst <= src unsigned +BPF_JSLT 0xc any PC += offset if dst < src signed +BPF_JSLE 0xd any PC += offset if dst <= src signed +======== ===== === =========================================== ========================================= The eBPF program needs to store the return value into register R0 before doing a -BPF_EXIT. +``BPF_EXIT``. + +Example: + +``BPF_JSGE | BPF_X | BPF_JMP32`` (0x7e) means:: + + if (s32)dst s>= (s32)src goto +offset + +where 's>=' indicates a signed '>=' comparison. +Helper functions +~~~~~~~~~~~~~~~~ + +Helper functions are a concept whereby BPF programs can call into a +set of function calls exposed by the underlying platform. + +Historically, each helper function was identified by an address +encoded in the imm field. The available helper functions may differ +for each program type, but address values are unique across all program types. + +Platforms that support the BPF Type Format (BTF) support identifying +a helper function by a BTF ID encoded in the imm field, where the BTF ID +identifies the helper name and type. + +Program-local functions +~~~~~~~~~~~~~~~~~~~~~~~ +Program-local functions are functions exposed by the same BPF program as the +caller, and are referenced by offset from the call instruction, similar to +``BPF_JA``. A ``BPF_EXIT`` within the program-local function will return to +the caller. Load and store instructions =========================== @@ -371,14 +416,56 @@ and loaded back to ``R0``. ----------------------------- Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction -encoding for an extra imm64 value. - -There is currently only one such instruction. - -``BPF_LD | BPF_DW | BPF_IMM`` means:: - - dst = imm64 - +encoding defined in `Instruction encoding`_, and use the 'src' field of the +basic instruction to hold an opcode subtype. + +The following table defines a set of ``BPF_IMM | BPF_DW | BPF_LD`` instructions +with opcode subtypes in the 'src' field, using new terms such as "map" +defined further below: + +========================= ====== === ========================================= =========== ============== +opcode construction opcode src pseudocode imm type dst type +========================= ====== === ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x0 dst = imm64 integer integer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x1 dst = map_by_fd(imm) map fd map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x4 dst = code_addr(imm) integer code pointer +BPF_IMM | BPF_DW | BPF_LD 0x18 0x5 dst = map_by_idx(imm) map index map +BPF_IMM | BPF_DW | BPF_LD 0x18 0x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer +========================= ====== === ========================================= =========== ============== + +where + +* map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_) +* map_by_idx(imm) means to convert a 32-bit index into an address of a map +* map_val(map) gets the address of the first value in a given map +* var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id +* code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions +* the 'imm type' can be used by disassemblers for display +* the 'dst type' can be used for verification and JIT compilation purposes + +Maps +~~~~ + +Maps are shared memory regions accessible by eBPF programs on some platforms. +A map can have various semantics as defined in a separate document, and may or +may not have a single contiguous memory region, but the 'map_val(map)' is +currently only defined for maps that do have a single contiguous memory region. + +Each map can have a file descriptor (fd) if supported by the platform, where +'map_by_fd(imm)' means to get the map with the specified file descriptor. Each +BPF program can also be defined to use a set of maps associated with the +program at load time, and 'map_by_idx(imm)' means to get the map with the given +index in the set associated with the BPF program containing the instruction. + +Platform Variables +~~~~~~~~~~~~~~~~~~ + +Platform variables are memory regions, identified by integer ids, exposed by +the runtime and accessible by BPF programs on some platforms. The +'var_addr(imm)' operation means to get the address of the memory region +identified by the given id. Legacy BPF Packet access instructions ------------------------------------- diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index ca96ef3f6896..ea2516374d92 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -100,6 +100,23 @@ Hence, whenever a constant scalar argument is accepted by a kfunc which is not a size parameter, and the value of the constant matters for program safety, __k suffix should be used. +2.2.2 __uninit Annotation +------------------------- + +This annotation is used to indicate that the argument will be treated as +uninitialized. + +An example is given below:: + + __bpf_kfunc int bpf_dynptr_from_skb(..., struct bpf_dynptr_kern *ptr__uninit) + { + ... + } + +Here, the dynptr will be treated as an uninitialized dynptr. Without this +annotation, the verifier will reject the program if the dynptr passed in is +not initialized. + .. _BPF_kfunc_nodef: 2.3 Using an existing kernel function @@ -162,20 +179,12 @@ both are orthogonal to each other. --------------------- The KF_RELEASE flag is used to indicate that the kfunc releases the pointer -passed in to it. There can be only one referenced pointer that can be passed in. -All copies of the pointer being released are invalidated as a result of invoking -kfunc with this flag. - -2.4.4 KF_KPTR_GET flag ----------------------- - -The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument -as a pointer to kptr, safely increments the refcount of the object it points to, -and returns a reference to the user. The rest of the arguments may be normal -arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with -KF_ACQUIRE and KF_RET_NULL flags. +passed in to it. There can be only one referenced pointer that can be passed +in. All copies of the pointer being released are invalidated as a result of +invoking kfunc with this flag. KF_RELEASE kfuncs automatically receive the +protection afforded by the KF_TRUSTED_ARGS flag described below. -2.4.5 KF_TRUSTED_ARGS flag +2.4.4 KF_TRUSTED_ARGS flag -------------------------- The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It @@ -187,7 +196,7 @@ exception described below). There are two types of pointers to kernel objects which are considered "valid": 1. Pointers which are passed as tracepoint or struct_ops callback arguments. -2. Pointers which were returned from a KF_ACQUIRE or KF_KPTR_GET kfunc. +2. Pointers which were returned from a KF_ACQUIRE kfunc. Pointers to non-BTF objects (e.g. scalar pointers) may also be passed to KF_TRUSTED_ARGS kfuncs, and may have a non-zero offset. @@ -214,13 +223,13 @@ In other words, you must: 2. Specify the type and name of the trusted nested field. This field must match the field in the original type definition exactly. -2.4.6 KF_SLEEPABLE flag +2.4.5 KF_SLEEPABLE flag ----------------------- The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only be called by sleepable BPF programs (BPF_F_SLEEPABLE). -2.4.7 KF_DESTRUCTIVE flag +2.4.6 KF_DESTRUCTIVE flag -------------------------- The KF_DESTRUCTIVE flag is used to indicate functions calling which is @@ -229,18 +238,20 @@ rebooting or panicking. Due to this additional restrictions apply to these calls. At the moment they only require CAP_SYS_BOOT capability, but more can be added later. -2.4.8 KF_RCU flag +2.4.7 KF_RCU flag ----------------- -The KF_RCU flag is used for kfuncs which have a rcu ptr as its argument. -When used together with KF_ACQUIRE, it indicates the kfunc should have a -single argument which must be a trusted argument or a MEM_RCU pointer. -The argument may have reference count of 0 and the kfunc must take this -into consideration. +The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with +KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees +that the objects are valid and there is no use-after-free. The pointers are not +NULL, but the object's refcount could have reached zero. The kfuncs need to +consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE +pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely +also be KF_RET_NULL. .. _KF_deprecated_flag: -2.4.9 KF_DEPRECATED flag +2.4.8 KF_DEPRECATED flag ------------------------ The KF_DEPRECATED flag is used for kfuncs which are scheduled to be @@ -451,13 +462,50 @@ struct_ops callback arg. For example: struct task_struct *acquired; acquired = bpf_task_acquire(task); + if (acquired) + /* + * In a typical program you'd do something like store + * the task in a map, and the map will automatically + * release it later. Here, we release it manually. + */ + bpf_task_release(acquired); + return 0; + } + + +References acquired on ``struct task_struct *`` objects are RCU protected. +Therefore, when in an RCU read region, you can obtain a pointer to a task +embedded in a map value without having to acquire a reference: + +.. code-block:: c + + #define private(name) SEC(".data." #name) __hidden __attribute__((aligned(8))) + private(TASK) static struct task_struct *global; + + /** + * A trivial example showing how to access a task stored + * in a map using RCU. + */ + SEC("tp_btf/task_newtask") + int BPF_PROG(task_rcu_read_example, struct task_struct *task, u64 clone_flags) + { + struct task_struct *local_copy; + + bpf_rcu_read_lock(); + local_copy = global; + if (local_copy) + /* + * We could also pass local_copy to kfuncs or helper functions here, + * as we're guaranteed that local_copy will be valid until we exit + * the RCU read region below. + */ + bpf_printk("Global task %s is valid", local_copy->comm); + else + bpf_printk("No global task found"); + bpf_rcu_read_unlock(); + + /* At this point we can no longer reference local_copy. */ - /* - * In a typical program you'd do something like store - * the task in a map, and the map will automatically - * release it later. Here, we release it manually. - */ - bpf_task_release(acquired); return 0; } @@ -515,80 +563,16 @@ bpf_task_release() respectively, so we won't provide examples for them. ---- -You may also acquire a reference to a ``struct cgroup`` kptr that's already -stored in a map using bpf_cgroup_kptr_get(): +Other kfuncs available for interacting with ``struct cgroup *`` objects are +bpf_cgroup_ancestor() and bpf_cgroup_from_id(), allowing callers to access +the ancestor of a cgroup and find a cgroup by its ID, respectively. Both +return a cgroup kptr. .. kernel-doc:: kernel/bpf/helpers.c - :identifiers: bpf_cgroup_kptr_get - -Here's an example of how it can be used: - -.. code-block:: c - - /* struct containing the struct task_struct kptr which is actually stored in the map. */ - struct __cgroups_kfunc_map_value { - struct cgroup __kptr_ref * cgroup; - }; - - /* The map containing struct __cgroups_kfunc_map_value entries. */ - struct { - __uint(type, BPF_MAP_TYPE_HASH); - __type(key, int); - __type(value, struct __cgroups_kfunc_map_value); - __uint(max_entries, 1); - } __cgroups_kfunc_map SEC(".maps"); - - /* ... */ - - /** - * A simple example tracepoint program showing how a - * struct cgroup kptr that is stored in a map can - * be acquired using the bpf_cgroup_kptr_get() kfunc. - */ - SEC("tp_btf/cgroup_mkdir") - int BPF_PROG(cgroup_kptr_get_example, struct cgroup *cgrp, const char *path) - { - struct cgroup *kptr; - struct __cgroups_kfunc_map_value *v; - s32 id = cgrp->self.id; - - /* Assume a cgroup kptr was previously stored in the map. */ - v = bpf_map_lookup_elem(&__cgroups_kfunc_map, &id); - if (!v) - return -ENOENT; - - /* Acquire a reference to the cgroup kptr that's already stored in the map. */ - kptr = bpf_cgroup_kptr_get(&v->cgroup); - if (!kptr) - /* If no cgroup was present in the map, it's because - * we're racing with another CPU that removed it with - * bpf_kptr_xchg() between the bpf_map_lookup_elem() - * above, and our call to bpf_cgroup_kptr_get(). - * bpf_cgroup_kptr_get() internally safely handles this - * race, and will return NULL if the task is no longer - * present in the map by the time we invoke the kfunc. - */ - return -EBUSY; - - /* Free the reference we just took above. Note that the - * original struct cgroup kptr is still in the map. It will - * be freed either at a later time if another context deletes - * it from the map, or automatically by the BPF subsystem if - * it's still present when the map is destroyed. - */ - bpf_cgroup_release(kptr); - - return 0; - } - ----- - -Another kfunc available for interacting with ``struct cgroup *`` objects is -bpf_cgroup_ancestor(). This allows callers to access the ancestor of a cgroup, -and return it as a cgroup kptr. + :identifiers: bpf_cgroup_ancestor .. kernel-doc:: kernel/bpf/helpers.c - :identifiers: bpf_cgroup_ancestor + :identifiers: bpf_cgroup_from_id Eventually, BPF should be updated to allow this to happen with a normal memory load in the program itself. This is currently not possible without more work in diff --git a/Documentation/bpf/libbpf/index.rst b/Documentation/bpf/libbpf/index.rst index f9b3b252e28f..7545a2049692 100644 --- a/Documentation/bpf/libbpf/index.rst +++ b/Documentation/bpf/libbpf/index.rst @@ -2,23 +2,32 @@ .. _libbpf: +====== libbpf ====== +If you are looking to develop BPF applications using the libbpf library, this +directory contains important documentation that you should read. + +To get started, it is recommended to begin with the :doc:`libbpf Overview +<libbpf_overview>` document, which provides a high-level understanding of the +libbpf APIs and their usage. This will give you a solid foundation to start +exploring and utilizing the various features of libbpf to develop your BPF +applications. + .. toctree:: :maxdepth: 1 + libbpf_overview API Documentation <https://libbpf.readthedocs.io/en/latest/api.html> program_types libbpf_naming_convention libbpf_build -This is documentation for libbpf, a userspace library for loading and -interacting with bpf programs. -All general BPF questions, including kernel functionality, libbpf APIs and -their application, should be sent to bpf@vger.kernel.org mailing list. -You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the -mailing list search its `archive <https://lore.kernel.org/bpf/>`_. -Please search the archive before asking new questions. It very well might -be that this was already addressed or answered before. +All general BPF questions, including kernel functionality, libbpf APIs and their +application, should be sent to bpf@vger.kernel.org mailing list. You can +`subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the mailing list +search its `archive <https://lore.kernel.org/bpf/>`_. Please search the archive +before asking new questions. It may be that this was already addressed or +answered before. diff --git a/Documentation/bpf/libbpf/libbpf_overview.rst b/Documentation/bpf/libbpf/libbpf_overview.rst new file mode 100644 index 000000000000..f36a2d4ffea2 --- /dev/null +++ b/Documentation/bpf/libbpf/libbpf_overview.rst @@ -0,0 +1,228 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +libbpf Overview +=============== + +libbpf is a C-based library containing a BPF loader that takes compiled BPF +object files and prepares and loads them into the Linux kernel. libbpf takes the +heavy lifting of loading, verifying, and attaching BPF programs to various +kernel hooks, allowing BPF application developers to focus only on BPF program +correctness and performance. + +The following are the high-level features supported by libbpf: + +* Provides high-level and low-level APIs for user space programs to interact + with BPF programs. The low-level APIs wrap all the bpf system call + functionality, which is useful when users need more fine-grained control + over the interactions between user space and BPF programs. +* Provides overall support for the BPF object skeleton generated by bpftool. + The skeleton file simplifies the process for the user space programs to access + global variables and work with BPF programs. +* Provides BPF-side APIS, including BPF helper definitions, BPF maps support, + and tracing helpers, allowing developers to simplify BPF code writing. +* Supports BPF CO-RE mechanism, enabling BPF developers to write portable + BPF programs that can be compiled once and run across different kernel + versions. + +This document will delve into the above concepts in detail, providing a deeper +understanding of the capabilities and advantages of libbpf and how it can help +you develop BPF applications efficiently. + +BPF App Lifecycle and libbpf APIs +================================== + +A BPF application consists of one or more BPF programs (either cooperating or +completely independent), BPF maps, and global variables. The global +variables are shared between all BPF programs, which allows them to cooperate on +a common set of data. libbpf provides APIs that user space programs can use to +manipulate the BPF programs by triggering different phases of a BPF application +lifecycle. + +The following section provides a brief overview of each phase in the BPF life +cycle: + +* **Open phase**: In this phase, libbpf parses the BPF + object file and discovers BPF maps, BPF programs, and global variables. After + a BPF app is opened, user space apps can make additional adjustments + (setting BPF program types, if necessary; pre-setting initial values for + global variables, etc.) before all the entities are created and loaded. + +* **Load phase**: In the load phase, libbpf creates BPF + maps, resolves various relocations, and verifies and loads BPF programs into + the kernel. At this point, libbpf validates all the parts of a BPF application + and loads the BPF program into the kernel, but no BPF program has yet been + executed. After the load phase, it’s possible to set up the initial BPF map + state without racing with the BPF program code execution. + +* **Attachment phase**: In this phase, libbpf + attaches BPF programs to various BPF hook points (e.g., tracepoints, kprobes, + cgroup hooks, network packet processing pipeline, etc.). During this + phase, BPF programs perform useful work such as processing + packets, or updating BPF maps and global variables that can be read from user + space. + +* **Tear down phase**: In the tear down phase, + libbpf detaches BPF programs and unloads them from the kernel. BPF maps are + destroyed, and all the resources used by the BPF app are freed. + +BPF Object Skeleton File +======================== + +BPF skeleton is an alternative interface to libbpf APIs for working with BPF +objects. Skeleton code abstract away generic libbpf APIs to significantly +simplify code for manipulating BPF programs from user space. Skeleton code +includes a bytecode representation of the BPF object file, simplifying the +process of distributing your BPF code. With BPF bytecode embedded, there are no +extra files to deploy along with your application binary. + +You can generate the skeleton header file ``(.skel.h)`` for a specific object +file by passing the BPF object to the bpftool. The generated BPF skeleton +provides the following custom functions that correspond to the BPF lifecycle, +each of them prefixed with the specific object name: + +* ``<name>__open()`` – creates and opens BPF application (``<name>`` stands for + the specific bpf object name) +* ``<name>__load()`` – instantiates, loads,and verifies BPF application parts +* ``<name>__attach()`` – attaches all auto-attachable BPF programs (it’s + optional, you can have more control by using libbpf APIs directly) +* ``<name>__destroy()`` – detaches all BPF programs and + frees up all used resources + +Using the skeleton code is the recommended way to work with bpf programs. Keep +in mind, BPF skeleton provides access to the underlying BPF object, so whatever +was possible to do with generic libbpf APIs is still possible even when the BPF +skeleton is used. It's an additive convenience feature, with no syscalls, and no +cumbersome code. + +Other Advantages of Using Skeleton File +--------------------------------------- + +* BPF skeleton provides an interface for user space programs to work with BPF + global variables. The skeleton code memory maps global variables as a struct + into user space. The struct interface allows user space programs to initialize + BPF programs before the BPF load phase and fetch and update data from user + space afterward. + +* The ``skel.h`` file reflects the object file structure by listing out the + available maps, programs, etc. BPF skeleton provides direct access to all the + BPF maps and BPF programs as struct fields. This eliminates the need for + string-based lookups with ``bpf_object_find_map_by_name()`` and + ``bpf_object_find_program_by_name()`` APIs, reducing errors due to BPF source + code and user-space code getting out of sync. + +* The embedded bytecode representation of the object file ensures that the + skeleton and the BPF object file are always in sync. + +BPF Helpers +=========== + +libbpf provides BPF-side APIs that BPF programs can use to interact with the +system. The BPF helpers definition allows developers to use them in BPF code as +any other plain C function. For example, there are helper functions to print +debugging messages, get the time since the system was booted, interact with BPF +maps, manipulate network packets, etc. + +For a complete description of what the helpers do, the arguments they take, and +the return value, see the `bpf-helpers +<https://man7.org/linux/man-pages/man7/bpf-helpers.7.html>`_ man page. + +BPF CO-RE (Compile Once – Run Everywhere) +========================================= + +BPF programs work in the kernel space and have access to kernel memory and data +structures. One limitation that BPF applications come across is the lack of +portability across different kernel versions and configurations. `BCC +<https://github.com/iovisor/bcc/>`_ is one of the solutions for BPF +portability. However, it comes with runtime overhead and a large binary size +from embedding the compiler with the application. + +libbpf steps up the BPF program portability by supporting the BPF CO-RE concept. +BPF CO-RE brings together BTF type information, libbpf, and the compiler to +produce a single executable binary that you can run on multiple kernel versions +and configurations. + +To make BPF programs portable libbpf relies on the BTF type information of the +running kernel. Kernel also exposes this self-describing authoritative BTF +information through ``sysfs`` at ``/sys/kernel/btf/vmlinux``. + +You can generate the BTF information for the running kernel with the following +command: + +:: + + $ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h + +The command generates a ``vmlinux.h`` header file with all kernel types +(:doc:`BTF types <../btf>`) that the running kernel uses. Including +``vmlinux.h`` in your BPF program eliminates dependency on system-wide kernel +headers. + +libbpf enables portability of BPF programs by looking at the BPF program’s +recorded BTF type and relocation information and matching them to BTF +information (vmlinux) provided by the running kernel. libbpf then resolves and +matches all the types and fields, and updates necessary offsets and other +relocatable data to ensure that BPF program’s logic functions correctly for a +specific kernel on the host. BPF CO-RE concept thus eliminates overhead +associated with BPF development and allows developers to write portable BPF +applications without modifications and runtime source code compilation on the +target machine. + +The following code snippet shows how to read the parent field of a kernel +``task_struct`` using BPF CO-RE and libbf. The basic helper to read a field in a +CO-RE relocatable manner is ``bpf_core_read(dst, sz, src)``, which will read +``sz`` bytes from the field referenced by ``src`` into the memory pointed to by +``dst``. + +.. code-block:: C + :emphasize-lines: 6 + + //... + struct task_struct *task = (void *)bpf_get_current_task(); + struct task_struct *parent_task; + int err; + + err = bpf_core_read(&parent_task, sizeof(void *), &task->parent); + if (err) { + /* handle error */ + } + + /* parent_task contains the value of task->parent pointer */ + +In the code snippet, we first get a pointer to the current ``task_struct`` using +``bpf_get_current_task()``. We then use ``bpf_core_read()`` to read the parent +field of task struct into the ``parent_task`` variable. ``bpf_core_read()`` is +just like ``bpf_probe_read_kernel()`` BPF helper, except it records information +about the field that should be relocated on the target kernel. i.e, if the +``parent`` field gets shifted to a different offset within +``struct task_struct`` due to some new field added in front of it, libbpf will +automatically adjust the actual offset to the proper value. + +Getting Started with libbpf +=========================== + +Check out the `libbpf-bootstrap <https://github.com/libbpf/libbpf-bootstrap>`_ +repository with simple examples of using libbpf to build various BPF +applications. + +See also `libbpf API documentation +<https://libbpf.readthedocs.io/en/latest/api.html>`_. + +libbpf and Rust +=============== + +If you are building BPF applications in Rust, it is recommended to use the +`Libbpf-rs <https://github.com/libbpf/libbpf-rs>`_ library instead of bindgen +bindings directly to libbpf. Libbpf-rs wraps libbpf functionality in +Rust-idiomatic interfaces and provides libbpf-cargo plugin to handle BPF code +compilation and skeleton generation. Using Libbpf-rs will make building user +space part of the BPF application easier. Note that the BPF program themselves +must still be written in plain C. + +Additional Documentation +======================== + +* `Program types and ELF Sections <https://libbpf.readthedocs.io/en/latest/program_types.html>`_ +* `API naming convention <https://libbpf.readthedocs.io/en/latest/libbpf_naming_convention.html>`_ +* `Building libbpf <https://libbpf.readthedocs.io/en/latest/libbpf_build.html>`_ +* `API documentation Convention <https://libbpf.readthedocs.io/en/latest/libbpf_naming_convention.html#api-documentation-convention>`_ diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst index 956b0c86699d..508d009d3bed 100644 --- a/Documentation/bpf/linux-notes.rst +++ b/Documentation/bpf/linux-notes.rst @@ -12,6 +12,36 @@ Byte swap instructions ``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively. +Jump instructions +================= + +``BPF_CALL | BPF_X | BPF_JMP`` (0x8d), where the helper function +integer would be read from a specified register, is not currently supported +by the verifier. Any programs with this instruction will fail to load +until such support is added. + +Maps +==== + +Linux only supports the 'map_val(map)' operation on array maps with a single element. + +Linux uses an fd_array to store maps associated with a BPF program. Thus, +map_by_idx(imm) uses the fd at that index in the array. + +Variables +========= + +The following 64-bit immediate instruction specifies that a variable address, +which corresponds to some integer stored in the 'imm' field, should be loaded: + +========================= ====== === ========================================= =========== ============== +opcode construction opcode src pseudocode imm type dst type +========================= ====== === ========================================= =========== ============== +BPF_IMM | BPF_DW | BPF_LD 0x18 0x3 dst = var_addr(imm) variable id data pointer +========================= ====== === ========================================= =========== ============== + +On Linux, this integer is a BTF ID. + Legacy BPF Packet access instructions ===================================== diff --git a/Documentation/bpf/maps.rst b/Documentation/bpf/maps.rst index 4906ff0f8382..6f069f3d6f4b 100644 --- a/Documentation/bpf/maps.rst +++ b/Documentation/bpf/maps.rst @@ -11,9 +11,9 @@ maps are accessed from BPF programs via BPF helpers which are documented in the `man-pages`_ for `bpf-helpers(7)`_. BPF maps are accessed from user space via the ``bpf`` syscall, which provides -commands to create maps, lookup elements, update elements and delete -elements. More details of the BPF syscall are available in -:doc:`/userspace-api/ebpf/syscall` and in the `man-pages`_ for `bpf(2)`_. +commands to create maps, lookup elements, update elements and delete elements. +More details of the BPF syscall are available in `ebpf-syscall`_ and in the +`man-pages`_ for `bpf(2)`_. Map Types ========= @@ -79,3 +79,4 @@ Find and delete element by key in a given map using ``attr->map_fd``, .. _man-pages: https://www.kernel.org/doc/man-pages/ .. _bpf(2): https://man7.org/linux/man-pages/man2/bpf.2.html .. _bpf-helpers(7): https://man7.org/linux/man-pages/man7/bpf-helpers.7.html +.. _ebpf-syscall: https://docs.kernel.org/userspace-api/ebpf/syscall.html diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mt7622-wed.yaml b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mt7622-wed.yaml index 5c223cb063d4..f7d578a171a4 100644 --- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mt7622-wed.yaml +++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mt7622-wed.yaml @@ -20,6 +20,7 @@ properties: items: - enum: - mediatek,mt7622-wed + - mediatek,mt7981-wed - mediatek,mt7986-wed - const: syscon diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt deleted file mode 100644 index d2c24c277514..000000000000 --- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,sgmiisys.txt +++ /dev/null @@ -1,27 +0,0 @@ -MediaTek SGMIISYS controller -============================ - -The MediaTek SGMIISYS controller provides various clocks to the system. - -Required Properties: - -- compatible: Should be: - - "mediatek,mt7622-sgmiisys", "syscon" - - "mediatek,mt7629-sgmiisys", "syscon" - - "mediatek,mt7981-sgmiisys_0", "syscon" - - "mediatek,mt7981-sgmiisys_1", "syscon" - - "mediatek,mt7986-sgmiisys_0", "syscon" - - "mediatek,mt7986-sgmiisys_1", "syscon" -- #clock-cells: Must be 1 - -The SGMIISYS controller uses the common clk binding from -Documentation/devicetree/bindings/clock/clock-bindings.txt -The available clocks are defined in dt-bindings/clock/mt*-clk.h. - -Example: - -sgmiisys: sgmiisys@1b128000 { - compatible = "mediatek,mt7622-sgmiisys", "syscon"; - reg = <0 0x1b128000 0 0x1000>; - #clock-cells = <1>; -}; diff --git a/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml b/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml index b2b156cc160a..ad8e51aa01b0 100644 --- a/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml +++ b/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml @@ -20,6 +20,7 @@ properties: - st,stm32-syscfg - st,stm32-power-config - st,stm32-tamp + - st,stm32f4-gcan - const: syscon - items: - const: st,stm32-tamp @@ -42,6 +43,7 @@ if: contains: enum: - st,stm32mp157-syscfg + - st,stm32f4-gcan then: required: - clocks diff --git a/Documentation/devicetree/bindings/net/actions,owl-emac.yaml b/Documentation/devicetree/bindings/net/actions,owl-emac.yaml index d30fada2ac39..5718ab4654b2 100644 --- a/Documentation/devicetree/bindings/net/actions,owl-emac.yaml +++ b/Documentation/devicetree/bindings/net/actions,owl-emac.yaml @@ -16,7 +16,7 @@ description: | operation modes at 10/100 Mb/s data transfer rates. allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-emac.yaml b/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-emac.yaml index 987b91b9afe9..eb26623dab51 100644 --- a/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-emac.yaml +++ b/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-emac.yaml @@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# title: Allwinner A10 EMAC Ethernet Controller allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# maintainers: - Chen-Yu Tsai <wens@csie.org> diff --git a/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-mdio.yaml b/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-mdio.yaml index ede977cdfb8d..85f552b907f3 100644 --- a/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-mdio.yaml +++ b/Documentation/devicetree/bindings/net/allwinner,sun4i-a10-mdio.yaml @@ -11,7 +11,7 @@ maintainers: - Maxime Ripard <mripard@kernel.org> allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# # Select every compatible, including the deprecated ones. This way, we # will be able to report a warning when we have that compatible, since diff --git a/Documentation/devicetree/bindings/net/altr,tse.yaml b/Documentation/devicetree/bindings/net/altr,tse.yaml index 8d1d94494349..9d02af468906 100644 --- a/Documentation/devicetree/bindings/net/altr,tse.yaml +++ b/Documentation/devicetree/bindings/net/altr,tse.yaml @@ -66,7 +66,7 @@ required: - tx-fifo-depth allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# - if: properties: compatible: diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml index ddd5a073c3a8..a2c51a84efa5 100644 --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml @@ -2,8 +2,8 @@ # Copyright 2019 BayLibre, SAS %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/amlogic,meson-dwmac.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/amlogic,meson-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Amlogic Meson DWMAC Ethernet controller diff --git a/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml index f81eda8cb0a5..d6ef468495c5 100644 --- a/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml +++ b/Documentation/devicetree/bindings/net/aspeed,ast2600-mdio.yaml @@ -15,7 +15,7 @@ description: |+ MAC. allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/bluetooth/nxp,88w8987-bt.yaml b/Documentation/devicetree/bindings/net/bluetooth/nxp,88w8987-bt.yaml new file mode 100644 index 000000000000..57e4c87cb00b --- /dev/null +++ b/Documentation/devicetree/bindings/net/bluetooth/nxp,88w8987-bt.yaml @@ -0,0 +1,45 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/bluetooth/nxp,88w8987-bt.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: NXP Bluetooth chips + +description: + This binding describes UART-attached NXP bluetooth chips. These chips + are dual-radio chips supporting WiFi and Bluetooth. The bluetooth + works on standard H4 protocol over 4-wire UART. The RTS and CTS lines + are used during FW download. To enable power save mode, the host + asserts break signal over UART-TX line to put the chip into power save + state. De-asserting break wakes up the BT chip. + +maintainers: + - Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> + +properties: + compatible: + enum: + - nxp,88w8987-bt + - nxp,88w8997-bt + + fw-init-baudrate: + description: + Chip baudrate after FW is downloaded and initialized. + This property depends on the module vendor's + configuration. If this property is not specified, + 115200 is set as default. + +required: + - compatible + +additionalProperties: false + +examples: + - | + serial { + bluetooth { + compatible = "nxp,88w8987-bt"; + fw-init-baudrate = <3000000>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml b/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml index a6a6b0e4df7a..68f78b90d23a 100644 --- a/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/bluetooth/qualcomm-bluetooth.yaml @@ -23,6 +23,7 @@ properties: - qcom,wcn3998-bt - qcom,qca6390-bt - qcom,wcn6750-bt + - qcom,wcn6855-bt enable-gpios: maxItems: 1 @@ -133,6 +134,22 @@ allOf: - vddrfa1p7-supply - vddrfa1p2-supply - vddasd-supply + - if: + properties: + compatible: + contains: + enum: + - qcom,wcn6855-bt + then: + required: + - enable-gpios + - swctrl-gpios + - vddio-supply + - vddbtcxmx-supply + - vddrfacmn-supply + - vddrfa0p8-supply + - vddrfa1p2-supply + - vddrfa1p7-supply examples: - | diff --git a/Documentation/devicetree/bindings/net/brcm,amac.yaml b/Documentation/devicetree/bindings/net/brcm,amac.yaml index ee2eac8f5710..210fb29c4e7b 100644 --- a/Documentation/devicetree/bindings/net/brcm,amac.yaml +++ b/Documentation/devicetree/bindings/net/brcm,amac.yaml @@ -10,7 +10,7 @@ maintainers: - Florian Fainelli <f.fainelli@gmail.com> allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# - if: properties: compatible: diff --git a/Documentation/devicetree/bindings/net/brcm,systemport.yaml b/Documentation/devicetree/bindings/net/brcm,systemport.yaml index 5fc9c9fafd85..b40006d44791 100644 --- a/Documentation/devicetree/bindings/net/brcm,systemport.yaml +++ b/Documentation/devicetree/bindings/net/brcm,systemport.yaml @@ -66,7 +66,7 @@ required: - phy-mode allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# unevaluatedProperties: false diff --git a/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml b/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml index b964c7dcec15..cc70b00c6ce5 100644 --- a/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/broadcom-bluetooth.yaml @@ -121,7 +121,7 @@ required: - compatible dependencies: - brcm,requires-autobaud-mode: [ 'shutdown-gpios' ] + brcm,requires-autobaud-mode: [ shutdown-gpios ] if: not: diff --git a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml index 6e59bd2a6094..4162469c3c08 100644 --- a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml +++ b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml @@ -63,6 +63,9 @@ properties: boot loader. This property should only be used the used operating system doesn't support the clocks and clock-names property. + power-domains: + maxItems: 1 + xceiver-supply: description: Regulator that powers the CAN transceiver. diff --git a/Documentation/devicetree/bindings/net/can/st,stm32-bxcan.yaml b/Documentation/devicetree/bindings/net/can/st,stm32-bxcan.yaml new file mode 100644 index 000000000000..769fa5c27b76 --- /dev/null +++ b/Documentation/devicetree/bindings/net/can/st,stm32-bxcan.yaml @@ -0,0 +1,85 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/can/st,stm32-bxcan.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: STMicroelectronics bxCAN controller + +description: STMicroelectronics BxCAN controller for CAN bus + +maintainers: + - Dario Binacchi <dario.binacchi@amarulasolutions.com> + +allOf: + - $ref: can-controller.yaml# + +properties: + compatible: + enum: + - st,stm32f4-bxcan + + st,can-primary: + description: + Primary and secondary mode of the bxCAN peripheral is only relevant + if the chip has two CAN peripherals. In that case they share some + of the required logic. + To avoid misunderstandings, it should be noted that ST documentation + uses the terms master/slave instead of primary/secondary. + type: boolean + + reg: + maxItems: 1 + + interrupts: + items: + - description: transmit interrupt + - description: FIFO 0 receive interrupt + - description: FIFO 1 receive interrupt + - description: status change error interrupt + + interrupt-names: + items: + - const: tx + - const: rx0 + - const: rx1 + - const: sce + + resets: + maxItems: 1 + + clocks: + maxItems: 1 + + st,gcan: + $ref: /schemas/types.yaml#/definitions/phandle-array + description: + The phandle to the gcan node which allows to access the 512-bytes + SRAM memory shared by the two bxCAN cells (CAN1 primary and CAN2 + secondary) in dual CAN peripheral configuration. + +required: + - compatible + - reg + - interrupts + - resets + - clocks + - st,gcan + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/stm32fx-clock.h> + #include <dt-bindings/mfd/stm32f4-rcc.h> + + can1: can@40006400 { + compatible = "st,stm32f4-bxcan"; + reg = <0x40006400 0x200>; + interrupts = <19>, <20>, <21>, <22>; + interrupt-names = "tx", "rx0", "rx1", "sce"; + resets = <&rcc STM32F4_APB1_RESET(CAN1)>; + clocks = <&rcc 0 STM32F4_APB1_CLOCK(CAN1)>; + st,can-primary; + st,gcan = <&gcan>; + }; diff --git a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml index 65af8183cb9c..897d2cbda45b 100644 --- a/Documentation/devicetree/bindings/net/can/xilinx,can.yaml +++ b/Documentation/devicetree/bindings/net/can/xilinx,can.yaml @@ -35,15 +35,15 @@ properties: maxItems: 1 tx-fifo-depth: - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 description: CAN Tx fifo depth (Zynq, Axi CAN). rx-fifo-depth: - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 description: CAN Rx fifo depth (Zynq, Axi CAN, CAN FD in sequential Rx mode) tx-mailbox-count: - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 description: CAN Tx mailbox buffer count (CAN FD) required: diff --git a/Documentation/devicetree/bindings/net/dsa/brcm,b53.yaml b/Documentation/devicetree/bindings/net/dsa/brcm,b53.yaml index 5bef4128d175..4c78c546343f 100644 --- a/Documentation/devicetree/bindings/net/dsa/brcm,b53.yaml +++ b/Documentation/devicetree/bindings/net/dsa/brcm,b53.yaml @@ -19,6 +19,7 @@ properties: - const: brcm,bcm53115 - const: brcm,bcm53125 - const: brcm,bcm53128 + - const: brcm,bcm53134 - const: brcm,bcm5365 - const: brcm,bcm5395 - const: brcm,bcm5389 @@ -57,8 +58,11 @@ properties: - items: - enum: - brcm,bcm3384-switch + - brcm,bcm6318-switch - brcm,bcm6328-switch + - brcm,bcm6362-switch - brcm,bcm6368-switch + - brcm,bcm63268-switch - const: brcm,bcm63xx-switch required: diff --git a/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml b/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml index eed16e216fb6..c745407f2f68 100644 --- a/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml +++ b/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml @@ -76,12 +76,6 @@ properties: supports reporting the number of packets in-flight in a switch queue type: boolean - "#address-cells": - const: 1 - - "#size-cells": - const: 0 - ports: type: object @@ -99,11 +93,9 @@ properties: required: - reg - interrupts - - "#address-cells" - - "#size-cells" allOf: - - $ref: "dsa.yaml#" + - $ref: dsa.yaml# - if: properties: compatible: @@ -145,8 +137,6 @@ examples: - | switch@f0b00000 { compatible = "brcm,bcm7445-switch-v4.0"; - #address-cells = <1>; - #size-cells = <0>; reg = <0xf0b00000 0x40000>, <0xf0b40000 0x110>, <0xf0b40340 0x30>, diff --git a/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml b/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml index 449ee0735012..e532c6b795f4 100644 --- a/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml +++ b/Documentation/devicetree/bindings/net/dsa/mediatek,mt7530.yaml @@ -11,16 +11,23 @@ maintainers: - Landen Chao <Landen.Chao@mediatek.com> - DENG Qingfang <dqfext@gmail.com> - Sean Wang <sean.wang@mediatek.com> + - Daniel Golle <daniel@makrotopia.org> description: | - There are two versions of MT7530, standalone and in a multi-chip module. + There are three versions of MT7530, standalone, in a multi-chip module and + built-into a SoC. MT7530 is a part of the multi-chip module in MT7620AN, MT7620DA, MT7620DAN, MT7620NN, MT7621AT, MT7621DAT, MT7621ST and MT7623AI SoCs. + The MT7988 SoC comes with a built-in switch similar to MT7531 as well as four + Gigabit Ethernet PHYs. The switch registers are directly mapped into the SoC's + memory map rather than using MDIO. The switch got an internally connected 10G + CPU port and 4 user ports connected to the built-in Gigabit Ethernet PHYs. + MT7530 in MT7620AN, MT7620DA, MT7620DAN and MT7620NN SoCs has got 10/100 PHYs and the switch registers are directly mapped into SoC's memory map rather than - using MDIO. The DSA driver currently doesn't support this. + using MDIO. The DSA driver currently doesn't support MT7620 variants. There is only the standalone version of MT7531. @@ -81,6 +88,10 @@ properties: Multi-chip module MT7530 in MT7621AT, MT7621DAT and MT7621ST SoCs const: mediatek,mt7621 + - description: + Built-in switch of the MT7988 SoC + const: mediatek,mt7988-switch + reg: maxItems: 1 @@ -93,7 +104,7 @@ properties: gpio-controller: type: boolean - description: + description: | If defined, LED controller of the MT7530 switch will run on GPIO mode. There are 15 controllable pins. @@ -112,7 +123,7 @@ properties: maxItems: 1 io-supply: - description: + description: | Phandle to the regulator node necessary for the I/O power. See Documentation/devicetree/bindings/regulator/mt6323-regulator.txt for details for the regulator setup on these boards. @@ -124,7 +135,7 @@ properties: switch is a part of the multi-chip module. reset-gpios: - description: + description: | GPIO to reset the switch. Use this if mediatek,mcm is not used. This property is optional because some boards share the reset line with other components which makes it impossible to probe the switch if the @@ -268,6 +279,17 @@ allOf: required: - mediatek,mcm + - if: + properties: + compatible: + const: mediatek,mt7988-switch + then: + $ref: "#/$defs/mt7530-dsa-port" + properties: + gpio-controller: false + mediatek,mcm: false + reset-names: false + unevaluatedProperties: false examples: diff --git a/Documentation/devicetree/bindings/net/dsa/qca8k.yaml b/Documentation/devicetree/bindings/net/dsa/qca8k.yaml index 389892592aac..df64eebebe18 100644 --- a/Documentation/devicetree/bindings/net/dsa/qca8k.yaml +++ b/Documentation/devicetree/bindings/net/dsa/qca8k.yaml @@ -18,6 +18,8 @@ description: PHY it is connected to. In this config, an internal mdio-bus is registered and the MDIO master is used for communication. Mixed external and internal mdio-bus configurations are not supported by the hardware. + Each phy has at most 3 LEDs connected and can be declared + using the standard LEDs structure. properties: compatible: @@ -66,7 +68,7 @@ properties: With the legacy mapping the reg corresponding to the internal mdio is the switch reg with an offset of -1. -$ref: "dsa.yaml#" +$ref: dsa.yaml# patternProperties: "^(ethernet-)?ports$": @@ -117,6 +119,7 @@ unevaluatedProperties: false examples: - | #include <dt-bindings/gpio/gpio.h> + #include <dt-bindings/leds/common.h> mdio { #address-cells = <1>; @@ -226,6 +229,25 @@ examples: label = "lan1"; phy-mode = "internal"; phy-handle = <&internal_phy_port1>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = <LED_COLOR_ID_WHITE>; + function = LED_FUNCTION_LAN; + default-state = "keep"; + }; + + led@1 { + reg = <1>; + color = <LED_COLOR_ID_AMBER>; + function = LED_FUNCTION_LAN; + default-state = "keep"; + }; + }; }; port@2 { diff --git a/Documentation/devicetree/bindings/net/engleder,tsnep.yaml b/Documentation/devicetree/bindings/net/engleder,tsnep.yaml index 4116667133ce..82a5d7927ca4 100644 --- a/Documentation/devicetree/bindings/net/engleder,tsnep.yaml +++ b/Documentation/devicetree/bindings/net/engleder,tsnep.yaml @@ -62,7 +62,7 @@ properties: mdio: type: object - $ref: "mdio.yaml#" + $ref: mdio.yaml# description: optional node for embedded MDIO controller required: diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml b/Documentation/devicetree/bindings/net/ethernet-controller.yaml index 00be387984ac..6b0d359367da 100644 --- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml @@ -205,7 +205,7 @@ properties: duplex is assumed. pause: - $ref: /schemas/types.yaml#definitions/flag + $ref: /schemas/types.yaml#/definitions/flag description: Indicates that pause should be enabled. @@ -222,6 +222,41 @@ properties: required: - speed + leds: + description: + Describes the LEDs associated by Ethernet Controller. + These LEDs are not integrated in the PHY and PHY doesn't have any + control on them. Ethernet Controller regs are used to control + these defined LEDs. + + type: object + + properties: + '#address-cells': + const: 1 + + '#size-cells': + const: 0 + + patternProperties: + '^led@[a-f0-9]+$': + $ref: /schemas/leds/common.yaml# + + properties: + reg: + maxItems: 1 + description: + This define the LED index in the PHY or the MAC. It's really + driver dependent and required for ports that define multiple + LED for the same port. + + required: + - reg + + unevaluatedProperties: false + + additionalProperties: false + dependencies: pcs-handle-names: [pcs-handle] diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml index 1327b81f15a2..4f574532ee13 100644 --- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml @@ -83,7 +83,7 @@ properties: 0: Disable 2.4 Vpp operating mode. 1: Request 2.4 Vpp operating mode from link partner. Absence of this property will leave configuration to default values. - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 enum: [0, 1] broken-turn-around: @@ -197,6 +197,35 @@ properties: PHY's that have configurable TX internal delays. If this property is present then the PHY applies the TX delay. + leds: + type: object + + properties: + '#address-cells': + const: 1 + + '#size-cells': + const: 0 + + patternProperties: + '^led@[a-f0-9]+$': + $ref: /schemas/leds/common.yaml# + + properties: + reg: + maxItems: 1 + description: + This define the LED index in the PHY or the MAC. It's really + driver dependent and required for ports that define multiple + LED for the same port. + + required: + - reg + + unevaluatedProperties: false + + additionalProperties: false + required: - reg @@ -204,6 +233,8 @@ additionalProperties: true examples: - | + #include <dt-bindings/leds/common.h> + ethernet { #address-cells = <1>; #size-cells = <0>; @@ -219,5 +250,17 @@ examples: reset-gpios = <&gpio1 4 1>; reset-assert-us = <1000>; reset-deassert-us = <2000>; + + leds { + #address-cells = <1>; + #size-cells = <0>; + + led@0 { + reg = <0>; + color = <LED_COLOR_ID_WHITE>; + function = LED_FUNCTION_LAN; + default-state = "keep"; + }; + }; }; }; diff --git a/Documentation/devicetree/bindings/net/ethernet-switch.yaml b/Documentation/devicetree/bindings/net/ethernet-switch.yaml index a04f8ef744aa..f1b9075dc7fb 100644 --- a/Documentation/devicetree/bindings/net/ethernet-switch.yaml +++ b/Documentation/devicetree/bindings/net/ethernet-switch.yaml @@ -40,6 +40,10 @@ patternProperties: type: object description: Ethernet switch ports + required: + - "#address-cells" + - "#size-cells" + oneOf: - required: - ports @@ -51,7 +55,7 @@ additionalProperties: true $defs: base: description: An ethernet switch without any extra port properties - $ref: '#/' + $ref: '#' patternProperties: "^(ethernet-)?port@[0-9]+$": diff --git a/Documentation/devicetree/bindings/net/fsl,fec.yaml b/Documentation/devicetree/bindings/net/fsl,fec.yaml index e6f2045f05de..b494e009326e 100644 --- a/Documentation/devicetree/bindings/net/fsl,fec.yaml +++ b/Documentation/devicetree/bindings/net/fsl,fec.yaml @@ -144,6 +144,9 @@ properties: description: Regulator that powers the Ethernet PHY. + power-domains: + maxItems: 1 + fsl,num-tx-queues: $ref: /schemas/types.yaml#/definitions/uint32 description: diff --git a/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml b/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml index 6e0763898d3a..a1b71b35319e 100644 --- a/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml +++ b/Documentation/devicetree/bindings/net/fsl,qoriq-mc-dpmac.yaml @@ -14,7 +14,7 @@ description: located under the 'dpmacs' node for the fsl-mc bus DTS node. allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml index 8b9b3f915d92..f92730b1d2fa 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp46x-ptp-timer.yaml @@ -2,8 +2,8 @@ # Copyright 2018 Linaro Ltd. %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/intel,ixp46x-ptp-timer.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/intel,ixp46x-ptp-timer.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP46x PTP Timer (TSYNC) diff --git a/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml b/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml index 4e1b79818aff..4fdc5328826c 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp4xx-ethernet.yaml @@ -2,13 +2,13 @@ # Copyright 2018 Linaro Ltd. %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/intel,ixp4xx-ethernet.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/intel,ixp4xx-ethernet.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx ethernet allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# maintainers: - Linus Walleij <linus.walleij@linaro.org> @@ -28,7 +28,7 @@ properties: description: Ethernet MMIO address range queue-rx: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the RX queue node @@ -36,7 +36,7 @@ properties: description: phandle to the RX queue on the NPE queue-txready: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the TX READY queue node @@ -48,7 +48,7 @@ properties: phy-handle: true intel,npe-handle: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the NPE this ethernet instance is using diff --git a/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml b/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml index e6329febb60c..7a405e9b37b2 100644 --- a/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml +++ b/Documentation/devicetree/bindings/net/intel,ixp4xx-hss.yaml @@ -2,8 +2,8 @@ # Copyright 2021 Linaro Ltd. %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/intel,ixp4xx-hss.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/intel,ixp4xx-hss.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Intel IXP4xx V.35 WAN High Speed Serial Link (HSS) @@ -24,7 +24,7 @@ properties: description: The HSS instance intel,npe-handle: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: items: - description: phandle to the NPE this HSS instance is using @@ -33,7 +33,7 @@ properties: and the instance to use in the second cell intel,queue-chl-rxtrig: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the RX trigger queue on the NPE @@ -41,7 +41,7 @@ properties: description: phandle to the RX trigger queue on the NPE intel,queue-chl-txready: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the TX ready queue on the NPE @@ -49,7 +49,7 @@ properties: description: phandle to the TX ready queue on the NPE intel,queue-pkt-rx: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the RX queue on the NPE @@ -57,7 +57,7 @@ properties: description: phandle to the packet RX queue on the NPE intel,queue-pkt-tx: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 4 items: items: @@ -66,7 +66,7 @@ properties: description: phandle to the packet TX0, TX1, TX2 and TX3 queues on the NPE intel,queue-pkt-rxfree: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 4 items: items: @@ -76,7 +76,7 @@ properties: RXFREE3 queues on the NPE intel,queue-pkt-txdone: - $ref: '/schemas/types.yaml#/definitions/phandle-array' + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the TXDONE queue on the NPE diff --git a/Documentation/devicetree/bindings/net/marvell,mvusb.yaml b/Documentation/devicetree/bindings/net/marvell,mvusb.yaml index 8e288ab38fd7..3a3325168048 100644 --- a/Documentation/devicetree/bindings/net/marvell,mvusb.yaml +++ b/Documentation/devicetree/bindings/net/marvell,mvusb.yaml @@ -20,7 +20,7 @@ description: |+ definition. allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/marvell-bluetooth.yaml b/Documentation/devicetree/bindings/net/marvell-bluetooth.yaml index 309ef21a1e37..188a42ca6ceb 100644 --- a/Documentation/devicetree/bindings/net/marvell-bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/marvell-bluetooth.yaml @@ -1,8 +1,8 @@ # SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/marvell-bluetooth.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/marvell-bluetooth.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Marvell Bluetooth chips @@ -15,11 +15,29 @@ maintainers: properties: compatible: - const: mrvl,88w8897 + enum: + - mrvl,88w8897 + - mrvl,88w8997 + + max-speed: + description: see Documentation/devicetree/bindings/serial/serial.yaml required: - compatible +allOf: + - if: + properties: + compatible: + contains: + const: mrvl,88w8997 + then: + properties: + max-speed: true + else: + properties: + max-speed: false + additionalProperties: false examples: diff --git a/Documentation/devicetree/bindings/net/mdio-gpio.yaml b/Documentation/devicetree/bindings/net/mdio-gpio.yaml index 1d83b8dcce2c..dca1aec119e3 100644 --- a/Documentation/devicetree/bindings/net/mdio-gpio.yaml +++ b/Documentation/devicetree/bindings/net/mdio-gpio.yaml @@ -12,7 +12,7 @@ maintainers: - Russell King <linux@armlinux.org.uk> allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/mediatek,net.yaml b/Documentation/devicetree/bindings/net/mediatek,net.yaml index 7ef696204c5a..acb2b2ac4fe1 100644 --- a/Documentation/devicetree/bindings/net/mediatek,net.yaml +++ b/Documentation/devicetree/bindings/net/mediatek,net.yaml @@ -21,6 +21,7 @@ properties: - mediatek,mt7623-eth - mediatek,mt7622-eth - mediatek,mt7629-eth + - mediatek,mt7981-eth - mediatek,mt7986-eth - ralink,rt5350-eth @@ -78,6 +79,11 @@ properties: description: List of phandles to wireless ethernet dispatch nodes. + mediatek,wed-pcie: + $ref: /schemas/types.yaml#/definitions/phandle + description: + Phandle to the mediatek wed-pcie controller. + dma-coherent: true mdio-bus: @@ -91,7 +97,7 @@ properties: const: 0 allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# - if: properties: compatible: @@ -123,6 +129,8 @@ allOf: mediatek,wed: false + mediatek,wed-pcie: false + - if: properties: compatible: @@ -160,6 +168,8 @@ allOf: description: Phandle to the mediatek pcie-mirror controller. + mediatek,wed-pcie: false + - if: properties: compatible: @@ -206,6 +216,44 @@ allOf: mediatek,wed: false + mediatek,wed-pcie: false + + - if: + properties: + compatible: + contains: + const: mediatek,mt7981-eth + then: + properties: + interrupts: + minItems: 4 + + clocks: + minItems: 15 + maxItems: 15 + + clock-names: + items: + - const: fe + - const: gp2 + - const: gp1 + - const: wocpu0 + - const: sgmii_ck + - const: sgmii_tx250m + - const: sgmii_rx250m + - const: sgmii_cdr_ref + - const: sgmii_cdr_fb + - const: sgmii2_tx250m + - const: sgmii2_rx250m + - const: sgmii2_cdr_ref + - const: sgmii2_cdr_fb + - const: netsys0 + - const: netsys1 + + mediatek,sgmiisys: + minItems: 2 + maxItems: 2 + - if: properties: compatible: @@ -242,11 +290,6 @@ allOf: minItems: 2 maxItems: 2 - mediatek,wed-pcie: - $ref: /schemas/types.yaml#/definitions/phandle - description: - Phandle to the mediatek wed-pcie controller. - patternProperties: "^mac@[0-1]$": type: object diff --git a/Documentation/devicetree/bindings/net/mediatek,star-emac.yaml b/Documentation/devicetree/bindings/net/mediatek,star-emac.yaml index 64c893c98d80..2e889f9a563e 100644 --- a/Documentation/devicetree/bindings/net/mediatek,star-emac.yaml +++ b/Documentation/devicetree/bindings/net/mediatek,star-emac.yaml @@ -15,7 +15,7 @@ description: modes with flow-control as well as CRC offloading and VLAN tags. allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml index dc116f14750e..306ef9ecf2b9 100644 --- a/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml +++ b/Documentation/devicetree/bindings/net/microchip,lan966x-switch.yaml @@ -73,7 +73,7 @@ properties: "^port@[0-9a-f]+$": type: object - $ref: "/schemas/net/ethernet-controller.yaml#" + $ref: /schemas/net/ethernet-controller.yaml# unevaluatedProperties: false properties: diff --git a/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml b/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml index 57ffeb8fc876..fcafef8d5a33 100644 --- a/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml +++ b/Documentation/devicetree/bindings/net/microchip,sparx5-switch.yaml @@ -99,7 +99,7 @@ properties: microchip,bandwidth: description: Specifies bandwidth in Mbit/s allocated to the port. - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 maximum: 25000 microchip,sd-sgpio: @@ -107,7 +107,7 @@ properties: Index of the ports Signal Detect SGPIO in the set of 384 SGPIOs This is optional, and only needed if the default used index is is not correct. - $ref: "/schemas/types.yaml#/definitions/uint32" + $ref: /schemas/types.yaml#/definitions/uint32 minimum: 0 maximum: 383 diff --git a/Documentation/devicetree/bindings/net/mscc,miim.yaml b/Documentation/devicetree/bindings/net/mscc,miim.yaml index 2c451cfa4e0b..5b292e7c9e46 100644 --- a/Documentation/devicetree/bindings/net/mscc,miim.yaml +++ b/Documentation/devicetree/bindings/net/mscc,miim.yaml @@ -10,7 +10,7 @@ maintainers: - Alexandre Belloni <alexandre.belloni@bootlin.com> allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/nfc/marvell,nci.yaml b/Documentation/devicetree/bindings/net/nfc/marvell,nci.yaml index 308485a8ee6c..8e9a95f24c80 100644 --- a/Documentation/devicetree/bindings/net/nfc/marvell,nci.yaml +++ b/Documentation/devicetree/bindings/net/nfc/marvell,nci.yaml @@ -28,7 +28,7 @@ properties: maxItems: 1 reset-n-io: - $ref: "/schemas/types.yaml#/definitions/phandle-array" + $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 1 description: | Output GPIO pin used to reset the chip (active low) diff --git a/Documentation/devicetree/bindings/net/nfc/nxp,pn532.yaml b/Documentation/devicetree/bindings/net/nfc/nxp,pn532.yaml index 0509e0166345..07c67c1e985f 100644 --- a/Documentation/devicetree/bindings/net/nfc/nxp,pn532.yaml +++ b/Documentation/devicetree/bindings/net/nfc/nxp,pn532.yaml @@ -31,7 +31,7 @@ required: - compatible dependencies: - interrupts: [ 'reg' ] + interrupts: [ reg ] additionalProperties: false diff --git a/Documentation/devicetree/bindings/net/pcs/mediatek,sgmiisys.yaml b/Documentation/devicetree/bindings/net/pcs/mediatek,sgmiisys.yaml new file mode 100644 index 000000000000..66a95191bd77 --- /dev/null +++ b/Documentation/devicetree/bindings/net/pcs/mediatek,sgmiisys.yaml @@ -0,0 +1,55 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/pcs/mediatek,sgmiisys.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: MediaTek SGMIISYS Controller + +maintainers: + - Matthias Brugger <matthias.bgg@gmail.com> + +description: + The MediaTek SGMIISYS controller provides a SGMII PCS and some clocks + to the ethernet subsystem to which it is attached. + +properties: + compatible: + items: + - enum: + - mediatek,mt7622-sgmiisys + - mediatek,mt7629-sgmiisys + - mediatek,mt7981-sgmiisys_0 + - mediatek,mt7981-sgmiisys_1 + - mediatek,mt7986-sgmiisys_0 + - mediatek,mt7986-sgmiisys_1 + - const: syscon + + reg: + maxItems: 1 + + '#clock-cells': + const: 1 + + mediatek,pnswap: + description: Invert polarity of the SGMII data lanes + type: boolean + +required: + - compatible + - reg + - '#clock-cells' + +additionalProperties: false + +examples: + - | + soc { + #address-cells = <2>; + #size-cells = <2>; + sgmiisys: syscon@1b128000 { + compatible = "mediatek,mt7622-sgmiisys", "syscon"; + reg = <0 0x1b128000 0 0x1000>; + #clock-cells = <1>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/pse-pd/podl-pse-regulator.yaml b/Documentation/devicetree/bindings/net/pse-pd/podl-pse-regulator.yaml index c6b1c188abf7..94a527e6aa1b 100644 --- a/Documentation/devicetree/bindings/net/pse-pd/podl-pse-regulator.yaml +++ b/Documentation/devicetree/bindings/net/pse-pd/podl-pse-regulator.yaml @@ -13,7 +13,7 @@ description: Regulator based PoDL PSE controller. The device must be referenced by the PHY node to control power injection to the Ethernet cable. allOf: - - $ref: "pse-controller.yaml#" + - $ref: pse-controller.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.txt b/Documentation/devicetree/bindings/net/qcom,ethqos.txt deleted file mode 100644 index 1f5746849a71..000000000000 --- a/Documentation/devicetree/bindings/net/qcom,ethqos.txt +++ /dev/null @@ -1,66 +0,0 @@ -Qualcomm Ethernet ETHQOS device - -This documents dwmmac based ethernet device which supports Gigabit -ethernet for version v2.3.0 onwards. - -This device has following properties: - -Required properties: - -- compatible: Should be one of: - "qcom,qcs404-ethqos" - "qcom,sm8150-ethqos" - -- reg: Address and length of the register set for the device - -- reg-names: Should contain register names "stmmaceth", "rgmii" - -- clocks: Should contain phandle to clocks - -- clock-names: Should contain clock names "stmmaceth", "pclk", - "ptp_ref", "rgmii" - -- interrupts: Should contain phandle to interrupts - -- interrupt-names: Should contain interrupt names "macirq", "eth_lpi" - -Rest of the properties are defined in stmmac.txt file in same directory - - -Example: - -ethernet: ethernet@7a80000 { - compatible = "qcom,qcs404-ethqos"; - reg = <0x07a80000 0x10000>, - <0x07a96000 0x100>; - reg-names = "stmmaceth", "rgmii"; - clock-names = "stmmaceth", "pclk", "ptp_ref", "rgmii"; - clocks = <&gcc GCC_ETH_AXI_CLK>, - <&gcc GCC_ETH_SLAVE_AHB_CLK>, - <&gcc GCC_ETH_PTP_CLK>, - <&gcc GCC_ETH_RGMII_CLK>; - interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>; - interrupt-names = "macirq", "eth_lpi"; - snps,reset-gpio = <&tlmm 60 GPIO_ACTIVE_LOW>; - snps,reset-active-low; - - snps,txpbl = <8>; - snps,rxpbl = <2>; - snps,aal; - snps,tso; - - phy-handle = <&phy1>; - phy-mode = "rgmii"; - - mdio { - #address-cells = <0x1>; - #size-cells = <0x0>; - compatible = "snps,dwmac-mdio"; - phy1: phy@4 { - device_type = "ethernet-phy"; - reg = <0x4>; - }; - }; - -}; diff --git a/Documentation/devicetree/bindings/net/qcom,ethqos.yaml b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml new file mode 100644 index 000000000000..60a38044fb19 --- /dev/null +++ b/Documentation/devicetree/bindings/net/qcom,ethqos.yaml @@ -0,0 +1,111 @@ +# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/qcom,ethqos.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Ethernet ETHQOS device + +maintainers: + - Bhupesh Sharma <bhupesh.sharma@linaro.org> + +description: + dwmmac based Qualcomm ethernet devices which support Gigabit + ethernet (version v2.3.0 and onwards). + +allOf: + - $ref: snps,dwmac.yaml# + +properties: + compatible: + enum: + - qcom,qcs404-ethqos + - qcom,sc8280xp-ethqos + - qcom,sm8150-ethqos + + reg: + maxItems: 2 + + reg-names: + items: + - const: stmmaceth + - const: rgmii + + interrupts: + items: + - description: Combined signal for various interrupt events + - description: The interrupt that occurs when Rx exits the LPI state + + interrupt-names: + items: + - const: macirq + - const: eth_lpi + + clocks: + maxItems: 4 + + clock-names: + items: + - const: stmmaceth + - const: pclk + - const: ptp_ref + - const: rgmii + + iommus: + maxItems: 1 + +required: + - compatible + - clocks + - clock-names + - reg-names + +unevaluatedProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/clock/qcom,gcc-qcs404.h> + #include <dt-bindings/gpio/gpio.h> + + ethernet: ethernet@7a80000 { + compatible = "qcom,qcs404-ethqos"; + reg = <0x07a80000 0x10000>, + <0x07a96000 0x100>; + reg-names = "stmmaceth", "rgmii"; + clock-names = "stmmaceth", "pclk", "ptp_ref", "rgmii"; + clocks = <&gcc GCC_ETH_AXI_CLK>, + <&gcc GCC_ETH_SLAVE_AHB_CLK>, + <&gcc GCC_ETH_PTP_CLK>, + <&gcc GCC_ETH_RGMII_CLK>; + interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "macirq", "eth_lpi"; + + rx-fifo-depth = <4096>; + tx-fifo-depth = <4096>; + + snps,tso; + snps,reset-gpio = <&tlmm 60 GPIO_ACTIVE_LOW>; + snps,reset-active-low; + snps,reset-delays-us = <0 10000 10000>; + + pinctrl-names = "default"; + pinctrl-0 = <ðernet_defaults>; + + phy-handle = <&phy1>; + phy-mode = "rgmii"; + mdio { + #address-cells = <0x1>; + #size-cells = <0x0>; + + compatible = "snps,dwmac-mdio"; + phy1: phy@4 { + compatible = "ethernet-phy-ieee802.3-c22"; + device_type = "ethernet-phy"; + reg = <0x4>; + + #phy-cells = <0>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml index 4aeda379726f..2d5e4ffb2f9e 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml @@ -49,6 +49,7 @@ properties: - qcom,sc7280-ipa - qcom,sdm845-ipa - qcom,sdx55-ipa + - qcom,sdx65-ipa - qcom,sm6350-ipa - qcom,sm8350-ipa diff --git a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml index 7631ecc8fd01..3407e909e8a7 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipq4019-mdio.yaml @@ -51,7 +51,7 @@ required: - "#size-cells" allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# - if: properties: diff --git a/Documentation/devicetree/bindings/net/qcom,ipq8064-mdio.yaml b/Documentation/devicetree/bindings/net/qcom,ipq8064-mdio.yaml index d7748dd33199..164704338ef0 100644 --- a/Documentation/devicetree/bindings/net/qcom,ipq8064-mdio.yaml +++ b/Documentation/devicetree/bindings/net/qcom,ipq8064-mdio.yaml @@ -14,7 +14,7 @@ description: used to communicate with the gmac phy connected. allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: @@ -53,7 +53,9 @@ examples: reg = <0x10>; ports { - /* ... */ + #address-cells = <1>; + #size-cells = <0>; + /* ... */ }; }; }; diff --git a/Documentation/devicetree/bindings/net/realtek-bluetooth.yaml b/Documentation/devicetree/bindings/net/realtek-bluetooth.yaml index 143b5667abad..8cc2b9924680 100644 --- a/Documentation/devicetree/bindings/net/realtek-bluetooth.yaml +++ b/Documentation/devicetree/bindings/net/realtek-bluetooth.yaml @@ -4,24 +4,30 @@ $id: http://devicetree.org/schemas/net/realtek-bluetooth.yaml# $schema: http://devicetree.org/meta-schemas/core.yaml# -title: RTL8723BS/RTL8723CS/RTL8822CS Bluetooth +title: RTL8723BS/RTL8723CS/RTL8821CS/RTL8822CS Bluetooth maintainers: - Vasily Khoruzhick <anarsoul@gmail.com> - Alistair Francis <alistair@alistair23.me> description: - RTL8723CS/RTL8723CS/RTL8822CS is WiFi + BT chip. WiFi part is connected over - SDIO, while BT is connected over serial. It speaks H5 protocol with few - extra commands to upload firmware and change module speed. + RTL8723CS/RTL8723CS/RTL8821CS/RTL8822CS is a WiFi + BT chip. WiFi part + is connected over SDIO, while BT is connected over serial. It speaks + H5 protocol with few extra commands to upload firmware and change + module speed. properties: compatible: - enum: - - realtek,rtl8723bs-bt - - realtek,rtl8723cs-bt - - realtek,rtl8723ds-bt - - realtek,rtl8822cs-bt + oneOf: + - enum: + - realtek,rtl8723bs-bt + - realtek,rtl8723cs-bt + - realtek,rtl8723ds-bt + - realtek,rtl8822cs-bt + - items: + - enum: + - realtek,rtl8821cs-bt + - const: realtek,rtl8822cs-bt device-wake-gpios: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/rockchip,emac.yaml b/Documentation/devicetree/bindings/net/rockchip,emac.yaml index a6d4f14df442..364028b3bba4 100644 --- a/Documentation/devicetree/bindings/net/rockchip,emac.yaml +++ b/Documentation/devicetree/bindings/net/rockchip,emac.yaml @@ -61,7 +61,7 @@ required: - mdio allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# - if: properties: compatible: diff --git a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml index 04936632fcbb..2a21bbe02892 100644 --- a/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/rockchip-dwmac.yaml @@ -1,8 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/rockchip-dwmac.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/rockchip-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Rockchip 10/100/1000 Ethernet driver(GMAC) diff --git a/Documentation/devicetree/bindings/net/sff,sfp.yaml b/Documentation/devicetree/bindings/net/sff,sfp.yaml index 231c4d75e4b1..973e478a399d 100644 --- a/Documentation/devicetree/bindings/net/sff,sfp.yaml +++ b/Documentation/devicetree/bindings/net/sff,sfp.yaml @@ -1,8 +1,8 @@ # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/sff,sfp.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/sff,sfp.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Small Form Factor (SFF) Committee Small Form-factor Pluggable (SFP) Transceiver diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml b/Documentation/devicetree/bindings/net/snps,dwmac.yaml index 16b7d2904696..363b3e3ea3a6 100644 --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -30,6 +30,7 @@ select: - snps,dwmac-4.10a - snps,dwmac-4.20a - snps,dwmac-5.10a + - snps,dwmac-5.20 - snps,dwxgmac - snps,dwxgmac-2.10 @@ -65,6 +66,9 @@ properties: - ingenic,x2000-mac - loongson,ls2k-dwmac - loongson,ls7a-dwmac + - qcom,qcs404-ethqos + - qcom,sc8280xp-ethqos + - qcom,sm8150-ethqos - renesas,r9a06g032-gmac - renesas,rzn1-gmac - rockchip,px30-gmac @@ -87,8 +91,10 @@ properties: - snps,dwmac-4.10a - snps,dwmac-4.20a - snps,dwmac-5.10a + - snps,dwmac-5.20 - snps,dwxgmac - snps,dwxgmac-2.10 + - starfive,jh7110-dwmac reg: minItems: 1 @@ -105,7 +111,7 @@ properties: minItems: 1 items: - const: macirq - - const: eth_wake_irq + - enum: [eth_wake_irq, eth_lpi] - const: eth_lpi clocks: @@ -131,12 +137,16 @@ properties: - ptp_ref resets: - maxItems: 1 - description: - MAC Reset signal. + minItems: 1 + items: + - description: GMAC stmmaceth reset + - description: AHB reset reset-names: - const: stmmaceth + minItems: 1 + items: + - const: stmmaceth + - const: ahb power-domains: maxItems: 1 @@ -555,7 +565,7 @@ dependencies: snps,reset-delays-us: ["snps,reset-gpio"] allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# - if: properties: compatible: @@ -572,9 +582,11 @@ allOf: - ingenic,x1600-mac - ingenic,x1830-mac - ingenic,x2000-mac + - qcom,sc8280xp-ethqos - snps,dwmac-3.50a - snps,dwmac-4.10a - snps,dwmac-4.20a + - snps,dwmac-5.20 - snps,dwxgmac - snps,dwxgmac-2.10 - st,spear600-gmac @@ -625,10 +637,14 @@ allOf: - ingenic,x1600-mac - ingenic,x1830-mac - ingenic,x2000-mac + - qcom,qcs404-ethqos + - qcom,sc8280xp-ethqos + - qcom,sm8150-ethqos - snps,dwmac-4.00 - snps,dwmac-4.10a - snps,dwmac-4.20a - snps,dwmac-5.10a + - snps,dwmac-5.20 - snps,dwxgmac - snps,dwxgmac-2.10 - st,spear600-gmac diff --git a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml new file mode 100644 index 000000000000..5e7cfbbebce6 --- /dev/null +++ b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml @@ -0,0 +1,144 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +# Copyright (C) 2022 StarFive Technology Co., Ltd. +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/starfive,jh7110-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: StarFive JH7110 DWMAC glue layer + +maintainers: + - Emil Renner Berthing <kernel@esmil.dk> + - Samin Guo <samin.guo@starfivetech.com> + +select: + properties: + compatible: + contains: + enum: + - starfive,jh7110-dwmac + required: + - compatible + +properties: + compatible: + items: + - enum: + - starfive,jh7110-dwmac + - const: snps,dwmac-5.20 + + reg: + maxItems: 1 + + clocks: + items: + - description: GMAC main clock + - description: GMAC AHB clock + - description: PTP clock + - description: TX clock + - description: GTX clock + + clock-names: + items: + - const: stmmaceth + - const: pclk + - const: ptp_ref + - const: tx + - const: gtx + + interrupts: + minItems: 3 + maxItems: 3 + + interrupt-names: + minItems: 3 + maxItems: 3 + + resets: + items: + - description: MAC Reset signal. + - description: AHB Reset signal. + + reset-names: + items: + - const: stmmaceth + - const: ahb + + starfive,tx-use-rgmii-clk: + description: + Tx clock is provided by external rgmii clock. + type: boolean + + starfive,syscon: + $ref: /schemas/types.yaml#/definitions/phandle-array + items: + - items: + - description: phandle to syscon that configures phy mode + - description: Offset of phy mode selection + - description: Shift of phy mode selection + description: + A phandle to syscon with two arguments that configure phy mode. + The argument one is the offset of phy mode selection, the + argument two is the shift of phy mode selection. + +required: + - compatible + - reg + - clocks + - clock-names + - interrupts + - interrupt-names + - resets + - reset-names + +allOf: + - $ref: snps,dwmac.yaml# + +unevaluatedProperties: false + +examples: + - | + ethernet@16030000 { + compatible = "starfive,jh7110-dwmac", "snps,dwmac-5.20"; + reg = <0x16030000 0x10000>; + clocks = <&clk 3>, <&clk 2>, <&clk 109>, + <&clk 6>, <&clk 111>; + clock-names = "stmmaceth", "pclk", "ptp_ref", + "tx", "gtx"; + resets = <&rst 1>, <&rst 2>; + reset-names = "stmmaceth", "ahb"; + interrupts = <7>, <6>, <5>; + interrupt-names = "macirq", "eth_wake_irq", "eth_lpi"; + phy-mode = "rgmii-id"; + snps,multicast-filter-bins = <64>; + snps,perfect-filter-entries = <8>; + rx-fifo-depth = <2048>; + tx-fifo-depth = <2048>; + snps,fixed-burst; + snps,no-pbl-x8; + snps,tso; + snps,force_thresh_dma_mode; + snps,axi-config = <&stmmac_axi_setup>; + snps,en-tx-lpi-clockgating; + snps,txpbl = <16>; + snps,rxpbl = <16>; + starfive,syscon = <&aon_syscon 0xc 0x12>; + phy-handle = <&phy0>; + + mdio { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,dwmac-mdio"; + + phy0: ethernet-phy@0 { + reg = <0>; + }; + }; + + stmmac_axi_setup: stmmac-axi-config { + snps,lpi_en; + snps,wr_osr_lmt = <4>; + snps,rd_osr_lmt = <4>; + snps,blen = <256 128 64 32 0 0 0>; + }; + }; diff --git a/Documentation/devicetree/bindings/net/stm32-dwmac.yaml b/Documentation/devicetree/bindings/net/stm32-dwmac.yaml index 5c93167b3b41..fc8c96b08d7d 100644 --- a/Documentation/devicetree/bindings/net/stm32-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/stm32-dwmac.yaml @@ -2,8 +2,8 @@ # Copyright 2019 BayLibre, SAS %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/stm32-dwmac.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/stm32-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: STMicroelectronics STM32 / MCU DWMAC glue layer controller @@ -26,7 +26,7 @@ select: - compatible allOf: - - $ref: "snps,dwmac.yaml#" + - $ref: snps,dwmac.yaml# properties: compatible: @@ -73,7 +73,7 @@ properties: - ptp_ref st,syscon: - $ref: "/schemas/types.yaml#/definitions/phandle-array" + $ref: /schemas/types.yaml#/definitions/phandle-array items: - items: - description: phandle to the syscon node which encompases the glue register diff --git a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml index e36c7817be69..b04ac4966608 100644 --- a/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml +++ b/Documentation/devicetree/bindings/net/ti,cpsw-switch.yaml @@ -62,10 +62,10 @@ properties: interrupt-names: items: - - const: "rx_thresh" - - const: "rx" - - const: "tx" - - const: "misc" + - const: rx_thresh + - const: rx + - const: tx + - const: misc pinctrl-names: true @@ -154,7 +154,7 @@ patternProperties: type: object description: CPSW MDIO bus. - $ref: "ti,davinci-mdio.yaml#" + $ref: ti,davinci-mdio.yaml# required: diff --git a/Documentation/devicetree/bindings/net/ti,davinci-mdio.yaml b/Documentation/devicetree/bindings/net/ti,davinci-mdio.yaml index a339202c5e8e..53604fab0b73 100644 --- a/Documentation/devicetree/bindings/net/ti,davinci-mdio.yaml +++ b/Documentation/devicetree/bindings/net/ti,davinci-mdio.yaml @@ -13,7 +13,7 @@ description: TI SoC Davinci/Keystone2 MDIO Controller allOf: - - $ref: "mdio.yaml#" + - $ref: mdio.yaml# properties: compatible: diff --git a/Documentation/devicetree/bindings/net/ti,dp83822.yaml b/Documentation/devicetree/bindings/net/ti,dp83822.yaml index f2489a9c852f..db74474207ed 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83822.yaml +++ b/Documentation/devicetree/bindings/net/ti,dp83822.yaml @@ -2,8 +2,8 @@ # Copyright (C) 2020 Texas Instruments Incorporated %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/ti,dp83822.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/ti,dp83822.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: TI DP83822 ethernet PHY @@ -21,7 +21,7 @@ description: | http://www.ti.com/lit/ds/symlink/dp83822i.pdf allOf: - - $ref: "ethernet-phy.yaml#" + - $ref: ethernet-phy.yaml# properties: reg: diff --git a/Documentation/devicetree/bindings/net/ti,dp83867.yaml b/Documentation/devicetree/bindings/net/ti,dp83867.yaml index b8c0e4b5b494..4bc1f98fd9fe 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83867.yaml +++ b/Documentation/devicetree/bindings/net/ti,dp83867.yaml @@ -2,13 +2,13 @@ # Copyright (C) 2019 Texas Instruments Incorporated %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/ti,dp83867.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/ti,dp83867.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: TI DP83867 ethernet PHY allOf: - - $ref: "ethernet-controller.yaml#" + - $ref: ethernet-controller.yaml# maintainers: - Andrew Davis <afd@ti.com> diff --git a/Documentation/devicetree/bindings/net/ti,dp83869.yaml b/Documentation/devicetree/bindings/net/ti,dp83869.yaml index b04ff0014a59..fb6725df4668 100644 --- a/Documentation/devicetree/bindings/net/ti,dp83869.yaml +++ b/Documentation/devicetree/bindings/net/ti,dp83869.yaml @@ -2,13 +2,13 @@ # Copyright (C) 2019 Texas Instruments Incorporated %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/ti,dp83869.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/ti,dp83869.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: TI DP83869 ethernet PHY allOf: - - $ref: "ethernet-phy.yaml#" + - $ref: ethernet-phy.yaml# maintainers: - Andrew Davis <afd@ti.com> diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml index 837d6db299c4..395a4650e285 100644 --- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml +++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml @@ -54,11 +54,12 @@ properties: compatible: enum: + - ti,am642-cpsw-nuss - ti,am654-cpsw-nuss - ti,j7200-cpswxg-nuss - ti,j721e-cpsw-nuss - ti,j721e-cpswxg-nuss - - ti,am642-cpsw-nuss + - ti,j784s4-cpswxg-nuss reg: maxItems: 1 @@ -126,8 +127,18 @@ properties: description: CPSW port number phys: - maxItems: 1 - description: phandle on phy-gmii-sel PHY + minItems: 1 + items: + - description: CPSW MAC's PHY. + - description: Serdes PHY. Serdes PHY is required only if + the Serdes has to be configured in the + Single-Link configuration. + + phy-names: + minItems: 1 + items: + - const: mac + - const: serdes label: description: label associated with this port @@ -187,7 +198,9 @@ allOf: properties: compatible: contains: - const: ti,j721e-cpswxg-nuss + enum: + - ti,j721e-cpswxg-nuss + - ti,j784s4-cpswxg-nuss then: properties: ethernet-ports: @@ -205,8 +218,9 @@ allOf: compatible: contains: enum: - - ti,j721e-cpswxg-nuss - ti,j7200-cpswxg-nuss + - ti,j721e-cpswxg-nuss + - ti,j784s4-cpswxg-nuss then: properties: ethernet-ports: diff --git a/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml index 0988ed8d1c12..474fa8bcf302 100644 --- a/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml +++ b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml @@ -1,8 +1,8 @@ # SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/toshiba,visconti-dwmac.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/toshiba,visconti-dwmac.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: Toshiba Visconti DWMAC Ethernet controller diff --git a/Documentation/devicetree/bindings/net/vertexcom-mse102x.yaml b/Documentation/devicetree/bindings/net/vertexcom-mse102x.yaml index 6a71f694cb55..4c4ced8cfa4b 100644 --- a/Documentation/devicetree/bindings/net/vertexcom-mse102x.yaml +++ b/Documentation/devicetree/bindings/net/vertexcom-mse102x.yaml @@ -1,8 +1,8 @@ # SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) %YAML 1.2 --- -$id: "http://devicetree.org/schemas/net/vertexcom-mse102x.yaml#" -$schema: "http://devicetree.org/meta-schemas/core.yaml#" +$id: http://devicetree.org/schemas/net/vertexcom-mse102x.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# title: The Vertexcom MSE102x (SPI) diff --git a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml index 7d526ff53fb7..67b63f119f64 100644 --- a/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml +++ b/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.yaml @@ -111,6 +111,11 @@ properties: $ref: /schemas/leds/common.yaml# additionalProperties: false properties: + led-active-low: + description: + LED is enabled with ground signal. + type: boolean + led-sources: maxItems: 1 diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt deleted file mode 100644 index b61c2d5a0ff7..000000000000 --- a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.txt +++ /dev/null @@ -1,215 +0,0 @@ -* Qualcomm Atheros ath10k wireless devices - -Required properties: -- compatible: Should be one of the following: - * "qcom,ath10k" - * "qcom,ipq4019-wifi" - * "qcom,wcn3990-wifi" - -PCI based devices uses compatible string "qcom,ath10k" and takes calibration -data along with board specific data via "qcom,ath10k-calibration-data". -Rest of the properties are not applicable for PCI based devices. - -AHB based devices (i.e. ipq4019) uses compatible string "qcom,ipq4019-wifi" -and also uses most of the properties defined in this doc (except -"qcom,ath10k-calibration-data"). It uses "qcom,ath10k-pre-calibration-data" -to carry pre calibration data. - -In general, entry "qcom,ath10k-pre-calibration-data" and -"qcom,ath10k-calibration-data" conflict with each other and only one -can be provided per device. - -SNOC based devices (i.e. wcn3990) uses compatible string "qcom,wcn3990-wifi". - -- reg: Address and length of the register set for the device. -- reg-names: Must include the list of following reg names, - "membase" -- interrupts: reference to the list of 17 interrupt numbers for "qcom,ipq4019-wifi" - compatible target. - reference to the list of 12 interrupt numbers for "qcom,wcn3990-wifi" - compatible target. - Must contain interrupt-names property per entry for - "qcom,ath10k", "qcom,ipq4019-wifi" compatible targets. - -- interrupt-names: Must include the entries for MSI interrupt - names ("msi0" to "msi15") and legacy interrupt - name ("legacy") for "qcom,ath10k", "qcom,ipq4019-wifi" - compatible targets. - -Optional properties: -- resets: Must contain an entry for each entry in reset-names. - See ../reset/reseti.txt for details. -- reset-names: Must include the list of following reset names, - "wifi_cpu_init" - "wifi_radio_srif" - "wifi_radio_warm" - "wifi_radio_cold" - "wifi_core_warm" - "wifi_core_cold" -- clocks: List of clock specifiers, must contain an entry for each required - entry in clock-names. -- clock-names: Should contain the clock names "wifi_wcss_cmd", "wifi_wcss_ref", - "wifi_wcss_rtc" for "qcom,ipq4019-wifi" compatible target and - "cxo_ref_clk_pin" and optionally "qdss" for "qcom,wcn3990-wifi" - compatible target. -- qcom,msi_addr: MSI interrupt address. -- qcom,msi_base: Base value to add before writing MSI data into - MSI address register. -- qcom,ath10k-calibration-variant: string to search for in the board-2.bin - variant list with the same bus and device - specific ids -- qcom,ath10k-calibration-data : calibration data + board specific data - as an array, the length can vary between - hw versions. -- qcom,ath10k-pre-calibration-data : pre calibration data as an array, - the length can vary between hw versions. -- <supply-name>-supply: handle to the regulator device tree node - optional "supply-name" are "vdd-0.8-cx-mx", - "vdd-1.8-xo", "vdd-1.3-rfa", "vdd-3.3-ch0", - and "vdd-3.3-ch1". -- memory-region: - Usage: optional - Value type: <phandle> - Definition: reference to the reserved-memory for the msa region - used by the wifi firmware running in Q6. -- iommus: - Usage: optional - Value type: <prop-encoded-array> - Definition: A list of phandle and IOMMU specifier pairs. -- ext-fem-name: - Usage: Optional - Value type: string - Definition: Name of external front end module used. Some valid FEM names - for example: "microsemi-lx5586", "sky85703-11" - and "sky85803" etc. -- qcom,snoc-host-cap-8bit-quirk: - Usage: Optional - Value type: <empty> - Definition: Quirk specifying that the firmware expects the 8bit version - of the host capability QMI request -- qcom,xo-cal-data: xo cal offset to be configured in xo trim register. - -- qcom,msa-fixed-perm: Boolean context flag to disable SCM call for statically - mapped msa region. - -- qcom,coexist-support : should contain eithr "0" or "1" to indicate coex - support by the hardware. -- qcom,coexist-gpio-pin : gpio pin number information to support coex - which will be used by wifi firmware. - -* Subnodes -The ath10k wifi node can contain one optional firmware subnode. -Firmware subnode is needed when the platform does not have TustZone. -The firmware subnode must have: - -- iommus: - Usage: required - Value type: <prop-encoded-array> - Definition: A list of phandle and IOMMU specifier pairs. - - -Example (to supply PCI based wifi block details): - -In this example, the node is defined as child node of the PCI controller. - -pci { - pcie@0 { - reg = <0 0 0 0 0>; - #interrupt-cells = <1>; - #size-cells = <2>; - #address-cells = <3>; - device_type = "pci"; - - wifi@0,0 { - reg = <0 0 0 0 0>; - qcom,ath10k-calibration-data = [ 01 02 03 ... ]; - ext-fem-name = "microsemi-lx5586"; - }; - }; -}; - -Example (to supply ipq4019 SoC wifi block details): - -wifi0: wifi@a000000 { - compatible = "qcom,ipq4019-wifi"; - reg = <0xa000000 0x200000>; - resets = <&gcc WIFI0_CPU_INIT_RESET>, - <&gcc WIFI0_RADIO_SRIF_RESET>, - <&gcc WIFI0_RADIO_WARM_RESET>, - <&gcc WIFI0_RADIO_COLD_RESET>, - <&gcc WIFI0_CORE_WARM_RESET>, - <&gcc WIFI0_CORE_COLD_RESET>; - reset-names = "wifi_cpu_init", - "wifi_radio_srif", - "wifi_radio_warm", - "wifi_radio_cold", - "wifi_core_warm", - "wifi_core_cold"; - clocks = <&gcc GCC_WCSS2G_CLK>, - <&gcc GCC_WCSS2G_REF_CLK>, - <&gcc GCC_WCSS2G_RTC_CLK>; - clock-names = "wifi_wcss_cmd", - "wifi_wcss_ref", - "wifi_wcss_rtc"; - interrupts = <0 0x20 0x1>, - <0 0x21 0x1>, - <0 0x22 0x1>, - <0 0x23 0x1>, - <0 0x24 0x1>, - <0 0x25 0x1>, - <0 0x26 0x1>, - <0 0x27 0x1>, - <0 0x28 0x1>, - <0 0x29 0x1>, - <0 0x2a 0x1>, - <0 0x2b 0x1>, - <0 0x2c 0x1>, - <0 0x2d 0x1>, - <0 0x2e 0x1>, - <0 0x2f 0x1>, - <0 0xa8 0x0>; - interrupt-names = "msi0", "msi1", "msi2", "msi3", - "msi4", "msi5", "msi6", "msi7", - "msi8", "msi9", "msi10", "msi11", - "msi12", "msi13", "msi14", "msi15", - "legacy"; - qcom,msi_addr = <0x0b006040>; - qcom,msi_base = <0x40>; - qcom,ath10k-pre-calibration-data = [ 01 02 03 ... ]; - qcom,coexist-support = <1>; - qcom,coexist-gpio-pin = <0x33>; -}; - -Example (to supply wcn3990 SoC wifi block details): - -wifi@18000000 { - compatible = "qcom,wcn3990-wifi"; - reg = <0x18800000 0x800000>; - reg-names = "membase"; - clocks = <&clock_gcc clk_rf_clk2_pin>; - clock-names = "cxo_ref_clk_pin"; - interrupts = - <GIC_SPI 414 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 415 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 416 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 417 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 418 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 419 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 420 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 421 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 422 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 423 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 424 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 425 IRQ_TYPE_LEVEL_HIGH>; - vdd-0.8-cx-mx-supply = <&pm8998_l5>; - vdd-1.8-xo-supply = <&vreg_l7a_1p8>; - vdd-1.3-rfa-supply = <&vreg_l17a_1p3>; - vdd-3.3-ch0-supply = <&vreg_l25a_3p3>; - vdd-3.3-ch1-supply = <&vreg_l26a_3p3>; - memory-region = <&wifi_msa_mem>; - iommus = <&apps_smmu 0x0040 0x1>; - qcom,msa-fixed-perm; - wifi-firmware { - iommus = <&apps_iommu 0xc22 0x1>; - }; -}; diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml new file mode 100644 index 000000000000..c85ed330426d --- /dev/null +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath10k.yaml @@ -0,0 +1,358 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/wireless/qcom,ath10k.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Technologies ath10k wireless devices + +maintainers: + - Kalle Valo <kvalo@kernel.org> + +description: + Qualcomm Technologies, Inc. IEEE 802.11ac devices. + +properties: + compatible: + enum: + - qcom,ath10k # SDIO-based devices + - qcom,ipq4019-wifi + - qcom,wcn3990-wifi # SNoC-based devices + + reg: + maxItems: 1 + + reg-names: + items: + - const: membase + + interrupts: + minItems: 12 + maxItems: 17 + + interrupt-names: + minItems: 12 + maxItems: 17 + + memory-region: + maxItems: 1 + description: + Reference to the MSA memory region used by the Wi-Fi firmware + running on the Q6 core. + + iommus: + minItems: 1 + maxItems: 2 + + clocks: + minItems: 1 + maxItems: 3 + + clock-names: + minItems: 1 + maxItems: 3 + + resets: + maxItems: 6 + + reset-names: + items: + - const: wifi_cpu_init + - const: wifi_radio_srif + - const: wifi_radio_warm + - const: wifi_radio_cold + - const: wifi_core_warm + - const: wifi_core_cold + + ext-fem-name: + $ref: /schemas/types.yaml#/definitions/string + description: Name of external front end module used. + enum: + - microsemi-lx5586 + - sky85703-11 + - sky85803 + + wifi-firmware: + type: object + additionalProperties: false + description: | + The ath10k Wi-Fi node can contain one optional firmware subnode. + Firmware subnode is needed when the platform does not have Trustzone. + properties: + iommus: + maxItems: 1 + required: + - iommus + + qcom,ath10k-calibration-data: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: + Calibration data + board-specific data as a byte array. The length + can vary between hardware versions. + + qcom,ath10k-calibration-variant: + $ref: /schemas/types.yaml#/definitions/string + description: + Unique variant identifier of the calibration data in board-2.bin + for designs with colliding bus and device specific ids + + qcom,ath10k-pre-calibration-data: + $ref: /schemas/types.yaml#/definitions/uint8-array + description: + Pre-calibration data as a byte array. The length can vary between + hardware versions. + + qcom,coexist-support: + $ref: /schemas/types.yaml#/definitions/uint8 + enum: [0, 1] + description: + Indicate coex support by the hardware. + + qcom,coexist-gpio-pin: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + COEX GPIO number provided to the Wi-Fi firmware. + + qcom,msa-fixed-perm: + type: boolean + description: + Whether to skip executing an SCM call that reassigns the memory + region ownership. + + qcom,smem-states: + $ref: /schemas/types.yaml#/definitions/phandle-array + description: State bits used by the AP to signal the WLAN Q6. + items: + - description: Signal bits used to enable/disable low power mode + on WCN in the case of WoW (Wake on Wireless). + + qcom,smem-state-names: + description: The names of the state bits used for SMP2P output. + items: + - const: wlan-smp2p-out + + qcom,snoc-host-cap-8bit-quirk: + type: boolean + description: + Quirk specifying that the firmware expects the 8bit version + of the host capability QMI request + + qcom,xo-cal-data: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + XO cal offset to be configured in XO trim register. + + vdd-0.8-cx-mx-supply: + description: Main logic power rail + + vdd-1.8-xo-supply: + description: Crystal oscillator supply + + vdd-1.3-rfa-supply: + description: RFA supply + + vdd-3.3-ch0-supply: + description: Primary Wi-Fi antenna supply + + vdd-3.3-ch1-supply: + description: Secondary Wi-Fi antenna supply + +required: + - compatible + - reg + +additionalProperties: false + +allOf: + - if: + properties: + compatible: + contains: + enum: + - qcom,ipq4019-wifi + then: + properties: + interrupts: + minItems: 17 + maxItems: 17 + + interrupt-names: + items: + - const: msi0 + - const: msi1 + - const: msi2 + - const: msi3 + - const: msi4 + - const: msi5 + - const: msi6 + - const: msi7 + - const: msi8 + - const: msi9 + - const: msi10 + - const: msi11 + - const: msi12 + - const: msi13 + - const: msi14 + - const: msi15 + - const: legacy + + clocks: + items: + - description: Wi-Fi command clock + - description: Wi-Fi reference clock + - description: Wi-Fi RTC clock + + clock-names: + items: + - const: wifi_wcss_cmd + - const: wifi_wcss_ref + - const: wifi_wcss_rtc + + required: + - clocks + - clock-names + - interrupts + - interrupt-names + - resets + - reset-names + + - if: + properties: + compatible: + contains: + enum: + - qcom,wcn3990-wifi + + then: + properties: + clocks: + minItems: 1 + items: + - description: XO reference clock + - description: Qualcomm Debug Subsystem clock + + clock-names: + minItems: 1 + items: + - const: cxo_ref_clk_pin + - const: qdss + + interrupts: + items: + - description: CE0 + - description: CE1 + - description: CE2 + - description: CE3 + - description: CE4 + - description: CE5 + - description: CE6 + - description: CE7 + - description: CE8 + - description: CE9 + - description: CE10 + - description: CE11 + + interrupt-names: false + + required: + - interrupts + +examples: + # SNoC + - | + #include <dt-bindings/clock/qcom,rpmcc.h> + #include <dt-bindings/interrupt-controller/arm-gic.h> + + wifi@18800000 { + compatible = "qcom,wcn3990-wifi"; + reg = <0x18800000 0x800000>; + reg-names = "membase"; + memory-region = <&wlan_msa_mem>; + clocks = <&rpmcc RPM_SMD_RF_CLK2_PIN>; + clock-names = "cxo_ref_clk_pin"; + interrupts = <GIC_SPI 413 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 414 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 415 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 416 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 417 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 418 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 420 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 421 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 422 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 423 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 424 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 425 IRQ_TYPE_LEVEL_HIGH>; + iommus = <&anoc2_smmu 0x1900>, + <&anoc2_smmu 0x1901>; + qcom,snoc-host-cap-8bit-quirk; + vdd-0.8-cx-mx-supply = <&vreg_l5a_0p8>; + vdd-1.8-xo-supply = <&vreg_l7a_1p8>; + vdd-1.3-rfa-supply = <&vreg_l17a_1p3>; + vdd-3.3-ch0-supply = <&vreg_l25a_3p3>; + vdd-3.3-ch1-supply = <&vreg_l23a_3p3>; + + wifi-firmware { + iommus = <&apps_smmu 0x1c02 0x1>; + }; + }; + + # AHB + - | + #include <dt-bindings/clock/qcom,gcc-ipq4019.h> + + wifi@a000000 { + compatible = "qcom,ipq4019-wifi"; + reg = <0xa000000 0x200000>; + resets = <&gcc WIFI0_CPU_INIT_RESET>, + <&gcc WIFI0_RADIO_SRIF_RESET>, + <&gcc WIFI0_RADIO_WARM_RESET>, + <&gcc WIFI0_RADIO_COLD_RESET>, + <&gcc WIFI0_CORE_WARM_RESET>, + <&gcc WIFI0_CORE_COLD_RESET>; + reset-names = "wifi_cpu_init", + "wifi_radio_srif", + "wifi_radio_warm", + "wifi_radio_cold", + "wifi_core_warm", + "wifi_core_cold"; + clocks = <&gcc GCC_WCSS2G_CLK>, + <&gcc GCC_WCSS2G_REF_CLK>, + <&gcc GCC_WCSS2G_RTC_CLK>; + clock-names = "wifi_wcss_cmd", + "wifi_wcss_ref", + "wifi_wcss_rtc"; + interrupts = <GIC_SPI 32 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 33 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 34 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 35 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 36 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 37 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 38 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 39 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 40 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 41 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 42 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 43 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 44 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 45 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 46 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 47 IRQ_TYPE_EDGE_RISING>, + <GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "msi0", + "msi1", + "msi2", + "msi3", + "msi4", + "msi5", + "msi6", + "msi7", + "msi8", + "msi9", + "msi10", + "msi11", + "msi12", + "msi13", + "msi14", + "msi15", + "legacy"; + }; diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml new file mode 100644 index 000000000000..817f02a8b481 --- /dev/null +++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k-pci.yaml @@ -0,0 +1,58 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +# Copyright (c) 2023 Linaro Limited +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/net/wireless/qcom,ath11k-pci.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm Technologies ath11k wireless devices (PCIe) + +maintainers: + - Kalle Valo <kvalo@kernel.org> + +description: | + Qualcomm Technologies IEEE 802.11ax PCIe devices + +properties: + compatible: + enum: + - pci17cb,1103 # WCN6855 + + reg: + maxItems: 1 + + qcom,ath11k-calibration-variant: + $ref: /schemas/types.yaml#/definitions/string + description: | + string to uniquely identify variant of the calibration data for designs + with colliding bus and device ids + +required: + - compatible + - reg + +additionalProperties: false + +examples: + - | + pcie { + #address-cells = <3>; + #size-cells = <2>; + + pcie@0 { + device_type = "pci"; + reg = <0x0 0x0 0x0 0x0 0x0>; + #address-cells = <3>; + #size-cells = <2>; + ranges; + + bus-range = <0x01 0xff>; + + wifi@0 { + compatible = "pci17cb,1103"; + reg = <0x10000 0x0 0x0 0x0 0x0>; + + qcom,ath11k-calibration-variant = "LE_X13S"; + }; + }; + }; diff --git a/Documentation/leds/well-known-leds.txt b/Documentation/leds/well-known-leds.txt index 2160382c86be..e9c30dc75884 100644 --- a/Documentation/leds/well-known-leds.txt +++ b/Documentation/leds/well-known-leds.txt @@ -70,3 +70,33 @@ Good: "platform:*:charging" (allwinner sun50i) * Screen Good: ":backlight" (Motorola Droid 4) + +* Ethernet LEDs + +Currently two types of Network LEDs are support, those controlled by +the PHY and those by the MAC. In theory both can be present at the +same time for one Linux netdev, hence the names need to differ between +MAC and PHY. + +Do not use the netdev name, such as eth0, enp1s0. These are not stable +and are not unique. They also don't differentiate between MAC and PHY. + +** MAC LEDs + +Good: f1070000.ethernet:white:WAN +Good: mdio_mux-0.1:00:green:left +Good: 0000:02:00.0:yellow:top + +The first part must uniquely name the MAC controller. Then follows the +colour. WAN/LAN should be used for a single LED. If there are +multiple LEDs, use left/right, or top/bottom to indicate their +position on the RJ45 socket. + +** PHY LEDs + +Good: f1072004.mdio-mii:00: white:WAN +Good: !mdio-mux!mdio@2!switch@0!mdio:01:green:right +Good: r8169-0-200:00:yellow:bottom + +The first part must uniquely name the PHY. This often means uniquely +identifying the MDIO bus controller, and the address on the bus. diff --git a/Documentation/netlink/genetlink-c.yaml b/Documentation/netlink/genetlink-c.yaml index 5c3642b3f802..8e8c17b0a6c6 100644 --- a/Documentation/netlink/genetlink-c.yaml +++ b/Documentation/netlink/genetlink-c.yaml @@ -33,10 +33,10 @@ properties: protocol: description: Schema compatibility level. Default is "genetlink". enum: [ genetlink, genetlink-c ] - # Start genetlink-c uapi-header: description: Path to the uAPI header, default is linux/${family-name}.h type: string + # Start genetlink-c c-family-name: description: Name of the define for the family name. type: string diff --git a/Documentation/netlink/genetlink-legacy.yaml b/Documentation/netlink/genetlink-legacy.yaml index 5e98c6d2b9aa..b33541a51d6b 100644 --- a/Documentation/netlink/genetlink-legacy.yaml +++ b/Documentation/netlink/genetlink-legacy.yaml @@ -33,10 +33,10 @@ properties: protocol: description: Schema compatibility level. Default is "genetlink". enum: [ genetlink, genetlink-c, genetlink-legacy ] # Trim - # Start genetlink-c uapi-header: description: Path to the uAPI header, default is linux/${family-name}.h type: string + # Start genetlink-c c-family-name: description: Name of the define for the family name. type: string @@ -218,6 +218,11 @@ properties: description: Max length for a string or a binary attribute. $ref: '#/$defs/len-or-define' sub-type: *attr-type + # Start genetlink-legacy + struct: + description: Name of the struct type used for the attribute. + type: string + # End genetlink-legacy # Make sure name-prefix does not appear in subsets (subsets inherit naming) dependencies: @@ -256,6 +261,14 @@ properties: async-enum: description: Name for the enum type with notifications/events. type: string + # Start genetlink-legacy + fixed-header: &fixed-header + description: | + Name of the structure defining the optional fixed-length protocol + header. This header is placed in a message after the netlink and + genetlink headers and before any attributes. + type: string + # End genetlink-legacy list: description: List of commands type: array @@ -288,6 +301,9 @@ properties: type: array items: enum: [ strict, dump ] + # Start genetlink-legacy + fixed-header: *fixed-header + # End genetlink-legacy do: &subop-type description: Main command handler. type: object diff --git a/Documentation/netlink/genetlink.yaml b/Documentation/netlink/genetlink.yaml index d35dcd6f8d82..d8b2cdeba058 100644 --- a/Documentation/netlink/genetlink.yaml +++ b/Documentation/netlink/genetlink.yaml @@ -33,6 +33,9 @@ properties: protocol: description: Schema compatibility level. Default is "genetlink". enum: [ genetlink ] + uapi-header: + description: Path to the uAPI header, default is linux/${family-name}.h + type: string definitions: description: List of type and constant definitions (enums, flags, defines). diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml new file mode 100644 index 000000000000..90641668232e --- /dev/null +++ b/Documentation/netlink/specs/devlink.yaml @@ -0,0 +1,198 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) + +name: devlink + +protocol: genetlink-legacy + +doc: Partial family for Devlink. + +attribute-sets: + - + name: devlink + attributes: + - + name: bus-name + type: string + value: 1 + - + name: dev-name + type: string + - + name: port-index + type: u32 + + # TODO: fill in the attributes in between + + - + name: info-driver-name + type: string + value: 98 + - + name: info-serial-number + type: string + - + name: info-version-fixed + type: nest + multi-attr: true + nested-attributes: dl-info-version + - + name: info-version-running + type: nest + multi-attr: true + nested-attributes: dl-info-version + - + name: info-version-stored + type: nest + multi-attr: true + nested-attributes: dl-info-version + - + name: info-version-name + type: string + - + name: info-version-value + type: string + + # TODO: fill in the attributes in between + + - + name: reload-failed + type: u8 + value: 136 + + # TODO: fill in the attributes in between + + - + name: reload-action + type: u8 + value: 153 + + # TODO: fill in the attributes in between + + - + name: dev-stats + type: nest + value: 156 + nested-attributes: dl-dev-stats + - + name: reload-stats + type: nest + nested-attributes: dl-reload-stats + - + name: reload-stats-entry + type: nest + multi-attr: true + nested-attributes: dl-reload-stats-entry + - + name: reload-stats-limit + type: u8 + - + name: reload-stats-value + type: u32 + - + name: remote-reload-stats + type: nest + nested-attributes: dl-reload-stats + - + name: reload-action-info + type: nest + nested-attributes: dl-reload-act-info + - + name: reload-action-stats + type: nest + nested-attributes: dl-reload-act-stats + - + name: dl-dev-stats + subset-of: devlink + attributes: + - + name: reload-stats + type: nest + - + name: remote-reload-stats + type: nest + - + name: dl-reload-stats + subset-of: devlink + attributes: + - + name: reload-action-info + type: nest + - + name: dl-reload-act-info + subset-of: devlink + attributes: + - + name: reload-action + type: u8 + - + name: reload-action-stats + type: nest + - + name: dl-reload-act-stats + subset-of: devlink + attributes: + - + name: reload-stats-entry + type: nest + - + name: dl-reload-stats-entry + subset-of: devlink + attributes: + - + name: reload-stats-limit + type: u8 + - + name: reload-stats-value + type: u32 + - + name: dl-info-version + subset-of: devlink + attributes: + - + name: info-version-name + type: string + - + name: info-version-value + type: string + +operations: + enum-model: directional + list: + - + name: get + doc: Get devlink instances. + attribute-set: devlink + + do: + request: + value: 1 + attributes: &dev-id-attrs + - bus-name + - dev-name + reply: &get-reply + value: 3 + attributes: + - bus-name + - dev-name + - reload-failed + - reload-action + - dev-stats + dump: + reply: *get-reply + + # TODO: fill in the operations in between + + - + name: info-get + doc: Get device information, like driver name, hardware and firmware versions etc. + attribute-set: devlink + + do: + request: + value: 51 + attributes: *dev-id-attrs + reply: + value: 51 + attributes: + - bus-name + - dev-name diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml index 4727c067e2ba..129f413ea349 100644 --- a/Documentation/netlink/specs/ethtool.yaml +++ b/Documentation/netlink/specs/ethtool.yaml @@ -6,6 +6,12 @@ protocol: genetlink-legacy doc: Partial family for Ethtool Netlink. +definitions: + - + name: udp-tunnel-type + type: enum + entries: [ vxlan, geneve, vxlan-gpe ] + attribute-sets: - name: header @@ -38,6 +44,7 @@ attribute-sets: - name: bit type: nest + multi-attr: true nested-attributes: bitset-bit - name: bitset @@ -54,6 +61,22 @@ attribute-sets: nested-attributes: bitset-bits - + name: u64-array + attributes: + - + name: u64 + type: nest + multi-attr: true + nested-attributes: u64 + - + name: s32-array + attributes: + - + name: s32 + type: nest + multi-attr: true + nested-attributes: s32 + - name: string attributes: - @@ -165,6 +188,12 @@ attribute-sets: - name: rx-push type: u8 + - + name: tx-push-buf-len + type: u32 + - + name: tx-push-buf-len-max + type: u32 - name: mm-stat @@ -228,6 +257,657 @@ attribute-sets: name: stats type: nest nested-attributes: mm-stat + - + name: linkinfo + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: port + type: u8 + - + name: phyaddr + type: u8 + - + name: tp-mdix + type: u8 + - + name: tp-mdix-ctrl + type: u8 + - + name: transceiver + type: u8 + - + name: linkmodes + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: autoneg + type: u8 + - + name: ours + type: nest + nested-attributes: bitset + - + name: peer + type: nest + nested-attributes: bitset + - + name: speed + type: u32 + - + name: duplex + type: u8 + - + name: master-slave-cfg + type: u8 + - + name: master-slave-state + type: u8 + - + name: master-slave-lanes + type: u32 + - + name: rate-matching + type: u8 + - + name: linkstate + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: link + type: u8 + - + name: sqi + type: u32 + - + name: sqi-max + type: u32 + - + name: ext-state + type: u8 + - + name: ext-substate + type: u8 + - + name: down-cnt + type: u32 + - + name: debug + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: msgmask + type: nest + nested-attributes: bitset + - + name: wol + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: modes + type: nest + nested-attributes: bitset + - + name: sopass + type: binary + - + name: features + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: hw + type: nest + nested-attributes: bitset + - + name: wanted + type: nest + nested-attributes: bitset + - + name: active + type: nest + nested-attributes: bitset + - + name: nochange + type: nest + nested-attributes: bitset + - + name: channels + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: rx-max + type: u32 + - + name: tx-max + type: u32 + - + name: other-max + type: u32 + - + name: combined-max + type: u32 + - + name: rx-count + type: u32 + - + name: tx-count + type: u32 + - + name: other-count + type: u32 + - + name: combined-count + type: u32 + + - + name: coalesce + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: rx-usecs + type: u32 + - + name: rx-max-frames + type: u32 + - + name: rx-usecs-irq + type: u32 + - + name: rx-max-frames-irq + type: u32 + - + name: tx-usecs + type: u32 + - + name: tx-max-frames + type: u32 + - + name: tx-usecs-irq + type: u32 + - + name: tx-max-frames-irq + type: u32 + - + name: stats-block-usecs + type: u32 + - + name: use-adaptive-rx + type: u8 + - + name: use-adaptive-tx + type: u8 + - + name: pkt-rate-low + type: u32 + - + name: rx-usecs-low + type: u32 + - + name: rx-max-frames-low + type: u32 + - + name: tx-usecs-low + type: u32 + - + name: tx-max-frames-low + type: u32 + - + name: pkt-rate-high + type: u32 + - + name: rx-usecs-high + type: u32 + - + name: rx-max-frames-high + type: u32 + - + name: tx-usecs-high + type: u32 + - + name: tx-max-frames-high + type: u32 + - + name: rate-sample-interval + type: u32 + - + name: use-cqe-mode-tx + type: u8 + - + name: use-cqe-mode-rx + type: u8 + - + name: tx-aggr-max-bytes + type: u32 + - + name: tx-aggr-max-frames + type: u32 + - + name: tx-aggr-time-usecs + type: u32 + - + name: pause-stat + attributes: + - + name: pad + type: u32 + - + name: tx-frames + type: u64 + - + name: rx-frames + type: u64 + - + name: pause + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: autoneg + type: u8 + - + name: rx + type: u8 + - + name: tx + type: u8 + - + name: stats + type: nest + nested-attributes: pause-stat + - + name: stats-src + type: u32 + - + name: eee + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: modes-ours + type: nest + nested-attributes: bitset + - + name: modes-peer + type: nest + nested-attributes: bitset + - + name: active + type: u8 + - + name: enabled + type: u8 + - + name: tx-lpi-enabled + type: u8 + - + name: tx-lpi-timer + type: u32 + - + name: tsinfo + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: timestamping + type: nest + nested-attributes: bitset + - + name: tx-types + type: nest + nested-attributes: bitset + - + name: rx-filters + type: nest + nested-attributes: bitset + - + name: phc-index + type: u32 + - + name: cable-test-nft-nest-result + attributes: + - + name: pair + type: u8 + - + name: code + type: u8 + - + name: cable-test-nft-nest-fault-length + attributes: + - + name: pair + type: u8 + - + name: cm + type: u32 + - + name: cable-test-nft-nest + attributes: + - + name: result + type: nest + nested-attributes: cable-test-nft-nest-result + - + name: fault-length + type: nest + nested-attributes: cable-test-nft-nest-fault-length + - + name: cable-test + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: status + type: u8 + - + name: nest + type: nest + nested-attributes: cable-test-nft-nest + - + name: cable-test-tdr-cfg + attributes: + - + name: first + type: u32 + - + name: last + type: u32 + - + name: step + type: u32 + - + name: pari + type: u8 + - + name: cable-test-tdr + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: cfg + type: nest + nested-attributes: cable-test-tdr-cfg + - + name: tunnel-info-udp-entry + attributes: + - + name: port + type: u16 + byte-order: big-endian + - + name: type + type: u32 + enum: udp-tunnel-type + - + name: tunnel-info-udp-table + attributes: + - + name: size + type: u32 + - + name: types + type: nest + nested-attributes: bitset + - + name: udp-ports + type: nest + nested-attributes: tunnel-info-udp-entry + - + name: tunnel-info + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: udp-ports + type: nest + nested-attributes: tunnel-info-udp-table + - + name: fec-stat + attributes: + - + name: pad + type: u8 + - + name: corrected + type: nest + nested-attributes: u64-array + - + name: uncorr + type: nest + nested-attributes: u64-array + - + name: corr-bits + type: nest + nested-attributes: u64-array + - + name: fec + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: modes + type: nest + nested-attributes: bitset + - + name: auto + type: u8 + - + name: active + type: u32 + - + name: stats + type: nest + nested-attributes: fec-stat + - + name: module-eeprom + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: offset + type: u32 + - + name: length + type: u32 + - + name: page + type: u8 + - + name: bank + type: u8 + - + name: i2c-address + type: u8 + - + name: data + type: binary + - + name: stats-grp + attributes: + - + name: pad + type: u32 + - + name: id + type: u32 + - + name: ss-id + type: u32 + - + name: stat + type: nest + nested-attributes: u64 + - + name: hist-rx + type: nest + nested-attributes: u64 + - + name: hist-tx + type: nest + nested-attributes: u64 + - + name: hist-bkt-low + type: u32 + - + name: hist-bkt-hi + type: u32 + - + name: hist-bkt-val + type: u64 + - + name: stats + attributes: + - + name: pad + type: u32 + - + name: header + type: nest + nested-attributes: header + - + name: groups + type: nest + nested-attributes: bitset + - + name: grp + type: nest + nested-attributes: stats-grp + - + name: src + type: u32 + - + name: phc-vclocks + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: num + type: u32 + - + name: index + type: nest + nested-attributes: s32-array + - + name: module + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: power-mode-policy + type: u8 + - + name: power-mode + type: u8 + - + name: pse + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: admin-state + type: u32 + - + name: admin-control + type: u32 + - + name: pw-d-status + type: u32 + - + name: rss + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: context + type: u32 + - + name: hfunc + type: u32 + - + name: indir + type: binary + - + name: hkey + type: binary + - + name: plca + attributes: + - + name: header + type: nest + nested-attributes: header + - + name: version + type: u16 + - + name: enabled + type: u8 + - + name: status + type: u8 + - + name: node-cnt + type: u32 + - + name: node-id + type: u32 + - + name: to-tmr + type: u32 + - + name: burst-cnt + type: u32 + - + name: burst-tmr + type: u32 operations: enum-model: directional @@ -249,9 +929,188 @@ operations: - header - stringsets dump: *strset-get-op + - + name: linkinfo-get + doc: Get link info. + + attribute-set: linkinfo + + do: &linkinfo-get-op + request: + attributes: + - header + reply: + attributes: &linkinfo + - header + - port + - phyaddr + - tp-mdix + - tp-mdix-ctrl + - transceiver + dump: *linkinfo-get-op + - + name: linkinfo-set + doc: Set link info. + + attribute-set: linkinfo + + do: + request: + attributes: *linkinfo + - + name: linkinfo-ntf + doc: Notification for change in link info. + notify: linkinfo-get + - + name: linkmodes-get + doc: Get link modes. + + attribute-set: linkmodes + + do: &linkmodes-get-op + request: + attributes: + - header + reply: + attributes: &linkmodes + - header + - autoneg + - ours + - peer + - speed + - duplex + - master-slave-cfg + - master-slave-state + - master-slave-lanes + - rate-matching + dump: *linkmodes-get-op + - + name: linkmodes-set + doc: Set link modes. + + attribute-set: linkmodes + + do: + request: + attributes: *linkmodes + - + name: linkmodes-ntf + doc: Notification for change in link modes. + notify: linkmodes-get + - + name: linkstate-get + doc: Get link state. + + attribute-set: linkstate + + do: &linkstate-get-op + request: + attributes: + - header + reply: + attributes: + - header + - link + - sqi + - sqi-max + - ext-state + - ext-substate + - down-cnt + dump: *linkstate-get-op + - + name: debug-get + doc: Get debug message mask. + + attribute-set: debug + + do: &debug-get-op + request: + attributes: + - header + reply: + attributes: &debug + - header + - msgmask + dump: *debug-get-op + - + name: debug-set + doc: Set debug message mask. + + attribute-set: debug - # TODO: fill in the requests in between + do: + request: + attributes: *debug + - + name: debug-ntf + doc: Notification for change in debug message mask. + notify: debug-get + - + name: wol-get + doc: Get WOL params. + + attribute-set: wol + + do: &wol-get-op + request: + attributes: + - header + reply: + attributes: &wol + - header + - modes + - sopass + dump: *wol-get-op + - + name: wol-set + doc: Set WOL params. + + attribute-set: wol + + do: + request: + attributes: *wol + - + name: wol-ntf + doc: Notification for change in WOL params. + notify: wol-get + - + name: features-get + doc: Get features. + + attribute-set: features + + do: &feature-get-op + request: + attributes: + - header + reply: + attributes: &feature + - header + # User-changeable features. + - hw + # User-requested features. + - wanted + # Currently active features. + - active + # Unchangeable features. + - nochange + dump: *feature-get-op + - + name: features-set + doc: Set features. + + attribute-set: features + do: &feature-set-op + request: + attributes: *feature + reply: + attributes: *feature + - + name: features-ntf + doc: Notification for change in features. + notify: features-get - name: privflags-get doc: Get device private flags. @@ -260,12 +1119,10 @@ operations: do: &privflag-get-op request: - value: 13 attributes: - header reply: - value: 14 - attributes: + attributes: &privflag - header - flags dump: *privflag-get-op @@ -277,9 +1134,7 @@ operations: do: request: - attributes: - - header - - flags + attributes: *privflag - name: privflags-ntf doc: Notification for change in device private flags. @@ -296,7 +1151,7 @@ operations: attributes: - header reply: - attributes: + attributes: &ring - header - rx-max - rx-mini-max @@ -311,6 +1166,8 @@ operations: - cqe-size - tx-push - rx-push + - tx-push-buf-len + - tx-push-buf-len-max dump: *ring-get-op - name: rings-set @@ -320,24 +1177,431 @@ operations: do: request: + attributes: *ring + - + name: rings-ntf + doc: Notification for change in ring params. + notify: rings-get + - + name: channels-get + doc: Get channel params. + + attribute-set: channels + + do: &channel-get-op + request: + attributes: + - header + reply: + attributes: &channel + - header + - rx-max + - tx-max + - other-max + - combined-max + - rx-count + - tx-count + - other-count + - combined-count + dump: *channel-get-op + - + name: channels-set + doc: Set channel params. + + attribute-set: channels + + do: + request: + attributes: *channel + - + name: channels-ntf + doc: Notification for change in channel params. + notify: channels-get + - + name: coalesce-get + doc: Get coalesce params. + + attribute-set: coalesce + + do: &coalesce-get-op + request: + attributes: + - header + reply: + attributes: &coalesce + - header + - rx-usecs + - rx-max-frames + - rx-usecs-irq + - rx-max-frames-irq + - tx-usecs + - tx-max-frames + - tx-usecs-irq + - tx-max-frames-irq + - stats-block-usecs + - use-adaptive-rx + - use-adaptive-tx + - pkt-rate-low + - rx-usecs-low + - rx-max-frames-low + - tx-usecs-low + - tx-max-frames-low + - pkt-rate-high + - rx-usecs-high + - rx-max-frames-high + - tx-usecs-high + - tx-max-frames-high + - rate-sample-interval + - use-cqe-mode-tx + - use-cqe-mode-rx + - tx-aggr-max-bytes + - tx-aggr-max-frames + - tx-aggr-time-usecs + dump: *coalesce-get-op + - + name: coalesce-set + doc: Set coalesce params. + + attribute-set: coalesce + + do: + request: + attributes: *coalesce + - + name: coalesce-ntf + doc: Notification for change in coalesce params. + notify: coalesce-get + - + name: pause-get + doc: Get pause params. + + attribute-set: pause + + do: &pause-get-op + request: attributes: - header + reply: + attributes: &pause + - header + - autoneg - rx - - rx-mini - - rx-jumbo - tx - - rx-buf-len - - tcp-data-split - - cqe-size - - tx-push - - rx-push + - stats + - stats-src + dump: *pause-get-op - - name: rings-ntf - doc: Notification for change in ring params. - notify: rings-get + name: pause-set + doc: Set pause params. + + attribute-set: pause + + do: + request: + attributes: *pause + - + name: pause-ntf + doc: Notification for change in pause params. + notify: pause-get + - + name: eee-get + doc: Get eee params. + + attribute-set: eee + + do: &eee-get-op + request: + attributes: + - header + reply: + attributes: &eee + - header + - modes-ours + - modes-peer + - active + - enabled + - tx-lpi-enabled + - tx-lpi-timer + dump: *eee-get-op + - + name: eee-set + doc: Set eee params. + + attribute-set: eee + + do: + request: + attributes: *eee + - + name: eee-ntf + doc: Notification for change in eee params. + notify: eee-get + - + name: tsinfo-get + doc: Get tsinfo params. + + attribute-set: tsinfo + + do: &tsinfo-get-op + request: + attributes: + - header + reply: + attributes: + - header + - timestamping + - tx-types + - rx-filters + - phc-index + dump: *tsinfo-get-op + - + name: cable-test-act + doc: Cable test. + + attribute-set: cable-test + + do: + request: + attributes: + - header + reply: + attributes: + - header + - cable-test-nft-nest + - + name: cable-test-tdr-act + doc: Cable test TDR. + + attribute-set: cable-test-tdr + + do: + request: + attributes: + - header + reply: + attributes: + - header + - cable-test-tdr-cfg + - + name: tunnel-info-get + doc: Get tsinfo params. + + attribute-set: tunnel-info + + do: &tunnel-info-get-op + request: + attributes: + - header + reply: + attributes: + - header + - udp-ports + dump: *tunnel-info-get-op + - + name: fec-get + doc: Get FEC params. - # TODO: fill in the requests in between + attribute-set: fec + do: &fec-get-op + request: + attributes: + - header + reply: + attributes: &fec + - header + - modes + - auto + - active + - stats + dump: *fec-get-op + - + name: fec-set + doc: Set FEC params. + + attribute-set: fec + + do: + request: + attributes: *fec + - + name: fec-ntf + doc: Notification for change in FEC params. + notify: fec-get + - + name: module-eeprom-get + doc: Get module EEPROM params. + + attribute-set: module-eeprom + + do: &module-eeprom-get-op + request: + attributes: + - header + reply: + attributes: + - header + - offset + - length + - page + - bank + - i2c-address + - data + dump: *module-eeprom-get-op + - + name: stats-get + doc: Get statistics. + + attribute-set: stats + + do: &stats-get-op + request: + attributes: + - header + - groups + reply: + attributes: + - header + - groups + - grp + - src + dump: *stats-get-op + - + name: phc-vclocks-get + doc: Get PHC VCLOCKs. + + attribute-set: phc-vclocks + + do: &phc-vclocks-get-op + request: + attributes: + - header + reply: + attributes: + - header + - num + dump: *phc-vclocks-get-op + - + name: module-get + doc: Get module params. + + attribute-set: module + + do: &module-get-op + request: + attributes: + - header + reply: + attributes: &module + - header + - power-mode-policy + - power-mode + dump: *module-get-op + - + name: module-set + doc: Set module params. + + attribute-set: module + + do: + request: + attributes: *module + - + name: module-ntf + doc: Notification for change in module params. + notify: module-get + - + name: pse-get + doc: Get Power Sourcing Equipment params. + + attribute-set: pse + + do: &pse-get-op + request: + attributes: + - header + reply: + attributes: &pse + - header + - admin-state + - admin-control + - pw-d-status + dump: *pse-get-op + - + name: pse-set + doc: Set Power Sourcing Equipment params. + + attribute-set: pse + + do: + request: + attributes: *pse + - + name: rss-get + doc: Get RSS params. + + attribute-set: rss + + do: &rss-get-op + request: + attributes: + - header + reply: + attributes: + - header + - context + - hfunc + - indir + - hkey + dump: *rss-get-op + - + name: plca-get + doc: Get PLCA params. + + attribute-set: plca + + do: &plca-get-op + request: + attributes: + - header + reply: + attributes: &plca + - header + - version + - enabled + - status + - node-cnt + - node-id + - to-tmr + - burst-cnt + - burst-tmr + dump: *plca-get-op + - + name: plca-set + doc: Set PLCA params. + + attribute-set: plca + + do: + request: + attributes: *plca + - + name: plca-get-status + doc: Get PLCA status params. + + attribute-set: plca + + do: &plca-get-status-op + request: + attributes: + - header + reply: + attributes: *plca + dump: *plca-get-status-op + - + name: plca-ntf + doc: Notification for change in PLCA params. + notify: plca-get - name: mm-get doc: Get MAC Merge configuration and state @@ -346,11 +1610,9 @@ operations: do: &mm-get-op request: - value: 42 attributes: - header reply: - value: 42 attributes: - header - pmac-enabled diff --git a/Documentation/netlink/specs/handshake.yaml b/Documentation/netlink/specs/handshake.yaml new file mode 100644 index 000000000000..614f1a585511 --- /dev/null +++ b/Documentation/netlink/specs/handshake.yaml @@ -0,0 +1,124 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) +# +# Author: Chuck Lever <chuck.lever@oracle.com> +# +# Copyright (c) 2023, Oracle and/or its affiliates. +# + +name: handshake + +protocol: genetlink + +doc: Netlink protocol to request a transport layer security handshake. + +definitions: + - + type: enum + name: handler-class + value-start: 0 + entries: [ none, tlshd, max ] + - + type: enum + name: msg-type + value-start: 0 + entries: [ unspec, clienthello, serverhello ] + - + type: enum + name: auth + value-start: 0 + entries: [ unspec, unauth, psk, x509 ] + +attribute-sets: + - + name: x509 + attributes: + - + name: cert + type: u32 + - + name: privkey + type: u32 + - + name: accept + attributes: + - + name: sockfd + type: u32 + - + name: handler-class + type: u32 + enum: handler-class + - + name: message-type + type: u32 + enum: msg-type + - + name: timeout + type: u32 + - + name: auth-mode + type: u32 + enum: auth + - + name: peer-identity + type: u32 + multi-attr: true + - + name: certificate + type: nest + nested-attributes: x509 + multi-attr: true + - + name: done + attributes: + - + name: status + type: u32 + - + name: sockfd + type: u32 + - + name: remote-auth + type: u32 + multi-attr: true + +operations: + list: + - + name: ready + doc: Notify handlers that a new handshake request is waiting + notify: accept + - + name: accept + doc: Handler retrieves next queued handshake request + attribute-set: accept + flags: [ admin-perm ] + do: + request: + attributes: + - handler-class + reply: + attributes: + - sockfd + - message-type + - timeout + - auth-mode + - peer-identity + - certificate + - + name: done + doc: Handler reports handshake completion + attribute-set: done + do: + request: + attributes: + - status + - sockfd + - remote-auth + +mcast-groups: + list: + - + name: none + - + name: tlshd diff --git a/Documentation/netlink/specs/ovs_datapath.yaml b/Documentation/netlink/specs/ovs_datapath.yaml new file mode 100644 index 000000000000..6d71db8c4416 --- /dev/null +++ b/Documentation/netlink/specs/ovs_datapath.yaml @@ -0,0 +1,153 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) + +name: ovs_datapath +version: 2 +protocol: genetlink-legacy + +doc: + OVS datapath configuration over generic netlink. + +definitions: + - + name: ovs-header + type: struct + members: + - + name: dp-ifindex + type: u32 + - + name: user-features + type: flags + entries: + - + name: unaligned + doc: Allow last Netlink attribute to be unaligned + - + name: vport-pids + doc: Allow datapath to associate multiple Netlink PIDs to each vport + - + name: tc-recirc-sharing + doc: Allow tc offload recirc sharing + - + name: dispatch-upcall-per-cpu + doc: Allow per-cpu dispatch of upcalls + - + name: datapath-stats + type: struct + members: + - + name: hit + type: u64 + - + name: missed + type: u64 + - + name: lost + type: u64 + - + name: flows + type: u64 + - + name: megaflow-stats + type: struct + members: + - + name: mask-hit + type: u64 + - + name: masks + type: u32 + - + name: padding + type: u32 + - + name: cache-hits + type: u64 + - + name: pad1 + type: u64 + +attribute-sets: + - + name: datapath + attributes: + - + name: name + type: string + - + name: upcall-pid + doc: upcall pid + type: u32 + - + name: stats + type: binary + struct: datapath-stats + - + name: megaflow-stats + type: binary + struct: megaflow-stats + - + name: user-features + type: u32 + enum: user-features + enum-as-flags: true + - + name: pad + type: unused + - + name: masks-cache-size + type: u32 + - + name: per-cpu-pids + type: binary + sub-type: u32 + +operations: + fixed-header: ovs-header + list: + - + name: dp-get + doc: Get / dump OVS data path configuration and state + value: 3 + attribute-set: datapath + do: &dp-get-op + request: + attributes: + - name + reply: + attributes: + - name + - upcall-pid + - stats + - megaflow-stats + - user-features + - masks-cache-size + - per-cpu-pids + dump: *dp-get-op + - + name: dp-new + doc: Create new OVS data path + value: 1 + attribute-set: datapath + do: + request: + attributes: + - dp-ifindex + - name + - upcall-pid + - user-features + - + name: dp-del + doc: Delete existing OVS data path + value: 2 + attribute-set: datapath + do: + request: + attributes: + - dp-ifindex + - name + +mcast-groups: + list: + - + name: ovs_datapath diff --git a/Documentation/netlink/specs/ovs_vport.yaml b/Documentation/netlink/specs/ovs_vport.yaml new file mode 100644 index 000000000000..8e55622ddf11 --- /dev/null +++ b/Documentation/netlink/specs/ovs_vport.yaml @@ -0,0 +1,139 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) + +name: ovs_vport +version: 2 +protocol: genetlink-legacy + +doc: + OVS vport configuration over generic netlink. + +definitions: + - + name: ovs-header + type: struct + members: + - + name: dp-ifindex + type: u32 + - + name: vport-type + type: enum + entries: [ unspec, netdev, internal, gre, vxlan, geneve ] + - + name: vport-stats + type: struct + members: + - + name: rx-packets + type: u64 + - + name: tx-packets + type: u64 + - + name: rx-bytes + type: u64 + - + name: tx-bytes + type: u64 + - + name: rx-errors + type: u64 + - + name: tx-errors + type: u64 + - + name: rx-dropped + type: u64 + - + name: tx-dropped + type: u64 + +attribute-sets: + - + name: vport-options + attributes: + - + name: dst-port + type: u32 + - + name: extension + type: u32 + - + name: upcall-stats + attributes: + - + name: success + type: u64 + value: 0 + - + name: fail + type: u64 + - + name: vport + attributes: + - + name: port-no + type: u32 + - + name: type + type: u32 + enum: vport-type + - + name: name + type: string + - + name: options + type: nest + nested-attributes: vport-options + - + name: upcall-pid + type: binary + sub-type: u32 + - + name: stats + type: binary + struct: vport-stats + - + name: pad + type: unused + - + name: ifindex + type: u32 + - + name: netnsid + type: u32 + - + name: upcall-stats + type: nest + nested-attributes: upcall-stats + +operations: + list: + - + name: vport-get + doc: Get / dump OVS vport configuration and state + value: 3 + attribute-set: vport + fixed-header: ovs-header + do: &vport-get-op + request: + attributes: + - dp-ifindex + - name + reply: &dev-all + attributes: + - dp-ifindex + - port-no + - type + - name + - upcall-pid + - stats + - ifindex + - netnsid + - upcall-stats + dump: *vport-get-op + +mcast-groups: + list: + - + name: ovs_vport diff --git a/Documentation/networking/device_drivers/can/ctu/ctucanfd-driver.rst b/Documentation/networking/device_drivers/can/ctu/ctucanfd-driver.rst index 1a4fc6607582..1661d13174d5 100644 --- a/Documentation/networking/device_drivers/can/ctu/ctucanfd-driver.rst +++ b/Documentation/networking/device_drivers/can/ctu/ctucanfd-driver.rst @@ -229,8 +229,7 @@ frames for a while. This has a potential to avoid the costly round of enabling interrupts, handling an incoming IRQ in ISR, re-enabling the softirq and switching context back to softirq. -More detailed documentation of NAPI may be found on the pages of Linux -Foundation `<https://wiki.linuxfoundation.org/networking/napi>`_. +See :ref:`Documentation/networking/napi.rst <napi>` for more information. Integrating the core to Xilinx Zynq ----------------------------------- diff --git a/Documentation/networking/device_drivers/ethernet/amd/pds_core.rst b/Documentation/networking/device_drivers/ethernet/amd/pds_core.rst new file mode 100644 index 000000000000..9e8a16c44102 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/amd/pds_core.rst @@ -0,0 +1,139 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +======================================================== +Linux Driver for the AMD/Pensando(R) DSC adapter family +======================================================== + +Copyright(c) 2023 Advanced Micro Devices, Inc + +Identifying the Adapter +======================= + +To find if one or more AMD/Pensando PCI Core devices are installed on the +host, check for the PCI devices:: + + # lspci -d 1dd8:100c + b5:00.0 Processing accelerators: Pensando Systems Device 100c + b6:00.0 Processing accelerators: Pensando Systems Device 100c + +If such devices are listed as above, then the pds_core.ko driver should find +and configure them for use. There should be log entries in the kernel +messages such as these:: + + $ dmesg | grep pds_core + pds_core 0000:b5:00.0: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link) + pds_core 0000:b5:00.0: FW: 1.60.0-73 + pds_core 0000:b6:00.0: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link) + pds_core 0000:b6:00.0: FW: 1.60.0-73 + +Driver and firmware version information can be gathered with devlink:: + + $ devlink dev info pci/0000:b5:00.0 + pci/0000:b5:00.0: + driver pds_core + serial_number FLM18420073 + versions: + fixed: + asic.id 0x0 + asic.rev 0x0 + running: + fw 1.51.0-73 + stored: + fw.goldfw 1.15.9-C-22 + fw.mainfwa 1.60.0-73 + fw.mainfwb 1.60.0-57 + +Info versions +============= + +The ``pds_core`` driver reports the following versions + +.. list-table:: devlink info versions implemented + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``fw`` + - running + - Version of firmware running on the device + * - ``fw.goldfw`` + - stored + - Version of firmware stored in the goldfw slot + * - ``fw.mainfwa`` + - stored + - Version of firmware stored in the mainfwa slot + * - ``fw.mainfwb`` + - stored + - Version of firmware stored in the mainfwb slot + * - ``asic.id`` + - fixed + - The ASIC type for this device + * - ``asic.rev`` + - fixed + - The revision of the ASIC for this device + +Parameters +========== + +The ``pds_core`` driver implements the following generic +parameters for controlling the functionality to be made available +as auxiliary_bus devices. + +.. list-table:: Generic parameters implemented + :widths: 5 5 8 82 + + * - Name + - Mode + - Type + - Description + * - ``enable_vnet`` + - runtime + - Boolean + - Enables vDPA functionality through an auxiliary_bus device + +Firmware Management +=================== + +The ``flash`` command can update a the DSC firmware. The downloaded firmware +will be saved into either of firmware bank 1 or bank 2, whichever is not +currently in use, and that bank will used for the next boot:: + + # devlink dev flash pci/0000:b5:00.0 \ + file pensando/dsc_fw_1.63.0-22.tar + +Health Reporters +================ + +The driver supports a devlink health reporter for FW status:: + + # devlink health show pci/0000:2b:00.0 reporter fw + pci/0000:2b:00.0: + reporter fw + state healthy error 0 recover 0 + # devlink health diagnose pci/0000:2b:00.0 reporter fw + Status: healthy State: 1 Generation: 0 Recoveries: 0 + +Enabling the driver +=================== + +The driver is enabled via the standard kernel configuration system, +using the make command:: + + make oldconfig/menuconfig/etc. + +The driver is located in the menu structure at: + + -> Device Drivers + -> Network device support (NETDEVICES [=y]) + -> Ethernet driver support + -> AMD devices + -> AMD/Pensando Ethernet PDS_CORE Support + +Support +======= + +For general Linux networking support, please use the netdev mailing +list, which is monitored by AMD/Pensando personnel:: + + netdev@vger.kernel.org diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst index 392969ac88ad..417ca514a4d0 100644 --- a/Documentation/networking/device_drivers/ethernet/index.rst +++ b/Documentation/networking/device_drivers/ethernet/index.rst @@ -14,6 +14,7 @@ Contents: 3com/vortex amazon/ena altera/altera_tse + amd/pds_core aquantia/atlantic chelsio/cxgb cirrus/cs89x0 @@ -31,7 +32,6 @@ Contents: intel/fm10k intel/igb intel/igbvf - intel/ixgb intel/ixgbe intel/ixgbevf intel/i40e diff --git a/Documentation/networking/device_drivers/ethernet/intel/e100.rst b/Documentation/networking/device_drivers/ethernet/intel/e100.rst index 3d4a9ba21946..5dee1b53e977 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/e100.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/e100.rst @@ -151,8 +151,7 @@ NAPI NAPI (Rx polling mode) is supported in the e100 driver. -See https://wiki.linuxfoundation.org/networking/napi for more -information on NAPI. +See :ref:`Documentation/networking/napi.rst <napi>` for more information. Multiple Interfaces on Same Ethernet Broadcast Network ------------------------------------------------------ @@ -181,8 +180,6 @@ Support For general information, go to the Intel support website at: https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: -http://sourceforge.net/projects/e1000 If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/e1000.rst b/Documentation/networking/device_drivers/ethernet/intel/e1000.rst index 4aaae0f7d6ba..52a7fb9ce8d9 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/e1000.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/e1000.rst @@ -451,13 +451,8 @@ Support ======= For general information, go to the Intel support website at: - - http://support.intel.com - -or the Intel Wired Networking project hosted by Sourceforge at: - - http://sourceforge.net/projects/e1000 +http://support.intel.com If an issue is identified with the released source code on the supported kernel with a supported adapter, email the specific information related -to the issue to e1000-devel@lists.sf.net +to the issue to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst b/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst index f49cd370e7bf..d8f810afdd49 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst @@ -371,13 +371,8 @@ NOTE: Wake on LAN is only supported on port A for the following devices: Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst b/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst index 9258ef6f515c..396a2c8c3db1 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst @@ -130,13 +130,8 @@ the Intel Ethernet Controller XL710. Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst index ac35bd472bdc..4fbaa1a2d674 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst @@ -399,8 +399,8 @@ operate only in full duplex and only at their native speed. NAPI ---- NAPI (Rx polling mode) is supported in the i40e driver. -For more information on NAPI, see -https://wiki.linuxfoundation.org/networking/napi + +See :ref:`Documentation/networking/napi.rst <napi>` for more information. Flow Control ------------ @@ -759,13 +759,8 @@ enabled when setting up DCB on your switch. Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/iavf.rst b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst index 151af0a8da9c..eb926c3bd4cd 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/iavf.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst @@ -319,13 +319,8 @@ This is caused by the way the Linux kernel reports this stressed condition. Support ======= For general information, go to the Intel support website at: - https://support.intel.com -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on the supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst index 5efea4dd1251..69695e5511f4 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst @@ -817,10 +817,10 @@ NOTE: NAPI ---- + This driver supports NAPI (Rx polling mode). -For more information on NAPI, see -https://wiki.linuxfoundation.org/networking/napi +See :ref:`Documentation/networking/napi.rst <napi>` for more information. MACVLAN ------- @@ -1026,12 +1026,9 @@ Support For general information, go to the Intel support website at: https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. Trademarks diff --git a/Documentation/networking/device_drivers/ethernet/intel/igb.rst b/Documentation/networking/device_drivers/ethernet/intel/igb.rst index d46289e182cf..fbd590b6a0d6 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/igb.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/igb.rst @@ -201,13 +201,8 @@ NOTE: This feature is exclusive to i210 models. Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst b/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst index 40fa210c5e14..11a9017f3069 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst @@ -53,13 +53,8 @@ https://www.kernel.org/pub/software/network/ethtool/ Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst deleted file mode 100644 index c6a233e68ad6..000000000000 --- a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst +++ /dev/null @@ -1,468 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0+ - -===================================================================== -Linux Base Driver for 10 Gigabit Intel(R) Ethernet Network Connection -===================================================================== - -October 1, 2018 - - -Contents -======== - -- In This Release -- Identifying Your Adapter -- Command Line Parameters -- Improving Performance -- Additional Configurations -- Known Issues/Troubleshooting -- Support - - - -In This Release -=============== - -This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R) -Network Connection. This driver includes support for Itanium(R)2-based -systems. - -For questions related to hardware requirements, refer to the documentation -supplied with your 10 Gigabit adapter. All hardware requirements listed apply -to use with Linux. - -The following features are available in this kernel: - - Native VLANs - - Channel Bonding (teaming) - - SNMP - -Channel Bonding documentation can be found in the Linux kernel source: -/Documentation/networking/bonding.rst - -The driver information previously displayed in the /proc filesystem is not -supported in this release. Alternatively, you can use ethtool (version 1.6 -or later), lspci, and iproute2 to obtain the same information. - -Instructions on updating ethtool can be found in the section "Additional -Configurations" later in this document. - - -Identifying Your Adapter -======================== - -The following Intel network adapters are compatible with the drivers in this -release: - -+------------+------------------------------+----------------------------------+ -| Controller | Adapter Name | Physical Layer | -+============+==============================+==================================+ -| 82597EX | Intel(R) PRO/10GbE LR/SR/CX4 | - 10G Base-LR (fiber) | -| | Server Adapters | - 10G Base-SR (fiber) | -| | | - 10G Base-CX4 (copper) | -+------------+------------------------------+----------------------------------+ - -For more information on how to identify your adapter, go to the Adapter & -Driver ID Guide at: - - https://support.intel.com - - -Command Line Parameters -======================= - -If the driver is built as a module, the following optional parameters are -used by entering them on the command line with the modprobe command using -this syntax:: - - modprobe ixgb [<option>=<VAL1>,<VAL2>,...] - -For example, with two 10GbE PCI adapters, entering:: - - modprobe ixgb TxDescriptors=80,128 - -loads the ixgb driver with 80 TX resources for the first adapter and 128 TX -resources for the second adapter. - -The default value for each parameter is generally the recommended setting, -unless otherwise noted. - -Copybreak ---------- -:Valid Range: 0-XXXX -:Default Value: 256 - - This is the maximum size of packet that is copied to a new buffer on - receive. - -Debug ------ -:Valid Range: 0-16 (0=none,...,16=all) -:Default Value: 0 - - This parameter adjusts the level of debug messages displayed in the - system logs. - -FlowControl ------------ -:Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) -:Default Value: 1 if no EEPROM, otherwise read from EEPROM - - This parameter controls the automatic generation(Tx) and response(Rx) to - Ethernet PAUSE frames. There are hardware bugs associated with enabling - Tx flow control so beware. - -RxDescriptors -------------- -:Valid Range: 64-4096 -:Default Value: 1024 - - This value is the number of receive descriptors allocated by the driver. - Increasing this value allows the driver to buffer more incoming packets. - Each descriptor is 16 bytes. A receive buffer is also allocated for - each descriptor and can be either 2048, 4056, 8192, or 16384 bytes, - depending on the MTU setting. When the MTU size is 1500 or less, the - receive buffer size is 2048 bytes. When the MTU is greater than 1500 the - receive buffer size will be either 4056, 8192, or 16384 bytes. The - maximum MTU size is 16114. - -TxDescriptors -------------- -:Valid Range: 64-4096 -:Default Value: 256 - - This value is the number of transmit descriptors allocated by the driver. - Increasing this value allows the driver to queue more transmits. Each - descriptor is 16 bytes. - -RxIntDelay ----------- -:Valid Range: 0-65535 (0=off) -:Default Value: 72 - - This value delays the generation of receive interrupts in units of - 0.8192 microseconds. Receive interrupt reduction can improve CPU - efficiency if properly tuned for specific network traffic. Increasing - this value adds extra latency to frame reception and can end up - decreasing the throughput of TCP traffic. If the system is reporting - dropped receives, this value may be set too high, causing the driver to - run out of available receive descriptors. - -TxIntDelay ----------- -:Valid Range: 0-65535 (0=off) -:Default Value: 32 - - This value delays the generation of transmit interrupts in units of - 0.8192 microseconds. Transmit interrupt reduction can improve CPU - efficiency if properly tuned for specific network traffic. Increasing - this value adds extra latency to frame transmission and can end up - decreasing the throughput of TCP traffic. If this value is set too high, - it will cause the driver to run out of available transmit descriptors. - -XsumRX ------- -:Valid Range: 0-1 -:Default Value: 1 - - A value of '1' indicates that the driver should enable IP checksum - offload for received packets (both UDP and TCP) to the adapter hardware. - -RxFCHighThresh --------------- -:Valid Range: 1,536-262,136 (0x600 - 0x3FFF8, 8 byte granularity) -:Default Value: 196,608 (0x30000) - - Receive Flow control high threshold (when we send a pause frame) - -RxFCLowThresh -------------- -:Valid Range: 64-262,136 (0x40 - 0x3FFF8, 8 byte granularity) -:Default Value: 163,840 (0x28000) - - Receive Flow control low threshold (when we send a resume frame) - -FCReqTimeout ------------- -:Valid Range: 1-65535 -:Default Value: 65535 - - Flow control request timeout (how long to pause the link partner's tx) - -IntDelayEnable --------------- -:Value Range: 0,1 -:Default Value: 1 - - Interrupt Delay, 0 disables transmit interrupt delay and 1 enables it. - - -Improving Performance -===================== - -With the 10 Gigabit server adapters, the default Linux configuration will -very likely limit the total available throughput artificially. There is a set -of configuration changes that, when applied together, will increase the ability -of Linux to transmit and receive data. The following enhancements were -originally acquired from settings published at https://www.spec.org/web99/ for -various submitted results using Linux. - -NOTE: - These changes are only suggestions, and serve as a starting point for - tuning your network performance. - -The changes are made in three major ways, listed in order of greatest effect: - -- Use ip link to modify the mtu (maximum transmission unit) and the txqueuelen - parameter. -- Use sysctl to modify /proc parameters (essentially kernel tuning) -- Use setpci to modify the MMRBC field in PCI-X configuration space to increase - transmit burst lengths on the bus. - -NOTE: - setpci modifies the adapter's configuration registers to allow it to read - up to 4k bytes at a time (for transmits). However, for some systems the - behavior after modifying this register may be undefined (possibly errors of - some kind). A power-cycle, hard reset or explicitly setting the e6 register - back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a - stable configuration. - -- COPY these lines and paste them into ixgb_perf.sh: - -:: - - #!/bin/bash - echo "configuring network performance , edit this file to change the interface - or device ID of 10GbE card" - # set mmrbc to 4k reads, modify only Intel 10GbE device IDs - # replace 1a48 with appropriate 10GbE device's ID installed on the system, - # if needed. - setpci -d 8086:1a48 e6.b=2e - # set the MTU (max transmission unit) - it requires your switch and clients - # to change as well. - # set the txqueuelen - # your ixgb adapter should be loaded as eth1 for this to work, change if needed - ip li set dev eth1 mtu 9000 txqueuelen 1000 up - # call the sysctl utility to modify /proc/sys entries - sysctl -p ./sysctl_ixgb.conf - -- COPY these lines and paste them into sysctl_ixgb.conf: - -:: - - # some of the defaults may be different for your kernel - # call this file with sysctl -p <this file> - # these are just suggested values that worked well to increase throughput in - # several network benchmark tests, your mileage may vary - - ### IPV4 specific settings - # turn TCP timestamp support off, default 1, reduces CPU use - net.ipv4.tcp_timestamps = 0 - # turn SACK support off, default on - # on systems with a VERY fast bus -> memory interface this is the big gainer - net.ipv4.tcp_sack = 0 - # set min/default/max TCP read buffer, default 4096 87380 174760 - net.ipv4.tcp_rmem = 10000000 10000000 10000000 - # set min/pressure/max TCP write buffer, default 4096 16384 131072 - net.ipv4.tcp_wmem = 10000000 10000000 10000000 - # set min/pressure/max TCP buffer space, default 31744 32256 32768 - net.ipv4.tcp_mem = 10000000 10000000 10000000 - - ### CORE settings (mostly for socket and UDP effect) - # set maximum receive socket buffer size, default 131071 - net.core.rmem_max = 524287 - # set maximum send socket buffer size, default 131071 - net.core.wmem_max = 524287 - # set default receive socket buffer size, default 65535 - net.core.rmem_default = 524287 - # set default send socket buffer size, default 65535 - net.core.wmem_default = 524287 - # set maximum amount of option memory buffers, default 10240 - net.core.optmem_max = 524287 - # set number of unprocessed input packets before kernel starts dropping them; default 300 - net.core.netdev_max_backlog = 300000 - -Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface -your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's -ID installed on the system. - -NOTE: - Unless these scripts are added to the boot process, these changes will - only last only until the next system reboot. - - -Resolving Slow UDP Traffic --------------------------- -If your server does not seem to be able to receive UDP traffic as fast as it -can receive TCP traffic, it could be because Linux, by default, does not set -the network stack buffers as large as they need to be to support high UDP -transfer rates. One way to alleviate this problem is to allow more memory to -be used by the IP stack to store incoming data. - -For instance, use the commands:: - - sysctl -w net.core.rmem_max=262143 - -and:: - - sysctl -w net.core.rmem_default=262143 - -to increase the read buffer memory max and default to 262143 (256k - 1) from -defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables -will increase the amount of memory used by the network stack for receives, and -can be increased significantly more if necessary for your application. - - -Additional Configurations -========================= - -Configuring the Driver on Different Distributions -------------------------------------------------- -Configuring a network driver to load properly when the system is started is -distribution dependent. Typically, the configuration process involves adding -an alias line to /etc/modprobe.conf as well as editing other system startup -scripts and/or configuration files. Many popular Linux distributions ship -with tools to make these changes for you. To learn the proper way to -configure a network device for your system, refer to your distribution -documentation. If during this process you are asked for the driver or module -name, the name for the Linux Base Driver for the Intel 10GbE Family of -Adapters is ixgb. - -Viewing Link Messages ---------------------- -Link messages will not be displayed to the console if the distribution is -restricting system messages. In order to see network driver link messages on -your console, set dmesg to eight by entering the following:: - - dmesg -n 8 - -NOTE: This setting is not saved across reboots. - -Jumbo Frames ------------- -The driver supports Jumbo Frames for all adapters. Jumbo Frames support is -enabled by changing the MTU to a value larger than the default of 1500. -The maximum value for the MTU is 16114. Use the ip command to -increase the MTU size. For example:: - - ip li set dev ethx mtu 9000 - -The maximum MTU setting for Jumbo Frames is 16114. This value coincides -with the maximum Jumbo Frames size of 16128. - -Ethtool -------- -The driver utilizes the ethtool interface for driver configuration and -diagnostics, as well as displaying statistical information. The ethtool -version 1.6 or later is required for this functionality. - -The latest release of ethtool can be found from -https://www.kernel.org/pub/software/network/ethtool/ - -NOTE: - The ethtool version 1.6 only supports a limited set of ethtool options. - Support for a more complete ethtool feature set can be enabled by - upgrading to the latest version. - -NAPI ----- -NAPI (Rx polling mode) is supported in the ixgb driver. - -See https://wiki.linuxfoundation.org/networking/napi for more information on -NAPI. - - -Known Issues/Troubleshooting -============================ - -NOTE: - After installing the driver, if your Intel Network Connection is not - working, verify in the "In This Release" section of the readme that you have - installed the correct driver. - -Cable Interoperability Issue with Fujitsu XENPAK Module in SmartBits Chassis ----------------------------------------------------------------------------- -Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 -Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits -chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni. -The CRC errors may be received either by the Intel(R) PRO/10GbE CX4 -Server adapter or the SmartBits. If this situation occurs using a different -cable assembly may resolve the issue. - -Cable Interoperability Issues with HP Procurve 3400cl Switch Port ------------------------------------------------------------------ -Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server -adapter is connected to an HP Procurve 3400cl switch port using short cables -(1 m or shorter). If this situation occurs, using a longer cable may resolve -the issue. - -Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that -Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC -errors may be received either by the CX4 Server adapter or at the switch. If -this situation occurs, using a different cable assembly may resolve the issue. - -Jumbo Frames System Requirement -------------------------------- -Memory allocation failures have been observed on Linux systems with 64 MB -of RAM or less that are running Jumbo Frames. If you are using Jumbo -Frames, your system may require more than the advertised minimum -requirement of 64 MB of system memory. - -Performance Degradation with Jumbo Frames ------------------------------------------ -Degradation in throughput performance may be observed in some Jumbo frames -environments. If this is observed, increasing the application's socket buffer -size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help. -See the specific application manual and /usr/src/linux*/Documentation/ -networking/ip-sysctl.txt for more details. - -Allocating Rx Buffers when Using Jumbo Frames ---------------------------------------------- -Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if -the available memory is heavily fragmented. This issue may be seen with PCI-X -adapters or with packet split disabled. This can be reduced or eliminated -by changing the amount of available memory for receive buffer allocation, by -increasing /proc/sys/vm/min_free_kbytes. - -Multiple Interfaces on Same Ethernet Broadcast Network ------------------------------------------------------- -Due to the default ARP behavior on Linux, it is not possible to have -one system on two IP networks in the same Ethernet broadcast domain -(non-partitioned switch) behave as expected. All Ethernet interfaces -will respond to IP traffic for any IP address assigned to the system. -This results in unbalanced receive traffic. - -If you have multiple interfaces in a server, do either of the following: - - - Turn on ARP filtering by entering:: - - echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter - - - Install the interfaces in separate broadcast domains - either in - different switches or in a switch partitioned to VLANs. - -UDP Stress Test Dropped Packet Issue --------------------------------------- -Under small packets UDP stress test with 10GbE driver, the Linux system -may drop UDP packets due to the fullness of socket buffers. You may want -to change the driver's Flow Control variables to the minimum value for -controlling packet reception. - -Tx Hangs Possible Under Stress ------------------------------- -Under stress conditions, if TX hangs occur, turning off TSO -"ethtool -K eth0 tso off" may resolve the problem. - - -Support -======= -For general information, go to the Intel support website at: - -https://www.intel.com/support/ - -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - -If an issue is identified with the released source code on a supported kernel -with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst index 0a233b17c664..1e5f16993f69 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst @@ -545,13 +545,8 @@ on the Intel Ethernet Controller XL710. Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst index 76bbde736f21..08dc0d368a48 100644 --- a/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst @@ -55,13 +55,8 @@ VLANs: There is a limit of a total of 64 shared VLANs to 1 or more VFs. Support ======= For general information, go to the Intel support website at: - https://www.intel.com/support/ -or the Intel Wired Networking project hosted by Sourceforge at: - -https://sourceforge.net/projects/e1000 - If an issue is identified with the released source code on a supported kernel with a supported adapter, email the specific information related to the issue -to e1000-devel@lists.sf.net. +to intel-wired-lan@lists.osuosl.org. diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst index 4cd8e869762b..6b2d1fe74ecf 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -346,32 +346,6 @@ the software port. - The number of receive packets with CQE compression on ring i [#accel]_. - Acceleration - * - `rx[i]_cache_reuse` - - The number of events of successful reuse of a page from a driver's - internal page cache. - - Acceleration - - * - `rx[i]_cache_full` - - The number of events of full internal page cache where driver can't put a - page back to the cache for recycling (page will be freed). - - Acceleration - - * - `rx[i]_cache_empty` - - The number of events where cache was empty - no page to give. Driver - shall allocate new page. - - Acceleration - - * - `rx[i]_cache_busy` - - The number of events where cache head was busy and cannot be recycled. - Driver allocated new page. - - Acceleration - - * - `rx[i]_cache_waive` - - The number of cache evacuation. This can occur due to page move to - another NUMA node or page was pfmemalloc-ed and should be freed as soon - as possible. - - Acceleration - * - `rx[i]_arfs_err` - Number of flow rules that failed to be added to the flow table. - Error diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst index 9b5c40ba7f0d..3a7a714cc08f 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst @@ -122,6 +122,41 @@ users try to enable them. $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev +hairpin_num_queues: Number of hairpin queues +-------------------------------------------- +We refer to a TC NIC rule that involves forwarding as "hairpin". + +Hairpin queues are mlx5 hardware specific implementation for hardware +forwarding of such packets. + +- Show the number of hairpin queues:: + + $ devlink dev param show pci/0000:06:00.0 name hairpin_num_queues + pci/0000:06:00.0: + name hairpin_num_queues type driver-specific + values: + cmode driverinit value 2 + +- Change the number of hairpin queues:: + + $ devlink dev param set pci/0000:06:00.0 name hairpin_num_queues value 4 cmode driverinit + +hairpin_queue_size: Size of the hairpin queues +---------------------------------------------- +Control the size of the hairpin queues. + +- Show the size of the hairpin queues:: + + $ devlink dev param show pci/0000:06:00.0 name hairpin_queue_size + pci/0000:06:00.0: + name hairpin_queue_size type driver-specific + values: + cmode driverinit value 1024 + +- Change the size (in packets) of the hairpin queues:: + + $ devlink dev param set pci/0000:06:00.0 name hairpin_queue_size value 512 cmode driverinit + Health reporters ================ @@ -222,3 +257,36 @@ User commands examples: $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal NOTE: This command can run only on PF. + +vnic reporter +------------- +The vnic reporter implements only the `diagnose` callback. +It is responsible for querying the vnic diagnostic counters from fw and displaying +them in realtime. + +Description of the vnic counters: +total_q_under_processor_handle: number of queues in an error state due to +an async error or errored command. +send_queue_priority_update_flow: number of QP/SQ priority/SL update +events. +cq_overrun: number of times CQ entered an error state due to an +overflow. +async_eq_overrun: number of times an EQ mapped to async events was +overrun. +comp_eq_overrun: number of times an EQ mapped to completion events was +overrun. +quota_exceeded_command: number of commands issued and failed due to quota +exceeded. +invalid_command: number of commands issued and failed dues to any reason +other than quota exceeded. +nic_receive_steering_discard: number of packets that completed RX flow +steering but were discarded due to a mismatch in flow table. + +User commands examples: +- Diagnose PF/VF vnic counters + $ devlink health diagnose pci/0000:82:00.1 reporter vnic +- Diagnose representor vnic counters (performed by supplying devlink port of the + representor, which can be obtained via devlink port command) + $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic + +NOTE: This command can run over all interfaces such as PF/VF and representor ports. diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst index 3321117cf605..202798d6501e 100644 --- a/Documentation/networking/devlink/mlx5.rst +++ b/Documentation/networking/devlink/mlx5.rst @@ -72,6 +72,18 @@ parameters. Default: disabled + * - ``hairpin_num_queues`` + - u32 + - driverinit + - We refer to a TC NIC rule that involves forwarding as "hairpin". + Hairpin queues are mlx5 hardware specific implementation for hardware + forwarding of such packets. + + Control the number of hairpin queues. + * - ``hairpin_queue_size`` + - u32 + - driverinit + - Control the size (in packets) of the hairpin queues. The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD`` diff --git a/Documentation/networking/driver.rst b/Documentation/networking/driver.rst index 64f7236ff10b..4f5dfa9c022e 100644 --- a/Documentation/networking/driver.rst +++ b/Documentation/networking/driver.rst @@ -4,94 +4,124 @@ Softnet Driver Issues ===================== -Transmit path guidelines: +Probing guidelines +================== -1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under - any normal circumstances. It is considered a hard error unless - there is no way your device can tell ahead of time when its - transmit function will become busy. +Address validation +------------------ - Instead it must maintain the queue properly. For example, - for a driver implementing scatter-gather this means:: +Any hardware layer address you obtain for your device should +be verified. For example, for ethernet check it with +linux/etherdevice.h:is_valid_ether_addr() + +Close/stop guidelines +===================== + +Quiescence +---------- + +After the ndo_stop routine has been called, the hardware must +not receive or transmit any data. All in flight packets must +be aborted. If necessary, poll or wait for completion of +any reset commands. + +Auto-close +---------- + +The ndo_stop routine will be called by unregister_netdevice +if device is still UP. + +Transmit path guidelines +======================== + +Stop queues in advance +---------------------- + +The ndo_start_xmit method must not return NETDEV_TX_BUSY under +any normal circumstances. It is considered a hard error unless +there is no way your device can tell ahead of time when its +transmit function will become busy. + +Instead it must maintain the queue properly. For example, +for a driver implementing scatter-gather this means: + +.. code-block:: c + + static u32 drv_tx_avail(struct drv_ring *dr) + { + u32 used = READ_ONCE(dr->prod) - READ_ONCE(dr->cons); + + return dr->tx_ring_size - (used & bp->tx_ring_mask); + } static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct drv *dp = netdev_priv(dev); + struct netdev_queue *txq; + struct drv_ring *dr; + int idx; - lock_tx(dp); - ... - /* This is a hard error log it. */ - if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) { + idx = skb_get_queue_mapping(skb); + dr = dp->tx_rings[idx]; + txq = netdev_get_tx_queue(dev, idx); + + //... + /* This should be a very rare race - log it. */ + if (drv_tx_avail(dr) <= skb_shinfo(skb)->nr_frags + 1) { netif_stop_queue(dev); - unlock_tx(dp); - printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", - dev->name); + netdev_warn(dev, "Tx Ring full when queue awake!\n"); return NETDEV_TX_BUSY; } - ... queue packet to card ... - ... update tx consumer index ... - - if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1)) - netif_stop_queue(dev); - - ... - unlock_tx(dp); - ... - return NETDEV_TX_OK; - } - - And then at the end of your TX reclamation event handling:: + //... queue packet to card ... - if (netif_queue_stopped(dp->dev) && - TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1)) - netif_wake_queue(dp->dev); + netdev_tx_sent_queue(txq, skb->len); - For a non-scatter-gather supporting card, the three tests simply become:: + //... update tx producer index using WRITE_ONCE() ... - /* This is a hard error log it. */ - if (TX_BUFFS_AVAIL(dp) <= 0) + if (!netif_txq_maybe_stop(txq, drv_tx_avail(dr), + MAX_SKB_FRAGS + 1, 2 * MAX_SKB_FRAGS)) + dr->stats.stopped++; - and:: + //... + return NETDEV_TX_OK; + } - if (TX_BUFFS_AVAIL(dp) == 0) +And then at the end of your TX reclamation event handling: - and:: +.. code-block:: c - if (netif_queue_stopped(dp->dev) && - TX_BUFFS_AVAIL(dp) > 0) - netif_wake_queue(dp->dev); + //... update tx consumer index using WRITE_ONCE() ... -2) An ndo_start_xmit method must not modify the shared parts of a - cloned SKB. + netif_txq_completed_wake(txq, cmpl_pkts, cmpl_bytes, + drv_tx_avail(dr), 2 * MAX_SKB_FRAGS); -3) Do not forget that once you return NETDEV_TX_OK from your - ndo_start_xmit method, it is your driver's responsibility to free - up the SKB and in some finite amount of time. +Lockless queue stop / wake helper macros +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - For example, this means that it is not allowed for your TX - mitigation scheme to let TX packets "hang out" in the TX - ring unreclaimed forever if no new TX packets are sent. - This error can deadlock sockets waiting for send buffer room - to be freed up. +.. kernel-doc:: include/net/netdev_queues.h + :doc: Lockless queue stopping / waking helpers. - If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you - must not keep any reference to that SKB and you must not attempt - to free it up. +No exclusive ownership +---------------------- -Probing guidelines: +An ndo_start_xmit method must not modify the shared parts of a +cloned SKB. -1) Any hardware layer address you obtain for your device should - be verified. For example, for ethernet check it with - linux/etherdevice.h:is_valid_ether_addr() +Timely completions +------------------ -Close/stop guidelines: +Do not forget that once you return NETDEV_TX_OK from your +ndo_start_xmit method, it is your driver's responsibility to free +up the SKB and in some finite amount of time. -1) After the ndo_stop routine has been called, the hardware must - not receive or transmit any data. All in flight packets must - be aborted. If necessary, poll or wait for completion of - any reset commands. +For example, this means that it is not allowed for your TX +mitigation scheme to let TX packets "hang out" in the TX +ring unreclaimed forever if no new TX packets are sent. +This error can deadlock sockets waiting for send buffer room +to be freed up. -2) The ndo_stop routine will be called by unregister_netdevice - if device is still UP. +If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you +must not keep any reference to that SKB and you must not attempt +to free it up. diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index e1bc6186d7ea..2540c70952ff 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -860,22 +860,24 @@ Request contents: Kernel response contents: - ==================================== ====== =========================== - ``ETHTOOL_A_RINGS_HEADER`` nested reply header - ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring - ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring - ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring - ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring - ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring - ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring - ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring - ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring - ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring - ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split - ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE - ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode - ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode - ==================================== ====== =========================== + ======================================= ====== =========================== + ``ETHTOOL_A_RINGS_HEADER`` nested reply header + ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring + ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring + ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring + ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring + ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring + ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring + ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring + ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring + ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring + ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split + ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE + ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode + ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer + ======================================= ====== =========================== ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``). @@ -891,6 +893,18 @@ through MMIO writes, thus reducing the latency. However, enabling this feature may increase the CPU cost. Drivers may enforce additional per-packet eligibility checks (e.g. on packet size). +``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` specifies the maximum number of bytes of a +transmitted packet a driver can push directly to the underlying device +('push' mode). Pushing some of the payload bytes to the device has the +advantages of reducing latency for small packets by avoiding DMA mapping (same +as ``ETHTOOL_A_RINGS_TX_PUSH`` parameter) as well as allowing the underlying +device to process packet headers ahead of fetching its payload. +This can help the device to make fast actions based on the packet's headers. +This is similar to the "tx-copybreak" parameter, which copies the packet to a +preallocated DMA memory area instead of mapping new memory. However, +tx-push-buff parameter copies the packet directly to the device to allow the +device to take faster actions on the packet. + RINGS_SET ========= @@ -908,6 +922,7 @@ Request contents: ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode + ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer ==================================== ====== =========================== Kernel checks that requested ring sizes do not exceed limits reported by @@ -1084,6 +1099,10 @@ such that the corresponding bit in ``ethtool_ops::supported_coalesce_params`` is not set), regardless of their values. Driver may impose additional constraints on coalescing parameters and their values. +Compared to requests issued via the ``ioctl()`` netlink version of this request +will try harder to make sure that values specified by the user have been applied +and may call the driver twice. + PAUSE_GET ========= diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 4ddcae33c336..a164ff074356 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -36,6 +36,7 @@ Contents: scaling tls tls-offload + tls-handshake nfc 6lowpan 6pack @@ -73,6 +74,7 @@ Contents: mpls-sysctl mptcp-sysctl multiqueue + napi netconsole netdev-features netdevices diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 58a78a316697..6ec06a33688a 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -2721,6 +2721,13 @@ echo_ignore_anycast - BOOLEAN Default: 0 +error_anycast_as_unicast - BOOLEAN + If set to 1, then the kernel will respond with ICMP Errors + resulting from requests sent to it over the IPv6 protocol destined + to anycast address essentially treating anycast as unicast. + + Default: 0 + xfrm6_gc_thresh - INTEGER (Obsolete since linux-4.14) The threshold at which we will start garbage collecting for IPv6 diff --git a/Documentation/networking/napi.rst b/Documentation/networking/napi.rst new file mode 100644 index 000000000000..a7a047742e93 --- /dev/null +++ b/Documentation/networking/napi.rst @@ -0,0 +1,254 @@ +.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) + +.. _napi: + +==== +NAPI +==== + +NAPI is the event handling mechanism used by the Linux networking stack. +The name NAPI no longer stands for anything in particular [#]_. + +In basic operation the device notifies the host about new events +via an interrupt. +The host then schedules a NAPI instance to process the events. +The device may also be polled for events via NAPI without receiving +interrupts first (:ref:`busy polling<poll>`). + +NAPI processing usually happens in the software interrupt context, +but there is an option to use :ref:`separate kernel threads<threaded>` +for NAPI processing. + +All in all NAPI abstracts away from the drivers the context and configuration +of event (packet Rx and Tx) processing. + +Driver API +========== + +The two most important elements of NAPI are the struct napi_struct +and the associated poll method. struct napi_struct holds the state +of the NAPI instance while the method is the driver-specific event +handler. The method will typically free Tx packets that have been +transmitted and process newly received packets. + +.. _drv_ctrl: + +Control API +----------- + +netif_napi_add() and netif_napi_del() add/remove a NAPI instance +from the system. The instances are attached to the netdevice passed +as argument (and will be deleted automatically when netdevice is +unregistered). Instances are added in a disabled state. + +napi_enable() and napi_disable() manage the disabled state. +A disabled NAPI can't be scheduled and its poll method is guaranteed +to not be invoked. napi_disable() waits for ownership of the NAPI +instance to be released. + +The control APIs are not idempotent. Control API calls are safe against +concurrent use of datapath APIs but an incorrect sequence of control API +calls may result in crashes, deadlocks, or race conditions. For example, +calling napi_disable() multiple times in a row will deadlock. + +Datapath API +------------ + +napi_schedule() is the basic method of scheduling a NAPI poll. +Drivers should call this function in their interrupt handler +(see :ref:`drv_sched` for more info). A successful call to napi_schedule() +will take ownership of the NAPI instance. + +Later, after NAPI is scheduled, the driver's poll method will be +called to process the events/packets. The method takes a ``budget`` +argument - drivers can process completions for any number of Tx +packets but should only process up to ``budget`` number of +Rx packets. Rx processing is usually much more expensive. + +In other words, it is recommended to ignore the budget argument when +performing TX buffer reclamation to ensure that the reclamation is not +arbitrarily bounded; however, it is required to honor the budget argument +for RX processing. + +.. warning:: + + The ``budget`` argument may be 0 if core tries to only process Tx completions + and no Rx packets. + +The poll method returns the amount of work done. If the driver still +has outstanding work to do (e.g. ``budget`` was exhausted) +the poll method should return exactly ``budget``. In that case, +the NAPI instance will be serviced/polled again (without the +need to be scheduled). + +If event processing has been completed (all outstanding packets +processed) the poll method should call napi_complete_done() +before returning. napi_complete_done() releases the ownership +of the instance. + +.. warning:: + + The case of finishing all events and using exactly ``budget`` + must be handled carefully. There is no way to report this + (rare) condition to the stack, so the driver must either + not call napi_complete_done() and wait to be called again, + or return ``budget - 1``. + + If the ``budget`` is 0 napi_complete_done() should never be called. + +Call sequence +------------- + +Drivers should not make assumptions about the exact sequencing +of calls. The poll method may be called without the driver scheduling +the instance (unless the instance is disabled). Similarly, +it's not guaranteed that the poll method will be called, even +if napi_schedule() succeeded (e.g. if the instance gets disabled). + +As mentioned in the :ref:`drv_ctrl` section - napi_disable() and subsequent +calls to the poll method only wait for the ownership of the instance +to be released, not for the poll method to exit. This means that +drivers should avoid accessing any data structures after calling +napi_complete_done(). + +.. _drv_sched: + +Scheduling and IRQ masking +-------------------------- + +Drivers should keep the interrupts masked after scheduling +the NAPI instance - until NAPI polling finishes any further +interrupts are unnecessary. + +Drivers which have to mask the interrupts explicitly (as opposed +to IRQ being auto-masked by the device) should use the napi_schedule_prep() +and __napi_schedule() calls: + +.. code-block:: c + + if (napi_schedule_prep(&v->napi)) { + mydrv_mask_rxtx_irq(v->idx); + /* schedule after masking to avoid races */ + __napi_schedule(&v->napi); + } + +IRQ should only be unmasked after a successful call to napi_complete_done(): + +.. code-block:: c + + if (budget && napi_complete_done(&v->napi, work_done)) { + mydrv_unmask_rxtx_irq(v->idx); + return min(work_done, budget - 1); + } + +napi_schedule_irqoff() is a variant of napi_schedule() which takes advantage +of guarantees given by being invoked in IRQ context (no need to +mask interrupts). Note that PREEMPT_RT forces all interrupts +to be threaded so the interrupt may need to be marked ``IRQF_NO_THREAD`` +to avoid issues on real-time kernel configurations. + +Instance to queue mapping +------------------------- + +Modern devices have multiple NAPI instances (struct napi_struct) per +interface. There is no strong requirement on how the instances are +mapped to queues and interrupts. NAPI is primarily a polling/processing +abstraction without specific user-facing semantics. That said, most networking +devices end up using NAPI in fairly similar ways. + +NAPI instances most often correspond 1:1:1 to interrupts and queue pairs +(queue pair is a set of a single Rx and single Tx queue). + +In less common cases a NAPI instance may be used for multiple queues +or Rx and Tx queues can be serviced by separate NAPI instances on a single +core. Regardless of the queue assignment, however, there is usually still +a 1:1 mapping between NAPI instances and interrupts. + +It's worth noting that the ethtool API uses a "channel" terminology where +each channel can be either ``rx``, ``tx`` or ``combined``. It's not clear +what constitutes a channel; the recommended interpretation is to understand +a channel as an IRQ/NAPI which services queues of a given type. For example, +a configuration of 1 ``rx``, 1 ``tx`` and 1 ``combined`` channel is expected +to utilize 3 interrupts, 2 Rx and 2 Tx queues. + +User API +======== + +User interactions with NAPI depend on NAPI instance ID. The instance IDs +are only visible to the user thru the ``SO_INCOMING_NAPI_ID`` socket option. +It's not currently possible to query IDs used by a given device. + +Software IRQ coalescing +----------------------- + +NAPI does not perform any explicit event coalescing by default. +In most scenarios batching happens due to IRQ coalescing which is done +by the device. There are cases where software coalescing is helpful. + +NAPI can be configured to arm a repoll timer instead of unmasking +the hardware interrupts as soon as all packets are processed. +The ``gro_flush_timeout`` sysfs configuration of the netdevice +is reused to control the delay of the timer, while +``napi_defer_hard_irqs`` controls the number of consecutive empty polls +before NAPI gives up and goes back to using hardware IRQs. + +.. _poll: + +Busy polling +------------ + +Busy polling allows a user process to check for incoming packets before +the device interrupt fires. As is the case with any busy polling it trades +off CPU cycles for lower latency (production uses of NAPI busy polling +are not well known). + +Busy polling is enabled by either setting ``SO_BUSY_POLL`` on +selected sockets or using the global ``net.core.busy_poll`` and +``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling +also exists. + +IRQ mitigation +--------------- + +While busy polling is supposed to be used by low latency applications, +a similar mechanism can be used for IRQ mitigation. + +Very high request-per-second applications (especially routing/forwarding +applications and especially applications using AF_XDP sockets) may not +want to be interrupted until they finish processing a request or a batch +of packets. + +Such applications can pledge to the kernel that they will perform a busy +polling operation periodically, and the driver should keep the device IRQs +permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POLL`` +socket option. To avoid system misbehavior the pledge is revoked +if ``gro_flush_timeout`` passes without any busy poll call. + +The NAPI budget for busy polling is lower than the default (which makes +sense given the low latency intention of normal busy polling). This is +not the case with IRQ mitigation, however, so the budget can be adjusted +with the ``SO_BUSY_POLL_BUDGET`` socket option. + +.. _threaded: + +Threaded NAPI +------------- + +Threaded NAPI is an operating mode that uses dedicated kernel +threads rather than software IRQ context for NAPI processing. +The configuration is per netdevice and will affect all +NAPI instances of that device. Each NAPI instance will spawn a separate +thread (called ``napi/${ifc-name}-${napi-id}``). + +It is recommended to pin each kernel thread to a single CPU, the same +CPU as the CPU which services the interrupt. Note that the mapping +between IRQs and NAPI instances may not be trivial (and is driver +dependent). The NAPI instance IDs will be assigned in the opposite +order than the process IDs of the kernel threads. + +Threaded NAPI is controlled by writing 0/1 to the ``threaded`` file in +netdev's sysfs directory. + +.. rubric:: Footnotes + +.. [#] NAPI was originally referred to as New API in 2.4 Linux. diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst index 30f1344e7cca..873efd97f822 100644 --- a/Documentation/networking/page_pool.rst +++ b/Documentation/networking/page_pool.rst @@ -165,6 +165,7 @@ Registration pp_params.pool_size = DESC_NUM; pp_params.nid = NUMA_NO_NODE; pp_params.dev = priv->dev; + pp_params.napi = napi; /* only if locking is tied to NAPI */ pp_params.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE; page_pool = page_pool_create(&pp_params); diff --git a/Documentation/networking/rxrpc.rst b/Documentation/networking/rxrpc.rst index ec1323d92c96..e807e18ba32a 100644 --- a/Documentation/networking/rxrpc.rst +++ b/Documentation/networking/rxrpc.rst @@ -848,14 +848,21 @@ The kernel interface functions are as follows: returned. The caller now holds a reference on this and it must be properly ended. - (#) End a client call:: + (#) Shut down a client call:: - void rxrpc_kernel_end_call(struct socket *sock, + void rxrpc_kernel_shutdown_call(struct socket *sock, + struct rxrpc_call *call); + + This is used to shut down a previously begun call. The user_call_ID is + expunged from AF_RXRPC's knowledge and will not be seen again in + association with the specified call. + + (#) Release the ref on a client call:: + + void rxrpc_kernel_put_call(struct socket *sock, struct rxrpc_call *call); - This is used to end a previously begun call. The user_call_ID is expunged - from AF_RXRPC's knowledge and will not be seen again in association with - the specified call. + This is used to release the caller's ref on an rxrpc call. (#) Send data through a call:: diff --git a/Documentation/networking/tls-handshake.rst b/Documentation/networking/tls-handshake.rst new file mode 100644 index 000000000000..a2817a88e905 --- /dev/null +++ b/Documentation/networking/tls-handshake.rst @@ -0,0 +1,217 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================= +In-Kernel TLS Handshake +======================= + +Overview +======== + +Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs +over TCP. TLS provides end-to-end data integrity and confidentiality in +addition to peer authentication. + +The kernel's kTLS implementation handles the TLS record subprotocol, but +does not handle the TLS handshake subprotocol which is used to establish +a TLS session. Kernel consumers can use the API described here to +request TLS session establishment. + +There are several possible ways to provide a handshake service in the +kernel. The API described here is designed to hide the details of those +implementations so that in-kernel TLS consumers do not need to be +aware of how the handshake gets done. + + +User handshake agent +==================== + +As of this writing, there is no TLS handshake implementation in the +Linux kernel. To provide a handshake service, a handshake agent +(typically in user space) is started in each network namespace where a +kernel consumer might require a TLS handshake. Handshake agents listen +for events sent from the kernel that indicate a handshake request is +waiting. + +An open socket is passed to a handshake agent via a netlink operation, +which creates a socket descriptor in the agent's file descriptor table. +If the handshake completes successfully, the handshake agent promotes +the socket to use the TLS ULP and sets the session information using the +SOL_TLS socket options. The handshake agent returns the socket to the +kernel via a second netlink operation. + + +Kernel Handshake API +==================== + +A kernel TLS consumer initiates a client-side TLS handshake on an open +socket by invoking one of the tls_client_hello() functions. First, it +fills in a structure that contains the parameters of the request: + +.. code-block:: c + + struct tls_handshake_args { + struct socket *ta_sock; + tls_done_func_t ta_done; + void *ta_data; + unsigned int ta_timeout_ms; + key_serial_t ta_keyring; + key_serial_t ta_my_cert; + key_serial_t ta_my_privkey; + unsigned int ta_num_peerids; + key_serial_t ta_my_peerids[5]; + }; + +The @ta_sock field references an open and connected socket. The consumer +must hold a reference on the socket to prevent it from being destroyed +while the handshake is in progress. The consumer must also have +instantiated a struct file in sock->file. + + +@ta_done contains a callback function that is invoked when the handshake +has completed. Further explanation of this function is in the "Handshake +Completion" sesction below. + +The consumer can fill in the @ta_timeout_ms field to force the servicing +handshake agent to exit after a number of milliseconds. This enables the +socket to be fully closed once both the kernel and the handshake agent +have closed their endpoints. + +Authentication material such as x.509 certificates, private certificate +keys, and pre-shared keys are provided to the handshake agent in keys +that are instantiated by the consumer before making the handshake +request. The consumer can provide a private keyring that is linked into +the handshake agent's process keyring in the @ta_keyring field to prevent +access of those keys by other subsystems. + +To request an x.509-authenticated TLS session, the consumer fills in +the @ta_my_cert and @ta_my_privkey fields with the serial numbers of +keys containing an x.509 certificate and the private key for that +certificate. Then, it invokes this function: + +.. code-block:: c + + ret = tls_client_hello_x509(args, gfp_flags); + +The function returns zero when the handshake request is under way. A +zero return guarantees the callback function @ta_done will be invoked +for this socket. The function returns a negative errno if the handshake +could not be started. A negative errno guarantees the callback function +@ta_done will not be invoked on this socket. + + +To initiate a client-side TLS handshake with a pre-shared key, use: + +.. code-block:: c + + ret = tls_client_hello_psk(args, gfp_flags); + +However, in this case, the consumer fills in the @ta_my_peerids array +with serial numbers of keys containing the peer identities it wishes +to offer, and the @ta_num_peerids field with the number of array +entries it has filled in. The other fields are filled in as above. + + +To initiate an anonymous client-side TLS handshake use: + +.. code-block:: c + + ret = tls_client_hello_anon(args, gfp_flags); + +The handshake agent presents no peer identity information to the remote +during this type of handshake. Only server authentication (ie the client +verifies the server's identity) is performed during the handshake. Thus +the established session uses encryption only. + + +Consumers that are in-kernel servers use: + +.. code-block:: c + + ret = tls_server_hello_x509(args, gfp_flags); + +or + +.. code-block:: c + + ret = tls_server_hello_psk(args, gfp_flags); + +The argument structure is filled in as above. + + +If the consumer needs to cancel the handshake request, say, due to a ^C +or other exigent event, the consumer can invoke: + +.. code-block:: c + + bool tls_handshake_cancel(sock); + +This function returns true if the handshake request associated with +@sock has been canceled. The consumer's handshake completion callback +will not be invoked. If this function returns false, then the consumer's +completion callback has already been invoked. + + +Handshake Completion +==================== + +When the handshake agent has completed processing, it notifies the +kernel that the socket may be used by the consumer again. At this point, +the consumer's handshake completion callback, provided in the @ta_done +field in the tls_handshake_args structure, is invoked. + +The synopsis of this function is: + +.. code-block:: c + + typedef void (*tls_done_func_t)(void *data, int status, + key_serial_t peerid); + +The consumer provides a cookie in the @ta_data field of the +tls_handshake_args structure that is returned in the @data parameter of +this callback. The consumer uses the cookie to match the callback to the +thread waiting for the handshake to complete. + +The success status of the handshake is returned via the @status +parameter: + ++------------+----------------------------------------------+ +| status | meaning | ++============+==============================================+ +| 0 | TLS session established successfully | ++------------+----------------------------------------------+ +| -EACCESS | Remote peer rejected the handshake or | +| | authentication failed | ++------------+----------------------------------------------+ +| -ENOMEM | Temporary resource allocation failure | ++------------+----------------------------------------------+ +| -EINVAL | Consumer provided an invalid argument | ++------------+----------------------------------------------+ +| -ENOKEY | Missing authentication material | ++------------+----------------------------------------------+ +| -EIO | An unexpected fault occurred | ++------------+----------------------------------------------+ + +The @peerid parameter contains the serial number of a key containing the +remote peer's identity or the value TLS_NO_PEERID if the session is not +authenticated. + +A best practice is to close and destroy the socket immediately if the +handshake failed. + + +Other considerations +-------------------- + +While a handshake is under way, the kernel consumer must alter the +socket's sk_data_ready callback function to ignore all incoming data. +Once the handshake completion callback function has been invoked, normal +receive operation can be resumed. + +Once a TLS session is established, the consumer must provide a buffer +for and then examine the control message (CMSG) that is part of every +subsequent sock_recvmsg(). Each control message indicates whether the +received message data is TLS record data or session metadata. + +See tls.rst for details on how a kTLS consumer recognizes incoming +(decrypted) application data, alerts, and handshake packets once the +socket has been promoted to use the TLS ULP. diff --git a/Documentation/process/maintainer-netdev.rst b/Documentation/process/maintainer-netdev.rst index 4a75686d35ab..f73ac9e175a8 100644 --- a/Documentation/process/maintainer-netdev.rst +++ b/Documentation/process/maintainer-netdev.rst @@ -109,6 +109,8 @@ Finally, the vX.Y gets released, and the whole cycle starts over. netdev patch review ------------------- +.. _patch_status: + Patch status ~~~~~~~~~~~~ @@ -143,6 +145,33 @@ Asking the maintainer for status updates on your patch is a good way to ensure your patch is ignored or pushed to the bottom of the priority list. +Changes requested +~~~~~~~~~~~~~~~~~ + +Patches :ref:`marked<patch_status>` as ``Changes Requested`` need +to be revised. The new version should come with a change log, +preferably including links to previous postings, for example:: + + [PATCH net-next v3] net: make cows go moo + + Even users who don't drink milk appreciate hearing the cows go "moo". + + The amount of mooing will depend on packet rate so should match + the diurnal cycle quite well. + + Signed-of-by: Joe Defarmer <joe@barn.org> + --- + v3: + - add a note about time-of-day mooing fluctuation to the commit message + v2: https://lore.kernel.org/netdev/123themessageid@barn.org/ + - fix missing argument in kernel doc for netif_is_bovine() + - fix memory leak in netdev_register_cow() + v1: https://lore.kernel.org/netdev/456getstheclicks@barn.org/ + +The commit message should be revised to answer any questions reviewers +had to ask in previous discussions. Occasionally the update of +the commit message will be the only change in the new version. + Partial resends ~~~~~~~~~~~~~~~ @@ -155,11 +184,18 @@ Handling misapplied patches Occasionally a patch series gets applied before receiving critical feedback, or the wrong version of a series gets applied. -There is no revert possible, once it is pushed out, it stays like that. + +Making the patch disappear once it is pushed out is not possible, the commit +history in netdev trees is immutable. Please send incremental versions on top of what has been merged in order to fix the patches the way they would look like if your latest patch series was to be merged. +In cases where full revert is needed the revert has to be submitted +as a patch to the list with a commit message explaining the technical +problems with the reverted commit. Reverts should be used as a last resort, +when original change is completely wrong; incremental fixes are preferred. + Stable tree ~~~~~~~~~~~ diff --git a/Documentation/userspace-api/netlink/genetlink-legacy.rst b/Documentation/userspace-api/netlink/genetlink-legacy.rst index 3bf0bcdf21d8..802875a37a27 100644 --- a/Documentation/userspace-api/netlink/genetlink-legacy.rst +++ b/Documentation/userspace-api/netlink/genetlink-legacy.rst @@ -162,9 +162,91 @@ Other quirks (todo) Structures ---------- -Legacy families can define C structures both to be used as the contents -of an attribute and as a fixed message header. The plan is to define -the structs in ``definitions`` and link the appropriate attrs. +Legacy families can define C structures both to be used as the contents of +an attribute and as a fixed message header. Structures are defined in +``definitions`` and referenced in operations or attributes. Note that +structures defined in YAML are implicitly packed according to C +conventions. For example, the following struct is 4 bytes, not 6 bytes: + +.. code-block:: c + + struct { + u8 a; + u16 b; + u8 c; + } + +Any padding must be explicitly added and C-like languages should infer the +need for explicit padding from whether the members are naturally aligned. + +Here is the struct definition from above, declared in YAML: + +.. code-block:: yaml + + definitions: + - + name: message-header + type: struct + members: + - + name: a + type: u8 + - + name: b + type: u16 + - + name: c + type: u8 + +Fixed Headers +~~~~~~~~~~~~~ + +Fixed message headers can be added to operations using ``fixed-header``. +The default ``fixed-header`` can be set in ``operations`` and it can be set +or overridden for each operation. + +.. code-block:: yaml + + operations: + fixed-header: message-header + list: + - + name: get + fixed-header: custom-header + attribute-set: message-attrs + +Attributes +~~~~~~~~~~ + +A ``binary`` attribute can be interpreted as a C structure using a +``struct`` property with the name of the structure definition. The +``struct`` property implies ``sub-type: struct`` so it is not necessary to +specify a sub-type. + +.. code-block:: yaml + + attribute-sets: + - + name: stats-attrs + attributes: + - + name: stats + type: binary + struct: vport-stats + +C Arrays +-------- + +Legacy families also use ``binary`` attributes to encapsulate C arrays. The +``sub-type`` is used to identify the type of scalar to extract. + +.. code-block:: yaml + + attributes: + - + name: ports + type: binary + sub-type: u32 Multi-message DO ---------------- diff --git a/Documentation/userspace-api/netlink/specs.rst b/Documentation/userspace-api/netlink/specs.rst index a22442ba1d30..2e4acde890b7 100644 --- a/Documentation/userspace-api/netlink/specs.rst +++ b/Documentation/userspace-api/netlink/specs.rst @@ -254,6 +254,16 @@ rather than depend on what is specified in the spec file. The validation policy in the kernel is formed by combining the type definition (``type`` and ``nested-attributes``) and the ``checks``. +sub-type +~~~~~~~~ + +Legacy families have special ways of expressing arrays. ``sub-type`` can be +used to define the type of array members in case array members are not +fully defined as attributes (in a bona fide attribute space). For instance +a C array of u32 values can be specified with ``type: binary`` and +``sub-type: u32``. Binary types and legacy array formats are described in +more detail in :doc:`genetlink-legacy`. + operations ---------- |