summaryrefslogtreecommitdiffstats
path: root/Documentation/netlink
Commit message (Collapse)AuthorAgeFilesLines
* netdev: Add queue stats for TX stop and wakeDaniel Jurgens4 days1-0/+14
| | | | | | | | | | | TX queue stop and wake are counted by some drivers. Support reporting these via netdev-genl queue stats. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240510201927.1821109-2-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8 days1-0/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * netlink: specs: Add missing bridge linkinfo attrsDonald Hunter10 days1-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Attributes for FDB learned entries were added to the if_link netlink api for bridge linkinfo but are missing from the rt_link.yaml spec. Add the missing attributes to the spec. Fixes: ddd1ad68826d ("net: bridge: Add netlink knobs for number / max learned FDB entries") Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240503164304.87427-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink/specs: Add VF attributes to rt_link specDonald Hunter8 days1-2/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for retrieving VFs as part of link info. For example: ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifi-index": 38, "ext-mask": ["vf", "skip-stats"]}' {'address': 'b6:75:91:f2:64:65', [snip] 'vfinfo-list': {'info': [{'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00', 'link-state': {'link-state': 'auto', 'vf': 0}, 'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00', 'vf': 0}, 'rate': {'max-tx-rate': 0, 'min-tx-rate': 0, 'vf': 0}, 'rss-query-en': {'setting': 0, 'vf': 0}, 'spoofchk': {'setting': 0, 'vf': 0}, 'trust': {'setting': 0, 'vf': 0}, 'tx-rate': {'rate': 0, 'vf': 0}, 'vlan': {'qos': 0, 'vf': 0, 'vlan': 0}, 'vlan-list': {'info': [{'qos': 0, 'vf': 0, 'vlan': 0, 'vlan-proto': 0}]}}, {'broadcast': b'\xff\xff\xff\xff\xff\xff\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00', 'link-state': {'link-state': 'auto', 'vf': 1}, 'mac': {'mac': b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00' b'\x00\x00\x00\x00\x00\x00\x00\x00', 'vf': 1}, 'rate': {'max-tx-rate': 0, 'min-tx-rate': 0, 'vf': 1}, 'rss-query-en': {'setting': 0, 'vf': 1}, 'spoofchk': {'setting': 0, 'vf': 1}, 'trust': {'setting': 0, 'vf': 1}, 'tx-rate': {'rate': 0, 'vf': 1}, 'vlan': {'qos': 0, 'vf': 1, 'vlan': 0}, 'vlan-list': {'info': [{'qos': 0, 'vf': 1, 'vlan': 0, 'vlan-proto': 0}]}}]}, 'xdp': {'attached': 0}} Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240507103603.23017-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netdev: add queue statsXuan Zhuo2024-04-301-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These stats are commonly. Support reporting those via netdev-genl queue stats. name: rx-hw-drops name: rx-hw-drop-overruns name: rx-csum-unnecessary name: rx-csum-none name: rx-csum-bad name: rx-hw-gro-packets name: rx-hw-gro-bytes name: rx-hw-gro-wire-packets name: rx-hw-gro-wire-bytes name: rx-hw-drop-ratelimits name: tx-hw-drops name: tx-hw-drop-errors name: tx-csum-none name: tx-needs-csum name: tx-hw-gso-packets name: tx-hw-gso-bytes name: tx-hw-gso-wire-packets name: tx-hw-gso-wire-bytes name: tx-hw-drop-ratelimits Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* | netdev: support dumping a single netdev in qstatsJakub Kicinski2024-04-231-0/+1
| | | | | | | | | | | | | | | | | | | | Having to filter the right ifindex in the tests is a bit tedious. Add support for dumping qstats for a single ifindex. Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240420023543.3300306-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: Add draft nftables specDonald Hunter2024-04-221-0/+1264
| | | | | | | | | | | | | | | | | | Add a spec for nftables that has nearly complete coverage of the ops, but limited coverage of rule types and subexpressions. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240418104737.77914-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: Expand the pse netlink command with PoE interfaceKory Maincent (Dent Project)2024-04-181-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the PoE pse attributes prefix to be able to use PoE interface. Example usage: ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-get \ --json '{"header":{"dev-name":"eth0"}}' {'header': {'dev-index': 4, 'dev-name': 'eth0'}, 'c33-pse-admin-state': 3, 'c33-pse-pw-d-status': 4} ./ynl/cli.py --spec netlink/specs/ethtool.yaml --no-schema --do pse-set \ --json '{"header":{"dev-name":"eth0"}, "c33-pse-admin-control":3}' Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240417-feature_poe-v9-5-242293fd1900@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: Modify pse attribute prefixKory Maincent (Dent Project)2024-04-181-9/+9
| | | | | | | | | | | | | | | | | | | | Remove podl from the attribute prefix to prepare the support of PoE pse netlink spec. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240417-feature_poe-v9-4-242293fd1900@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: Add bond support to rt_link.yamlHangbin Liu2024-04-101-0/+163
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add bond support to rt_link.yaml. Here is an example output: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifname": "bond0"}' --output-json | jq '.linkinfo' { "kind": "bond", "data": { "mode": 4, "miimon": 100, ... "arp-interval": 0, "arp-ip-target": [ "192.168.1.1", "192.168.1.2" ], "arp-validate": 0, "arp-all-targets": 0, "ns-ip6-target": [ "2001::1", "2001::2" ], "primary-reselect": 0, ... "missed-max": 2, "ad-info": { "aggregator": 1, "num-ports": 1, "actor-key": 0, "partner-key": 1, "partner-mac": "00:00:00:00:00:00" } } } And here is the downlink info. $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifname": "dummy0"}' --output-json | jq '.linkinfo' { "kind": "dummy", "slave-kind": "bond", "slave-data": { "state": 0, "mii-status": 0, "link-failure-count": 0, "perm-hwaddr": "f2:82:f7:cc:47:13", "queue-id": 0, "prio": 0 } } Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240409083504.3900877-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | ynl: rename array-nest to indexed-arrayHangbin Liu2024-04-057-14/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some implementations, like bonding, has nest array with same attr type. To support all kinds of entries under one nest array. As discussed[1], let's rename array-nest to indexed-array, and assuming the value is a nest by passing the type via sub-type. [1] https://lore.kernel.org/netdev/20240312100105.16a59086@kernel.org/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240404063114.1221532-2-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: ethtool: define header-flags as an enumJakub Kicinski2024-04-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Recent changes added header flags to the spec. Use an enum instead of defines for more seamless codegen. [Jakub: drop the already applied parts and rewrite message] Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://lore.kernel.org/r/20240403212931.128541-6-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | ethtool: add interface to read Tx hardware timestamping statisticsRahul Rameshbabu2024-04-051-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | Multiple network devices that support hardware timestamping appear to have common behavior with regards to timestamp handling. Implement common Tx hardware timestamping statistics in a tx_stats struct_group. Common Rx hardware timestamping statistics can subsequently be implemented in a rx_stats struct_group for ethtool_ts_stats. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://lore.kernel.org/r/20240403212931.128541-2-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: define ethtool header flagsJakub Kicinski2024-04-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When interfacing with the ethtool commands it's handy to be able to use the names of the flags. Example: ethnl.pause_get({"header": {"dev-index": cfg.ifindex, "flags": {'stats'}}}) Note that not all commands accept all the flags, but the meaning of the bits does not change command to command. Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20240403023426.1762996-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Documentation: netlink: add a YAML spec for teamHangbin Liu2024-04-021-0/+204
| | | | | | | | | | | | | | | | | | Add a YAML specification for team. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240401031004.1159713-2-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc: netlink: Update tc spec with missing definitionsDonald Hunter2024-04-011-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | The tc spec referenced tc-u32-mark and tc-act-police-attrs but did not define them. The missing definitions were discovered when building the docs with generated hyperlinks because the hyperlink target labels were missing. Add definitions for tc-u32-mark and tc-act-police-attrs. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240329135021.52534-4-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: Add vlan attr in rt_link specHangbin Liu2024-03-281-2/+78
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With command: # ./tools/net/ynl/cli.py \ --spec Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifname": "eno1.2"}' --output-json | \ jq -C '.linkinfo' Before: Exception: No message format for 'vlan' in sub-message spec 'linkinfo-data-msg' After: { "kind": "vlan", "data": { "protocol": "8021q", "id": 2, "flag": { "flags": [ "reorder-hdr" ], "mask": "0xffffffff" }, "egress-qos": { "mapping": [ { "from": 1, "to": 2 }, { "from": 4, "to": 4 } ] } } } Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240327123130.1322921-3-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2024-03-111-1/+1
|\ | | | | | | | | | | Merge in late fixes to prepare for the 6.9 net-next PR. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * devlink: Fix length of eswitch inline-modeWilliam Tu2024-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set eswitch inline-mode to be u8, not u16. Otherwise, errors below $ devlink dev eswitch set pci/0000:08:00.0 mode switchdev \ inline-mode network Error: Attribute failed policy validation. kernel answers: Numerical result out of rang netlink: 'devlink': attribute type 26 has an invalid length. Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops") Signed-off-by: William Tu <witu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240310164547.35219-1-witu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: support generating code for genl socket privJakub Kicinski2024-03-113-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The family struct is auto-generated for new families, support use of the sock_priv_* mechanism added in commit a731132424ad ("genetlink: introduce per-sock family private storage"). For example if the family wants to use struct sk_buff as its private struct (unrealistic but just for illustration), it would add to its spec: kernel-family: headers: [ "linux/skbuff.h" ] sock-priv: struct sk_buff ynl-gen-c will declare the appropriate priv size and hook in function prototypes to be implemented by the family. Link: https://lore.kernel.org/r/20240308190319.2523704-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: support unterminated-okHangbin Liu2024-03-113-0/+15
| | | | | | | | | | | | | | | | | | ynl-gen-c.py supports check unterminated-ok, but the yaml schemas don't have this key. Add this to the yaml files. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240308081239.3281710-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | tools: ynl-gen: support using pre-defined values in attr checksHangbin Liu2024-03-114-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support using pre-defined values in checks so we don't need to use hard code number for the string, binary length. e.g. we have a definition like #define TEAM_STRING_MAX_LEN 32 Which defined in yaml like: definitions: - name: string-max-len type: const value: 32 It can be used in the attribute-sets like attribute-sets: - name: attr-option name-prefix: team-attr-option- attributes: - name: name type: string checks: len: string-max-len With this patch it will be converted to [TEAM_ATTR_OPTION_NAME] = { .type = NLA_STRING, .len = TEAM_STRING_MAX_LEN, } Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240311140727.109562-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netdev: add queue stat for alloc failuresJakub Kicinski2024-03-071-0/+7
| | | | | | | | | | | | | | | | | | | | | | Rx alloc failures are commonly counted by drivers. Support reporting those via netdev-genl queue stats. Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netdev: add per-queue statisticsJakub Kicinski2024-03-071-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ethtool-nl family does a good job exposing various protocol related and IEEE/IETF statistics which used to get dumped under ethtool -S, with creative names. Queue stats don't have a netlink API, yet, and remain a lion's share of ethtool -S output for new drivers. Not only is that bad because the names differ driver to driver but it's also bug-prone. Intuitively drivers try to report only the stats for active queues, but querying ethtool stats involves multiple system calls, and the number of stats is read separately from the stats themselves. Worse still when user space asks for values of the stats, it doesn't inform the kernel how big the buffer is. If number of stats increases in the meantime kernel will overflow user buffer. Add a netlink API for dumping queue stats. Queue information is exposed via the netdev-genl family, so add the stats there. Support per-queue and sum-for-device dumps. Latter will be useful when subsequent patches add more interesting common stats than just bytes and packets. The API does not currently distinguish between HW and SW stats. The expectation is that the source of the stats will either not matter much (good packets) or be obvious (skb alloc errors). Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | dpll: spec: use proper enum for pin capabilities attributeJiri Pirko2024-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The enum is defined, however the pin capabilities attribute does refer to it. Add this missing enum field. This fixes ynl cli output: Example current output: $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}' {'capabilities': 4, ... Example new output: $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}' {'capabilities': {'state-can-change'}, ... Fixes: 3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240306120739.1447621-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: Add spec for nlctrl netlink familyDonald Hunter2024-03-071-0/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a spec for the nlctrl family. Example usage: ./tools/net/ynl/cli.py \ --spec Documentation/netlink/specs/nlctrl.yaml \ --do getfamily --json '{"family-name": "nlctrl"}' ./tools/net/ynl/cli.py \ --spec Documentation/netlink/specs/nlctrl.yaml \ --dump getpolicy --json '{"family-name": "nlctrl"}' Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240306231046.97158-7-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink: Allow empty enum-name in ynl specsDonald Hunter2024-03-073-18/+27
| | | | | | | | | | | | | | | | | | Update the ynl schemas to allow the specification of empty enum names for all enum code generation. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20240306231046.97158-6-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | mptcp: add token for get-addr in yamlGeliang Tang2024-03-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds token parameter together with addr in get-addr section in mptcp_pm.yaml, then use the following commands to update mptcp_pm_gen.c and mptcp_pm_gen.h: ./tools/net/ynl/ynl-gen-c.py --mode kernel \ --spec Documentation/netlink/specs/mptcp_pm.yaml --source \ -o net/mptcp/mptcp_pm_gen.c ./tools/net/ynl/ynl-gen-c.py --mode kernel \ --spec Documentation/netlink/specs/mptcp_pm.yaml --header \ -o net/mptcp/mptcp_pm_gen.h Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2024-02-151-4/+0
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c 9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()") 723de3ebef03 ("net: free altname using an RCU callback") net/unix/garbage.c 11498715f266 ("af_unix: Remove io_uring code for GC.") 25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) c2da9408579d ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c bdd70eb68913 ("mptcp: drop the push_pending field") 28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * dpll: fix possible deadlock during netlink dump operationJiri Pirko2024-02-081-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recently, I've been hitting following deadlock warning during dpll pin dump: [52804.637962] ====================================================== [52804.638536] WARNING: possible circular locking dependency detected [52804.639111] 6.8.0-rc2jiri+ #1 Not tainted [52804.639529] ------------------------------------------------------ [52804.640104] python3/2984 is trying to acquire lock: [52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780 [52804.641417] but task is already holding lock: [52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20 [52804.642747] which lock already depends on the new lock. [52804.643551] the existing dependency chain (in reverse order) is: [52804.644259] -> #1 (dpll_lock){+.+.}-{3:3}: [52804.644836] lock_acquire+0x174/0x3e0 [52804.645271] __mutex_lock+0x119/0x1150 [52804.645723] dpll_lock_dumpit+0x13/0x20 [52804.646169] genl_start+0x266/0x320 [52804.646578] __netlink_dump_start+0x321/0x450 [52804.647056] genl_family_rcv_msg_dumpit+0x155/0x1e0 [52804.647575] genl_rcv_msg+0x1ed/0x3b0 [52804.648001] netlink_rcv_skb+0xdc/0x210 [52804.648440] genl_rcv+0x24/0x40 [52804.648831] netlink_unicast+0x2f1/0x490 [52804.649290] netlink_sendmsg+0x36d/0x660 [52804.649742] __sock_sendmsg+0x73/0xc0 [52804.650165] __sys_sendto+0x184/0x210 [52804.650597] __x64_sys_sendto+0x72/0x80 [52804.651045] do_syscall_64+0x6f/0x140 [52804.651474] entry_SYSCALL_64_after_hwframe+0x46/0x4e [52804.652001] -> #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}: [52804.652650] check_prev_add+0x1ae/0x1280 [52804.653107] __lock_acquire+0x1ed3/0x29a0 [52804.653559] lock_acquire+0x174/0x3e0 [52804.653984] __mutex_lock+0x119/0x1150 [52804.654423] netlink_dump+0xb3/0x780 [52804.654845] __netlink_dump_start+0x389/0x450 [52804.655321] genl_family_rcv_msg_dumpit+0x155/0x1e0 [52804.655842] genl_rcv_msg+0x1ed/0x3b0 [52804.656272] netlink_rcv_skb+0xdc/0x210 [52804.656721] genl_rcv+0x24/0x40 [52804.657119] netlink_unicast+0x2f1/0x490 [52804.657570] netlink_sendmsg+0x36d/0x660 [52804.658022] __sock_sendmsg+0x73/0xc0 [52804.658450] __sys_sendto+0x184/0x210 [52804.658877] __x64_sys_sendto+0x72/0x80 [52804.659322] do_syscall_64+0x6f/0x140 [52804.659752] entry_SYSCALL_64_after_hwframe+0x46/0x4e [52804.660281] other info that might help us debug this: [52804.661077] Possible unsafe locking scenario: [52804.661671] CPU0 CPU1 [52804.662129] ---- ---- [52804.662577] lock(dpll_lock); [52804.662924] lock(nlk_cb_mutex-GENERIC); [52804.663538] lock(dpll_lock); [52804.664073] lock(nlk_cb_mutex-GENERIC); [52804.664490] The issue as follows: __netlink_dump_start() calls control->start(cb) with nlk->cb_mutex held. In control->start(cb) the dpll_lock is taken. Then nlk->cb_mutex is released and taken again in netlink_dump(), while dpll_lock still being held. That leads to ABBA deadlock when another CPU races with the same operation. Fix this by moving dpll_lock taking into dumpit() callback which ensures correct lock taking order. Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc: netlink: specs: tc: add multi-attr to tc-taprio-sched-entryAlessandro Marcolini2024-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | Add multi-attr attribute to tc-taprio-sched-entry to specify multiple entries. Signed-off-by: Alessandro Marcolini <alessandromarcolini99@gmail.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/0ba5088ea715103a2bce83b12e2dcbdaa08da6ac.1706962013.git.alessandromarcolini99@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2024-02-011-0/+10
|\| | | | | | | | | | | | | | | Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * doc/netlink/specs: Add missing attr in rt_link specDonald Hunter2024-02-011-0/+10
| | | | | | | | | | | | | | | | | | | | | | IFLA_DPLL_PIN was added to rt_link messages but not to the spec, which breaks ynl. Add the missing definitions to the rt_link ynl spec. Fixes: 5f1842692880 ("netdev: expose DPLL pin handle for netdevice") Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240201113853.37432-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | dpll: extend uapi by lock status error attributeJiri Pirko2024-02-011-0/+39
| | | | | | | | | | | | | | | | | | | | | | If the dpll devices goes to state "unlocked" or "holdover", it may be caused by an error. In that case, allow user to see what the error was. Introduce a new attribute and values it can carry. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* | doc/netlink/specs: Update the tc specDonald Hunter2024-01-311-101/+2017
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fill in many of the gaps in the tc netlink spec, including stats attrs, classes and actions. Many documentation strings have also been added. This is still a work in progress, albeit fairly complete: - there are still many attributes left as binary blobs. - actions have not had much testing Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-14-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | tools/net/ynl: Add support for nested structsDonald Hunter2024-01-311-3/+12
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make it possible for struct definitions to reference other struct definitions ofr binary members. For example, the tbf qdisc uses this struct definition for its parms attribute: - name: tc-tbf-qopt type: struct members: - name: rate type: binary struct: tc-ratespec - name: peakrate type: binary struct: tc-ratespec - name: limit type: u32 - name: buffer type: u32 - name: mtu type: u32 This adds the necessary schema changes and adds nested struct encoding and decoding to ynl. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-11-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* dpll: expose fractional frequency offset value to userJiri Pirko2024-01-051-0/+11
| | | | | | | | | | | Add a new netlink attribute to expose fractional frequency offset value for a pin. Add an op to get the value from the driver. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Acked-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://lore.kernel.org/r/20240103132838.1501801-2-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Revert "Introduce PHY listing and link_topology tracking"Jakub Kicinski2024-01-041-68/+0
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 32bb4515e34469975abc936deb0a116c4a445817. This reverts commit d078d480639a4f3b5fc2d56247afa38e0956483a. This reverts commit fcc4b105caa4b844bf043375bf799c20a9c99db1. This reverts commit 345237dbc1bdbb274c9fb9ec38976261ff4a40b8. This reverts commit 7db69ec9cfb8b4ab50420262631fb2d1908b25bf. This reverts commit 95132a018f00f5dad38bdcfd4180d1af955d46f6. This reverts commit 63d5eaf35ac36cad00cfb3809d794ef0078c822b. This reverts commit c29451aefcb42359905d18678de38e52eccb3bb5. This reverts commit 2ab0edb505faa9ac90dee1732571390f074e8113. This reverts commit dedd702a35793ab462fce4c737eeba0badf9718e. This reverts commit 034fcc210349b873ece7356905be5c6ca11eef2a. This reverts commit 9c5625f559ad6fe9f6f733c11475bf470e637d34. This reverts commit 02018c544ef113e980a2349eba89003d6f399d22. Looks like we need more time for reviews, and incremental changes will be hard to make sense of. So revert. Link: https://lore.kernel.org/all/ZZP6FV5sXEf+xd58@shell.armlinux.org.uk/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* netlink: specs: add ethnl PHY_GET command setMaxime Chevallier2024-01-011-0/+65
| | | | | | | | | | | The PHY_GET command, supporting both DUMP and GET operations, is used to retrieve the list of PHYs connected to a netdevice, and get topology information to know where exactly it sits on the physical link. Add the netlink specs corresponding to that command. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netlink: specs: add phy-index as a header parameterMaxime Chevallier2024-01-011-0/+3
| | | | | | | | | Update the spec to take the newly introduced phy-index as a generic request parameter. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* devlink: extend multicast filtering by port indexJiri Pirko2023-12-191-0/+1
| | | | | | | | | Expose the previously introduced notification multicast messages filtering infrastructure and allow the user to select messages using port index. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* devlink: add a command to set notification filter and use it for multicastsJiri Pirko2023-12-191-0/+10
| | | | | | | | | | | | | | | | | | Currently the user listening on a socket for devlink notifications gets always all messages for all existing instances, even if he is interested only in one of those. That may cause unnecessary overhead on setups with thousands of instances present. User is currently able to narrow down the devlink objects replies to dump commands by specifying select attributes. Allow similar approach for notifications. Introduce a new devlink NOTIFY_FILTER_SET which the user passes the select attributes. Store these per-socket and use them for filtering messages during multicast send. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* Merge tag 'for-netdev' of ↵Jakub Kicinski2023-12-181-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2023-12-18 This PR is larger than usual and contains changes in various parts of the kernel. The main changes are: 1) Fix kCFI bugs in BPF, from Peter Zijlstra. End result: all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y. 2) Introduce BPF token object, from Andrii Nakryiko. It adds an ability to delegate a subset of BPF features from privileged daemon (e.g., systemd) through special mount options for userns-bound BPF FS to a trusted unprivileged application. The design accommodates suggestions from Christian Brauner and Paul Moore. Example: $ sudo mkdir -p /sys/fs/bpf/token $ sudo mount -t bpf bpffs /sys/fs/bpf/token \ -o delegate_cmds=prog_load:MAP_CREATE \ -o delegate_progs=kprobe \ -o delegate_attachs=xdp 3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei. - Complete precision tracking support for register spills - Fix verification of possibly-zero-sized stack accesses - Fix access to uninit stack slots - Track aligned STACK_ZERO cases as imprecise spilled registers. It improves the verifier "instructions processed" metric from single digit to 50-60% for some programs. - Fix verifier retval logic 4) Support for VLAN tag in XDP hints, from Larysa Zaremba. 5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu. End result: better memory utilization and lower I$ miss for calls to BPF via BPF trampoline. 6) Fix race between BPF prog accessing inner map and parallel delete, from Hou Tao. 7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu. It allows BPF interact with IPSEC infra. The intent is to support software RSS (via XDP) for the upcoming ipsec pcpu work. Experiments on AWS demonstrate single tunnel pcpu ipsec reaching line rate on 100G ENA nics. 8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao. 9) BPF file verification via fsverity, from Song Liu. It allows BPF progs get fsverity digest. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits) bpf: Ensure precise is reset to false in __mark_reg_const_zero() selftests/bpf: Add more uprobe multi fail tests bpf: Fail uprobe multi link with negative offset selftests/bpf: Test the release of map btf s390/bpf: Fix indirect trampoline generation selftests/bpf: Temporarily disable dummy_struct_ops test on s390 x86/cfi,bpf: Fix bpf_exception_cb() signature bpf: Fix dtor CFI cfi: Add CFI_NOSEAL() x86/cfi,bpf: Fix bpf_struct_ops CFI x86/cfi,bpf: Fix bpf_callback_t CFI x86/cfi,bpf: Fix BPF JIT call cfi: Flip headers selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment bpf: Limit the number of kprobes when attaching program to multiple kprobes bpf: Limit the number of uprobes when attaching program to multiple uprobes bpf: xdp: Register generic_kfunc_set with XDP programs selftests/bpf: utilize string values for delegate_xxx mount options ... ==================== Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * xdp: Add VLAN tag hintLarysa Zaremba2023-12-131-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement functionality that enables drivers to expose VLAN tag to XDP code. VLAN tag is represented by 2 variables: - protocol ID, which is passed to bpf code in BE - VLAN TCI, in host byte order Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20231205210847.28460-10-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* | doc/netlink/specs: Add a spec for tcDonald Hunter2023-12-181-0/+2031
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a work-in-progress spec for tc that covers: - most of the qdiscs - the flower classifier - new, del, get for qdisc, chain, class and filter Notable omissions: - most of the stats attrs are left as binary blobs - notifications are not yet implemented Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-9-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: use pad in structs in rt_linkDonald Hunter2023-12-181-7/+6
| | | | | | | | | | | | | | | | | | | | The rt_link spec was using pad1, pad2 attributes in structs which appears in the ynl output. Replace this with the 'pad' type which doesn't pollute the output. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-8-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink/specs: Add sub-message type to rt_link familyDonald Hunter2023-12-181-4/+432
| | | | | | | | | | | | | | | | | | Start using sub-message selectors in the rt_link spec for the link-specific 'data' and 'slave-data' attributes. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-7-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | doc/netlink: Add sub-message support to netlink-rawDonald Hunter2023-12-181-3/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a 'sub-message' attribute type with a selector that supports polymorphic attribute formats for raw netlink families like tc. A sub-message attribute uses the value of another attribute as a selector key to choose the right sub-message format. For example if the following attribute has already been decoded: { "kind": "gre" } and we encounter the following attribute spec: - name: data type: sub-message sub-message: linkinfo-data-msg selector: kind Then we look for a sub-message definition called 'linkinfo-data-msg' and use the value of the 'kind' attribute i.e. 'gre' as the key to choose the correct format for the sub-message: sub-messages: name: linkinfo-data-msg formats: - value: bridge attribute-set: linkinfo-bridge-attrs - value: gre attribute-set: linkinfo-gre-attrs - value: geneve attribute-set: linkinfo-geneve-attrs This would decode the attribute value as a sub-message with the attribute-set called 'linkinfo-gre-attrs' as the attribute space. A sub-message can have an optional 'fixed-header' followed by zero or more attributes from an attribute-set. For example the following 'tc-options-msg' sub-message defines message formats that use a mixture of fixed-header, attribute-set or both together: sub-messages: - name: tc-options-msg formats: - value: bfifo fixed-header: tc-fifo-qopt - value: cake attribute-set: tc-cake-attrs - value: netem fixed-header: tc-netem-qopt attribute-set: tc-netem-attrs Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231215093720.18774-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* | netlink: specs: mptcp: rename the MPTCP path management specJakub Kicinski2023-12-151-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | We assume in handful of places that the name of the spec is the same as the name of the family. We could fix that but it seems like a fair assumption to make. Rename the MPTCP spec instead. Reviewed-by: Mat Martineau <martineau@kernel.org> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | netlink: specs: ovs: correct enum names in specsJakub Kicinski2023-12-152-0/+5
| | | | | | | | | | | | | | | | | | | | Align the enum-names of OVS with what's actually in the uAPI. Either correct the names, or mark the enum as empty because the values are in fact #defines. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>