diff options
84 files changed, 1895 insertions, 1205 deletions
diff --git a/Documentation/networking/l2tp.rst b/Documentation/networking/l2tp.rst index a48238a2ec09..498b382d25a0 100644 --- a/Documentation/networking/l2tp.rst +++ b/Documentation/networking/l2tp.rst @@ -4,124 +4,364 @@ L2TP ==== -This document describes how to use the kernel's L2TP drivers to -provide L2TP functionality. L2TP is a protocol that tunnels one or -more sessions over an IP tunnel. It is commonly used for VPNs -(L2TP/IPSec) and by ISPs to tunnel subscriber PPP sessions over an IP -network infrastructure. With L2TPv3, it is also useful as a Layer-2 -tunneling infrastructure. - -Features +Layer 2 Tunneling Protocol (L2TP) allows L2 frames to be tunneled over +an IP network. + +This document covers the kernel's L2TP subsystem. It documents kernel +APIs for application developers who want to use the L2TP subsystem and +it provides some technical details about the internal implementation +which may be useful to kernel developers and maintainers. + +Overview ======== -L2TPv2 (PPP over L2TP (UDP tunnels)). -L2TPv3 ethernet pseudowires. -L2TPv3 PPP pseudowires. -L2TPv3 IP encapsulation. -Netlink sockets for L2TPv3 configuration management. - -History -======= - -The original pppol2tp driver was introduced in 2.6.23 and provided -L2TPv2 functionality (rfc2661). L2TPv2 is used to tunnel one or more PPP -sessions over a UDP tunnel. - -L2TPv3 (rfc3931) changes the protocol to allow different frame types -to be passed over an L2TP tunnel by moving the PPP-specific parts of -the protocol out of the core L2TP packet headers. Each frame type is -known as a pseudowire type. Ethernet, PPP, HDLC, Frame Relay and ATM -pseudowires for L2TP are defined in separate RFC standards. Another -change for L2TPv3 is that it can be carried directly over IP with no -UDP header (UDP is optional). It is also possible to create static -unmanaged L2TPv3 tunnels manually without a control protocol -(userspace daemon) to manage them. - -To support L2TPv3, the original pppol2tp driver was split up to -separate the L2TP and PPP functionality. Existing L2TPv2 userspace -apps should be unaffected as the original pppol2tp sockets API is -retained. L2TPv3, however, uses netlink to manage L2TPv3 tunnels and -sessions. - -Design -====== - -The L2TP protocol separates control and data frames. The L2TP kernel -drivers handle only L2TP data frames; control frames are always -handled by userspace. L2TP control frames carry messages between L2TP -clients/servers and are used to setup / teardown tunnels and -sessions. An L2TP client or server is implemented in userspace. - -Each L2TP tunnel is implemented using a UDP or L2TPIP socket; L2TPIP -provides L2TPv3 IP encapsulation (no UDP) and is implemented using a -new l2tpip socket family. The tunnel socket is typically created by -userspace, though for unmanaged L2TPv3 tunnels, the socket can also be -created by the kernel. Each L2TP session (pseudowire) gets a network -interface instance. In the case of PPP, these interfaces are created -indirectly by pppd using a pppol2tp socket. In the case of ethernet, -the netdevice is created upon a netlink request to create an L2TPv3 -ethernet pseudowire. - -For PPP, the PPPoL2TP driver, net/l2tp/l2tp_ppp.c, provides a -mechanism by which PPP frames carried through an L2TP session are -passed through the kernel's PPP subsystem. The standard PPP daemon, -pppd, handles all PPP interaction with the peer. PPP network -interfaces are created for each local PPP endpoint. The kernel's PPP -subsystem arranges for PPP control frames to be delivered to pppd, -while data frames are forwarded as usual. - -For ethernet, the L2TPETH driver, net/l2tp/l2tp_eth.c, implements a -netdevice driver, managing virtual ethernet devices, one per -pseudowire. These interfaces can be managed using standard Linux tools -such as "ip" and "ifconfig". If only IP frames are passed over the -tunnel, the interface can be given an IP addresses of itself and its -peer. If non-IP frames are to be passed over the tunnel, the interface -can be added to a bridge using brctl. All L2TP datapath protocol -functions are handled by the L2TP core driver. - -Each tunnel and session within a tunnel is assigned a unique tunnel_id -and session_id. These ids are carried in the L2TP header of every -control and data packet. (Actually, in L2TPv3, the tunnel_id isn't -present in data frames - it is inferred from the IP connection on -which the packet was received.) The L2TP driver uses the ids to lookup -internal tunnel and/or session contexts to determine how to handle the -packet. Zero tunnel / session ids are treated specially - zero ids are -never assigned to tunnels or sessions in the network. In the driver, -the tunnel context keeps a reference to the tunnel UDP or L2TPIP -socket. The session context holds data that lets the driver interface -to the kernel's network frame type subsystems, i.e. PPP, ethernet. - -Userspace Programming -===================== - -For L2TPv2, there are a number of requirements on the userspace L2TP -daemon in order to use the pppol2tp driver. - -1. Use a UDP socket per tunnel. - -2. Create a single PPPoL2TP socket per tunnel bound to a special null - session id. This is used only for communicating with the driver but - must remain open while the tunnel is active. Opening this tunnel - management socket causes the driver to mark the tunnel socket as an - L2TP UDP encapsulation socket and flags it for use by the - referenced tunnel id. This hooks up the UDP receive path via - udp_encap_rcv() in net/ipv4/udp.c. PPP data frames are never passed - in this special PPPoX socket. - -3. Create a PPPoL2TP socket per L2TP session. This is typically done - by starting pppd with the pppol2tp plugin and appropriate - arguments. A PPPoL2TP tunnel management socket (Step 2) must be - created before the first PPPoL2TP session socket is created. +The kernel's L2TP subsystem implements the datapath for L2TPv2 and +L2TPv3. L2TPv2 is carried over UDP. L2TPv3 is carried over UDP or +directly over IP (protocol 115). + +The L2TP RFCs define two basic kinds of L2TP packets: control packets +(the "control plane"), and data packets (the "data plane"). The kernel +deals only with data packets. The more complex control packets are +handled by user space. + +An L2TP tunnel carries one or more L2TP sessions. Each tunnel is +associated with a socket. Each session is associated with a virtual +netdevice, e.g. ``pppN``, ``l2tpethN``, through which data frames pass +to/from L2TP. Fields in the L2TP header identify the tunnel or session +and whether it is a control or data packet. When tunnels and sessions +are set up using the Linux kernel API, we're just setting up the L2TP +data path. All aspects of the control protocol are to be handled by +user space. + +This split in responsibilities leads to a natural sequence of +operations when establishing tunnels and sessions. The procedure looks +like this: + + 1) Create a tunnel socket. Exchange L2TP control protocol messages + with the peer over that socket in order to establish a tunnel. + + 2) Create a tunnel context in the kernel, using information + obtained from the peer using the control protocol messages. + + 3) Exchange L2TP control protocol messages with the peer over the + tunnel socket in order to establish a session. + + 4) Create a session context in the kernel using information + obtained from the peer using the control protocol messages. + +L2TP APIs +========= + +This section documents each userspace API of the L2TP subsystem. + +Tunnel Sockets +-------------- + +L2TPv2 always uses UDP. L2TPv3 may use UDP or IP encapsulation. + +To create a tunnel socket for use by L2TP, the standard POSIX +socket API is used. + +For example, for a tunnel using IPv4 addresses and UDP encapsulation:: + + int sockfd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); + +Or for a tunnel using IPv6 addresses and IP encapsulation:: + + int sockfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_L2TP); + +UDP socket programming doesn't need to be covered here. + +IPPROTO_L2TP is an IP protocol type implemented by the kernel's L2TP +subsystem. The L2TPIP socket address is defined in struct +sockaddr_l2tpip and struct sockaddr_l2tpip6 at +`include/uapi/linux/l2tp.h`_. The address includes the L2TP tunnel +(connection) id. To use L2TP IP encapsulation, an L2TPv3 application +should bind the L2TPIP socket using the locally assigned +tunnel id. When the peer's tunnel id and IP address is known, a +connect must be done. + +If the L2TP application needs to handle L2TPv3 tunnel setup requests +from peers using L2TPIP, it must open a dedicated L2TPIP +socket to listen for those requests and bind the socket using tunnel +id 0 since tunnel setup requests are addressed to tunnel id 0. + +An L2TP tunnel and all of its sessions are automatically closed when +its tunnel socket is closed. + +Netlink API +----------- + +L2TP applications use netlink to manage L2TP tunnel and session +instances in the kernel. The L2TP netlink API is defined in +`include/uapi/linux/l2tp.h`_. + +L2TP uses `Generic Netlink`_ (GENL). Several commands are defined: +Create, Delete, Modify and Get for tunnel and session +instances, e.g. ``L2TP_CMD_TUNNEL_CREATE``. The API header lists the +netlink attribute types that can be used with each command. + +Tunnel and session instances are identified by a locally unique +32-bit id. L2TP tunnel ids are given by ``L2TP_ATTR_CONN_ID`` and +``L2TP_ATTR_PEER_CONN_ID`` attributes and L2TP session ids are given +by ``L2TP_ATTR_SESSION_ID`` and ``L2TP_ATTR_PEER_SESSION_ID`` +attributes. If netlink is used to manage L2TPv2 tunnel and session +instances, the L2TPv2 16-bit tunnel/session id is cast to a 32-bit +value in these attributes. + +In the ``L2TP_CMD_TUNNEL_CREATE`` command, ``L2TP_ATTR_FD`` tells the +kernel the tunnel socket fd being used. If not specified, the kernel +creates a kernel socket for the tunnel, using IP parameters set in +``L2TP_ATTR_IP[6]_SADDR``, ``L2TP_ATTR_IP[6]_DADDR``, +``L2TP_ATTR_UDP_SPORT``, ``L2TP_ATTR_UDP_DPORT`` attributes. Kernel +sockets are used to implement unmanaged L2TPv3 tunnels (iproute2's "ip +l2tp" commands). If ``L2TP_ATTR_FD`` is given, it must be a socket fd +that is already bound and connected. There is more information about +unmanaged tunnels later in this document. + +``L2TP_CMD_TUNNEL_CREATE`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y Sets the tunnel (connection) id. +PEER_CONN_ID Y Sets the peer tunnel (connection) id. +PROTO_VERSION Y Protocol version. 2 or 3. +ENCAP_TYPE Y Encapsulation type: UDP or IP. +FD N Tunnel socket file descriptor. +UDP_CSUM N Enable IPv4 UDP checksums. Used only if FD is + not set. +UDP_ZERO_CSUM6_TX N Zero IPv6 UDP checksum on transmit. Used only + if FD is not set. +UDP_ZERO_CSUM6_RX N Zero IPv6 UDP checksum on receive. Used only if + FD is not set. +IP_SADDR N IPv4 source address. Used only if FD is not + set. +IP_DADDR N IPv4 destination address. Used only if FD is + not set. +UDP_SPORT N UDP source port. Used only if FD is not set. +UDP_DPORT N UDP destination port. Used only if FD is not + set. +IP6_SADDR N IPv6 source address. Used only if FD is not + set. +IP6_DADDR N IPv6 destination address. Used only if FD is + not set. +DEBUG N Debug flags. +================== ======== === + +``L2TP_CMD_TUNNEL_DESTROY`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y Identifies the tunnel id to be destroyed. +================== ======== === + +``L2TP_CMD_TUNNEL_MODIFY`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y Identifies the tunnel id to be modified. +DEBUG N Debug flags. +================== ======== === + +``L2TP_CMD_TUNNEL_GET`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID N Identifies the tunnel id to be queried. + Ignored in DUMP requests. +================== ======== === + +``L2TP_CMD_SESSION_CREATE`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y The parent tunnel id. +SESSION_ID Y Sets the session id. +PEER_SESSION_ID Y Sets the parent session id. +PW_TYPE Y Sets the pseudowire type. +DEBUG N Debug flags. +RECV_SEQ N Enable rx data sequence numbers. +SEND_SEQ N Enable tx data sequence numbers. +LNS_MODE N Enable LNS mode (auto-enable data sequence + numbers). +RECV_TIMEOUT N Timeout to wait when reordering received + packets. +L2SPEC_TYPE N Sets layer2-specific-sublayer type (L2TPv3 + only). +COOKIE N Sets optional cookie (L2TPv3 only). +PEER_COOKIE N Sets optional peer cookie (L2TPv3 only). +IFNAME N Sets interface name (L2TPv3 only). +================== ======== === + +For Ethernet session types, this will create an l2tpeth virtual +interface which can then be configured as required. For PPP session +types, a PPPoL2TP socket must also be opened and connected, mapping it +onto the new session. This is covered in "PPPoL2TP Sockets" later. + +``L2TP_CMD_SESSION_DESTROY`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y Identifies the parent tunnel id of the session + to be destroyed. +SESSION_ID Y Identifies the session id to be destroyed. +IFNAME N Identifies the session by interface name. If + set, this overrides any CONN_ID and SESSION_ID + attributes. Currently supported for L2TPv3 + Ethernet sessions only. +================== ======== === + +``L2TP_CMD_SESSION_MODIFY`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID Y Identifies the parent tunnel id of the session + to be modified. +SESSION_ID Y Identifies the session id to be modified. +IFNAME N Identifies the session by interface name. If + set, this overrides any CONN_ID and SESSION_ID + attributes. Currently supported for L2TPv3 + Ethernet sessions only. +DEBUG N Debug flags. +RECV_SEQ N Enable rx data sequence numbers. +SEND_SEQ N Enable tx data sequence numbers. +LNS_MODE N Enable LNS mode (auto-enable data sequence + numbers). +RECV_TIMEOUT N Timeout to wait when reordering received + packets. +================== ======== === + +``L2TP_CMD_SESSION_GET`` attributes:- + +================== ======== === +Attribute Required Use +================== ======== === +CONN_ID N Identifies the tunnel id to be queried. + Ignored for DUMP requests. +SESSION_ID N Identifies the session id to be queried. + Ignored for DUMP requests. +IFNAME N Identifies the session by interface name. + If set, this overrides any CONN_ID and + SESSION_ID attributes. Ignored for DUMP + requests. Currently supported for L2TPv3 + Ethernet sessions only. +================== ======== === + +Application developers should refer to `include/uapi/linux/l2tp.h`_ for +netlink command and attribute definitions. + +Sample userspace code using libmnl_: + + - Open L2TP netlink socket:: + + struct nl_sock *nl_sock; + int l2tp_nl_family_id; + + nl_sock = nl_socket_alloc(); + genl_connect(nl_sock); + genl_id = genl_ctrl_resolve(nl_sock, L2TP_GENL_NAME); + + - Create a tunnel:: + + struct nlmsghdr *nlh; + struct genlmsghdr *gnlh; + + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = genl_id; /* assigned to genl socket */ + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + nlh->nlmsg_seq = seq; + + gnlh = mnl_nlmsg_put_extra_header(nlh, sizeof(*gnlh)); + gnlh->cmd = L2TP_CMD_TUNNEL_CREATE; + gnlh->version = L2TP_GENL_VERSION; + gnlh->reserved = 0; + + mnl_attr_put_u32(nlh, L2TP_ATTR_FD, tunl_sock_fd); + mnl_attr_put_u32(nlh, L2TP_ATTR_CONN_ID, tid); + mnl_attr_put_u32(nlh, L2TP_ATTR_PEER_CONN_ID, peer_tid); + mnl_attr_put_u8(nlh, L2TP_ATTR_PROTO_VERSION, protocol_version); + mnl_attr_put_u16(nlh, L2TP_ATTR_ENCAP_TYPE, encap); + + - Create a session:: + + struct nlmsghdr *nlh; + struct genlmsghdr *gnlh; + + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = genl_id; /* assigned to genl socket */ + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + nlh->nlmsg_seq = seq; + + gnlh = mnl_nlmsg_put_extra_header(nlh, sizeof(*gnlh)); + gnlh->cmd = L2TP_CMD_SESSION_CREATE; + gnlh->version = L2TP_GENL_VERSION; + gnlh->reserved = 0; + + mnl_attr_put_u32(nlh, L2TP_ATTR_CONN_ID, tid); + mnl_attr_put_u32(nlh, L2TP_ATTR_PEER_CONN_ID, peer_tid); + mnl_attr_put_u32(nlh, L2TP_ATTR_SESSION_ID, sid); + mnl_attr_put_u32(nlh, L2TP_ATTR_PEER_SESSION_ID, peer_sid); + mnl_attr_put_u16(nlh, L2TP_ATTR_PW_TYPE, pwtype); + /* there are other session options which can be set using netlink + * attributes during session creation -- see l2tp.h + */ + + - Delete a session:: + + struct nlmsghdr *nlh; + struct genlmsghdr *gnlh; + + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = genl_id; /* assigned to genl socket */ + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + nlh->nlmsg_seq = seq; + + gnlh = mnl_nlmsg_put_extra_header(nlh, sizeof(*gnlh)); + gnlh->cmd = L2TP_CMD_SESSION_DELETE; + gnlh->version = L2TP_GENL_VERSION; + gnlh->reserved = 0; + + mnl_attr_put_u32(nlh, L2TP_ATTR_CONN_ID, tid); + mnl_attr_put_u32(nlh, L2TP_ATTR_SESSION_ID, sid); + + - Delete a tunnel and all of its sessions (if any):: + + struct nlmsghdr *nlh; + struct genlmsghdr *gnlh; + + nlh = mnl_nlmsg_put_header(buf); + nlh->nlmsg_type = genl_id; /* assigned to genl socket */ + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK; + nlh->nlmsg_seq = seq; + + gnlh = mnl_nlmsg_put_extra_header(nlh, sizeof(*gnlh)); + gnlh->cmd = L2TP_CMD_TUNNEL_DELETE; + gnlh->version = L2TP_GENL_VERSION; + gnlh->reserved = 0; + + mnl_attr_put_u32(nlh, L2TP_ATTR_CONN_ID, tid); + +PPPoL2TP Session Socket API +--------------------------- + +For PPP session types, a PPPoL2TP socket must be opened and connected +to the L2TP session. When creating PPPoL2TP sockets, the application provides information -to the driver about the socket in a socket connect() call. Source and -destination tunnel and session ids are provided, as well as the file -descriptor of a UDP socket. See struct pppol2tp_addr in -include/linux/if_pppol2tp.h. Note that zero tunnel / session ids are -treated specially. When creating the per-tunnel PPPoL2TP management -socket in Step 2 above, zero source and destination session ids are -specified, which tells the driver to prepare the supplied UDP file -descriptor for use as an L2TP tunnel socket. +to the kernel about the tunnel and session in a socket connect() +call. Source and destination tunnel and session ids are provided, as +well as the file descriptor of a UDP or L2TPIP socket. See struct +pppol2tp_addr in `include/linux/if_pppol2tp.h`_. For historical reasons, +there are unfortunately slightly different address structures for +L2TPv2/L2TPv3 IPv4/IPv6 tunnels and userspace must use the appropriate +structure that matches the tunnel socket type. Userspace may control behavior of the tunnel or session using setsockopt and ioctl on the PPPoX socket. The following socket @@ -130,229 +370,308 @@ options are supported:- ========= =========================================================== DEBUG bitmask of debug message categories. See below. SENDSEQ - 0 => don't send packets with sequence numbers - - 1 => send packets with sequence numbers + - 1 => send packets with sequence numbers RECVSEQ - 0 => receive packet sequence numbers are optional - - 1 => drop receive packets without sequence numbers + - 1 => drop receive packets without sequence numbers LNSMODE - 0 => act as LAC. - - 1 => act as LNS. + - 1 => act as LNS. REORDERTO reorder timeout (in millisecs). If 0, don't try to reorder. ========= =========================================================== -Only the DEBUG option is supported by the special tunnel management -PPPoX socket. - In addition to the standard PPP ioctls, a PPPIOCGL2TPSTATS is provided to retrieve tunnel and session statistics from the kernel using the PPPoX socket of the appropriate tunnel or session. -For L2TPv3, userspace must use the netlink API defined in -include/linux/l2tp.h to manage tunnel and session contexts. The -general procedure to create a new L2TP tunnel with one session is:- - -1. Open a GENL socket using L2TP_GENL_NAME for configuring the kernel - using netlink. - -2. Create a UDP or L2TPIP socket for the tunnel. - -3. Create a new L2TP tunnel using a L2TP_CMD_TUNNEL_CREATE - request. Set attributes according to desired tunnel parameters, - referencing the UDP or L2TPIP socket created in the previous step. - -4. Create a new L2TP session in the tunnel using a - L2TP_CMD_SESSION_CREATE request. - -The tunnel and all of its sessions are closed when the tunnel socket -is closed. The netlink API may also be used to delete sessions and -tunnels. Configuration and status info may be set or read using netlink. - -The L2TP driver also supports static (unmanaged) L2TPv3 tunnels. These -are where there is no L2TP control message exchange with the peer to -setup the tunnel; the tunnel is configured manually at each end of the -tunnel. There is no need for an L2TP userspace application in this -case -- the tunnel socket is created by the kernel and configured -using parameters sent in the L2TP_CMD_TUNNEL_CREATE netlink -request. The "ip" utility of iproute2 has commands for managing static -L2TPv3 tunnels; do "ip l2tp help" for more information. +Sample userspace code: + + - Create session PPPoX data socket:: + + struct sockaddr_pppol2tp sax; + int fd; + + /* Note, the tunnel socket must be bound already, else it + * will not be ready + */ + sax.sa_family = AF_PPPOX; + sax.sa_protocol = PX_PROTO_OL2TP; + sax.pppol2tp.fd = tunnel_fd; + sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr; + sax.pppol2tp.addr.sin_port = addr->sin_port; + sax.pppol2tp.addr.sin_family = AF_INET; + sax.pppol2tp.s_tunnel = tunnel_id; + sax.pppol2tp.s_session = session_id; + sax.pppol2tp.d_tunnel = peer_tunnel_id; + sax.pppol2tp.d_session = peer_session_id; + + /* session_fd is the fd of the session's PPPoL2TP socket. + * tunnel_fd is the fd of the tunnel UDP / L2TPIP socket. + */ + fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax)); + if (fd < 0 ) { + return -errno; + } + return 0; + +Old L2TPv2-only API +------------------- + +When L2TP was first added to the Linux kernel in 2.6.23, it +implemented only L2TPv2 and did not include a netlink API. Instead, +tunnel and session instances in the kernel were managed directly using +only PPPoL2TP sockets. The PPPoL2TP socket is used as described in +section "PPPoL2TP Session Socket API" but tunnel and session instances +are automatically created on a connect() of the socket instead of +being created by a separate netlink request: + + - Tunnels are managed using a tunnel management socket which is a + dedicated PPPoL2TP socket, connected to (invalid) session + id 0. The L2TP tunnel instance is created when the PPPoL2TP + tunnel management socket is connected and is destroyed when the + socket is closed. + + - Session instances are created in the kernel when a PPPoL2TP + socket is connected to a non-zero session id. Session parameters + are set using setsockopt. The L2TP session instance is destroyed + when the socket is closed. + +This API is still supported but its use is discouraged. Instead, new +L2TPv2 applications should use netlink to first create the tunnel and +session, then create a PPPoL2TP socket for the session. + +Unmanaged L2TPv3 tunnels +------------------------ + +The kernel L2TP subsystem also supports static (unmanaged) L2TPv3 +tunnels. Unmanaged tunnels have no userspace tunnel socket, and +exchange no control messages with the peer to set up the tunnel; the +tunnel is configured manually at each end of the tunnel. All +configuration is done using netlink. There is no need for an L2TP +userspace application in this case -- the tunnel socket is created by +the kernel and configured using parameters sent in the +``L2TP_CMD_TUNNEL_CREATE`` netlink request. The ``ip`` utility of +``iproute2`` has commands for managing static L2TPv3 tunnels; do ``ip +l2tp help`` for more information. Debugging -========= - -The driver supports a flexible debug scheme where kernel trace -messages may be optionally enabled per tunnel and per session. Care is -needed when debugging a live system since the messages are not -rate-limited and a busy system could be swamped. Userspace uses -setsockopt on the PPPoX socket to set a debug mask. +--------- -The following debug mask bits are available: +The L2TP subsystem offers a range of debugging interfaces through the +debugfs filesystem. -================ ============================== -L2TP_MSG_DEBUG verbose debug (if compiled in) -L2TP_MSG_CONTROL userspace - kernel interface -L2TP_MSG_SEQ sequence numbers handling -L2TP_MSG_DATA data packets -================ ============================== +To access these interfaces, the debugfs filesystem must first be mounted:: -If enabled, files under a l2tp debugfs directory can be used to dump -kernel state about L2TP tunnels and sessions. To access it, the -debugfs filesystem must first be mounted:: + # mount -t debugfs debugfs /debug - # mount -t debugfs debugfs /debug +Files under the l2tp directory can then be accessed, providing a summary +of the current population of tunnel and session contexts existing in the +kernel:: -Files under the l2tp directory can then be accessed:: - - # cat /debug/l2tp/tunnels + # cat /debug/l2tp/tunnels The debugfs files should not be used by applications to obtain L2TP state information because the file format is subject to change. It is implemented to provide extra debug information to help diagnose -problems.) Users should use the netlink API. +problems. Applications should instead use the netlink API. -/proc/net/pppol2tp is also provided for backwards compatibility with -the original pppol2tp driver. It lists information about L2TPv2 -tunnels and sessions only. Its use is discouraged. +In addition the L2TP subsystem implements tracepoints using the standard +kernel event tracing API. The available L2TP events can be reviewed as +follows:: -Unmanaged L2TPv3 Tunnels -======================== - -Some commercial L2TP products support unmanaged L2TPv3 ethernet -tunnels, where there is no L2TP control protocol; tunnels are -configured at each side manually. New commands are available in -iproute2's ip utility to support this. - -To create an L2TPv3 ethernet pseudowire between local host 192.168.1.1 -and peer 192.168.1.2, using IP addresses 10.5.1.1 and 10.5.1.2 for the -tunnel endpoints:: - - # ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \ - udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2 - # ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1 - # ip -s -d show dev l2tpeth0 - # ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0 - # ip li set dev l2tpeth0 up - -Choose IP addresses to be the address of a local IP interface and that -of the remote system. The IP addresses of the l2tpeth0 interface can be -anything suitable. - -Repeat the above at the peer, with ports, tunnel/session ids and IP -addresses reversed. The tunnel and session IDs can be any non-zero -32-bit number, but the values must be reversed at the peer. - -======================== =================== -Host 1 Host2 -======================== =================== -udp_sport=5000 udp_sport=5001 -udp_dport=5001 udp_dport=5000 -tunnel_id=42 tunnel_id=45 -peer_tunnel_id=45 peer_tunnel_id=42 -session_id=128 session_id=5196755 -peer_session_id=5196755 peer_session_id=128 -======================== =================== - -When done at both ends of the tunnel, it should be possible to send -data over the network. e.g.:: - - # ping 10.5.1.1 - - -Sample Userspace Code -===================== - -1. Create tunnel management PPPoX socket:: - - kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP); - if (kernel_fd >= 0) { - struct sockaddr_pppol2tp sax; - struct sockaddr_in const *peer_addr; - - peer_addr = l2tp_tunnel_get_peer_addr(tunnel); - memset(&sax, 0, sizeof(sax)); - sax.sa_family = AF_PPPOX; - sax.sa_protocol = PX_PROTO_OL2TP; - sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */ - sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr; - sax.pppol2tp.addr.sin_port = peer_addr->sin_port; - sax.pppol2tp.addr.sin_family = AF_INET; - sax.pppol2tp.s_tunnel = tunnel_id; - sax.pppol2tp.s_session = 0; /* special case: mgmt socket */ - sax.pppol2tp.d_tunnel = 0; - sax.pppol2tp.d_session = 0; /* special case: mgmt socket */ - - if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) { - perror("connect failed"); - result = -errno; - goto err; - } - } - -2. Create session PPPoX data socket:: - - struct sockaddr_pppol2tp sax; - int fd; - - /* Note, the target socket must be bound already, else it will not be ready */ - sax.sa_family = AF_PPPOX; - sax.sa_protocol = PX_PROTO_OL2TP; - sax.pppol2tp.fd = tunnel_fd; - sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr; - sax.pppol2tp.addr.sin_port = addr->sin_port; - sax.pppol2tp.addr.sin_family = AF_INET; - sax.pppol2tp.s_tunnel = tunnel_id; - sax.pppol2tp.s_session = session_id; - sax.pppol2tp.d_tunnel = peer_tunnel_id; - sax.pppol2tp.d_session = peer_session_id; - - /* session_fd is the fd of the session's PPPoL2TP socket. - * tunnel_fd is the fd of the tunnel UDP socket. - */ - fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax)); - if (fd < 0 ) { - return -errno; - } - return 0; + # find /debug/tracing/events/l2tp + +Finally, /proc/net/pppol2tp is also provided for backwards compatibility +with the original pppol2tp code. It lists information about L2TPv2 +tunnels and sessions only. Its use is discouraged. Internal Implementation ======================= -The driver keeps a struct l2tp_tunnel context per L2TP tunnel and a -struct l2tp_session context for each session. The l2tp_tunnel is -always associated with a UDP or L2TP/IP socket and keeps a list of -sessions in the tunnel. The l2tp_session context keeps kernel state -about the session. It has private data which is used for data specific -to the session type. With L2TPv2, the session always carried PPP -traffic. With L2TPv3, the session can also carry ethernet frames -(ethernet pseudowire) or other data types such as ATM, HDLC or Frame -Relay. - -When a tunnel is first opened, the reference count on the socket is -increased using sock_hold(). This ensures that the kernel socket -cannot be removed while L2TP's data structures reference it. - -Some L2TP sessions also have a socket (PPP pseudowires) while others -do not (ethernet pseudowires). We can't use the socket reference count -as the reference count for session contexts. The L2TP implementation -therefore has its own internal reference counts on the session -contexts. - -To Do -===== - -Add L2TP tunnel switching support. This would route tunneled traffic -from one L2TP tunnel into another. Specified in -http://tools.ietf.org/html/draft-ietf-l2tpext-tunnel-switching-08 - -Add L2TPv3 VLAN pseudowire support. - -Add L2TPv3 IP pseudowire support. - -Add L2TPv3 ATM pseudowire support. +This section is for kernel developers and maintainers. + +Sockets +------- + +UDP sockets are implemented by the networking core. When an L2TP +tunnel is created using a UDP socket, the socket is set up as an +encapsulated UDP socket by setting encap_rcv and encap_destroy +callbacks on the UDP socket. l2tp_udp_encap_recv is called when +packets are received on the socket. l2tp_udp_encap_destroy is called +when userspace closes the socket. + +L2TPIP sockets are implemented in `net/l2tp/l2tp_ip.c`_ and +`net/l2tp/l2tp_ip6.c`_. + +Tunnels +------- + +The kernel keeps a struct l2tp_tunnel context per L2TP tunnel. The +l2tp_tunnel is always associated with a UDP or L2TP/IP socket and +keeps a list of sessions in the tunnel. When a tunnel is first +registered with L2TP core, the reference count on the socket is +increased. This ensures that the socket cannot be removed while L2TP's +data structures reference it. + +Tunnels are identified by a unique tunnel id. The id is 16-bit for +L2TPv2 and 32-bit for L2TPv3. Internally, the id is stored as a 32-bit +value. + +Tunnels are kept in a per-net list, indexed by tunnel id. The tunnel +id namespace is shared by L2TPv2 and L2TPv3. The tunnel context can be +derived from the socket's sk_user_data. + +Handling tunnel socket close is perhaps the most tricky part of the +L2TP implementation. If userspace closes a tunnel socket, the L2TP +tunnel and all of its sessions must be closed and destroyed. Since the +tunnel context holds a ref on the tunnel socket, the socket's +sk_destruct won't be called until the tunnel sock_put's its +socket. For UDP sockets, when userspace closes the tunnel socket, the +socket's encap_destroy handler is invoked, which L2TP uses to initiate +its tunnel close actions. For L2TPIP sockets, the socket's close +handler initiates the same tunnel close actions. All sessions are +first closed. Each session drops its tunnel ref. When the tunnel ref +reaches zero, the tunnel puts its socket ref. When the socket is +eventually destroyed, it's sk_destruct finally frees the L2TP tunnel +context. + +Sessions +-------- + +The kernel keeps a struct l2tp_session context for each session. Each +session has private data which is used for data specific to the +session type. With L2TPv2, the session always carries PPP +traffic. With L2TPv3, the session can carry Ethernet frames (Ethernet +pseudowire) or other data types such as PPP, ATM, HDLC or Frame +Relay. Linux currently implements only Ethernet and PPP session types. + +Some L2TP session types also have a socket (PPP pseudowires) while +others do not (Ethernet pseudowires). We can't therefore use the +socket reference count as the reference count for session +contexts. The L2TP implementation therefore has its own internal +reference counts on the session contexts. + +Like tunnels, L2TP sessions are identified by a unique +session id. Just as with tunnel ids, the session id is 16-bit for +L2TPv2 and 32-bit for L2TPv3. Internally, the id is stored as a 32-bit +value. + +Sessions hold a ref on their parent tunnel to ensure that the tunnel +stays extant while one or more sessions references it. + +Sessions are kept in a per-tunnel list, indexed by session id. L2TPv3 +sessions are also kept in a per-net list indexed by session id, +because L2TPv3 session ids are unique across all tunnels and L2TPv3 +data packets do not contain a tunnel id in the header. This list is +therefore needed to find the session context associated with a +received data packet when the tunnel context cannot be derived from +the tunnel socket. + +Although the L2TPv3 RFC specifies that L2TPv3 session ids are not +scoped by the tunnel, the kernel does not police this for L2TPv3 UDP +tunnels and does not add sessions of L2TPv3 UDP tunnels into the +per-net session list. In the UDP receive code, we must trust that the +tunnel can be identified using the tunnel socket's sk_user_data and +lookup the session in the tunnel's session list instead of the per-net +session list. + +PPP +--- + +`net/l2tp/l2tp_ppp.c`_ implements the PPPoL2TP socket family. Each PPP +session has a PPPoL2TP socket. + +The PPPoL2TP socket's sk_user_data references the l2tp_session. + +Userspace sends and receives PPP packets over L2TP using a PPPoL2TP +socket. Only PPP control frames pass over this socket: PPP data +packets are handled entirely by the kernel, passing between the L2TP +session and its associated ``pppN`` netdev through the PPP channel +interface of the kernel PPP subsystem. + +The L2TP PPP implementation handles the closing of a PPPoL2TP socket +by closing its corresponding L2TP session. This is complicated because +it must consider racing with netlink session create/destroy requests +and pppol2tp_connect trying to reconnect with a session that is in the +process of being closed. Unlike tunnels, PPP sessions do not hold a +ref on their associated socket, so code must be careful to sock_hold +the socket where necessary. For all the details, see commit +3d609342cc04129ff7568e19316ce3d7451a27e8. + +Ethernet +-------- + +`net/l2tp/l2tp_eth.c`_ implements L2TPv3 Ethernet pseudowires. It +manages a netdev for each session. + +L2TP Ethernet sessions are created and destroyed by netlink request, +or are destroyed when the tunnel is destroyed. Unlike PPP sessions, +Ethernet sessions do not have an associated socket. Miscellaneous ============= -The L2TP drivers were developed as part of the OpenL2TP project by -Katalix Systems Ltd. OpenL2TP is a full-featured L2TP client / server, -designed from the ground up to have the L2TP datapath in the -kernel. The project also implemented the pppol2tp plugin for pppd -which allows pppd to use the kernel driver. Details can be found at -http://www.openl2tp.org. +RFCs +---- + +The kernel code implements the datapath features specified in the +following RFCs: + +======= =============== =================================== +RFC2661 L2TPv2 https://tools.ietf.org/html/rfc2661 +RFC3931 L2TPv3 https://tools.ietf.org/html/rfc3931 +RFC4719 L2TPv3 Ethernet https://tools.ietf.org/html/rfc4719 +======= =============== =================================== + +Implementations +--------------- + +A number of open source applications use the L2TP kernel subsystem: + +============ ============================================== +iproute2 https://github.com/shemminger/iproute2 +go-l2tp https://github.com/katalix/go-l2tp +tunneldigger https://github.com/wlanslovenija/tunneldigger +xl2tpd https://github.com/xelerance/xl2tpd +============ ============================================== + +Limitations +----------- + +The current implementation has a number of limitations: + + 1) Multiple UDP sockets with the same 5-tuple address cannot be + used. The kernel's tunnel context is identified using private + data associated with the socket so it is important that each + socket is uniquely identified by its address. + + 2) Interfacing with openvswitch is not yet implemented. It may be + useful to map OVS Ethernet and VLAN ports into L2TPv3 tunnels. + + 3) VLAN pseudowires are implemented using an ``l2tpethN`` interface + configured with a VLAN sub-interface. Since L2TPv3 VLAN + pseudowires carry one and only one VLAN, it may be better to use + a single netdevice rather than an ``l2tpethN`` and ``l2tpethN``:M + pair per VLAN session. The netlink attribute + ``L2TP_ATTR_VLAN_ID`` was added for this, but it was never + implemented. + +Testing +------- + +Unmanaged L2TPv3 Ethernet features are tested by the kernel's built-in +selftests. See `tools/testing/selftests/net/l2tp.sh`_. + +Another test suite, l2tp-ktest_, covers all +of the L2TP APIs and tunnel/session types. This may be integrated into +the kernel's built-in L2TP selftests in the future. + +.. Links +.. _Generic Netlink: generic_netlink.html +.. _libmnl: https://www.netfilter.org/projects/libmnl +.. _include/uapi/linux/l2tp.h: ../../../include/uapi/linux/l2tp.h +.. _include/linux/if_pppol2tp.h: ../../../include/linux/if_pppol2tp.h +.. _net/l2tp/l2tp_ip.c: ../../../net/l2tp/l2tp_ip.c +.. _net/l2tp/l2tp_ip6.c: ../../../net/l2tp/l2tp_ip6.c +.. _net/l2tp/l2tp_ppp.c: ../../../net/l2tp/l2tp_ppp.c +.. _net/l2tp/l2tp_eth.c: ../../../net/l2tp/l2tp_eth.c +.. _tools/testing/selftests/net/l2tp.sh: ../../../tools/testing/selftests/net/l2tp.sh +.. _l2tp-ktest: https://github.com/katalix/l2tp-ktest diff --git a/MAINTAINERS b/MAINTAINERS index f0068bceeb61..36ec0bd50a8f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4692,6 +4692,15 @@ S: Supported W: http://www.chelsio.com F: drivers/crypto/chelsio +CXGB4 INLINE CRYPTO DRIVER +M: Ayush Sawal <ayush.sawal@chelsio.com> +M: Vinay Kumar Yadav <vinay.yadav@chelsio.com> +M: Rohit Maheshwari <rohitm@chelsio.com> +L: netdev@vger.kernel.org +S: Supported +W: http://www.chelsio.com +F: drivers/net/ethernet/chelsio/inline_crypto/ + CXGB4 ETHERNET DRIVER (CXGB4) M: Vishal Kulkarni <vishal@chelsio.com> L: netdev@vger.kernel.org diff --git a/drivers/crypto/chelsio/Kconfig b/drivers/crypto/chelsio/Kconfig index 2984fdf51e85..af161dab49bd 100644 --- a/drivers/crypto/chelsio/Kconfig +++ b/drivers/crypto/chelsio/Kconfig @@ -22,27 +22,6 @@ config CRYPTO_DEV_CHELSIO To compile this driver as a module, choose M here: the module will be called chcr. -config CHELSIO_IPSEC_INLINE - bool "Chelsio IPSec XFRM Tx crypto offload" - depends on CHELSIO_T4 - depends on CRYPTO_DEV_CHELSIO - depends on XFRM_OFFLOAD - depends on INET_ESP_OFFLOAD || INET6_ESP_OFFLOAD - default n - help - Enable support for IPSec Tx Inline. - -config CRYPTO_DEV_CHELSIO_TLS - tristate "Chelsio Crypto Inline TLS Driver" - depends on CHELSIO_T4 - depends on TLS_TOE - select CRYPTO_DEV_CHELSIO - help - Support Chelsio Inline TLS with Chelsio crypto accelerator. - - To compile this driver as a module, choose M here: the module - will be called chtls. - config CHELSIO_TLS_DEVICE bool "Chelsio Inline KTLS Offload" depends on CHELSIO_T4 diff --git a/drivers/crypto/chelsio/Makefile b/drivers/crypto/chelsio/Makefile index 0e9d035927e9..f2e8e2fb4e60 100644 --- a/drivers/crypto/chelsio/Makefile +++ b/drivers/crypto/chelsio/Makefile @@ -6,5 +6,3 @@ chcr-objs := chcr_core.o chcr_algo.o #ifdef CONFIG_CHELSIO_TLS_DEVICE chcr-objs += chcr_ktls.o #endif -chcr-$(CONFIG_CHELSIO_IPSEC_INLINE) += chcr_ipsec.o -obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls/ diff --git a/drivers/crypto/chelsio/chcr_algo.h b/drivers/crypto/chelsio/chcr_algo.h index d4f6e010dc79..507aafe93f21 100644 --- a/drivers/crypto/chelsio/chcr_algo.h +++ b/drivers/crypto/chelsio/chcr_algo.h @@ -86,39 +86,6 @@ KEY_CONTEXT_OPAD_PRESENT_M) #define KEY_CONTEXT_OPAD_PRESENT_F KEY_CONTEXT_OPAD_PRESENT_V(1U) -#define TLS_KEYCTX_RXFLIT_CNT_S 24 -#define TLS_KEYCTX_RXFLIT_CNT_V(x) ((x) << TLS_KEYCTX_RXFLIT_CNT_S) - -#define TLS_KEYCTX_RXPROT_VER_S 20 -#define TLS_KEYCTX_RXPROT_VER_M 0xf -#define TLS_KEYCTX_RXPROT_VER_V(x) ((x) << TLS_KEYCTX_RXPROT_VER_S) - -#define TLS_KEYCTX_RXCIPH_MODE_S 16 -#define TLS_KEYCTX_RXCIPH_MODE_M 0xf -#define TLS_KEYCTX_RXCIPH_MODE_V(x) ((x) << TLS_KEYCTX_RXCIPH_MODE_S) - -#define TLS_KEYCTX_RXAUTH_MODE_S 12 -#define TLS_KEYCTX_RXAUTH_MODE_M 0xf -#define TLS_KEYCTX_RXAUTH_MODE_V(x) ((x) << TLS_KEYCTX_RXAUTH_MODE_S) - -#define TLS_KEYCTX_RXCIAU_CTRL_S 11 -#define TLS_KEYCTX_RXCIAU_CTRL_V(x) ((x) << TLS_KEYCTX_RXCIAU_CTRL_S) - -#define TLS_KEYCTX_RX_SEQCTR_S 9 -#define TLS_KEYCTX_RX_SEQCTR_M 0x3 -#define TLS_KEYCTX_RX_SEQCTR_V(x) ((x) << TLS_KEYCTX_RX_SEQCTR_S) - -#define TLS_KEYCTX_RX_VALID_S 8 -#define TLS_KEYCTX_RX_VALID_V(x) ((x) << TLS_KEYCTX_RX_VALID_S) - -#define TLS_KEYCTX_RXCK_SIZE_S 3 -#define TLS_KEYCTX_RXCK_SIZE_M 0x7 -#define TLS_KEYCTX_RXCK_SIZE_V(x) ((x) << TLS_KEYCTX_RXCK_SIZE_S) - -#define TLS_KEYCTX_RXMK_SIZE_S 0 -#define TLS_KEYCTX_RXMK_SIZE_M 0x7 -#define TLS_KEYCTX_RXMK_SIZE_V(x) ((x) << TLS_KEYCTX_RXMK_SIZE_S) - #define CHCR_HASH_MAX_DIGEST_SIZE 64 #define CHCR_MAX_SHA_DIGEST_SIZE 64 diff --git a/drivers/crypto/chelsio/chcr_core.c b/drivers/crypto/chelsio/chcr_core.c index bd8dac806e7a..b3570b41a737 100644 --- a/drivers/crypto/chelsio/chcr_core.c +++ b/drivers/crypto/chelsio/chcr_core.c @@ -40,10 +40,6 @@ static const struct tlsdev_ops chcr_ktls_ops = { }; #endif -#ifdef CONFIG_CHELSIO_IPSEC_INLINE -static void update_netdev_features(void); -#endif /* CONFIG_CHELSIO_IPSEC_INLINE */ - static chcr_handler_func work_handlers[NUM_CPL_CMDS] = { [CPL_FW6_PLD] = cpl_fw6_pld_handler, #ifdef CONFIG_CHELSIO_TLS_DEVICE @@ -60,10 +56,8 @@ static struct cxgb4_uld_info chcr_uld_info = { .add = chcr_uld_add, .state_change = chcr_uld_state_change, .rx_handler = chcr_uld_rx_handler, -#if defined(CONFIG_CHELSIO_IPSEC_INLINE) || defined(CONFIG_CHELSIO_TLS_DEVICE) - .tx_handler = chcr_uld_tx_handler, -#endif /* CONFIG_CHELSIO_IPSEC_INLINE || CONFIG_CHELSIO_TLS_DEVICE */ #if defined(CONFIG_CHELSIO_TLS_DEVICE) + .tx_handler = chcr_uld_tx_handler, .tlsdev_ops = &chcr_ktls_ops, #endif }; @@ -241,19 +235,11 @@ int chcr_uld_rx_handler(void *handle, const __be64 *rsp, return 0; } -#if defined(CONFIG_CHELSIO_IPSEC_INLINE) || defined(CONFIG_CHELSIO_TLS_DEVICE) +#if defined(CONFIG_CHELSIO_TLS_DEVICE) int chcr_uld_tx_handler(struct sk_buff *skb, struct net_device *dev) { - /* In case if skb's decrypted bit is set, it's nic tls packet, else it's - * ipsec packet. - */ -#ifdef CONFIG_CHELSIO_TLS_DEVICE if (skb->decrypted) return chcr_ktls_xmit(skb, dev); -#endif -#ifdef CONFIG_CHELSIO_IPSEC_INLINE - return chcr_ipsec_xmit(skb, dev); -#endif return 0; } #endif /* CONFIG_CHELSIO_IPSEC_INLINE || CONFIG_CHELSIO_TLS_DEVICE */ @@ -305,24 +291,6 @@ static int chcr_uld_state_change(void *handle, enum cxgb4_state state) return ret; } -#ifdef CONFIG_CHELSIO_IPSEC_INLINE -static void update_netdev_features(void) -{ - struct uld_ctx *u_ctx, *tmp; - - mutex_lock(&drv_data.drv_mutex); - list_for_each_entry_safe(u_ctx, tmp, &drv_data.inact_dev, entry) { - if (u_ctx->lldi.crypto & ULP_CRYPTO_IPSEC_INLINE) - chcr_add_xfrmops(&u_ctx->lldi); - } - list_for_each_entry_safe(u_ctx, tmp, &drv_data.act_dev, entry) { - if (u_ctx->lldi.crypto & ULP_CRYPTO_IPSEC_INLINE) - chcr_add_xfrmops(&u_ctx->lldi); - } - mutex_unlock(&drv_data.drv_mutex); -} -#endif /* CONFIG_CHELSIO_IPSEC_INLINE */ - static int __init chcr_crypto_init(void) { INIT_LIST_HEAD(&drv_data.act_dev); @@ -332,12 +300,6 @@ static int __init chcr_crypto_init(void) drv_data.last_dev = NULL; cxgb4_register_uld(CXGB4_ULD_CRYPTO, &chcr_uld_info); - #ifdef CONFIG_CHELSIO_IPSEC_INLINE - rtnl_lock(); - update_netdev_features(); - rtnl_unlock(); - #endif /* CONFIG_CHELSIO_IPSEC_INLINE */ - return 0; } diff --git a/drivers/crypto/chelsio/chcr_core.h b/drivers/crypto/chelsio/chcr_core.h index 67d77abd6775..81f6e61401e5 100644 --- a/drivers/crypto/chelsio/chcr_core.h +++ b/drivers/crypto/chelsio/chcr_core.h @@ -72,54 +72,6 @@ struct _key_ctx { unsigned char key[]; }; -#define KEYCTX_TX_WR_IV_S 55 -#define KEYCTX_TX_WR_IV_M 0x1ffULL -#define KEYCTX_TX_WR_IV_V(x) ((x) << KEYCTX_TX_WR_IV_S) -#define KEYCTX_TX_WR_IV_G(x) \ - (((x) >> KEYCTX_TX_WR_IV_S) & KEYCTX_TX_WR_IV_M) - -#define KEYCTX_TX_WR_AAD_S 47 -#define KEYCTX_TX_WR_AAD_M 0xffULL -#define KEYCTX_TX_WR_AAD_V(x) ((x) << KEYCTX_TX_WR_AAD_S) -#define KEYCTX_TX_WR_AAD_G(x) (((x) >> KEYCTX_TX_WR_AAD_S) & \ - KEYCTX_TX_WR_AAD_M) - -#define KEYCTX_TX_WR_AADST_S 39 -#define KEYCTX_TX_WR_AADST_M 0xffULL -#define KEYCTX_TX_WR_AADST_V(x) ((x) << KEYCTX_TX_WR_AADST_S) -#define KEYCTX_TX_WR_AADST_G(x) \ - (((x) >> KEYCTX_TX_WR_AADST_S) & KEYCTX_TX_WR_AADST_M) - -#define KEYCTX_TX_WR_CIPHER_S 30 -#define KEYCTX_TX_WR_CIPHER_M 0x1ffULL -#define KEYCTX_TX_WR_CIPHER_V(x) ((x) << KEYCTX_TX_WR_CIPHER_S) -#define KEYCTX_TX_WR_CIPHER_G(x) \ - (((x) >> KEYCTX_TX_WR_CIPHER_S) & KEYCTX_TX_WR_CIPHER_M) - -#define KEYCTX_TX_WR_CIPHERST_S 23 -#define KEYCTX_TX_WR_CIPHERST_M 0x7f -#define KEYCTX_TX_WR_CIPHERST_V(x) ((x) << KEYCTX_TX_WR_CIPHERST_S) -#define KEYCTX_TX_WR_CIPHERST_G(x) \ - (((x) >> KEYCTX_TX_WR_CIPHERST_S) & KEYCTX_TX_WR_CIPHERST_M) - -#define KEYCTX_TX_WR_AUTH_S 14 -#define KEYCTX_TX_WR_AUTH_M 0x1ff -#define KEYCTX_TX_WR_AUTH_V(x) ((x) << KEYCTX_TX_WR_AUTH_S) -#define KEYCTX_TX_WR_AUTH_G(x) \ - (((x) >> KEYCTX_TX_WR_AUTH_S) & KEYCTX_TX_WR_AUTH_M) - -#define KEYCTX_TX_WR_AUTHST_S 7 -#define KEYCTX_TX_WR_AUTHST_M 0x7f -#define KEYCTX_TX_WR_AUTHST_V(x) ((x) << KEYCTX_TX_WR_AUTHST_S) -#define KEYCTX_TX_WR_AUTHST_G(x) \ - (((x) >> KEYCTX_TX_WR_AUTHST_S) & KEYCTX_TX_WR_AUTHST_M) - -#define KEYCTX_TX_WR_AUTHIN_S 0 -#define KEYCTX_TX_WR_AUTHIN_M 0x7f -#define KEYCTX_TX_WR_AUTHIN_V(x) ((x) << KEYCTX_TX_WR_AUTHIN_S) -#define KEYCTX_TX_WR_AUTHIN_G(x) \ - (((x) >> KEYCTX_TX_WR_AUTHIN_S) & KEYCTX_TX_WR_AUTHIN_M) - #define WQ_RETRY 5 struct chcr_driver_data { struct list_head act_dev; @@ -157,42 +109,6 @@ struct uld_ctx { struct chcr_dev dev; }; -struct sge_opaque_hdr { - void *dev; - dma_addr_t addr[MAX_SKB_FRAGS + 1]; -}; - -struct chcr_ipsec_req { - struct ulp_txpkt ulptx; - struct ulptx_idata sc_imm; - struct cpl_tx_sec_pdu sec_cpl; - struct _key_ctx key_ctx; -}; - -struct chcr_ipsec_wr { - struct fw_ulptx_wr wreq; - struct chcr_ipsec_req req; -}; - -#define ESN_IV_INSERT_OFFSET 12 -struct chcr_ipsec_aadiv { - __be32 spi; - u8 seq_no[8]; - u8 iv[8]; -}; - -struct ipsec_sa_entry { - int hmac_ctrl; - u16 esn; - u16 resv; - unsigned int enckey_len; - unsigned int kctx_len; - unsigned int authsize; - __be32 key_ctx_hdr; - char salt[MAX_SALT]; - char key[2 * AES_MAX_KEY_SIZE]; -}; - /* * sgl_len - calculates the size of an SGL of the given capacity * @n: the number of SGL entries diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c index eb600b3dbf26..b588614d1e5e 100644 --- a/drivers/net/dsa/dsa_loop.c +++ b/drivers/net/dsa/dsa_loop.c @@ -28,6 +28,53 @@ static struct dsa_loop_mib_entry dsa_loop_mibs[] = { static struct phy_device *phydevs[PHY_MAX_ADDR]; +enum dsa_loop_devlink_resource_id { + DSA_LOOP_DEVLINK_PARAM_ID_VTU, +}; + +static u64 dsa_loop_devlink_vtu_get(void *priv) +{ + struct dsa_loop_priv *ps = priv; + unsigned int i, count = 0; + struct dsa_loop_vlan *vl; + + for (i = 0; i < ARRAY_SIZE(ps->vlans); i++) { + vl = &ps->vlans[i]; + if (vl->members) + count++; + } + + return count; +} + +static int dsa_loop_setup_devlink_resources(struct dsa_switch *ds) +{ + struct devlink_resource_size_params size_params; + struct dsa_loop_priv *ps = ds->priv; + int err; + + devlink_resource_size_params_init(&size_params, ARRAY_SIZE(ps->vlans), + ARRAY_SIZE(ps->vlans), + 1, DEVLINK_RESOURCE_UNIT_ENTRY); + + err = dsa_devlink_resource_register(ds, "VTU", ARRAY_SIZE(ps->vlans), + DSA_LOOP_DEVLINK_PARAM_ID_VTU, + DEVLINK_RESOURCE_ID_PARENT_TOP, + &size_params); + if (err) + goto out; + + dsa_devlink_resource_occ_get_register(ds, + DSA_LOOP_DEVLINK_PARAM_ID_VTU, + dsa_loop_devlink_vtu_get, ps); + + return 0; + +out: + dsa_devlink_resources_unregister(ds); + return err; +} + static enum dsa_tag_protocol dsa_loop_get_protocol(struct dsa_switch *ds, int port, enum dsa_tag_protocol mp) @@ -48,7 +95,12 @@ static int dsa_loop_setup(struct dsa_switch *ds) dev_dbg(ds->dev, "%s\n", __func__); - return 0; + return dsa_loop_setup_devlink_resources(ds); +} + +static void dsa_loop_teardown(struct dsa_switch *ds) +{ + dsa_devlink_resources_unregister(ds); } static int dsa_loop_get_sset_count(struct dsa_switch *ds, int port, int sset) @@ -243,6 +295,7 @@ static int dsa_loop_port_max_mtu(struct dsa_switch *ds, int port) static const struct dsa_switch_ops dsa_loop_driver = { .get_tag_protocol = dsa_loop_get_protocol, .setup = dsa_loop_setup, + .teardown = dsa_loop_teardown, .get_strings = dsa_loop_get_strings, .get_ethtool_stats = dsa_loop_get_ethtool_stats, .get_sset_count = dsa_loop_get_sset_count, @@ -290,6 +343,7 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev) ds->dev = &mdiodev->dev; ds->ops = &dsa_loop_driver; ds->priv = ps; + ds->configure_vlan_while_not_filtering = true; ps->bus = mdiodev->bus; dev_set_drvdata(&mdiodev->dev, ds); diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c index a4c488b12e8f..094d17a1d037 100644 --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c @@ -211,49 +211,20 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, -EFAULT : 0; } -/* Get the start of the PTP header in this skb */ -static u8 *parse_ptp_header(struct sk_buff *skb, unsigned int type) -{ - u8 *data = skb_mac_header(skb); - unsigned int offset = 0; - - if (type & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (type & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return NULL; - } - - /* Ensure that the entire header is present in this packet. */ - if (skb->len + ETH_HLEN < offset + 34) - return NULL; - - return data + offset; -} - /* Returns a pointer to the PTP header if the caller should time stamp, * or NULL if the caller should not. */ -static u8 *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port, - struct sk_buff *skb, unsigned int type) +static struct ptp_header *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, + int port, struct sk_buff *skb, + unsigned int type) { struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port]; - u8 *hdr; + struct ptp_header *hdr; if (!chip->info->ptp_support) return NULL; - hdr = parse_ptp_header(skb, type); + hdr = ptp_parse_header(skb, type); if (!hdr) return NULL; @@ -275,12 +246,11 @@ static int mv88e6xxx_ts_valid(u16 status) static int seq_match(struct sk_buff *skb, u16 ts_seqid) { unsigned int type = SKB_PTP_TYPE(skb); - u8 *hdr = parse_ptp_header(skb, type); - __be16 *seqid; + struct ptp_header *hdr; - seqid = (__be16 *)(hdr + OFF_PTP_SEQUENCE_ID); + hdr = ptp_parse_header(skb, type); - return ts_seqid == ntohs(*seqid); + return ts_seqid == ntohs(hdr->sequence_id); } static void mv88e6xxx_get_rxts(struct mv88e6xxx_chip *chip, @@ -357,9 +327,9 @@ static void mv88e6xxx_rxtstamp_work(struct mv88e6xxx_chip *chip, &ps->rx_queue2); } -static int is_pdelay_resp(u8 *msgtype) +static int is_pdelay_resp(const struct ptp_header *hdr) { - return (*msgtype & 0xf) == 3; + return (hdr->tsmt & 0xf) == 3; } bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port, @@ -367,7 +337,7 @@ bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port, { struct mv88e6xxx_port_hwtstamp *ps; struct mv88e6xxx_chip *chip; - u8 *hdr; + struct ptp_header *hdr; chip = ds->priv; ps = &chip->port_hwtstamp[port]; @@ -503,8 +473,7 @@ bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port, { struct mv88e6xxx_chip *chip = ds->priv; struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port]; - __be16 *seq_ptr; - u8 *hdr; + struct ptp_header *hdr; if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP)) return false; @@ -513,15 +482,13 @@ bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port, if (!hdr) return false; - seq_ptr = (__be16 *)(hdr + OFF_PTP_SEQUENCE_ID); - if (test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS, &ps->state)) return false; ps->tx_skb = clone; ps->tx_tstamp_start = jiffies; - ps->tx_seq_id = be16_to_cpup(seq_ptr); + ps->tx_seq_id = be16_to_cpu(hdr->sequence_id); ptp_schedule_worker(chip->ptp_clock, 0); return true; diff --git a/drivers/net/ethernet/chelsio/Kconfig b/drivers/net/ethernet/chelsio/Kconfig index f6f3ef9a93cf..87cc0ef68b31 100644 --- a/drivers/net/ethernet/chelsio/Kconfig +++ b/drivers/net/ethernet/chelsio/Kconfig @@ -134,4 +134,6 @@ config CHELSIO_LIB help Common library for Chelsio drivers. +source "drivers/net/ethernet/chelsio/inline_crypto/Kconfig" + endif # NET_VENDOR_CHELSIO diff --git a/drivers/net/ethernet/chelsio/Makefile b/drivers/net/ethernet/chelsio/Makefile index c0f978d2e8a7..1a6fd8b2bb7d 100644 --- a/drivers/net/ethernet/chelsio/Makefile +++ b/drivers/net/ethernet/chelsio/Makefile @@ -8,3 +8,4 @@ obj-$(CONFIG_CHELSIO_T3) += cxgb3/ obj-$(CONFIG_CHELSIO_T4) += cxgb4/ obj-$(CONFIG_CHELSIO_T4VF) += cxgb4vf/ obj-$(CONFIG_CHELSIO_LIB) += libcxgb/ +obj-$(CONFIG_CHELSIO_INLINE_CRYPTO) += inline_crypto/ diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 9cb8b229c1b3..e5d5c0fb7f47 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -1196,6 +1196,9 @@ struct adapter { struct cxgb4_tc_u32_table *tc_u32; struct chcr_ktls chcr_ktls; struct chcr_stats_debug chcr_stats; +#if IS_ENABLED(CONFIG_CHELSIO_IPSEC_INLINE) + struct ch_ipsec_stats_debug ch_ipsec_stats; +#endif /* TC flower offload */ bool tc_flower_initialized; diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c index 05f33b7e3677..42112e8ad687 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c @@ -3542,14 +3542,17 @@ static int chcr_stats_show(struct seq_file *seq, void *v) atomic_read(&adap->chcr_stats.error)); seq_printf(seq, "Fallback: %10u \n", atomic_read(&adap->chcr_stats.fallback)); - seq_printf(seq, "IPSec PDU: %10u\n", - atomic_read(&adap->chcr_stats.ipsec_cnt)); seq_printf(seq, "TLS PDU Tx: %10u\n", atomic_read(&adap->chcr_stats.tls_pdu_tx)); seq_printf(seq, "TLS PDU Rx: %10u\n", atomic_read(&adap->chcr_stats.tls_pdu_rx)); seq_printf(seq, "TLS Keys (DDR) Count: %10u\n", atomic_read(&adap->chcr_stats.tls_key)); +#if IS_ENABLED(CONFIG_CHELSIO_IPSEC_INLINE) + seq_puts(seq, "\nChelsio Inline IPsec Crypto Accelerator Stats\n"); + seq_printf(seq, "IPSec PDU: %10u\n", + atomic_read(&adap->ch_ipsec_stats.ipsec_cnt)); +#endif #ifdef CONFIG_CHELSIO_TLS_DEVICE seq_puts(seq, "\nChelsio KTLS Crypto Accelerator Stats\n"); seq_printf(seq, "Tx TLS offload refcount: %20u\n", diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c index 650db92cb11c..f6c1ec140e09 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c @@ -604,17 +604,14 @@ int cxgb4_get_free_ftid(struct net_device *dev, u8 family, bool hash_en, /* If the new rule wants to get inserted into * HPFILTER region, but its prio is greater * than the rule with the highest prio in HASH - * region, then reject the rule. - */ - if (t->tc_hash_tids_max_prio && - tc_prio > t->tc_hash_tids_max_prio) - break; - - /* If there's not enough slots available - * in HPFILTER region, then move on to - * normal FILTER region immediately. + * region, or if there's not enough slots + * available in HPFILTER region, then skip + * trying to insert this rule into HPFILTER + * region and directly go to the next region. */ - if (ftid + n > t->nhpftids) { + if ((t->tc_hash_tids_max_prio && + tc_prio > t->tc_hash_tids_max_prio) || + (ftid + n) > t->nhpftids) { ftid = t->nhpftids; continue; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h index a963fd0b4540..83c8189e4088 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h @@ -302,6 +302,7 @@ enum cxgb4_uld { CXGB4_ULD_ISCSI, CXGB4_ULD_ISCSIT, CXGB4_ULD_CRYPTO, + CXGB4_ULD_IPSEC, CXGB4_ULD_TLS, CXGB4_ULD_MAX }; @@ -368,7 +369,6 @@ struct chcr_stats_debug { atomic_t complete; atomic_t error; atomic_t fallback; - atomic_t ipsec_cnt; atomic_t tls_pdu_tx; atomic_t tls_pdu_rx; atomic_t tls_key; @@ -394,6 +394,12 @@ struct chcr_stats_debug { #endif }; +#if IS_ENABLED(CONFIG_CHELSIO_IPSEC_INLINE) +struct ch_ipsec_stats_debug { + atomic_t ipsec_cnt; +}; +#endif + #define OCQ_WIN_OFFSET(pdev, vres) \ (pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq.size)) diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index 869431a1eedd..fddd70ee6436 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -1416,9 +1416,9 @@ static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev) pi = netdev_priv(dev); adap = pi->adapter; ssi = skb_shinfo(skb); -#ifdef CONFIG_CHELSIO_IPSEC_INLINE +#if IS_ENABLED(CONFIG_CHELSIO_IPSEC_INLINE) if (xfrm_offload(skb) && !ssi->gso_size) - return adap->uld[CXGB4_ULD_CRYPTO].tx_handler(skb, dev); + return adap->uld[CXGB4_ULD_IPSEC].tx_handler(skb, dev); #endif /* CHELSIO_IPSEC_INLINE */ #ifdef CONFIG_CHELSIO_TLS_DEVICE diff --git a/drivers/net/ethernet/chelsio/inline_crypto/Kconfig b/drivers/net/ethernet/chelsio/inline_crypto/Kconfig new file mode 100644 index 000000000000..a3ef057031e4 --- /dev/null +++ b/drivers/net/ethernet/chelsio/inline_crypto/Kconfig @@ -0,0 +1,38 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Chelsio inline crypto configuration +# + +config CHELSIO_INLINE_CRYPTO + bool "Chelsio Inline Crypto support" + default y + help + Enable support for inline crypto. + Allows enable/disable from list of inline crypto drivers. + +if CHELSIO_INLINE_CRYPTO + +config CRYPTO_DEV_CHELSIO_TLS + tristate "Chelsio Crypto Inline TLS Driver" + depends on CHELSIO_T4 + depends on TLS_TOE + help + Support Chelsio Inline TLS with Chelsio crypto accelerator. + Enable inline TLS support for Tx and Rx. + + To compile this driver as a module, choose M here: the module + will be called chtls. + +config CHELSIO_IPSEC_INLINE + tristate "Chelsio IPSec XFRM Tx crypto offload" + depends on CHELSIO_T4 + depends on XFRM_OFFLOAD + depends on INET_ESP_OFFLOAD || INET6_ESP_OFFLOAD + help + Support Chelsio Inline IPsec with Chelsio crypto accelerator. + Enable inline IPsec support for Tx. + + To compile this driver as a module, choose M here: the module + will be called ch_ipsec. + +endif # CHELSIO_INLINE_CRYPTO diff --git a/drivers/net/ethernet/chelsio/inline_crypto/Makefile b/drivers/net/ethernet/chelsio/inline_crypto/Makefile new file mode 100644 index 000000000000..9a86ee8f1f38 --- /dev/null +++ b/drivers/net/ethernet/chelsio/inline_crypto/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0-only +obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls/ +obj-$(CONFIG_CHELSIO_IPSEC_INLINE) += ch_ipsec/ diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/Makefile b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/Makefile new file mode 100644 index 000000000000..efdcaaebc455 --- /dev/null +++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0-only +ccflags-y := -I $(srctree)/drivers/net/ethernet/chelsio/cxgb4 \ + -I $(srctree)/drivers/crypto/chelsio + +obj-$(CONFIG_CHELSIO_IPSEC_INLINE) += ch_ipsec.o +ch_ipsec-objs := chcr_ipsec.o + + diff --git a/drivers/crypto/chelsio/chcr_ipsec.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c index 967babd67a51..276f8841becc 100644 --- a/drivers/crypto/chelsio/chcr_ipsec.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.c @@ -60,9 +60,7 @@ #include <crypto/scatterwalk.h> #include <crypto/internal/hash.h> -#include "chcr_core.h" -#include "chcr_algo.h" -#include "chcr_crypto.h" +#include "chcr_ipsec.h" /* * Max Tx descriptor space we allow for an Ethernet packet to be inlined @@ -71,11 +69,17 @@ #define MAX_IMM_TX_PKT_LEN 256 #define GCM_ESP_IV_SIZE 8 +static LIST_HEAD(uld_ctx_list); +static DEFINE_MUTEX(dev_mutex); + static int chcr_xfrm_add_state(struct xfrm_state *x); static void chcr_xfrm_del_state(struct xfrm_state *x); static void chcr_xfrm_free_state(struct xfrm_state *x); static bool chcr_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x); static void chcr_advance_esn_state(struct xfrm_state *x); +static int ch_ipsec_uld_state_change(void *handle, enum cxgb4_state new_state); +static void *ch_ipsec_uld_add(const struct cxgb4_lld_info *infop); +static void update_netdev_features(void); static const struct xfrmdev_ops chcr_xfrmdev_ops = { .xdo_dev_state_add = chcr_xfrm_add_state, @@ -102,6 +106,57 @@ void chcr_add_xfrmops(const struct cxgb4_lld_info *lld) } } +static struct cxgb4_uld_info ch_ipsec_uld_info = { + .name = CHIPSEC_DRV_MODULE_NAME, + .nrxq = MAX_ULD_QSETS, + /* Max ntxq will be derived from fw config file*/ + .rxq_size = 1024, + .add = ch_ipsec_uld_add, + .state_change = ch_ipsec_uld_state_change, + .tx_handler = chcr_ipsec_xmit, +}; + +static void *ch_ipsec_uld_add(const struct cxgb4_lld_info *infop) +{ + struct ipsec_uld_ctx *u_ctx; + + pr_info_once("%s - version %s\n", CHIPSEC_DRV_DESC, + CHIPSEC_DRV_VERSION); + u_ctx = kzalloc(sizeof(*u_ctx), GFP_KERNEL); + if (!u_ctx) { + u_ctx = ERR_PTR(-ENOMEM); + goto out; + } + u_ctx->lldi = *infop; +out: + return u_ctx; +} + +static int ch_ipsec_uld_state_change(void *handle, enum cxgb4_state new_state) +{ + struct ipsec_uld_ctx *u_ctx = handle; + + pr_info("new_state %u\n", new_state); + switch (new_state) { + case CXGB4_STATE_UP: + pr_info("%s: Up\n", pci_name(u_ctx->lldi.pdev)); + mutex_lock(&dev_mutex); + list_add_tail(&u_ctx->entry, &uld_ctx_list); + mutex_unlock(&dev_mutex); + break; + case CXGB4_STATE_START_RECOVERY: + case CXGB4_STATE_DOWN: + case CXGB4_STATE_DETACH: + pr_info("%s: Down\n", pci_name(u_ctx->lldi.pdev)); + list_del(&u_ctx->entry); + break; + default: + break; + } + + return 0; +} + static inline int chcr_ipsec_setauthsize(struct xfrm_state *x, struct ipsec_sa_entry *sa_entry) { @@ -538,7 +593,7 @@ inline void *chcr_crypto_wreq(struct sk_buff *skb, unsigned int kctx_len = sa_entry->kctx_len; int qid = q->q.cntxt_id; - atomic_inc(&adap->chcr_stats.ipsec_cnt); + atomic_inc(&adap->ch_ipsec_stats.ipsec_cnt); flits = calc_tx_sec_flits(skb, sa_entry, &immediate); ndesc = DIV_ROUND_UP(flits, 2); @@ -752,3 +807,51 @@ out_free: dev_kfree_skb_any(skb); cxgb4_ring_tx_db(adap, &q->q, ndesc); return NETDEV_TX_OK; } + +static void update_netdev_features(void) +{ + struct ipsec_uld_ctx *u_ctx, *tmp; + + mutex_lock(&dev_mutex); + list_for_each_entry_safe(u_ctx, tmp, &uld_ctx_list, entry) { + if (u_ctx->lldi.crypto & ULP_CRYPTO_IPSEC_INLINE) + chcr_add_xfrmops(&u_ctx->lldi); + } + mutex_unlock(&dev_mutex); +} + +static int __init chcr_ipsec_init(void) +{ + cxgb4_register_uld(CXGB4_ULD_IPSEC, &ch_ipsec_uld_info); + + rtnl_lock(); + update_netdev_features(); + rtnl_unlock(); + + return 0; +} + +static void __exit chcr_ipsec_exit(void) +{ + struct ipsec_uld_ctx *u_ctx, *tmp; + struct adapter *adap; + + mutex_lock(&dev_mutex); + list_for_each_entry_safe(u_ctx, tmp, &uld_ctx_list, entry) { + adap = pci_get_drvdata(u_ctx->lldi.pdev); + atomic_set(&adap->ch_ipsec_stats.ipsec_cnt, 0); + list_del(&u_ctx->entry); + kfree(u_ctx); + } + mutex_unlock(&dev_mutex); + cxgb4_unregister_uld(CXGB4_ULD_IPSEC); +} + +module_init(chcr_ipsec_init); +module_exit(chcr_ipsec_exit); + +MODULE_DESCRIPTION("Crypto IPSEC for Chelsio Terminator cards."); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Chelsio Communications"); +MODULE_VERSION(CHIPSEC_DRV_VERSION); + diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.h b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.h new file mode 100644 index 000000000000..1d110d2edd64 --- /dev/null +++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ipsec/chcr_ipsec.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2018 Chelsio Communications, Inc. */ + +#ifndef __CHCR_IPSEC_H__ +#define __CHCR_IPSEC_H__ + +#include <crypto/algapi.h> +#include "t4_hw.h" +#include "cxgb4.h" +#include "t4_msg.h" +#include "cxgb4_uld.h" + +#include "chcr_core.h" +#include "chcr_algo.h" +#include "chcr_crypto.h" + +#define CHIPSEC_DRV_MODULE_NAME "ch_ipsec" +#define CHIPSEC_DRV_VERSION "1.0.0.0-ko" +#define CHIPSEC_DRV_DESC "Chelsio T6 Crypto Ipsec offload Driver" + +struct ipsec_uld_ctx { + struct list_head entry; + struct cxgb4_lld_info lldi; +}; + +struct chcr_ipsec_req { + struct ulp_txpkt ulptx; + struct ulptx_idata sc_imm; + struct cpl_tx_sec_pdu sec_cpl; + struct _key_ctx key_ctx; +}; + +struct chcr_ipsec_wr { + struct fw_ulptx_wr wreq; + struct chcr_ipsec_req req; +}; + +#define ESN_IV_INSERT_OFFSET 12 +struct chcr_ipsec_aadiv { + __be32 spi; + u8 seq_no[8]; + u8 iv[8]; +}; + +struct ipsec_sa_entry { + int hmac_ctrl; + u16 esn; + u16 resv; + unsigned int enckey_len; + unsigned int kctx_len; + unsigned int authsize; + __be32 key_ctx_hdr; + char salt[MAX_SALT]; + char key[2 * AES_MAX_KEY_SIZE]; +}; + +#endif /* __CHCR_IPSEC_H__ */ + diff --git a/drivers/crypto/chelsio/chtls/Makefile b/drivers/net/ethernet/chelsio/inline_crypto/chtls/Makefile index bc11495acdb3..bc11495acdb3 100644 --- a/drivers/crypto/chelsio/chtls/Makefile +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/Makefile diff --git a/drivers/crypto/chelsio/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h index 459442704eb1..2d3dfdd2a716 100644 --- a/drivers/crypto/chelsio/chtls/chtls.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h @@ -32,6 +32,94 @@ #include "chcr_core.h" #include "chcr_crypto.h" +#define CHTLS_DRV_VERSION "1.0.0.0-ko" + +#define TLS_KEYCTX_RXFLIT_CNT_S 24 +#define TLS_KEYCTX_RXFLIT_CNT_V(x) ((x) << TLS_KEYCTX_RXFLIT_CNT_S) + +#define TLS_KEYCTX_RXPROT_VER_S 20 +#define TLS_KEYCTX_RXPROT_VER_M 0xf +#define TLS_KEYCTX_RXPROT_VER_V(x) ((x) << TLS_KEYCTX_RXPROT_VER_S) + +#define TLS_KEYCTX_RXCIPH_MODE_S 16 +#define TLS_KEYCTX_RXCIPH_MODE_M 0xf +#define TLS_KEYCTX_RXCIPH_MODE_V(x) ((x) << TLS_KEYCTX_RXCIPH_MODE_S) + +#define TLS_KEYCTX_RXAUTH_MODE_S 12 +#define TLS_KEYCTX_RXAUTH_MODE_M 0xf +#define TLS_KEYCTX_RXAUTH_MODE_V(x) ((x) << TLS_KEYCTX_RXAUTH_MODE_S) + +#define TLS_KEYCTX_RXCIAU_CTRL_S 11 +#define TLS_KEYCTX_RXCIAU_CTRL_V(x) ((x) << TLS_KEYCTX_RXCIAU_CTRL_S) + +#define TLS_KEYCTX_RX_SEQCTR_S 9 +#define TLS_KEYCTX_RX_SEQCTR_M 0x3 +#define TLS_KEYCTX_RX_SEQCTR_V(x) ((x) << TLS_KEYCTX_RX_SEQCTR_S) + +#define TLS_KEYCTX_RX_VALID_S 8 +#define TLS_KEYCTX_RX_VALID_V(x) ((x) << TLS_KEYCTX_RX_VALID_S) + +#define TLS_KEYCTX_RXCK_SIZE_S 3 +#define TLS_KEYCTX_RXCK_SIZE_M 0x7 +#define TLS_KEYCTX_RXCK_SIZE_V(x) ((x) << TLS_KEYCTX_RXCK_SIZE_S) + +#define TLS_KEYCTX_RXMK_SIZE_S 0 +#define TLS_KEYCTX_RXMK_SIZE_M 0x7 +#define TLS_KEYCTX_RXMK_SIZE_V(x) ((x) << TLS_KEYCTX_RXMK_SIZE_S) + +#define KEYCTX_TX_WR_IV_S 55 +#define KEYCTX_TX_WR_IV_M 0x1ffULL +#define KEYCTX_TX_WR_IV_V(x) ((x) << KEYCTX_TX_WR_IV_S) +#define KEYCTX_TX_WR_IV_G(x) \ + (((x) >> KEYCTX_TX_WR_IV_S) & KEYCTX_TX_WR_IV_M) + +#define KEYCTX_TX_WR_AAD_S 47 +#define KEYCTX_TX_WR_AAD_M 0xffULL +#define KEYCTX_TX_WR_AAD_V(x) ((x) << KEYCTX_TX_WR_AAD_S) +#define KEYCTX_TX_WR_AAD_G(x) (((x) >> KEYCTX_TX_WR_AAD_S) & \ + KEYCTX_TX_WR_AAD_M) + +#define KEYCTX_TX_WR_AADST_S 39 +#define KEYCTX_TX_WR_AADST_M 0xffULL +#define KEYCTX_TX_WR_AADST_V(x) ((x) << KEYCTX_TX_WR_AADST_S) +#define KEYCTX_TX_WR_AADST_G(x) \ + (((x) >> KEYCTX_TX_WR_AADST_S) & KEYCTX_TX_WR_AADST_M) + +#define KEYCTX_TX_WR_CIPHER_S 30 +#define KEYCTX_TX_WR_CIPHER_M 0x1ffULL +#define KEYCTX_TX_WR_CIPHER_V(x) ((x) << KEYCTX_TX_WR_CIPHER_S) +#define KEYCTX_TX_WR_CIPHER_G(x) \ + (((x) >> KEYCTX_TX_WR_CIPHER_S) & KEYCTX_TX_WR_CIPHER_M) + +#define KEYCTX_TX_WR_CIPHERST_S 23 +#define KEYCTX_TX_WR_CIPHERST_M 0x7f +#define KEYCTX_TX_WR_CIPHERST_V(x) ((x) << KEYCTX_TX_WR_CIPHERST_S) +#define KEYCTX_TX_WR_CIPHERST_G(x) \ + (((x) >> KEYCTX_TX_WR_CIPHERST_S) & KEYCTX_TX_WR_CIPHERST_M) + +#define KEYCTX_TX_WR_AUTH_S 14 +#define KEYCTX_TX_WR_AUTH_M 0x1ff +#define KEYCTX_TX_WR_AUTH_V(x) ((x) << KEYCTX_TX_WR_AUTH_S) +#define KEYCTX_TX_WR_AUTH_G(x) \ + (((x) >> KEYCTX_TX_WR_AUTH_S) & KEYCTX_TX_WR_AUTH_M) + +#define KEYCTX_TX_WR_AUTHST_S 7 +#define KEYCTX_TX_WR_AUTHST_M 0x7f +#define KEYCTX_TX_WR_AUTHST_V(x) ((x) << KEYCTX_TX_WR_AUTHST_S) +#define KEYCTX_TX_WR_AUTHST_G(x) \ + (((x) >> KEYCTX_TX_WR_AUTHST_S) & KEYCTX_TX_WR_AUTHST_M) + +#define KEYCTX_TX_WR_AUTHIN_S 0 +#define KEYCTX_TX_WR_AUTHIN_M 0x7f +#define KEYCTX_TX_WR_AUTHIN_V(x) ((x) << KEYCTX_TX_WR_AUTHIN_S) +#define KEYCTX_TX_WR_AUTHIN_G(x) \ + (((x) >> KEYCTX_TX_WR_AUTHIN_S) & KEYCTX_TX_WR_AUTHIN_M) + +struct sge_opaque_hdr { + void *dev; + dma_addr_t addr[MAX_SKB_FRAGS + 1]; +}; + #define MAX_IVS_PAGE 256 #define TLS_KEY_CONTEXT_SZ 64 #define CIPHER_BLOCK_SIZE 16 diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c index 05520dccd906..05520dccd906 100644 --- a/drivers/crypto/chelsio/chtls/chtls_cm.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h index 47ba81e42f5d..47ba81e42f5d 100644 --- a/drivers/crypto/chelsio/chtls/chtls_cm.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c index f1820aca0d33..f1820aca0d33 100644 --- a/drivers/crypto/chelsio/chtls/chtls_hw.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_hw.c diff --git a/drivers/crypto/chelsio/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index 2e9acae1cba3..2e9acae1cba3 100644 --- a/drivers/crypto/chelsio/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c diff --git a/drivers/crypto/chelsio/chtls/chtls_main.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c index 66d247efd561..9098b3eed4da 100644 --- a/drivers/crypto/chelsio/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -638,4 +638,4 @@ module_exit(chtls_unregister); MODULE_DESCRIPTION("Chelsio TLS Inline driver"); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Chelsio Communications"); -MODULE_VERSION(DRV_VERSION); +MODULE_VERSION(CHTLS_DRV_VERSION); diff --git a/drivers/net/ethernet/cirrus/cs89x0.h b/drivers/net/ethernet/cirrus/cs89x0.h index 91423b70bb45..210f9ec9af4b 100644 --- a/drivers/net/ethernet/cirrus/cs89x0.h +++ b/drivers/net/ethernet/cirrus/cs89x0.h @@ -459,7 +459,3 @@ #define PNP_CNF_INT 0x70 #define PNP_CNF_DMA 0x74 #define PNP_CNF_MEM 0x48 - -#define BIT0 1 -#define BIT15 0x8000 - diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 5afb3c9c52d2..597801e7e8ba 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -306,6 +306,7 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, struct ibmvnic_rx_pool *pool) { int count = pool->size - atomic_read(&pool->available); + u64 handle = adapter->rx_scrq[pool->index]->handle; struct device *dev = &adapter->vdev->dev; int buffers_added = 0; unsigned long lpar_rc; @@ -314,7 +315,6 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, unsigned int offset; dma_addr_t dma_addr; unsigned char *dst; - u64 *handle_array; int shift = 0; int index; int i; @@ -322,10 +322,6 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, if (!pool->active) return; - handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf-> - off_rxadd_subcrqs)); - for (i = 0; i < count; ++i) { skb = alloc_skb(pool->buff_size, GFP_ATOMIC); if (!skb) { @@ -369,8 +365,7 @@ static void replenish_rx_pool(struct ibmvnic_adapter *adapter, #endif sub_crq.rx_add.len = cpu_to_be32(pool->buff_size << shift); - lpar_rc = send_subcrq(adapter, handle_array[pool->index], - &sub_crq); + lpar_rc = send_subcrq(adapter, handle, &sub_crq); if (lpar_rc != H_SUCCESS) goto failure; @@ -1524,9 +1519,9 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) unsigned int offset; int num_entries = 1; unsigned char *dst; - u64 *handle_array; int index = 0; u8 proto = 0; + u64 handle; netdev_tx_t ret = NETDEV_TX_OK; if (test_bit(0, &adapter->resetting)) { @@ -1553,8 +1548,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) tx_scrq = adapter->tx_scrq[queue_num]; txq = netdev_get_tx_queue(netdev, skb_get_queue_mapping(skb)); - handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + - be32_to_cpu(adapter->login_rsp_buf->off_txsubm_subcrqs)); + handle = tx_scrq->handle; index = tx_pool->free_map[tx_pool->consumer_index]; @@ -1666,14 +1660,14 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) ret = NETDEV_TX_OK; goto tx_err_out; } - lpar_rc = send_subcrq_indirect(adapter, handle_array[queue_num], + lpar_rc = send_subcrq_indirect(adapter, handle, (u64)tx_buff->indir_dma, (u64)num_entries); dma_unmap_single(dev, tx_buff->indir_dma, sizeof(tx_buff->indir_arr), DMA_TO_DEVICE); } else { tx_buff->num_entries = num_entries; - lpar_rc = send_subcrq(adapter, handle_array[queue_num], + lpar_rc = send_subcrq(adapter, handle, &tx_crq); } if (lpar_rc != H_SUCCESS) { @@ -4292,6 +4286,10 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, struct net_device *netdev = adapter->netdev; struct ibmvnic_login_rsp_buffer *login_rsp = adapter->login_rsp_buf; struct ibmvnic_login_buffer *login = adapter->login_buf; + u64 *tx_handle_array; + u64 *rx_handle_array; + int num_tx_pools; + int num_rx_pools; int i; dma_unmap_single(dev, adapter->login_buf_token, adapter->login_buf_sz, @@ -4326,6 +4324,22 @@ static int handle_login_rsp(union ibmvnic_crq *login_rsp_crq, ibmvnic_remove(adapter->vdev); return -EIO; } + + num_tx_pools = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs); + num_rx_pools = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs); + + tx_handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + + be32_to_cpu(adapter->login_rsp_buf->off_txsubm_subcrqs)); + rx_handle_array = (u64 *)((u8 *)(adapter->login_rsp_buf) + + be32_to_cpu(adapter->login_rsp_buf->off_rxadd_subcrqs)); + + for (i = 0; i < num_tx_pools; i++) + adapter->tx_scrq[i]->handle = tx_handle_array[i]; + + for (i = 0; i < num_rx_pools; i++) + adapter->rx_scrq[i]->handle = rx_handle_array[i]; + + release_login_rsp_buffer(adapter); release_login_buffer(adapter); complete(&adapter->init_done); diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h index f8416e1d4cf0..d99820212edd 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.h +++ b/drivers/net/ethernet/ibm/ibmvnic.h @@ -875,6 +875,7 @@ struct ibmvnic_sub_crq_queue { struct ibmvnic_adapter *adapter; atomic_t used; char name[32]; + u64 handle; }; struct ibmvnic_long_term_buff { diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c index 9650562fc0ef..ca8090a28dec 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c @@ -314,11 +314,9 @@ static int mlxsw_sp_ptp_parse(struct sk_buff *skb, u8 *p_message_type, u16 *p_sequence_id) { - unsigned int offset = 0; unsigned int ptp_class; - u8 *data; + struct ptp_header *hdr; - data = skb_mac_header(skb); ptp_class = ptp_classify_raw(skb); switch (ptp_class & PTP_CLASS_VMASK) { @@ -329,30 +327,14 @@ static int mlxsw_sp_ptp_parse(struct sk_buff *skb, return -ERANGE; } - if (ptp_class & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (ptp_class & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return -ERANGE; - } - - /* PTP header is 34 bytes. */ - if (skb->len < offset + 34) + hdr = ptp_parse_header(skb, ptp_class); + if (!hdr) return -EINVAL; - *p_message_type = data[offset] & 0x0f; - *p_domain_number = data[offset + 4]; - *p_sequence_id = (u16)(data[offset + 30]) << 8 | data[offset + 31]; + *p_message_type = ptp_get_msgtype(hdr, ptp_class); + *p_domain_number = hdr->domain_number; + *p_sequence_id = be16_to_cpu(hdr->sequence_id); + return 0; } diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h index bf516285510f..a2926b1b3cff 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.h +++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.h @@ -24,6 +24,7 @@ #define NFP_FLOWER_LAYER_VXLAN BIT(7) #define NFP_FLOWER_LAYER2_GRE BIT(0) +#define NFP_FLOWER_LAYER2_QINQ BIT(4) #define NFP_FLOWER_LAYER2_GENEVE BIT(5) #define NFP_FLOWER_LAYER2_GENEVE_OP BIT(6) #define NFP_FLOWER_LAYER2_TUN_IPV6 BIT(7) @@ -319,6 +320,22 @@ struct nfp_flower_mac_mpls { __be32 mpls_lse; }; +/* VLAN details (2W/8B) + * 3 2 1 + * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | outer_tpid | outer_tci | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + * | inner_tpid | inner_tci | + * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + */ +struct nfp_flower_vlan { + __be16 outer_tpid; + __be16 outer_tci; + __be16 inner_tpid; + __be16 inner_tci; +}; + /* L4 ports (for UDP, TCP, SCTP) (1W/4B) * 3 2 1 * 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h index 3bf9c1afa45e..caf12eec9945 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/main.h +++ b/drivers/net/ethernet/netronome/nfp/flower/main.h @@ -30,6 +30,8 @@ struct nfp_app; #define NFP_FLOWER_MASK_ELEMENT_RS 1 #define NFP_FLOWER_MASK_HASH_BITS 10 +#define NFP_FLOWER_KEY_MAX_LW 32 + #define NFP_FL_META_FLAG_MANAGE_MASK BIT(7) #define NFP_FL_MASK_REUSE_TIME_NS 40000 @@ -44,6 +46,7 @@ struct nfp_app; #define NFP_FL_FEATS_FLOW_MOD BIT(5) #define NFP_FL_FEATS_PRE_TUN_RULES BIT(6) #define NFP_FL_FEATS_IPV6_TUN BIT(7) +#define NFP_FL_FEATS_VLAN_QINQ BIT(8) #define NFP_FL_FEATS_HOST_ACK BIT(31) #define NFP_FL_ENABLE_FLOW_MERGE BIT(0) @@ -57,7 +60,8 @@ struct nfp_app; NFP_FL_FEATS_VF_RLIM | \ NFP_FL_FEATS_FLOW_MOD | \ NFP_FL_FEATS_PRE_TUN_RULES | \ - NFP_FL_FEATS_IPV6_TUN) + NFP_FL_FEATS_IPV6_TUN | \ + NFP_FL_FEATS_VLAN_QINQ) struct nfp_fl_mask_id { struct circ_buf mask_id_free_list; diff --git a/drivers/net/ethernet/netronome/nfp/flower/match.c b/drivers/net/ethernet/netronome/nfp/flower/match.c index f7f01e2e3dce..255a4dff6288 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/match.c +++ b/drivers/net/ethernet/netronome/nfp/flower/match.c @@ -10,7 +10,7 @@ static void nfp_flower_compile_meta_tci(struct nfp_flower_meta_tci *ext, struct nfp_flower_meta_tci *msk, - struct flow_rule *rule, u8 key_type) + struct flow_rule *rule, u8 key_type, bool qinq_sup) { u16 tmp_tci; @@ -24,7 +24,7 @@ nfp_flower_compile_meta_tci(struct nfp_flower_meta_tci *ext, msk->nfp_flow_key_layer = key_type; msk->mask_id = ~0; - if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) { + if (!qinq_sup && flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) { struct flow_match_vlan match; flow_rule_match_vlan(rule, &match); @@ -231,6 +231,50 @@ nfp_flower_compile_ip_ext(struct nfp_flower_ip_ext *ext, } static void +nfp_flower_fill_vlan(struct flow_dissector_key_vlan *key, + struct nfp_flower_vlan *frame, + bool outer_vlan) +{ + u16 tci; + + tci = NFP_FLOWER_MASK_VLAN_PRESENT; + tci |= FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO, + key->vlan_priority) | + FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID, + key->vlan_id); + + if (outer_vlan) { + frame->outer_tci = cpu_to_be16(tci); + frame->outer_tpid = key->vlan_tpid; + } else { + frame->inner_tci = cpu_to_be16(tci); + frame->inner_tpid = key->vlan_tpid; + } +} + +static void +nfp_flower_compile_vlan(struct nfp_flower_vlan *ext, + struct nfp_flower_vlan *msk, + struct flow_rule *rule) +{ + struct flow_match_vlan match; + + memset(ext, 0, sizeof(struct nfp_flower_vlan)); + memset(msk, 0, sizeof(struct nfp_flower_vlan)); + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) { + flow_rule_match_vlan(rule, &match); + nfp_flower_fill_vlan(match.key, ext, true); + nfp_flower_fill_vlan(match.mask, msk, true); + } + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CVLAN)) { + flow_rule_match_cvlan(rule, &match); + nfp_flower_fill_vlan(match.key, ext, false); + nfp_flower_fill_vlan(match.mask, msk, false); + } +} + +static void nfp_flower_compile_ipv4(struct nfp_flower_ipv4 *ext, struct nfp_flower_ipv4 *msk, struct flow_rule *rule) { @@ -433,7 +477,10 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, struct netlink_ext_ack *extack) { struct flow_rule *rule = flow_cls_offload_flow_rule(flow); + struct nfp_flower_priv *priv = app->priv; + bool qinq_sup; u32 port_id; + int ext_len; int err; u8 *ext; u8 *msk; @@ -446,9 +493,11 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, ext = nfp_flow->unmasked_data; msk = nfp_flow->mask_data; + qinq_sup = !!(priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ); + nfp_flower_compile_meta_tci((struct nfp_flower_meta_tci *)ext, (struct nfp_flower_meta_tci *)msk, - rule, key_ls->key_layer); + rule, key_ls->key_layer, qinq_sup); ext += sizeof(struct nfp_flower_meta_tci); msk += sizeof(struct nfp_flower_meta_tci); @@ -547,6 +596,14 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, } } + if (NFP_FLOWER_LAYER2_QINQ & key_ls->key_layer_two) { + nfp_flower_compile_vlan((struct nfp_flower_vlan *)ext, + (struct nfp_flower_vlan *)msk, + rule); + ext += sizeof(struct nfp_flower_vlan); + msk += sizeof(struct nfp_flower_vlan); + } + if (key_ls->key_layer & NFP_FLOWER_LAYER_VXLAN || key_ls->key_layer_two & NFP_FLOWER_LAYER2_GENEVE) { if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_TUN_IPV6) { @@ -589,5 +646,15 @@ int nfp_flower_compile_flow_match(struct nfp_app *app, } } + /* Check that the flow key does not exceed the maximum limit. + * All structures in the key is multiples of 4 bytes, so use u32. + */ + ext_len = (u32 *)ext - (u32 *)nfp_flow->unmasked_data; + if (ext_len > NFP_FLOWER_KEY_MAX_LW) { + NL_SET_ERR_MSG_MOD(extack, + "unsupported offload: flow key too long"); + return -EOPNOTSUPP; + } + return 0; } diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c index 4651fe417b7f..44cf738636ef 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c @@ -31,6 +31,7 @@ BIT(FLOW_DISSECTOR_KEY_PORTS) | \ BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | \ BIT(FLOW_DISSECTOR_KEY_VLAN) | \ + BIT(FLOW_DISSECTOR_KEY_CVLAN) | \ BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | \ BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | \ BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | \ @@ -66,7 +67,8 @@ NFP_FLOWER_LAYER_IPV6) #define NFP_FLOWER_PRE_TUN_RULE_FIELDS \ - (NFP_FLOWER_LAYER_PORT | \ + (NFP_FLOWER_LAYER_EXT_META | \ + NFP_FLOWER_LAYER_PORT | \ NFP_FLOWER_LAYER_MAC | \ NFP_FLOWER_LAYER_IPV4 | \ NFP_FLOWER_LAYER_IPV6) @@ -285,6 +287,30 @@ nfp_flower_calculate_key_layers(struct nfp_app *app, NL_SET_ERR_MSG_MOD(extack, "unsupported offload: loaded firmware does not support VLAN PCP offload"); return -EOPNOTSUPP; } + if (priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ && + !(key_layer_two & NFP_FLOWER_LAYER2_QINQ)) { + key_layer |= NFP_FLOWER_LAYER_EXT_META; + key_size += sizeof(struct nfp_flower_ext_meta); + key_size += sizeof(struct nfp_flower_vlan); + key_layer_two |= NFP_FLOWER_LAYER2_QINQ; + } + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CVLAN)) { + struct flow_match_vlan cvlan; + + if (!(priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ)) { + NL_SET_ERR_MSG_MOD(extack, "unsupported offload: loaded firmware does not support VLAN QinQ offload"); + return -EOPNOTSUPP; + } + + flow_rule_match_vlan(rule, &cvlan); + if (!(key_layer_two & NFP_FLOWER_LAYER2_QINQ)) { + key_layer |= NFP_FLOWER_LAYER_EXT_META; + key_size += sizeof(struct nfp_flower_ext_meta); + key_size += sizeof(struct nfp_flower_vlan); + key_layer_two |= NFP_FLOWER_LAYER2_QINQ; + } } if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_CONTROL)) { @@ -1066,6 +1092,7 @@ err_destroy_merge_flow: * nfp_flower_validate_pre_tun_rule() * @app: Pointer to the APP handle * @flow: Pointer to NFP flow representation of rule + * @key_ls: Pointer to NFP key layers structure * @extack: Netlink extended ACK report * * Verifies the flow as a pre-tunnel rule. @@ -1075,10 +1102,13 @@ err_destroy_merge_flow: static int nfp_flower_validate_pre_tun_rule(struct nfp_app *app, struct nfp_fl_payload *flow, + struct nfp_fl_key_ls *key_ls, struct netlink_ext_ack *extack) { + struct nfp_flower_priv *priv = app->priv; struct nfp_flower_meta_tci *meta_tci; struct nfp_flower_mac_mpls *mac; + u8 *ext = flow->unmasked_data; struct nfp_fl_act_head *act; u8 *mask = flow->mask_data; bool vlan = false; @@ -1086,20 +1116,25 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, u8 key_layer; meta_tci = (struct nfp_flower_meta_tci *)flow->unmasked_data; - if (meta_tci->tci & cpu_to_be16(NFP_FLOWER_MASK_VLAN_PRESENT)) { - u16 vlan_tci = be16_to_cpu(meta_tci->tci); - - vlan_tci &= ~NFP_FLOWER_MASK_VLAN_PRESENT; - flow->pre_tun_rule.vlan_tci = cpu_to_be16(vlan_tci); - vlan = true; - } else { - flow->pre_tun_rule.vlan_tci = cpu_to_be16(0xffff); + key_layer = key_ls->key_layer; + if (!(priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ)) { + if (meta_tci->tci & cpu_to_be16(NFP_FLOWER_MASK_VLAN_PRESENT)) { + u16 vlan_tci = be16_to_cpu(meta_tci->tci); + + vlan_tci &= ~NFP_FLOWER_MASK_VLAN_PRESENT; + flow->pre_tun_rule.vlan_tci = cpu_to_be16(vlan_tci); + vlan = true; + } else { + flow->pre_tun_rule.vlan_tci = cpu_to_be16(0xffff); + } } - key_layer = meta_tci->nfp_flow_key_layer; if (key_layer & ~NFP_FLOWER_PRE_TUN_RULE_FIELDS) { NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: too many match fields"); return -EOPNOTSUPP; + } else if (key_ls->key_layer_two & ~NFP_FLOWER_LAYER2_QINQ) { + NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: non-vlan in extended match fields"); + return -EOPNOTSUPP; } if (!(key_layer & NFP_FLOWER_LAYER_MAC)) { @@ -1109,7 +1144,13 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, /* Skip fields known to exist. */ mask += sizeof(struct nfp_flower_meta_tci); + ext += sizeof(struct nfp_flower_meta_tci); + if (key_ls->key_layer_two) { + mask += sizeof(struct nfp_flower_ext_meta); + ext += sizeof(struct nfp_flower_ext_meta); + } mask += sizeof(struct nfp_flower_in_port); + ext += sizeof(struct nfp_flower_in_port); /* Ensure destination MAC address is fully matched. */ mac = (struct nfp_flower_mac_mpls *)mask; @@ -1118,6 +1159,8 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, return -EOPNOTSUPP; } + mask += sizeof(struct nfp_flower_mac_mpls); + ext += sizeof(struct nfp_flower_mac_mpls); if (key_layer & NFP_FLOWER_LAYER_IPV4 || key_layer & NFP_FLOWER_LAYER_IPV6) { /* Flags and proto fields have same offset in IPv4 and IPv6. */ @@ -1130,7 +1173,6 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, sizeof(struct nfp_flower_ipv4) : sizeof(struct nfp_flower_ipv6); - mask += sizeof(struct nfp_flower_mac_mpls); /* Ensure proto and flags are the only IP layer fields. */ for (i = 0; i < size; i++) @@ -1138,6 +1180,25 @@ nfp_flower_validate_pre_tun_rule(struct nfp_app *app, NL_SET_ERR_MSG_MOD(extack, "unsupported pre-tunnel rule: only flags and proto can be matched in ip header"); return -EOPNOTSUPP; } + ext += size; + mask += size; + } + + if ((priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ)) { + if (key_ls->key_layer_two & NFP_FLOWER_LAYER2_QINQ) { + struct nfp_flower_vlan *vlan_tags; + u16 vlan_tci; + + vlan_tags = (struct nfp_flower_vlan *)ext; + + vlan_tci = be16_to_cpu(vlan_tags->outer_tci); + + vlan_tci &= ~NFP_FLOWER_MASK_VLAN_PRESENT; + flow->pre_tun_rule.vlan_tci = cpu_to_be16(vlan_tci); + vlan = true; + } else { + flow->pre_tun_rule.vlan_tci = cpu_to_be16(0xffff); + } } /* Action must be a single egress or pop_vlan and egress. */ @@ -1220,7 +1281,7 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev, goto err_destroy_flow; if (flow_pay->pre_tun_rule.dev) { - err = nfp_flower_validate_pre_tun_rule(app, flow_pay, extack); + err = nfp_flower_validate_pre_tun_rule(app, flow_pay, key_layer, extack); if (err) goto err_destroy_flow; } diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c index a4bcde522cdf..4394a4d77224 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c @@ -1151,7 +1151,6 @@ qed_rdma_destroy_cq(void *rdma_cxt, DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "icid = %08x\n", in_params->icid); p_ramrod_res = - (struct rdma_destroy_cq_output_params *) dma_alloc_coherent(&p_hwfn->cdev->pdev->dev, sizeof(struct rdma_destroy_cq_output_params), &ramrod_res_phys, GFP_KERNEL); diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index d1da92ac7fbe..c427865d51a4 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -617,7 +617,6 @@ struct rtl8169_private { struct work_struct work; } wk; - unsigned irq_enabled:1; unsigned supports_gmii:1; unsigned aspm_manageable:1; dma_addr_t counters_phys_addr; @@ -1280,12 +1279,10 @@ static void rtl_irq_disable(struct rtl8169_private *tp) RTL_W32(tp, IntrMask_8125, 0); else RTL_W16(tp, IntrMask, 0); - tp->irq_enabled = 0; } static void rtl_irq_enable(struct rtl8169_private *tp) { - tp->irq_enabled = 1; if (rtl_is_8125(tp)) RTL_W32(tp, IntrMask_8125, tp->irq_mask); else @@ -4541,8 +4538,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) struct rtl8169_private *tp = dev_instance; u32 status = rtl_get_events(tp); - if (!tp->irq_enabled || (status & 0xffff) == 0xffff || - !(status & tp->irq_mask)) + if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask)) return IRQ_NONE; if (unlikely(status & SYSErr)) { @@ -4596,10 +4592,8 @@ static int rtl8169_poll(struct napi_struct *napi, int budget) rtl_tx(dev, tp, budget); - if (work_done < budget) { - napi_complete_done(napi, work_done); + if (work_done < budget && napi_complete_done(napi, work_done)) rtl_irq_enable(tp); - } return work_done; } diff --git a/drivers/net/ethernet/ti/am65-cpts.c b/drivers/net/ethernet/ti/am65-cpts.c index c59a289e428c..365b5b9c6897 100644 --- a/drivers/net/ethernet/ti/am65-cpts.c +++ b/drivers/net/ethernet/ti/am65-cpts.c @@ -748,42 +748,23 @@ EXPORT_SYMBOL_GPL(am65_cpts_rx_enable); static int am65_skb_get_mtype_seqid(struct sk_buff *skb, u32 *mtype_seqid) { unsigned int ptp_class = ptp_classify_raw(skb); - u8 *msgtype, *data = skb->data; - unsigned int offset = 0; - __be16 *seqid; + struct ptp_header *hdr; + u8 msgtype; + u16 seqid; if (ptp_class == PTP_CLASS_NONE) return 0; - if (ptp_class & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (ptp_class & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return 0; - } - - if (skb->len + ETH_HLEN < offset + OFF_PTP_SEQUENCE_ID + sizeof(*seqid)) + hdr = ptp_parse_header(skb, ptp_class); + if (!hdr) return 0; - if (unlikely(ptp_class & PTP_CLASS_V1)) - msgtype = data + offset + OFF_PTP_CONTROL; - else - msgtype = data + offset; + msgtype = ptp_get_msgtype(hdr, ptp_class); + seqid = ntohs(hdr->sequence_id); - seqid = (__be16 *)(data + offset + OFF_PTP_SEQUENCE_ID); - *mtype_seqid = (*msgtype << AM65_CPTS_EVENT_1_MESSAGE_TYPE_SHIFT) & + *mtype_seqid = (msgtype << AM65_CPTS_EVENT_1_MESSAGE_TYPE_SHIFT) & AM65_CPTS_EVENT_1_MESSAGE_TYPE_MASK; - *mtype_seqid |= (ntohs(*seqid) & AM65_CPTS_EVENT_1_SEQUENCE_ID_MASK); + *mtype_seqid |= (seqid & AM65_CPTS_EVENT_1_SEQUENCE_ID_MASK); return 1; } diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c index 7c55d395de2c..d1fc7955d422 100644 --- a/drivers/net/ethernet/ti/cpts.c +++ b/drivers/net/ethernet/ti/cpts.c @@ -446,41 +446,22 @@ static const struct ptp_clock_info cpts_info = { static int cpts_skb_get_mtype_seqid(struct sk_buff *skb, u32 *mtype_seqid) { unsigned int ptp_class = ptp_classify_raw(skb); - u8 *msgtype, *data = skb->data; - unsigned int offset = 0; - u16 *seqid; + struct ptp_header *hdr; + u8 msgtype; + u16 seqid; if (ptp_class == PTP_CLASS_NONE) return 0; - if (ptp_class & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (ptp_class & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return 0; - } - - if (skb->len + ETH_HLEN < offset + OFF_PTP_SEQUENCE_ID + sizeof(*seqid)) + hdr = ptp_parse_header(skb, ptp_class); + if (!hdr) return 0; - if (unlikely(ptp_class & PTP_CLASS_V1)) - msgtype = data + offset + OFF_PTP_CONTROL; - else - msgtype = data + offset; + msgtype = ptp_get_msgtype(hdr, ptp_class); + seqid = ntohs(hdr->sequence_id); - seqid = (u16 *)(data + offset + OFF_PTP_SEQUENCE_ID); - *mtype_seqid = (*msgtype & MESSAGE_TYPE_MASK) << MESSAGE_TYPE_SHIFT; - *mtype_seqid |= (ntohs(*seqid) & SEQUENCE_ID_MASK) << SEQUENCE_ID_SHIFT; + *mtype_seqid = (msgtype & MESSAGE_TYPE_MASK) << MESSAGE_TYPE_SHIFT; + *mtype_seqid |= (seqid & SEQUENCE_ID_MASK) << SEQUENCE_ID_SHIFT; return 1; } @@ -528,6 +509,11 @@ void cpts_rx_timestamp(struct cpts *cpts, struct sk_buff *skb) int ret; u64 ns; + /* cpts_rx_timestamp() is called before eth_type_trans(), so + * skb MAC Hdr properties are not configured yet. Hence need to + * reset skb MAC header here + */ + skb_reset_mac_header(skb); ret = cpts_skb_get_mtype_seqid(skb, &skb_cb->skb_mtype_seqid); if (!ret) return; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 9159846b8b93..124045cbcda3 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -1611,7 +1611,7 @@ static const struct nla_policy macsec_genl_rxsc_policy[NUM_MACSEC_RXSC_ATTR] = { static const struct nla_policy macsec_genl_sa_policy[NUM_MACSEC_SA_ATTR] = { [MACSEC_SA_ATTR_AN] = { .type = NLA_U8 }, [MACSEC_SA_ATTR_ACTIVE] = { .type = NLA_U8 }, - [MACSEC_SA_ATTR_PN] = { .type = NLA_MIN_LEN, .len = 4 }, + [MACSEC_SA_ATTR_PN] = NLA_POLICY_MIN_LEN(4), [MACSEC_SA_ATTR_KEYID] = { .type = NLA_BINARY, .len = MACSEC_KEYID_LEN, }, [MACSEC_SA_ATTR_KEY] = { .type = NLA_BINARY, diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c index 50fb7d16b75a..fc3d747eba55 100644 --- a/drivers/net/phy/dp83640.c +++ b/drivers/net/phy/dp83640.c @@ -798,51 +798,32 @@ static int decode_evnt(struct dp83640_private *dp83640, return parsed; } -#define DP83640_PACKET_HASH_OFFSET 20 #define DP83640_PACKET_HASH_LEN 10 static int match(struct sk_buff *skb, unsigned int type, struct rxts *rxts) { - unsigned int offset = 0; - u8 *msgtype, *data = skb_mac_header(skb); - __be16 *seqid; + struct ptp_header *hdr; + u8 msgtype; + u16 seqid; u16 hash; /* check sequenceID, messageType, 12 bit hash of offset 20-29 */ - if (type & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (type & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: + hdr = ptp_parse_header(skb, type); + if (!hdr) return 0; - } - if (skb->len + ETH_HLEN < offset + OFF_PTP_SEQUENCE_ID + sizeof(*seqid)) - return 0; + msgtype = ptp_get_msgtype(hdr, type); - if (unlikely(type & PTP_CLASS_V1)) - msgtype = data + offset + OFF_PTP_CONTROL; - else - msgtype = data + offset; - if (rxts->msgtype != (*msgtype & 0xf)) + if (rxts->msgtype != (msgtype & 0xf)) return 0; - seqid = (__be16 *)(data + offset + OFF_PTP_SEQUENCE_ID); - if (rxts->seqid != ntohs(*seqid)) + seqid = be16_to_cpu(hdr->sequence_id); + if (rxts->seqid != seqid) return 0; hash = ether_crc(DP83640_PACKET_HASH_LEN, - data + offset + DP83640_PACKET_HASH_OFFSET) >> 20; + (unsigned char *)&hdr->source_port_identity) >> 20; if (rxts->hash != hash) return 0; @@ -982,35 +963,16 @@ static void decode_status_frame(struct dp83640_private *dp83640, static int is_sync(struct sk_buff *skb, int type) { - u8 *data = skb->data, *msgtype; - unsigned int offset = 0; - - if (type & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (type & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return 0; - } - - if (type & PTP_CLASS_V1) - offset += OFF_PTP_CONTROL; + struct ptp_header *hdr; + u8 msgtype; - if (skb->len < offset + 1) + hdr = ptp_parse_header(skb, type); + if (!hdr) return 0; - msgtype = data + offset; + msgtype = ptp_get_msgtype(hdr, type); - return (*msgtype & 0xf) == 0; + return (msgtype & 0xf) == 0; } static void dp83640_free_clocks(void) diff --git a/drivers/net/wireguard/netlink.c b/drivers/net/wireguard/netlink.c index 20a4f3c0a0a1..d0f3b6d7f408 100644 --- a/drivers/net/wireguard/netlink.c +++ b/drivers/net/wireguard/netlink.c @@ -22,8 +22,8 @@ static struct genl_family genl_family; static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { [WGDEVICE_A_IFINDEX] = { .type = NLA_U32 }, [WGDEVICE_A_IFNAME] = { .type = NLA_NUL_STRING, .len = IFNAMSIZ - 1 }, - [WGDEVICE_A_PRIVATE_KEY] = { .type = NLA_EXACT_LEN, .len = NOISE_PUBLIC_KEY_LEN }, - [WGDEVICE_A_PUBLIC_KEY] = { .type = NLA_EXACT_LEN, .len = NOISE_PUBLIC_KEY_LEN }, + [WGDEVICE_A_PRIVATE_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), + [WGDEVICE_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), [WGDEVICE_A_FLAGS] = { .type = NLA_U32 }, [WGDEVICE_A_LISTEN_PORT] = { .type = NLA_U16 }, [WGDEVICE_A_FWMARK] = { .type = NLA_U32 }, @@ -31,12 +31,12 @@ static const struct nla_policy device_policy[WGDEVICE_A_MAX + 1] = { }; static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { - [WGPEER_A_PUBLIC_KEY] = { .type = NLA_EXACT_LEN, .len = NOISE_PUBLIC_KEY_LEN }, - [WGPEER_A_PRESHARED_KEY] = { .type = NLA_EXACT_LEN, .len = NOISE_SYMMETRIC_KEY_LEN }, + [WGPEER_A_PUBLIC_KEY] = NLA_POLICY_EXACT_LEN(NOISE_PUBLIC_KEY_LEN), + [WGPEER_A_PRESHARED_KEY] = NLA_POLICY_EXACT_LEN(NOISE_SYMMETRIC_KEY_LEN), [WGPEER_A_FLAGS] = { .type = NLA_U32 }, - [WGPEER_A_ENDPOINT] = { .type = NLA_MIN_LEN, .len = sizeof(struct sockaddr) }, + [WGPEER_A_ENDPOINT] = NLA_POLICY_MIN_LEN(sizeof(struct sockaddr)), [WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL] = { .type = NLA_U16 }, - [WGPEER_A_LAST_HANDSHAKE_TIME] = { .type = NLA_EXACT_LEN, .len = sizeof(struct __kernel_timespec) }, + [WGPEER_A_LAST_HANDSHAKE_TIME] = NLA_POLICY_EXACT_LEN(sizeof(struct __kernel_timespec)), [WGPEER_A_RX_BYTES] = { .type = NLA_U64 }, [WGPEER_A_TX_BYTES] = { .type = NLA_U64 }, [WGPEER_A_ALLOWEDIPS] = { .type = NLA_NESTED }, @@ -45,7 +45,7 @@ static const struct nla_policy peer_policy[WGPEER_A_MAX + 1] = { static const struct nla_policy allowedip_policy[WGALLOWEDIP_A_MAX + 1] = { [WGALLOWEDIP_A_FAMILY] = { .type = NLA_U16 }, - [WGALLOWEDIP_A_IPADDR] = { .type = NLA_MIN_LEN, .len = sizeof(struct in_addr) }, + [WGALLOWEDIP_A_IPADDR] = NLA_POLICY_MIN_LEN(sizeof(struct in_addr)), [WGALLOWEDIP_A_CIDR_MASK] = { .type = NLA_U8 } }; diff --git a/drivers/nfc/st-nci/se.c b/drivers/nfc/st-nci/se.c index f25f1ec5f9e9..807eae04c1e3 100644 --- a/drivers/nfc/st-nci/se.c +++ b/drivers/nfc/st-nci/se.c @@ -331,8 +331,7 @@ static int st_nci_hci_connectivity_event_received(struct nci_dev *ndev, skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG) return -EPROTO; - transaction = (struct nfc_evt_transaction *)devm_kzalloc(dev, - skb->len - 2, GFP_KERNEL); + transaction = devm_kzalloc(dev, skb->len - 2, GFP_KERNEL); if (!transaction) return -ENOMEM; diff --git a/drivers/nfc/st21nfca/se.c b/drivers/nfc/st21nfca/se.c index 6586378cacb0..c8bdf078d111 100644 --- a/drivers/nfc/st21nfca/se.c +++ b/drivers/nfc/st21nfca/se.c @@ -315,8 +315,7 @@ int st21nfca_connectivity_event_received(struct nfc_hci_dev *hdev, u8 host, skb->data[0] != NFC_EVT_TRANSACTION_AID_TAG) return -EPROTO; - transaction = (struct nfc_evt_transaction *)devm_kzalloc(dev, - skb->len - 2, GFP_KERNEL); + transaction = devm_kzalloc(dev, skb->len - 2, GFP_KERNEL); if (!transaction) return -ENOMEM; diff --git a/drivers/ptp/ptp_ines.c b/drivers/ptp/ptp_ines.c index 7711651ff19e..d726c589edda 100644 --- a/drivers/ptp/ptp_ines.c +++ b/drivers/ptp/ptp_ines.c @@ -93,9 +93,6 @@ MODULE_LICENSE("GPL"); #define TC_E2E_PTP_V2 2 #define TC_P2P_PTP_V2 3 -#define OFF_PTP_CLOCK_ID 20 -#define OFF_PTP_PORT_NUM 28 - #define PHY_SPEED_10 0 #define PHY_SPEED_100 1 #define PHY_SPEED_1000 2 @@ -443,57 +440,41 @@ static void ines_link_state(struct mii_timestamper *mii_ts, static bool ines_match(struct sk_buff *skb, unsigned int ptp_class, struct ines_timestamp *ts, struct device *dev) { - u8 *msgtype, *data = skb_mac_header(skb); - unsigned int offset = 0; - __be16 *portn, *seqid; - __be64 *clkid; + struct ptp_header *hdr; + u16 portn, seqid; + u8 msgtype; + u64 clkid; if (unlikely(ptp_class & PTP_CLASS_V1)) return false; - if (ptp_class & PTP_CLASS_VLAN) - offset += VLAN_HLEN; - - switch (ptp_class & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return false; - } - - if (skb->len + ETH_HLEN < offset + OFF_PTP_SEQUENCE_ID + sizeof(*seqid)) + hdr = ptp_parse_header(skb, ptp_class); + if (!hdr) return false; - msgtype = data + offset; - clkid = (__be64 *)(data + offset + OFF_PTP_CLOCK_ID); - portn = (__be16 *)(data + offset + OFF_PTP_PORT_NUM); - seqid = (__be16 *)(data + offset + OFF_PTP_SEQUENCE_ID); + msgtype = ptp_get_msgtype(hdr, ptp_class); + clkid = be64_to_cpup((__be64 *)&hdr->source_port_identity.clock_identity.id[0]); + portn = be16_to_cpu(hdr->source_port_identity.port_number); + seqid = be16_to_cpu(hdr->sequence_id); - if (tag_to_msgtype(ts->tag & 0x7) != (*msgtype & 0xf)) { + if (tag_to_msgtype(ts->tag & 0x7) != msgtype) { dev_dbg(dev, "msgtype mismatch ts %hhu != skb %hhu\n", - tag_to_msgtype(ts->tag & 0x7), *msgtype & 0xf); + tag_to_msgtype(ts->tag & 0x7), msgtype); return false; } - if (cpu_to_be64(ts->clkid) != *clkid) { + if (ts->clkid != clkid) { dev_dbg(dev, "clkid mismatch ts %llx != skb %llx\n", - cpu_to_be64(ts->clkid), *clkid); + ts->clkid, clkid); return false; } - if (ts->portnum != ntohs(*portn)) { + if (ts->portnum != portn) { dev_dbg(dev, "portn mismatch ts %hu != skb %hu\n", - ts->portnum, ntohs(*portn)); + ts->portnum, portn); return false; } - if (ts->seqid != ntohs(*seqid)) { + if (ts->seqid != seqid) { dev_dbg(dev, "seqid mismatch ts %hu != skb %hu\n", - ts->seqid, ntohs(*seqid)); + ts->seqid, seqid); return false; } @@ -694,35 +675,16 @@ static void ines_txtstamp_work(struct work_struct *work) static bool is_sync_pdelay_resp(struct sk_buff *skb, int type) { - u8 *data = skb->data, *msgtype; - unsigned int offset = 0; - - if (type & PTP_CLASS_VLAN) - offset += VLAN_HLEN; + struct ptp_header *hdr; + u8 msgtype; - switch (type & PTP_CLASS_PMASK) { - case PTP_CLASS_IPV4: - offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN; - break; - case PTP_CLASS_IPV6: - offset += ETH_HLEN + IP6_HLEN + UDP_HLEN; - break; - case PTP_CLASS_L2: - offset += ETH_HLEN; - break; - default: - return 0; - } - - if (type & PTP_CLASS_V1) - offset += OFF_PTP_CONTROL; - - if (skb->len < offset + 1) - return 0; + hdr = ptp_parse_header(skb, type); + if (!hdr) + return false; - msgtype = data + offset; + msgtype = ptp_get_msgtype(hdr, type); - switch ((*msgtype & 0xf)) { + switch ((msgtype & 0xf)) { case SYNC: case PDELAY_RESP: return true; diff --git a/include/linux/ptp_classify.h b/include/linux/ptp_classify.h index dd00fa41f7e7..8437307cca8c 100644 --- a/include/linux/ptp_classify.h +++ b/include/linux/ptp_classify.h @@ -36,7 +36,6 @@ #define OFF_PTP_SOURCE_UUID 22 /* PTPv1 only */ #define OFF_PTP_SEQUENCE_ID 30 -#define OFF_PTP_CONTROL 32 /* PTPv1 only */ /* Below defines should actually be removed at some point in time. */ #define IP6_HLEN 40 @@ -44,6 +43,30 @@ #define OFF_IHL 14 #define IPV4_HLEN(data) (((struct iphdr *)(data + OFF_IHL))->ihl << 2) +struct clock_identity { + u8 id[8]; +} __packed; + +struct port_identity { + struct clock_identity clock_identity; + __be16 port_number; +} __packed; + +struct ptp_header { + u8 tsmt; /* transportSpecific | messageType */ + u8 ver; /* reserved | versionPTP */ + __be16 message_length; + u8 domain_number; + u8 reserved1; + u8 flag_field[2]; + __be64 correction; + __be32 reserved2; + struct port_identity source_port_identity; + __be16 sequence_id; + u8 control; + u8 log_message_interval; +} __packed; + #if defined(CONFIG_NET_PTP_CLASSIFY) /** * ptp_classify_raw - classify a PTP packet @@ -57,6 +80,46 @@ */ unsigned int ptp_classify_raw(const struct sk_buff *skb); +/** + * ptp_parse_header - Get pointer to the PTP v2 header + * @skb: packet buffer + * @type: type of the packet (see ptp_classify_raw()) + * + * This function takes care of the VLAN, UDP, IPv4 and IPv6 headers. The length + * is checked. + * + * Note, internally skb_mac_header() is used. Make sure that the @skb is + * initialized accordingly. + * + * Return: Pointer to the ptp v2 header or NULL if not found + */ +struct ptp_header *ptp_parse_header(struct sk_buff *skb, unsigned int type); + +/** + * ptp_get_msgtype - Extract ptp message type from given header + * @hdr: ptp header + * @type: type of the packet (see ptp_classify_raw()) + * + * This function returns the message type for a given ptp header. It takes care + * of the different ptp header versions (v1 or v2). + * + * Return: The message type + */ +static inline u8 ptp_get_msgtype(const struct ptp_header *hdr, + unsigned int type) +{ + u8 msgtype; + + if (unlikely(type & PTP_CLASS_V1)) { + /* msg type is located at the control field for ptp v1 */ + msgtype = hdr->control; + } else { + msgtype = hdr->tsmt & 0x0f; + } + + return msgtype; +} + void __init ptp_classifier_init(void); #else static inline void ptp_classifier_init(void) @@ -66,5 +129,10 @@ static inline unsigned int ptp_classify_raw(struct sk_buff *skb) { return PTP_CLASS_NONE; } +static inline struct ptp_header *ptp_parse_header(struct sk_buff *skb, + unsigned int type) +{ + return NULL; +} #endif #endif /* _PTP_CLASSIFY_H_ */ diff --git a/include/net/netlink.h b/include/net/netlink.h index c0411f14fb53..fdd317f8fde4 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -181,8 +181,6 @@ enum { NLA_S64, NLA_BITFIELD32, NLA_REJECT, - NLA_EXACT_LEN, - NLA_MIN_LEN, __NLA_TYPE_MAX, }; @@ -199,11 +197,11 @@ struct netlink_range_validation_signed { enum nla_policy_validation { NLA_VALIDATE_NONE, NLA_VALIDATE_RANGE, + NLA_VALIDATE_RANGE_WARN_TOO_LONG, NLA_VALIDATE_MIN, NLA_VALIDATE_MAX, NLA_VALIDATE_RANGE_PTR, NLA_VALIDATE_FUNCTION, - NLA_VALIDATE_WARN_TOO_LONG, }; /** @@ -222,7 +220,7 @@ enum nla_policy_validation { * NLA_NUL_STRING Maximum length of string (excluding NUL) * NLA_FLAG Unused * NLA_BINARY Maximum length of attribute payload - * NLA_MIN_LEN Minimum length of attribute payload + * (but see also below with the validation type) * NLA_NESTED, * NLA_NESTED_ARRAY Length verification is done by checking len of * nested header (or empty); len field is used if @@ -237,11 +235,6 @@ enum nla_policy_validation { * just like "All other" * NLA_BITFIELD32 Unused * NLA_REJECT Unused - * NLA_EXACT_LEN Attribute should have exactly this length, otherwise - * it is rejected or warned about, the latter happening - * if and only if the `validation_type' is set to - * NLA_VALIDATE_WARN_TOO_LONG. - * NLA_MIN_LEN Minimum length of attribute payload * All other Minimum length of attribute payload * * Meaning of validation union: @@ -296,6 +289,11 @@ enum nla_policy_validation { * pointer to a struct netlink_range_validation_signed * that indicates the min/max values. * Use NLA_POLICY_FULL_RANGE_SIGNED(). + * + * NLA_BINARY If the validation type is like the ones for integers + * above, then the min/max length (not value like for + * integers) of the attribute is enforced. + * * All other Unused - but note that it's a union * * Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN: @@ -309,7 +307,7 @@ enum nla_policy_validation { * static const struct nla_policy my_policy[ATTR_MAX+1] = { * [ATTR_FOO] = { .type = NLA_U16 }, * [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ }, - * [ATTR_BAZ] = { .type = NLA_EXACT_LEN, .len = sizeof(struct mystruct) }, + * [ATTR_BAZ] = NLA_POLICY_EXACT_LEN(sizeof(struct mystruct)), * [ATTR_GOO] = NLA_POLICY_BITFIELD32(myvalidflags), * }; */ @@ -335,9 +333,10 @@ struct nla_policy { * nesting validation starts here. * * Additionally, it means that NLA_UNSPEC is actually NLA_REJECT - * for any types >= this, so need to use NLA_MIN_LEN to get the - * previous pure { .len = xyz } behaviour. The advantage of this - * is that types not specified in the policy will be rejected. + * for any types >= this, so need to use NLA_POLICY_MIN_LEN() to + * get the previous pure { .len = xyz } behaviour. The advantage + * of this is that types not specified in the policy will be + * rejected. * * For completely new families it should be set to 1 so that the * validation is enforced for all attributes. For existing ones @@ -349,12 +348,6 @@ struct nla_policy { }; }; -#define NLA_POLICY_EXACT_LEN(_len) { .type = NLA_EXACT_LEN, .len = _len } -#define NLA_POLICY_EXACT_LEN_WARN(_len) \ - { .type = NLA_EXACT_LEN, .len = _len, \ - .validation_type = NLA_VALIDATE_WARN_TOO_LONG, } -#define NLA_POLICY_MIN_LEN(_len) { .type = NLA_MIN_LEN, .len = _len } - #define NLA_POLICY_ETH_ADDR NLA_POLICY_EXACT_LEN(ETH_ALEN) #define NLA_POLICY_ETH_ADDR_COMPAT NLA_POLICY_EXACT_LEN_WARN(ETH_ALEN) @@ -370,19 +363,21 @@ struct nla_policy { { .type = NLA_BITFIELD32, .bitfield32_valid = valid } #define __NLA_ENSURE(condition) BUILD_BUG_ON_ZERO(!(condition)) -#define NLA_ENSURE_UINT_TYPE(tp) \ +#define NLA_ENSURE_UINT_OR_BINARY_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_U8 || tp == NLA_U16 || \ tp == NLA_U32 || tp == NLA_U64 || \ - tp == NLA_MSECS) + tp) + tp == NLA_MSECS || \ + tp == NLA_BINARY) + tp) #define NLA_ENSURE_SINT_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_S8 || tp == NLA_S16 || \ tp == NLA_S32 || tp == NLA_S64) + tp) -#define NLA_ENSURE_INT_TYPE(tp) \ +#define NLA_ENSURE_INT_OR_BINARY_TYPE(tp) \ (__NLA_ENSURE(tp == NLA_S8 || tp == NLA_U8 || \ tp == NLA_S16 || tp == NLA_U16 || \ tp == NLA_S32 || tp == NLA_U32 || \ tp == NLA_S64 || tp == NLA_U64 || \ - tp == NLA_MSECS) + tp) + tp == NLA_MSECS || \ + tp == NLA_BINARY) + tp) #define NLA_ENSURE_NO_VALIDATION_PTR(tp) \ (__NLA_ENSURE(tp != NLA_BITFIELD32 && \ tp != NLA_REJECT && \ @@ -390,14 +385,14 @@ struct nla_policy { tp != NLA_NESTED_ARRAY) + tp) #define NLA_POLICY_RANGE(tp, _min, _max) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_RANGE, \ .min = _min, \ .max = _max \ } #define NLA_POLICY_FULL_RANGE(tp, _range) { \ - .type = NLA_ENSURE_UINT_TYPE(tp), \ + .type = NLA_ENSURE_UINT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_RANGE_PTR, \ .range = _range, \ } @@ -409,13 +404,13 @@ struct nla_policy { } #define NLA_POLICY_MIN(tp, _min) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_MIN, \ .min = _min, \ } #define NLA_POLICY_MAX(tp, _max) { \ - .type = NLA_ENSURE_INT_TYPE(tp), \ + .type = NLA_ENSURE_INT_OR_BINARY_TYPE(tp), \ .validation_type = NLA_VALIDATE_MAX, \ .max = _max, \ } @@ -427,6 +422,15 @@ struct nla_policy { .len = __VA_ARGS__ + 0, \ } +#define NLA_POLICY_EXACT_LEN(_len) NLA_POLICY_RANGE(NLA_BINARY, _len, _len) +#define NLA_POLICY_EXACT_LEN_WARN(_len) { \ + .type = NLA_BINARY, \ + .validation_type = NLA_VALIDATE_RANGE_WARN_TOO_LONG, \ + .min = _len, \ + .max = _len \ +} +#define NLA_POLICY_MIN_LEN(_len) NLA_POLICY_MIN(NLA_BINARY, _len) + /** * struct nl_info - netlink source information * @nlh: Netlink message header of original request diff --git a/include/uapi/linux/if_pppol2tp.h b/include/uapi/linux/if_pppol2tp.h index 060b4d1f3129..a91044328bc9 100644 --- a/include/uapi/linux/if_pppol2tp.h +++ b/include/uapi/linux/if_pppol2tp.h @@ -75,7 +75,7 @@ struct pppol2tpv3in6_addr { }; /* Socket options: - * DEBUG - bitmask of debug message categories + * DEBUG - bitmask of debug message categories (not used) * SENDSEQ - 0 => don't send packets with sequence numbers * 1 => send packets with sequence numbers * RECVSEQ - 0 => receive packet sequence numbers are optional diff --git a/include/uapi/linux/l2tp.h b/include/uapi/linux/l2tp.h index 61158f5a1a5b..88a0d32b8c07 100644 --- a/include/uapi/linux/l2tp.h +++ b/include/uapi/linux/l2tp.h @@ -108,7 +108,7 @@ enum { L2TP_ATTR_VLAN_ID, /* u16 (not used) */ L2TP_ATTR_COOKIE, /* 0, 4 or 8 bytes */ L2TP_ATTR_PEER_COOKIE, /* 0, 4 or 8 bytes */ - L2TP_ATTR_DEBUG, /* u32, enum l2tp_debug_flags */ + L2TP_ATTR_DEBUG, /* u32, enum l2tp_debug_flags (not used) */ L2TP_ATTR_RECV_SEQ, /* u8 */ L2TP_ATTR_SEND_SEQ, /* u8 */ L2TP_ATTR_LNS_MODE, /* u8 */ @@ -177,7 +177,9 @@ enum l2tp_seqmode { }; /** - * enum l2tp_debug_flags - debug message categories for L2TP tunnels/sessions + * enum l2tp_debug_flags - debug message categories for L2TP tunnels/sessions. + * + * Unused. * * @L2TP_MSG_DEBUG: verbose debug (if compiled in) * @L2TP_MSG_CONTROL: userspace - kernel interface diff --git a/lib/nlattr.c b/lib/nlattr.c index bc5b5cf608c4..665bdaff02c8 100644 --- a/lib/nlattr.c +++ b/lib/nlattr.c @@ -124,6 +124,7 @@ void nla_get_range_unsigned(const struct nla_policy *pt, range->max = U8_MAX; break; case NLA_U16: + case NLA_BINARY: range->max = U16_MAX; break; case NLA_U32: @@ -140,6 +141,7 @@ void nla_get_range_unsigned(const struct nla_policy *pt, switch (pt->validation_type) { case NLA_VALIDATE_RANGE: + case NLA_VALIDATE_RANGE_WARN_TOO_LONG: range->min = pt->min; range->max = pt->max; break; @@ -157,9 +159,10 @@ void nla_get_range_unsigned(const struct nla_policy *pt, } } -static int nla_validate_int_range_unsigned(const struct nla_policy *pt, - const struct nlattr *nla, - struct netlink_ext_ack *extack) +static int nla_validate_range_unsigned(const struct nla_policy *pt, + const struct nlattr *nla, + struct netlink_ext_ack *extack, + unsigned int validate) { struct netlink_range_validation range; u64 value; @@ -178,15 +181,39 @@ static int nla_validate_int_range_unsigned(const struct nla_policy *pt, case NLA_MSECS: value = nla_get_u64(nla); break; + case NLA_BINARY: + value = nla_len(nla); + break; default: return -EINVAL; } nla_get_range_unsigned(pt, &range); + if (pt->validation_type == NLA_VALIDATE_RANGE_WARN_TOO_LONG && + pt->type == NLA_BINARY && value > range.max) { + pr_warn_ratelimited("netlink: '%s': attribute type %d has an invalid length.\n", + current->comm, pt->type); + if (validate & NL_VALIDATE_STRICT_ATTRS) { + NL_SET_ERR_MSG_ATTR(extack, nla, + "invalid attribute length"); + return -EINVAL; + } + + /* this assumes min <= max (don't validate against min) */ + return 0; + } + if (value < range.min || value > range.max) { - NL_SET_ERR_MSG_ATTR(extack, nla, - "integer out of range"); + bool binary = pt->type == NLA_BINARY; + + if (binary) + NL_SET_ERR_MSG_ATTR(extack, nla, + "binary attribute size out of range"); + else + NL_SET_ERR_MSG_ATTR(extack, nla, + "integer out of range"); + return -ERANGE; } @@ -274,7 +301,8 @@ static int nla_validate_int_range_signed(const struct nla_policy *pt, static int nla_validate_int_range(const struct nla_policy *pt, const struct nlattr *nla, - struct netlink_ext_ack *extack) + struct netlink_ext_ack *extack, + unsigned int validate) { switch (pt->type) { case NLA_U8: @@ -282,7 +310,8 @@ static int nla_validate_int_range(const struct nla_policy *pt, case NLA_U32: case NLA_U64: case NLA_MSECS: - return nla_validate_int_range_unsigned(pt, nla, extack); + case NLA_BINARY: + return nla_validate_range_unsigned(pt, nla, extack, validate); case NLA_S8: case NLA_S16: case NLA_S32: @@ -313,10 +342,7 @@ static int validate_nla(const struct nlattr *nla, int maxtype, BUG_ON(pt->type > NLA_TYPE_MAX); - if ((nla_attr_len[pt->type] && attrlen != nla_attr_len[pt->type]) || - (pt->type == NLA_EXACT_LEN && - pt->validation_type == NLA_VALIDATE_WARN_TOO_LONG && - attrlen != pt->len)) { + if (nla_attr_len[pt->type] && attrlen != nla_attr_len[pt->type]) { pr_warn_ratelimited("netlink: '%s': attribute type %d has an invalid length.\n", current->comm, type); if (validate & NL_VALIDATE_STRICT_ATTRS) { @@ -449,19 +475,10 @@ static int validate_nla(const struct nlattr *nla, int maxtype, "Unsupported attribute"); return -EINVAL; } - /* fall through */ - case NLA_MIN_LEN: if (attrlen < pt->len) goto out_err; break; - case NLA_EXACT_LEN: - if (pt->validation_type != NLA_VALIDATE_WARN_TOO_LONG) { - if (attrlen != pt->len) - goto out_err; - break; - } - /* fall through */ default: if (pt->len) minlen = pt->len; @@ -479,9 +496,10 @@ static int validate_nla(const struct nlattr *nla, int maxtype, break; case NLA_VALIDATE_RANGE_PTR: case NLA_VALIDATE_RANGE: + case NLA_VALIDATE_RANGE_WARN_TOO_LONG: case NLA_VALIDATE_MIN: case NLA_VALIDATE_MAX: - err = nla_validate_int_range(pt, nla, extack); + err = nla_validate_int_range(pt, nla, extack, validate); if (err) return err; break; diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 147d52596e17..8a71c60fa357 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -1095,8 +1095,8 @@ static const struct nla_policy br_policy[IFLA_BR_MAX + 1] = { [IFLA_BR_MCAST_IGMP_VERSION] = { .type = NLA_U8 }, [IFLA_BR_MCAST_MLD_VERSION] = { .type = NLA_U8 }, [IFLA_BR_VLAN_STATS_PER_PORT] = { .type = NLA_U8 }, - [IFLA_BR_MULTI_BOOLOPT] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct br_boolopt_multi) }, + [IFLA_BR_MULTI_BOOLOPT] = + NLA_POLICY_EXACT_LEN(sizeof(struct br_boolopt_multi)), }; static int br_changelink(struct net_device *brdev, struct nlattr *tb[], diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index f9092c71225f..d2b8737f9fc0 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -1884,8 +1884,8 @@ out_err: } static const struct nla_policy br_vlan_db_policy[BRIDGE_VLANDB_ENTRY_MAX + 1] = { - [BRIDGE_VLANDB_ENTRY_INFO] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct bridge_vlan_info) }, + [BRIDGE_VLANDB_ENTRY_INFO] = + NLA_POLICY_EXACT_LEN(sizeof(struct bridge_vlan_info)), [BRIDGE_VLANDB_ENTRY_RANGE] = { .type = NLA_U16 }, [BRIDGE_VLANDB_ENTRY_STATE] = { .type = NLA_U8 }, [BRIDGE_VLANDB_ENTRY_TUNNEL_INFO] = { .type = NLA_NESTED }, diff --git a/net/core/datagram.c b/net/core/datagram.c index 639745d4f3b9..9fcaa544f11a 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -623,10 +623,11 @@ int __zerocopy_sg_from_iter(struct sock *sk, struct sk_buff *skb, while (length && iov_iter_count(from)) { struct page *pages[MAX_SKB_FRAGS]; + struct page *last_head = NULL; size_t start; ssize_t copied; unsigned long truesize; - int n = 0; + int refs, n = 0; if (frag == MAX_SKB_FRAGS) return -EMSGSIZE; @@ -649,13 +650,37 @@ int __zerocopy_sg_from_iter(struct sock *sk, struct sk_buff *skb, } else { refcount_add(truesize, &skb->sk->sk_wmem_alloc); } - while (copied) { + for (refs = 0; copied != 0; start = 0) { int size = min_t(int, copied, PAGE_SIZE - start); - skb_fill_page_desc(skb, frag++, pages[n], start, size); - start = 0; + struct page *head = compound_head(pages[n]); + + start += (pages[n] - head) << PAGE_SHIFT; copied -= size; n++; + if (frag) { + skb_frag_t *last = &skb_shinfo(skb)->frags[frag - 1]; + + if (head == skb_frag_page(last) && + start == skb_frag_off(last) + skb_frag_size(last)) { + skb_frag_size_add(last, size); + /* We combined this page, we need to release + * a reference. Since compound pages refcount + * is shared among many pages, batch the refcount + * adjustments to limit false sharing. + */ + last_head = head; + refs++; + continue; + } + } + if (refs) { + page_ref_sub(last_head, refs); + refs = 0; + } + skb_fill_page_desc(skb, frag++, head, start, size); } + if (refs) + page_ref_sub(last_head, refs); } return 0; } diff --git a/net/core/ptp_classifier.c b/net/core/ptp_classifier.c index d964a5147f22..e33fde06d528 100644 --- a/net/core/ptp_classifier.c +++ b/net/core/ptp_classifier.c @@ -107,6 +107,36 @@ unsigned int ptp_classify_raw(const struct sk_buff *skb) } EXPORT_SYMBOL_GPL(ptp_classify_raw); +struct ptp_header *ptp_parse_header(struct sk_buff *skb, unsigned int type) +{ + u8 *ptr = skb_mac_header(skb); + + if (type & PTP_CLASS_VLAN) + ptr += VLAN_HLEN; + + switch (type & PTP_CLASS_PMASK) { + case PTP_CLASS_IPV4: + ptr += IPV4_HLEN(ptr) + UDP_HLEN; + break; + case PTP_CLASS_IPV6: + ptr += IP6_HLEN + UDP_HLEN; + break; + case PTP_CLASS_L2: + break; + default: + return NULL; + } + + ptr += ETH_HLEN; + + /* Ensure that the entire header is present in this packet. */ + if (ptr + sizeof(struct ptp_header) > skb->data + skb->len) + return NULL; + + return (struct ptp_header *)ptr; +} +EXPORT_SYMBOL_GPL(ptp_parse_header); + void __init ptp_classifier_init(void) { static struct sock_filter ptp_filter[] __initdata = { diff --git a/net/core/skbuff.c b/net/core/skbuff.c index e18184ffa9c3..a5c11aae9c89 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -5953,8 +5953,7 @@ static int pskb_carve_inside_nonlinear(struct sk_buff *skb, const u32 off, size = SKB_WITH_OVERHEAD(ksize(data)); memcpy((struct skb_shared_info *)(data + size), - skb_shinfo(skb), offsetof(struct skb_shared_info, - frags[skb_shinfo(skb)->nr_frags])); + skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0])); if (skb_orphan_frags(skb, gfp_mask)) { kfree(data); return -ENOMEM; diff --git a/net/core/sock.c b/net/core/sock.c index e4f40b175acb..64d2aec5ed45 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -413,18 +413,6 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen, return 0; } -static void sock_warn_obsolete_bsdism(const char *name) -{ - static int warned; - static char warncomm[TASK_COMM_LEN]; - if (strcmp(warncomm, current->comm) && warned < 5) { - strcpy(warncomm, current->comm); - pr_warn("process `%s' is using obsolete %s SO_BSDCOMPAT\n", - warncomm, name); - warned++; - } -} - static bool sock_needs_netstamp(const struct sock *sk) { switch (sk->sk_family) { @@ -984,7 +972,6 @@ set_sndbuf: break; case SO_BSDCOMPAT: - sock_warn_obsolete_bsdism("setsockopt"); break; case SO_PASSCRED: @@ -1387,7 +1374,6 @@ int sock_getsockopt(struct socket *sock, int level, int optname, break; case SO_BSDCOMPAT: - sock_warn_obsolete_bsdism("getsockopt"); break; case SO_TIMESTAMP_OLD: diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index aef72f6a2829..b9ee1a4a8955 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -608,7 +608,7 @@ static void ccid3_hc_rx_send_feedback(struct sock *sk, */ if (hc->rx_x_recv > 0) break; - /* fall through */ + fallthrough; case CCID3_FBACK_PERIODIC: delta = ktime_us_delta(now, hc->rx_tstamp_last_feedback); if (delta <= 0) diff --git a/net/dccp/feat.c b/net/dccp/feat.c index afc071ea1271..788dd629c420 100644 --- a/net/dccp/feat.c +++ b/net/dccp/feat.c @@ -1407,7 +1407,8 @@ int dccp_feat_parse_options(struct sock *sk, struct dccp_request_sock *dreq, * Negotiation during connection setup */ case DCCP_LISTEN: - server = true; /* fall through */ + server = true; + fallthrough; case DCCP_REQUESTING: switch (opt) { case DCCPO_CHANGE_L: diff --git a/net/dccp/input.c b/net/dccp/input.c index bd9cfdb67436..2cbb757a894f 100644 --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -64,7 +64,7 @@ static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb) */ if (dccp_sk(sk)->dccps_role != DCCP_ROLE_CLIENT) break; - /* fall through */ + fallthrough; case DCCP_REQUESTING: case DCCP_ACTIVE_CLOSEREQ: dccp_send_reset(sk, DCCP_RESET_CODE_CLOSED); @@ -76,7 +76,7 @@ static int dccp_rcv_close(struct sock *sk, struct sk_buff *skb) queued = 1; dccp_fin(sk, skb); dccp_set_state(sk, DCCP_PASSIVE_CLOSE); - /* fall through */ + fallthrough; case DCCP_PASSIVE_CLOSE: /* * Retransmitted Close: we have already enqueued the first one. @@ -113,7 +113,7 @@ static int dccp_rcv_closereq(struct sock *sk, struct sk_buff *skb) queued = 1; dccp_fin(sk, skb); dccp_set_state(sk, DCCP_PASSIVE_CLOSEREQ); - /* fall through */ + fallthrough; case DCCP_PASSIVE_CLOSEREQ: sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP); } @@ -530,7 +530,7 @@ static int dccp_rcv_respond_partopen_state_process(struct sock *sk, case DCCP_PKT_DATA: if (sk->sk_state == DCCP_RESPOND) break; - /* fall through */ + fallthrough; case DCCP_PKT_DATAACK: case DCCP_PKT_ACK: /* @@ -684,7 +684,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb, /* Step 8: if using Ack Vectors, mark packet acknowledgeable */ dccp_handle_ackvec_processing(sk, skb); dccp_deliver_input_to_ccids(sk, skb); - /* fall through */ + fallthrough; case DCCP_RESPOND: queued = dccp_rcv_respond_partopen_state_process(sk, skb, dh, len); diff --git a/net/dccp/options.c b/net/dccp/options.c index 51aaba7a5d45..d24cad05001e 100644 --- a/net/dccp/options.c +++ b/net/dccp/options.c @@ -225,7 +225,7 @@ int dccp_parse_options(struct sock *sk, struct dccp_request_sock *dreq, * interested. The RX CCID need not parse Ack Vectors, * since it is only interested in clearing old state. */ - /* fall through */ + fallthrough; case DCCPO_MIN_TX_CCID_SPECIFIC ... DCCPO_MAX_TX_CCID_SPECIFIC: if (ccid_hc_tx_parse_options(dp->dccps_hc_tx_ccid, sk, pkt_type, opt, value, len)) diff --git a/net/dccp/output.c b/net/dccp/output.c index 6433187a5cc4..50e6d5699bb2 100644 --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -62,7 +62,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) switch (dcb->dccpd_type) { case DCCP_PKT_DATA: set_ack = 0; - /* fall through */ + fallthrough; case DCCP_PKT_DATAACK: case DCCP_PKT_RESET: break; @@ -72,12 +72,12 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) /* Use ISS on the first (non-retransmitted) Request. */ if (icsk->icsk_retransmits == 0) dcb->dccpd_seq = dp->dccps_iss; - /* fall through */ + fallthrough; case DCCP_PKT_SYNC: case DCCP_PKT_SYNCACK: ackno = dcb->dccpd_ack_seq; - /* fall through */ + fallthrough; default: /* * Set owner/destructor: some skbs are allocated via @@ -481,7 +481,7 @@ struct sk_buff *dccp_ctl_make_reset(struct sock *sk, struct sk_buff *rcv_skb) case DCCP_RESET_CODE_PACKET_ERROR: dhr->dccph_reset_data[0] = rxdh->dccph_type; break; - case DCCP_RESET_CODE_OPTION_ERROR: /* fall through */ + case DCCP_RESET_CODE_OPTION_ERROR: case DCCP_RESET_CODE_MANDATORY_ERROR: memcpy(dhr->dccph_reset_data, dcb->dccpd_reset_data, 3); break; diff --git a/net/dccp/proto.c b/net/dccp/proto.c index d148ab1530e5..6d705d90c614 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -101,7 +101,7 @@ void dccp_set_state(struct sock *sk, const int state) if (inet_csk(sk)->icsk_bind_hash != NULL && !(sk->sk_userlocks & SOCK_BINDPORT_LOCK)) inet_put_port(sk); - /* fall through */ + fallthrough; default: if (oldstate == DCCP_OPEN) DCCP_DEC_STATS(DCCP_MIB_CURRESTAB); @@ -834,7 +834,7 @@ int dccp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, case DCCP_PKT_CLOSEREQ: if (!(flags & MSG_PEEK)) dccp_finish_passive_close(sk); - /* fall through */ + fallthrough; case DCCP_PKT_RESET: dccp_pr_debug("found fin (%s) ok!\n", dccp_packet_name(dh->dccph_type)); @@ -960,7 +960,7 @@ static void dccp_terminate_connection(struct sock *sk) case DCCP_PARTOPEN: dccp_pr_debug("Stop PARTOPEN timer (%p)\n", sk); inet_csk_clear_xmit_timer(sk, ICSK_TIME_DACK); - /* fall through */ + fallthrough; case DCCP_OPEN: dccp_send_close(sk, 1); @@ -969,7 +969,7 @@ static void dccp_terminate_connection(struct sock *sk) next_state = DCCP_ACTIVE_CLOSEREQ; else next_state = DCCP_CLOSING; - /* fall through */ + fallthrough; default: dccp_set_state(sk, next_state); } diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 441794e0034f..e6f5cf52023c 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -3025,13 +3025,14 @@ ethtool_rx_flow_rule_create(const struct ethtool_rx_flow_spec_input *input) case TCP_V4_FLOW: case TCP_V6_FLOW: match->key.basic.ip_proto = IPPROTO_TCP; + match->mask.basic.ip_proto = 0xff; break; case UDP_V4_FLOW: case UDP_V6_FLOW: match->key.basic.ip_proto = IPPROTO_UDP; + match->mask.basic.ip_proto = 0xff; break; } - match->mask.basic.ip_proto = 0xff; match->dissector.used_keys |= BIT(FLOW_DISSECTOR_KEY_BASIC); match->dissector.offset[FLOW_DISSECTOR_KEY_BASIC] = diff --git a/net/l2tp/Makefile b/net/l2tp/Makefile index 399a7e5db2f4..cf8f27071d3f 100644 --- a/net/l2tp/Makefile +++ b/net/l2tp/Makefile @@ -5,6 +5,8 @@ obj-$(CONFIG_L2TP) += l2tp_core.o +CFLAGS_l2tp_core.o += -I$(src) + # Build l2tp as modules if L2TP is M obj-$(subst y,$(CONFIG_L2TP),$(CONFIG_PPPOL2TP)) += l2tp_ppp.o obj-$(subst y,$(CONFIG_L2TP),$(CONFIG_L2TP_IP)) += l2tp_ip.o diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 701fc72ad9f4..560c687f5457 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -61,6 +61,10 @@ #include <linux/atomic.h> #include "l2tp_core.h" +#include "trace.h" + +#define CREATE_TRACE_POINTS +#include "trace.h" #define L2TP_DRV_VERSION "V2.0" @@ -151,6 +155,7 @@ l2tp_session_id_hash(struct l2tp_tunnel *tunnel, u32 session_id) static void l2tp_tunnel_free(struct l2tp_tunnel *tunnel) { + trace_free_tunnel(tunnel); sock_put(tunnel->sock); /* the tunnel is freed in the socket destructor */ } @@ -159,6 +164,8 @@ static void l2tp_session_free(struct l2tp_session *session) { struct l2tp_tunnel *tunnel = session->tunnel; + trace_free_session(session); + if (tunnel) { if (WARN_ON(tunnel->magic != L2TP_TUNNEL_MAGIC)) goto out; @@ -381,6 +388,8 @@ int l2tp_session_register(struct l2tp_session *session, hlist_add_head(&session->hlist, head); write_unlock_bh(&tunnel->hlist_lock); + trace_register_session(session); + return 0; err_tlock_pnlock: @@ -409,10 +418,6 @@ static void l2tp_recv_queue_skb(struct l2tp_session *session, struct sk_buff *sk skb_queue_walk_safe(&session->reorder_q, skbp, tmp) { if (L2TP_SKB_CB(skbp)->ns > ns) { __skb_queue_before(&session->reorder_q, skbp, skb); - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: pkt %hu, inserted before %hu, reorder_q len=%d\n", - session->name, ns, L2TP_SKB_CB(skbp)->ns, - skb_queue_len(&session->reorder_q)); atomic_long_inc(&session->stats.rx_oos_packets); goto out; } @@ -445,9 +450,7 @@ static void l2tp_recv_dequeue_skb(struct l2tp_session *session, struct sk_buff * /* Bump our Nr */ session->nr++; session->nr &= session->nr_max; - - l2tp_dbg(session, L2TP_MSG_SEQ, "%s: updated nr to %hu\n", - session->name, session->nr); + trace_session_seqnum_update(session); } /* call private receive handler */ @@ -472,37 +475,27 @@ static void l2tp_recv_dequeue(struct l2tp_session *session) start: spin_lock_bh(&session->reorder_q.lock); skb_queue_walk_safe(&session->reorder_q, skb, tmp) { - if (time_after(jiffies, L2TP_SKB_CB(skb)->expires)) { + struct l2tp_skb_cb *cb = L2TP_SKB_CB(skb); + + /* If the packet has been pending on the queue for too long, discard it */ + if (time_after(jiffies, cb->expires)) { atomic_long_inc(&session->stats.rx_seq_discards); atomic_long_inc(&session->stats.rx_errors); - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: oos pkt %u len %d discarded (too old), waiting for %u, reorder_q_len=%d\n", - session->name, L2TP_SKB_CB(skb)->ns, - L2TP_SKB_CB(skb)->length, session->nr, - skb_queue_len(&session->reorder_q)); + trace_session_pkt_expired(session, cb->ns); session->reorder_skip = 1; __skb_unlink(skb, &session->reorder_q); kfree_skb(skb); continue; } - if (L2TP_SKB_CB(skb)->has_seq) { + if (cb->has_seq) { if (session->reorder_skip) { - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: advancing nr to next pkt: %u -> %u", - session->name, session->nr, - L2TP_SKB_CB(skb)->ns); session->reorder_skip = 0; - session->nr = L2TP_SKB_CB(skb)->ns; + session->nr = cb->ns; + trace_session_seqnum_reset(session); } - if (L2TP_SKB_CB(skb)->ns != session->nr) { - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: holding oos pkt %u len %d, waiting for %u, reorder_q_len=%d\n", - session->name, L2TP_SKB_CB(skb)->ns, - L2TP_SKB_CB(skb)->length, session->nr, - skb_queue_len(&session->reorder_q)); + if (cb->ns != session->nr) goto out; - } } __skb_unlink(skb, &session->reorder_q); @@ -535,14 +528,13 @@ static int l2tp_seq_check_rx_window(struct l2tp_session *session, u32 nr) */ static int l2tp_recv_data_seq(struct l2tp_session *session, struct sk_buff *skb) { - if (!l2tp_seq_check_rx_window(session, L2TP_SKB_CB(skb)->ns)) { + struct l2tp_skb_cb *cb = L2TP_SKB_CB(skb); + + if (!l2tp_seq_check_rx_window(session, cb->ns)) { /* Packet sequence number is outside allowed window. * Discard it. */ - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: pkt %u len %d discarded, outside window, nr=%u\n", - session->name, L2TP_SKB_CB(skb)->ns, - L2TP_SKB_CB(skb)->length, session->nr); + trace_session_pkt_outside_rx_window(session, cb->ns); goto discard; } @@ -559,10 +551,10 @@ static int l2tp_recv_data_seq(struct l2tp_session *session, struct sk_buff *skb) * is seen. After nr_oos_count_max in-sequence packets, reset the * sequence number to re-enable packet reception. */ - if (L2TP_SKB_CB(skb)->ns == session->nr) { + if (cb->ns == session->nr) { skb_queue_tail(&session->reorder_q, skb); } else { - u32 nr_oos = L2TP_SKB_CB(skb)->ns; + u32 nr_oos = cb->ns; u32 nr_next = (session->nr_oos + 1) & session->nr_max; if (nr_oos == nr_next) @@ -573,17 +565,10 @@ static int l2tp_recv_data_seq(struct l2tp_session *session, struct sk_buff *skb) session->nr_oos = nr_oos; if (session->nr_oos_count > session->nr_oos_count_max) { session->reorder_skip = 1; - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: %d oos packets received. Resetting sequence numbers\n", - session->name, session->nr_oos_count); } if (!session->reorder_skip) { atomic_long_inc(&session->stats.rx_seq_discards); - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: oos pkt %u len %d discarded, waiting for %u, reorder_q_len=%d\n", - session->name, L2TP_SKB_CB(skb)->ns, - L2TP_SKB_CB(skb)->length, session->nr, - skb_queue_len(&session->reorder_q)); + trace_session_pkt_oos(session, cb->ns); goto discard; } skb_queue_tail(&session->reorder_q, skb); @@ -660,16 +645,14 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb, int length) { struct l2tp_tunnel *tunnel = session->tunnel; - u32 ns = 0, nr = 0; int offset; /* Parse and check optional cookie */ if (session->peer_cookie_len > 0) { if (memcmp(ptr, &session->peer_cookie[0], session->peer_cookie_len)) { - l2tp_info(tunnel, L2TP_MSG_DATA, - "%s: cookie mismatch (%u/%u). Discarding.\n", - tunnel->name, tunnel->tunnel_id, - session->session_id); + pr_warn_ratelimited("%s: cookie mismatch (%u/%u). Discarding.\n", + tunnel->name, tunnel->tunnel_id, + session->session_id); atomic_long_inc(&session->stats.rx_cookie_discards); goto discard; } @@ -686,32 +669,21 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb, L2TP_SKB_CB(skb)->has_seq = 0; if (tunnel->version == L2TP_HDR_VER_2) { if (hdrflags & L2TP_HDRFLAG_S) { - ns = ntohs(*(__be16 *)ptr); - ptr += 2; - nr = ntohs(*(__be16 *)ptr); - ptr += 2; - /* Store L2TP info in the skb */ - L2TP_SKB_CB(skb)->ns = ns; + L2TP_SKB_CB(skb)->ns = ntohs(*(__be16 *)ptr); L2TP_SKB_CB(skb)->has_seq = 1; + ptr += 2; + /* Skip past nr in the header */ + ptr += 2; - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: recv data ns=%u, nr=%u, session nr=%u\n", - session->name, ns, nr, session->nr); } } else if (session->l2specific_type == L2TP_L2SPECTYPE_DEFAULT) { u32 l2h = ntohl(*(__be32 *)ptr); if (l2h & 0x40000000) { - ns = l2h & 0x00ffffff; - /* Store L2TP info in the skb */ - L2TP_SKB_CB(skb)->ns = ns; + L2TP_SKB_CB(skb)->ns = l2h & 0x00ffffff; L2TP_SKB_CB(skb)->has_seq = 1; - - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: recv data ns=%u, session nr=%u\n", - session->name, ns, session->nr); } ptr += 4; } @@ -722,9 +694,7 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb, * configure it so. */ if (!session->lns_mode && !session->send_seq) { - l2tp_info(session, L2TP_MSG_SEQ, - "%s: requested to enable seq numbers by LNS\n", - session->name); + trace_session_seqnum_lns_enable(session); session->send_seq = 1; l2tp_session_set_header_len(session, tunnel->version); } @@ -733,9 +703,8 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb, * If user has configured mandatory sequence numbers, discard. */ if (session->recv_seq) { - l2tp_warn(session, L2TP_MSG_SEQ, - "%s: recv data has no seq numbers when required. Discarding.\n", - session->name); + pr_warn_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", + session->name); atomic_long_inc(&session->stats.rx_seq_discards); goto discard; } @@ -746,15 +715,12 @@ void l2tp_recv_common(struct l2tp_session *session, struct sk_buff *skb, * LAC is broken. Discard the frame. */ if (!session->lns_mode && session->send_seq) { - l2tp_info(session, L2TP_MSG_SEQ, - "%s: requested to disable seq numbers by LNS\n", - session->name); + trace_session_seqnum_lns_disable(session); session->send_seq = 0; l2tp_session_set_header_len(session, tunnel->version); } else if (session->send_seq) { - l2tp_warn(session, L2TP_MSG_SEQ, - "%s: recv data has no seq numbers when required. Discarding.\n", - session->name); + pr_warn_ratelimited("%s: recv data has no seq numbers when required. Discarding.\n", + session->name); atomic_long_inc(&session->stats.rx_seq_discards); goto discard; } @@ -847,22 +813,11 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb) /* Short packet? */ if (!pskb_may_pull(skb, L2TP_HDR_SIZE_MAX)) { - l2tp_info(tunnel, L2TP_MSG_DATA, - "%s: recv short packet (len=%d)\n", - tunnel->name, skb->len); + pr_warn_ratelimited("%s: recv short packet (len=%d)\n", + tunnel->name, skb->len); goto error; } - /* Trace packet contents, if enabled */ - if (tunnel->debug & L2TP_MSG_DATA) { - length = min(32u, skb->len); - if (!pskb_may_pull(skb, length)) - goto error; - - pr_debug("%s: recv\n", tunnel->name); - print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, skb->data, length); - } - /* Point to L2TP header */ optr = skb->data; ptr = skb->data; @@ -873,9 +828,8 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb) /* Check protocol version */ version = hdrflags & L2TP_HDR_VER_MASK; if (version != tunnel->version) { - l2tp_info(tunnel, L2TP_MSG_DATA, - "%s: recv protocol version mismatch: got %d expected %d\n", - tunnel->name, version, tunnel->version); + pr_warn_ratelimited("%s: recv protocol version mismatch: got %d expected %d\n", + tunnel->name, version, tunnel->version); goto error; } @@ -883,12 +837,8 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb) length = skb->len; /* If type is control packet, it is handled by userspace. */ - if (hdrflags & L2TP_HDRFLAG_T) { - l2tp_dbg(tunnel, L2TP_MSG_DATA, - "%s: recv control packet, len=%d\n", - tunnel->name, length); + if (hdrflags & L2TP_HDRFLAG_T) goto error; - } /* Skip flags */ ptr += 2; @@ -917,9 +867,8 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb) l2tp_session_dec_refcount(session); /* Not found? Pass to userspace to deal with */ - l2tp_info(tunnel, L2TP_MSG_DATA, - "%s: no session found (%u/%u). Passing up.\n", - tunnel->name, tunnel_id, session_id); + pr_warn_ratelimited("%s: no session found (%u/%u). Passing up.\n", + tunnel->name, tunnel_id, session_id); goto error; } @@ -953,9 +902,6 @@ int l2tp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) if (!tunnel) goto pass_up; - l2tp_dbg(tunnel, L2TP_MSG_DATA, "%s: received %d bytes\n", - tunnel->name, skb->len); - if (l2tp_udp_recv_core(tunnel, skb)) goto pass_up; @@ -993,8 +939,7 @@ static int l2tp_build_l2tpv2_header(struct l2tp_session *session, void *buf) *bufp++ = 0; session->ns++; session->ns &= 0xffff; - l2tp_dbg(session, L2TP_MSG_SEQ, "%s: updated ns to %u\n", - session->name, session->ns); + trace_session_seqnum_update(session); } return bufp - optr; @@ -1030,9 +975,7 @@ static int l2tp_build_l2tpv3_header(struct l2tp_session *session, void *buf) l2h = 0x40000000 | session->ns; session->ns++; session->ns &= 0xffffff; - l2tp_dbg(session, L2TP_MSG_SEQ, - "%s: updated ns to %u\n", - session->name, session->ns); + trace_session_seqnum_update(session); } *((__be32 *)bufp) = htonl(l2h); @@ -1049,23 +992,6 @@ static void l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb, unsigned int len = skb->len; int error; - /* Debug */ - if (session->send_seq) - l2tp_dbg(session, L2TP_MSG_DATA, "%s: send %zd bytes, ns=%u\n", - session->name, data_len, session->ns - 1); - else - l2tp_dbg(session, L2TP_MSG_DATA, "%s: send %zd bytes\n", - session->name, data_len); - - if (session->debug & L2TP_MSG_DATA) { - int uhlen = (tunnel->encap == L2TP_ENCAPTYPE_UDP) ? sizeof(struct udphdr) : 0; - unsigned char *datap = skb->data + uhlen; - - pr_debug("%s: xmit\n", session->name); - print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, - datap, min_t(size_t, 32, len - uhlen)); - } - /* Queue the packet to IP for output */ skb->ignore_df = 1; skb_dst_drop(skb); @@ -1195,8 +1121,6 @@ static void l2tp_tunnel_destruct(struct sock *sk) if (!tunnel) goto end; - l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: closing...\n", tunnel->name); - /* Disable udp encapsulation */ switch (tunnel->encap) { case L2TP_ENCAPTYPE_UDP: @@ -1255,9 +1179,6 @@ static void l2tp_tunnel_closeall(struct l2tp_tunnel *tunnel) struct hlist_node *tmp; struct l2tp_session *session; - l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: closing all sessions...\n", - tunnel->name); - write_lock_bh(&tunnel->hlist_lock); tunnel->acpt_newsess = false; for (hash = 0; hash < L2TP_HASH_SIZE; hash++) { @@ -1265,9 +1186,6 @@ again: hlist_for_each_safe(walk, tmp, &tunnel->session_hlist[hash]) { session = hlist_entry(walk, struct l2tp_session, hlist); - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: closing session\n", session->name); - hlist_del_init(&session->hlist); if (test_and_set_bit(0, &session->dead)) @@ -1483,16 +1401,12 @@ int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32 tunnel->version = version; tunnel->tunnel_id = tunnel_id; tunnel->peer_tunnel_id = peer_tunnel_id; - tunnel->debug = L2TP_DEFAULT_DEBUG_FLAGS; tunnel->magic = L2TP_TUNNEL_MAGIC; sprintf(&tunnel->name[0], "tunl %u", tunnel_id); rwlock_init(&tunnel->hlist_lock); tunnel->acpt_newsess = true; - if (cfg) - tunnel->debug = cfg->debug; - tunnel->encap = encap; refcount_set(&tunnel->ref_count, 1); @@ -1597,6 +1511,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net, "l2tp_sock"); sk->sk_allocation = GFP_ATOMIC; + trace_register_tunnel(tunnel); + if (tunnel->fd >= 0) sockfd_put(sock); @@ -1617,6 +1533,7 @@ EXPORT_SYMBOL_GPL(l2tp_tunnel_register); void l2tp_tunnel_delete(struct l2tp_tunnel *tunnel) { if (!test_and_set_bit(0, &tunnel->dead)) { + trace_delete_tunnel(tunnel); l2tp_tunnel_inc_refcount(tunnel); queue_work(l2tp_wq, &tunnel->del_work); } @@ -1628,6 +1545,7 @@ void l2tp_session_delete(struct l2tp_session *session) if (test_and_set_bit(0, &session->dead)) return; + trace_delete_session(session); l2tp_session_unhash(session); l2tp_session_queue_purge(session); if (session->session_close) @@ -1686,12 +1604,8 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn INIT_HLIST_NODE(&session->hlist); INIT_HLIST_NODE(&session->global_hlist); - /* Inherit debug options from tunnel */ - session->debug = tunnel->debug; - if (cfg) { session->pwtype = cfg->pw_type; - session->debug = cfg->debug; session->send_seq = cfg->send_seq; session->recv_seq = cfg->recv_seq; session->lns_mode = cfg->lns_mode; diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h index 3468d6b177a0..07249c5f22ef 100644 --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -51,7 +51,6 @@ struct l2tp_session_cfg { unsigned int lns_mode:1; /* behave as LNS? * LAC enables sequence numbers under LNS control. */ - int debug; /* bitmask of debug message categories */ u16 l2specific_type; /* Layer 2 specific type */ u8 cookie[8]; /* optional cookie */ int cookie_len; /* 0, 4 or 8 bytes */ @@ -66,6 +65,7 @@ struct l2tp_session_cfg { * Is linked into a per-tunnel session hashlist; and in the case of an L2TPv3 session into * an additional per-net ("global") hashlist. */ +#define L2TP_SESSION_NAME_MAX 32 struct l2tp_session { int magic; /* should be L2TP_SESSION_MAGIC */ long dead; @@ -90,14 +90,13 @@ struct l2tp_session { struct hlist_node hlist; /* hash list node */ refcount_t ref_count; - char name[32]; /* for logging */ + char name[L2TP_SESSION_NAME_MAX]; /* for logging */ char ifname[IFNAMSIZ]; unsigned int recv_seq:1; /* expect receive packets with sequence numbers? */ unsigned int send_seq:1; /* send packets with sequence numbers? */ unsigned int lns_mode:1; /* behave as LNS? * LAC enables sequence numbers under LNS control. */ - int debug; /* bitmask of debug message categories */ int reorder_timeout; /* configured reorder timeout (in jiffies) */ int reorder_skip; /* set if skip to next nr */ enum l2tp_pwtype pwtype; @@ -131,7 +130,6 @@ struct l2tp_session { /* L2TP tunnel configuration */ struct l2tp_tunnel_cfg { - int debug; /* bitmask of debug message categories */ enum l2tp_encap_type encap; /* Used only for kernel-created sockets */ @@ -154,6 +152,7 @@ struct l2tp_tunnel_cfg { * Maintains a hashlist of sessions belonging to the tunnel instance. * Is linked into a per-net list of tunnels. */ +#define L2TP_TUNNEL_NAME_MAX 20 struct l2tp_tunnel { int magic; /* Should be L2TP_TUNNEL_MAGIC */ @@ -170,8 +169,7 @@ struct l2tp_tunnel { u32 peer_tunnel_id; int version; /* 2=>L2TPv2, 3=>L2TPv3 */ - char name[20]; /* for logging */ - int debug; /* bitmask of debug message categories */ + char name[L2TP_TUNNEL_NAME_MAX]; /* for logging */ enum l2tp_encap_type encap; struct l2tp_stats stats; @@ -337,19 +335,6 @@ static inline int l2tp_v3_ensure_opt_in_linear(struct l2tp_session *session, str return 0; } -#define l2tp_printk(ptr, type, func, fmt, ...) \ -do { \ - if (((ptr)->debug) & (type)) \ - func(fmt, ##__VA_ARGS__); \ -} while (0) - -#define l2tp_warn(ptr, type, fmt, ...) \ - l2tp_printk(ptr, type, pr_warn, fmt, ##__VA_ARGS__) -#define l2tp_info(ptr, type, fmt, ...) \ - l2tp_printk(ptr, type, pr_info, fmt, ##__VA_ARGS__) -#define l2tp_dbg(ptr, type, fmt, ...) \ - l2tp_printk(ptr, type, pr_debug, fmt, ##__VA_ARGS__) - #define MODULE_ALIAS_L2TP_PWTYPE(type) \ MODULE_ALIAS("net-l2tp-type-" __stringify(type)) diff --git a/net/l2tp/l2tp_debugfs.c b/net/l2tp/l2tp_debugfs.c index 96cb9601c21b..bca75bef8282 100644 --- a/net/l2tp/l2tp_debugfs.c +++ b/net/l2tp/l2tp_debugfs.c @@ -167,7 +167,7 @@ static void l2tp_dfs_seq_tunnel_show(struct seq_file *m, void *v) tunnel->sock ? refcount_read(&tunnel->sock->sk_refcnt) : 0, refcount_read(&tunnel->ref_count)); seq_printf(m, " %08x rx %ld/%ld/%ld rx %ld/%ld/%ld\n", - tunnel->debug, + 0, atomic_long_read(&tunnel->stats.tx_packets), atomic_long_read(&tunnel->stats.tx_bytes), atomic_long_read(&tunnel->stats.tx_errors), @@ -192,7 +192,7 @@ static void l2tp_dfs_seq_session_show(struct seq_file *m, void *v) session->recv_seq ? 'R' : '-', session->send_seq ? 'S' : '-', session->lns_mode ? "LNS" : "LAC", - session->debug, + 0, jiffies_to_msecs(session->reorder_timeout)); seq_printf(m, " offset 0 l2specific %hu/%hu\n", session->l2specific_type, l2tp_get_l2specific_len(session)); diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c index 7ed2b4eced94..657edad1263e 100644 --- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -128,17 +128,6 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb, struct net_device *dev; struct l2tp_eth *priv; - if (session->debug & L2TP_MSG_DATA) { - unsigned int length; - - length = min(32u, skb->len); - if (!pskb_may_pull(skb, length)) - goto error; - - pr_debug("%s: eth recv\n", session->name); - print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, skb->data, length); - } - if (!pskb_may_pull(skb, ETH_HLEN)) goto error; diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c index df2a35b5714a..7086d97f293c 100644 --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -118,7 +118,6 @@ static int l2tp_ip_recv(struct sk_buff *skb) struct l2tp_session *session; struct l2tp_tunnel *tunnel = NULL; struct iphdr *iph; - int length; if (!pskb_may_pull(skb, 4)) goto discard; @@ -147,20 +146,6 @@ static int l2tp_ip_recv(struct sk_buff *skb) if (!tunnel) goto discard_sess; - /* Trace packet contents, if enabled */ - if (tunnel->debug & L2TP_MSG_DATA) { - length = min(32u, skb->len); - if (!pskb_may_pull(skb, length)) - goto discard_sess; - - /* Point to L2TP header */ - optr = skb->data; - ptr = skb->data; - ptr += 4; - pr_debug("%s: ip recv\n", tunnel->name); - print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length); - } - if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr)) goto discard_sess; diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c index bc757bc7e264..409ea8927f6c 100644 --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -131,7 +131,6 @@ static int l2tp_ip6_recv(struct sk_buff *skb) struct l2tp_session *session; struct l2tp_tunnel *tunnel = NULL; struct ipv6hdr *iph; - int length; if (!pskb_may_pull(skb, 4)) goto discard; @@ -160,20 +159,6 @@ static int l2tp_ip6_recv(struct sk_buff *skb) if (!tunnel) goto discard_sess; - /* Trace packet contents, if enabled */ - if (tunnel->debug & L2TP_MSG_DATA) { - length = min(32u, skb->len); - if (!pskb_may_pull(skb, length)) - goto discard_sess; - - /* Point to L2TP header */ - optr = skb->data; - ptr = skb->data; - ptr += 4; - pr_debug("%s: ip recv\n", tunnel->name); - print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length); - } - if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr)) goto discard_sess; diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c index def78eebca4c..31a1e27eab20 100644 --- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -229,9 +229,6 @@ static int l2tp_nl_cmd_tunnel_create(struct sk_buff *skb, struct genl_info *info goto out; } - if (attrs[L2TP_ATTR_DEBUG]) - cfg.debug = nla_get_u32(attrs[L2TP_ATTR_DEBUG]); - ret = -EINVAL; switch (cfg.encap) { case L2TP_ENCAPTYPE_UDP: @@ -307,9 +304,6 @@ static int l2tp_nl_cmd_tunnel_modify(struct sk_buff *skb, struct genl_info *info goto out; } - if (info->attrs[L2TP_ATTR_DEBUG]) - tunnel->debug = nla_get_u32(info->attrs[L2TP_ATTR_DEBUG]); - ret = l2tp_tunnel_notify(&l2tp_nl_family, info, tunnel, L2TP_CMD_TUNNEL_MODIFY); @@ -400,7 +394,7 @@ static int l2tp_nl_tunnel_send(struct sk_buff *skb, u32 portid, u32 seq, int fla if (nla_put_u8(skb, L2TP_ATTR_PROTO_VERSION, tunnel->version) || nla_put_u32(skb, L2TP_ATTR_CONN_ID, tunnel->tunnel_id) || nla_put_u32(skb, L2TP_ATTR_PEER_CONN_ID, tunnel->peer_tunnel_id) || - nla_put_u32(skb, L2TP_ATTR_DEBUG, tunnel->debug) || + nla_put_u32(skb, L2TP_ATTR_DEBUG, 0) || nla_put_u16(skb, L2TP_ATTR_ENCAP_TYPE, tunnel->encap)) goto nla_put_failure; @@ -605,9 +599,6 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf cfg.ifname = nla_data(info->attrs[L2TP_ATTR_IFNAME]); } - if (info->attrs[L2TP_ATTR_DEBUG]) - cfg.debug = nla_get_u32(info->attrs[L2TP_ATTR_DEBUG]); - if (info->attrs[L2TP_ATTR_RECV_SEQ]) cfg.recv_seq = nla_get_u8(info->attrs[L2TP_ATTR_RECV_SEQ]); @@ -689,9 +680,6 @@ static int l2tp_nl_cmd_session_modify(struct sk_buff *skb, struct genl_info *inf goto out; } - if (info->attrs[L2TP_ATTR_DEBUG]) - session->debug = nla_get_u32(info->attrs[L2TP_ATTR_DEBUG]); - if (info->attrs[L2TP_ATTR_RECV_SEQ]) session->recv_seq = nla_get_u8(info->attrs[L2TP_ATTR_RECV_SEQ]); @@ -730,7 +718,7 @@ static int l2tp_nl_session_send(struct sk_buff *skb, u32 portid, u32 seq, int fl nla_put_u32(skb, L2TP_ATTR_SESSION_ID, session->session_id) || nla_put_u32(skb, L2TP_ATTR_PEER_CONN_ID, tunnel->peer_tunnel_id) || nla_put_u32(skb, L2TP_ATTR_PEER_SESSION_ID, session->peer_session_id) || - nla_put_u32(skb, L2TP_ATTR_DEBUG, session->debug) || + nla_put_u32(skb, L2TP_ATTR_DEBUG, 0) || nla_put_u16(skb, L2TP_ATTR_PW_TYPE, session->pwtype)) goto nla_put_failure; diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c index 13c3153b40d6..450637ffa557 100644 --- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -237,17 +237,9 @@ static void pppol2tp_recv(struct l2tp_session *session, struct sk_buff *skb, int if (sk->sk_state & PPPOX_BOUND) { struct pppox_sock *po; - l2tp_dbg(session, L2TP_MSG_DATA, - "%s: recv %d byte data frame, passing to ppp\n", - session->name, data_len); - po = pppox_sk(sk); ppp_input(&po->chan, skb); } else { - l2tp_dbg(session, L2TP_MSG_DATA, - "%s: recv %d byte data frame, passing to L2TP socket\n", - session->name, data_len); - if (sock_queue_rcv_skb(sk, skb) < 0) { atomic_long_inc(&session->stats.rx_errors); kfree_skb(skb); @@ -259,7 +251,7 @@ static void pppol2tp_recv(struct l2tp_session *session, struct sk_buff *skb, int no_sock: rcu_read_unlock(); - l2tp_info(session, L2TP_MSG_DATA, "%s: no socket\n", session->name); + pr_warn_ratelimited("%s: no socket in recv\n", session->name); kfree_skb(skb); } @@ -710,7 +702,6 @@ static int pppol2tp_connect(struct socket *sock, struct sockaddr *uservaddr, if (!tunnel) { struct l2tp_tunnel_cfg tcfg = { .encap = L2TP_ENCAPTYPE_UDP, - .debug = 0, }; /* Prevent l2tp_tunnel_register() from trying to set up @@ -840,8 +831,6 @@ out_no_ppp: drop_refcnt = false; sk->sk_state = PPPOX_CONNECTED; - l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", - session->name); end: if (error) { @@ -1157,9 +1146,7 @@ static int pppol2tp_tunnel_setsockopt(struct sock *sk, switch (optname) { case PPPOL2TP_SO_DEBUG: - tunnel->debug = val; - l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: set debug=%x\n", - tunnel->name, tunnel->debug); + /* Tunnel debug flags option is deprecated */ break; default: @@ -1185,9 +1172,6 @@ static int pppol2tp_session_setsockopt(struct sock *sk, break; } session->recv_seq = !!val; - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: set recv_seq=%d\n", - session->name, session->recv_seq); break; case PPPOL2TP_SO_SENDSEQ: @@ -1203,9 +1187,6 @@ static int pppol2tp_session_setsockopt(struct sock *sk, PPPOL2TP_L2TP_HDR_SIZE_NOSEQ; } l2tp_session_set_header_len(session, session->tunnel->version); - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: set send_seq=%d\n", - session->name, session->send_seq); break; case PPPOL2TP_SO_LNSMODE: @@ -1214,22 +1195,14 @@ static int pppol2tp_session_setsockopt(struct sock *sk, break; } session->lns_mode = !!val; - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: set lns_mode=%d\n", - session->name, session->lns_mode); break; case PPPOL2TP_SO_DEBUG: - session->debug = val; - l2tp_info(session, L2TP_MSG_CONTROL, "%s: set debug=%x\n", - session->name, session->debug); + /* Session debug flags option is deprecated */ break; case PPPOL2TP_SO_REORDERTO: session->reorder_timeout = msecs_to_jiffies(val); - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: set reorder_timeout=%d\n", - session->name, session->reorder_timeout); break; default: @@ -1297,9 +1270,8 @@ static int pppol2tp_tunnel_getsockopt(struct sock *sk, switch (optname) { case PPPOL2TP_SO_DEBUG: - *val = tunnel->debug; - l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: get debug=%x\n", - tunnel->name, tunnel->debug); + /* Tunnel debug flags option is deprecated */ + *val = 0; break; default: @@ -1321,32 +1293,23 @@ static int pppol2tp_session_getsockopt(struct sock *sk, switch (optname) { case PPPOL2TP_SO_RECVSEQ: *val = session->recv_seq; - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: get recv_seq=%d\n", session->name, *val); break; case PPPOL2TP_SO_SENDSEQ: *val = session->send_seq; - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: get send_seq=%d\n", session->name, *val); break; case PPPOL2TP_SO_LNSMODE: *val = session->lns_mode; - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: get lns_mode=%d\n", session->name, *val); break; case PPPOL2TP_SO_DEBUG: - *val = session->debug; - l2tp_info(session, L2TP_MSG_CONTROL, "%s: get debug=%d\n", - session->name, *val); + /* Session debug flags option is deprecated */ + *val = 0; break; case PPPOL2TP_SO_REORDERTO: *val = (int)jiffies_to_msecs(session->reorder_timeout); - l2tp_info(session, L2TP_MSG_CONTROL, - "%s: get reorder_timeout=%d\n", session->name, *val); break; default: @@ -1534,7 +1497,7 @@ static void pppol2tp_seq_tunnel_show(struct seq_file *m, void *v) (tunnel == tunnel->sock->sk_user_data) ? 'Y' : 'N', refcount_read(&tunnel->ref_count) - 1); seq_printf(m, " %08x %ld/%ld/%ld %ld/%ld/%ld\n", - tunnel->debug, + 0, atomic_long_read(&tunnel->stats.tx_packets), atomic_long_read(&tunnel->stats.tx_bytes), atomic_long_read(&tunnel->stats.tx_errors), @@ -1580,7 +1543,7 @@ static void pppol2tp_seq_session_show(struct seq_file *m, void *v) session->recv_seq ? 'R' : '-', session->send_seq ? 'S' : '-', session->lns_mode ? "LNS" : "LAC", - session->debug, + 0, jiffies_to_msecs(session->reorder_timeout)); seq_printf(m, " %hu/%hu %ld/%ld/%ld %ld/%ld/%ld\n", session->nr, session->ns, diff --git a/net/l2tp/trace.h b/net/l2tp/trace.h new file mode 100644 index 000000000000..8596eaa12a2e --- /dev/null +++ b/net/l2tp/trace.h @@ -0,0 +1,211 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM l2tp + +#if !defined(_TRACE_L2TP_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_L2TP_H + +#include <linux/tracepoint.h> +#include <linux/l2tp.h> +#include "l2tp_core.h" + +#define encap_type_name(e) { L2TP_ENCAPTYPE_##e, #e } +#define show_encap_type_name(val) \ + __print_symbolic(val, \ + encap_type_name(UDP), \ + encap_type_name(IP)) + +#define pw_type_name(p) { L2TP_PWTYPE_##p, #p } +#define show_pw_type_name(val) \ + __print_symbolic(val, \ + pw_type_name(ETH_VLAN), \ + pw_type_name(ETH), \ + pw_type_name(PPP), \ + pw_type_name(PPP_AC), \ + pw_type_name(IP)) + +DECLARE_EVENT_CLASS(tunnel_only_evt, + TP_PROTO(struct l2tp_tunnel *tunnel), + TP_ARGS(tunnel), + TP_STRUCT__entry( + __array(char, name, L2TP_TUNNEL_NAME_MAX) + ), + TP_fast_assign( + memcpy(__entry->name, tunnel->name, L2TP_TUNNEL_NAME_MAX); + ), + TP_printk("%s", __entry->name) +); + +DECLARE_EVENT_CLASS(session_only_evt, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session), + TP_STRUCT__entry( + __array(char, name, L2TP_SESSION_NAME_MAX) + ), + TP_fast_assign( + memcpy(__entry->name, session->name, L2TP_SESSION_NAME_MAX); + ), + TP_printk("%s", __entry->name) +); + +TRACE_EVENT(register_tunnel, + TP_PROTO(struct l2tp_tunnel *tunnel), + TP_ARGS(tunnel), + TP_STRUCT__entry( + __array(char, name, L2TP_TUNNEL_NAME_MAX) + __field(int, fd) + __field(u32, tid) + __field(u32, ptid) + __field(int, version) + __field(enum l2tp_encap_type, encap) + ), + TP_fast_assign( + memcpy(__entry->name, tunnel->name, L2TP_TUNNEL_NAME_MAX); + __entry->fd = tunnel->fd; + __entry->tid = tunnel->tunnel_id; + __entry->ptid = tunnel->peer_tunnel_id; + __entry->version = tunnel->version; + __entry->encap = tunnel->encap; + ), + TP_printk("%s: type=%s encap=%s version=L2TPv%d tid=%u ptid=%u fd=%d", + __entry->name, + __entry->fd > 0 ? "managed" : "unmanaged", + show_encap_type_name(__entry->encap), + __entry->version, + __entry->tid, + __entry->ptid, + __entry->fd) +); + +DEFINE_EVENT(tunnel_only_evt, delete_tunnel, + TP_PROTO(struct l2tp_tunnel *tunnel), + TP_ARGS(tunnel) +); + +DEFINE_EVENT(tunnel_only_evt, free_tunnel, + TP_PROTO(struct l2tp_tunnel *tunnel), + TP_ARGS(tunnel) +); + +TRACE_EVENT(register_session, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session), + TP_STRUCT__entry( + __array(char, name, L2TP_SESSION_NAME_MAX) + __field(u32, tid) + __field(u32, ptid) + __field(u32, sid) + __field(u32, psid) + __field(enum l2tp_pwtype, pwtype) + ), + TP_fast_assign( + memcpy(__entry->name, session->name, L2TP_SESSION_NAME_MAX); + __entry->tid = session->tunnel ? session->tunnel->tunnel_id : 0; + __entry->ptid = session->tunnel ? session->tunnel->peer_tunnel_id : 0; + __entry->sid = session->session_id; + __entry->psid = session->peer_session_id; + __entry->pwtype = session->pwtype; + ), + TP_printk("%s: pseudowire=%s sid=%u psid=%u tid=%u ptid=%u", + __entry->name, + show_pw_type_name(__entry->pwtype), + __entry->sid, + __entry->psid, + __entry->sid, + __entry->psid) +); + +DEFINE_EVENT(session_only_evt, delete_session, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DEFINE_EVENT(session_only_evt, free_session, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DEFINE_EVENT(session_only_evt, session_seqnum_lns_enable, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DEFINE_EVENT(session_only_evt, session_seqnum_lns_disable, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DECLARE_EVENT_CLASS(session_seqnum_evt, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session), + TP_STRUCT__entry( + __array(char, name, L2TP_SESSION_NAME_MAX) + __field(u32, ns) + __field(u32, nr) + ), + TP_fast_assign( + memcpy(__entry->name, session->name, L2TP_SESSION_NAME_MAX); + __entry->ns = session->ns; + __entry->nr = session->nr; + ), + TP_printk("%s: ns=%u nr=%u", + __entry->name, + __entry->ns, + __entry->nr) +); + +DEFINE_EVENT(session_seqnum_evt, session_seqnum_update, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DEFINE_EVENT(session_seqnum_evt, session_seqnum_reset, + TP_PROTO(struct l2tp_session *session), + TP_ARGS(session) +); + +DECLARE_EVENT_CLASS(session_pkt_discard_evt, + TP_PROTO(struct l2tp_session *session, u32 pkt_ns), + TP_ARGS(session, pkt_ns), + TP_STRUCT__entry( + __array(char, name, L2TP_SESSION_NAME_MAX) + __field(u32, pkt_ns) + __field(u32, my_nr) + __field(u32, reorder_q_len) + ), + TP_fast_assign( + memcpy(__entry->name, session->name, L2TP_SESSION_NAME_MAX); + __entry->pkt_ns = pkt_ns, + __entry->my_nr = session->nr; + __entry->reorder_q_len = skb_queue_len(&session->reorder_q); + ), + TP_printk("%s: pkt_ns=%u my_nr=%u reorder_q_len=%u", + __entry->name, + __entry->pkt_ns, + __entry->my_nr, + __entry->reorder_q_len) +); + +DEFINE_EVENT(session_pkt_discard_evt, session_pkt_expired, + TP_PROTO(struct l2tp_session *session, u32 pkt_ns), + TP_ARGS(session, pkt_ns) +); + +DEFINE_EVENT(session_pkt_discard_evt, session_pkt_outside_rx_window, + TP_PROTO(struct l2tp_session *session, u32 pkt_ns), + TP_ARGS(session, pkt_ns) +); + +DEFINE_EVENT(session_pkt_discard_evt, session_pkt_oos, + TP_PROTO(struct l2tp_session *session, u32 pkt_ns), + TP_ARGS(session, pkt_ns) +); + +#endif /* _TRACE_L2TP_H */ + +/* This part must be outside protection */ +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH . +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE trace +#include <trace/define_trace.h> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index c8820c4156e6..2c208d2e65cd 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -384,8 +384,8 @@ mptcp_pm_addr_policy[MPTCP_PM_ADDR_ATTR_MAX + 1] = { [MPTCP_PM_ADDR_ATTR_FAMILY] = { .type = NLA_U16, }, [MPTCP_PM_ADDR_ATTR_ID] = { .type = NLA_U8, }, [MPTCP_PM_ADDR_ATTR_ADDR4] = { .type = NLA_U32, }, - [MPTCP_PM_ADDR_ATTR_ADDR6] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct in6_addr), }, + [MPTCP_PM_ADDR_ATTR_ADDR6] = + NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)), [MPTCP_PM_ADDR_ATTR_PORT] = { .type = NLA_U16 }, [MPTCP_PM_ADDR_ATTR_FLAGS] = { .type = NLA_U32 }, [MPTCP_PM_ADDR_ATTR_IF_IDX] = { .type = NLA_S32 }, diff --git a/net/netlink/policy.c b/net/netlink/policy.c index 2b3e26f7496f..7b1f50531cd3 100644 --- a/net/netlink/policy.c +++ b/net/netlink/policy.c @@ -254,12 +254,6 @@ send_attribute: pt->bitfield32_valid)) goto nla_put_failure; break; - case NLA_EXACT_LEN: - type = NL_ATTR_TYPE_BINARY; - if (nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MIN_LENGTH, pt->len) || - nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MAX_LENGTH, pt->len)) - goto nla_put_failure; - break; case NLA_STRING: case NLA_NUL_STRING: case NLA_BINARY: @@ -269,14 +263,26 @@ send_attribute: type = NL_ATTR_TYPE_NUL_STRING; else type = NL_ATTR_TYPE_BINARY; - if (pt->len && nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MAX_LENGTH, - pt->len)) - goto nla_put_failure; - break; - case NLA_MIN_LEN: - type = NL_ATTR_TYPE_BINARY; - if (nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MIN_LENGTH, pt->len)) + + if (pt->validation_type != NLA_VALIDATE_NONE) { + struct netlink_range_validation range; + + nla_get_range_unsigned(pt, &range); + + if (range.min && + nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MIN_LENGTH, + range.min)) + goto nla_put_failure; + + if (range.max < U16_MAX && + nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MAX_LENGTH, + range.max)) + goto nla_put_failure; + } else if (pt->len && + nla_put_u32(skb, NL_POLICY_TYPE_ATTR_MAX_LENGTH, + pt->len)) { goto nla_put_failure; + } break; case NLA_FLAG: type = NL_ATTR_TYPE_FLAG; diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 2c3619165680..47f9128ecb8f 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -1039,7 +1039,7 @@ drop: static const struct nla_policy ct_policy[TCA_CT_MAX + 1] = { [TCA_CT_ACTION] = { .type = NLA_U16 }, - [TCA_CT_PARMS] = { .type = NLA_EXACT_LEN, .len = sizeof(struct tc_ct) }, + [TCA_CT_PARMS] = NLA_POLICY_EXACT_LEN(sizeof(struct tc_ct)), [TCA_CT_ZONE] = { .type = NLA_U16 }, [TCA_CT_MARK] = { .type = NLA_U32 }, [TCA_CT_MARK_MASK] = { .type = NLA_U32 }, @@ -1049,10 +1049,8 @@ static const struct nla_policy ct_policy[TCA_CT_MAX + 1] = { .len = 128 / BITS_PER_BYTE }, [TCA_CT_NAT_IPV4_MIN] = { .type = NLA_U32 }, [TCA_CT_NAT_IPV4_MAX] = { .type = NLA_U32 }, - [TCA_CT_NAT_IPV6_MIN] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct in6_addr) }, - [TCA_CT_NAT_IPV6_MAX] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct in6_addr) }, + [TCA_CT_NAT_IPV6_MIN] = NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)), + [TCA_CT_NAT_IPV6_MAX] = NLA_POLICY_EXACT_LEN(sizeof(struct in6_addr)), [TCA_CT_NAT_PORT_MIN] = { .type = NLA_U16 }, [TCA_CT_NAT_PORT_MAX] = { .type = NLA_U16 }, }; diff --git a/net/sched/act_ctinfo.c b/net/sched/act_ctinfo.c index b5042f3ea079..e88fa19ea8a9 100644 --- a/net/sched/act_ctinfo.c +++ b/net/sched/act_ctinfo.c @@ -144,9 +144,8 @@ out: } static const struct nla_policy ctinfo_policy[TCA_CTINFO_MAX + 1] = { - [TCA_CTINFO_ACT] = { .type = NLA_EXACT_LEN, - .len = sizeof(struct - tc_ctinfo) }, + [TCA_CTINFO_ACT] = + NLA_POLICY_EXACT_LEN(sizeof(struct tc_ctinfo)), [TCA_CTINFO_ZONE] = { .type = NLA_U16 }, [TCA_CTINFO_PARMS_DSCP_MASK] = { .type = NLA_U32 }, [TCA_CTINFO_PARMS_DSCP_STATEMASK] = { .type = NLA_U32 }, diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c index 1fb8d428d2c1..f5dd4d1d274e 100644 --- a/net/sched/act_gate.c +++ b/net/sched/act_gate.c @@ -159,8 +159,8 @@ static const struct nla_policy entry_policy[TCA_GATE_ENTRY_MAX + 1] = { }; static const struct nla_policy gate_policy[TCA_GATE_MAX + 1] = { - [TCA_GATE_PARMS] = { .len = sizeof(struct tc_gate), - .type = NLA_EXACT_LEN }, + [TCA_GATE_PARMS] = + NLA_POLICY_EXACT_LEN(sizeof(struct tc_gate)), [TCA_GATE_PRIORITY] = { .type = NLA_S32 }, [TCA_GATE_ENTRY_LIST] = { .type = NLA_NESTED }, [TCA_GATE_BASE_TIME] = { .type = NLA_U64 }, diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 808b147df7d5..650414110452 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -652,7 +652,7 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt, test_and_set_bit_lock(0, &b->up); break; } - /* fall through */ + fallthrough; case NETDEV_GOING_DOWN: clear_bit_unlock(0, &b->up); tipc_reset_bearer(net, b); diff --git a/net/tipc/link.c b/net/tipc/link.c index 107578122973..b7362556da95 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1239,7 +1239,7 @@ static bool tipc_data_input(struct tipc_link *l, struct sk_buff *skb, skb_queue_tail(mc_inputq, skb); return true; } - /* fall through */ + fallthrough; case CONN_MANAGER: skb_queue_tail(inputq, skb); return true; diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 07419f36116a..2679e97e0389 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -783,7 +783,7 @@ static __poll_t tipc_poll(struct file *file, struct socket *sock, case TIPC_ESTABLISHED: if (!tsk->cong_link_cnt && !tsk_conn_cong(tsk)) revents |= EPOLLOUT; - /* fall through */ + fallthrough; case TIPC_LISTEN: case TIPC_CONNECTING: if (!skb_queue_empty_lockless(&sk->sk_receive_queue)) @@ -2597,7 +2597,7 @@ static int tipc_connect(struct socket *sock, struct sockaddr *dest, * case is EINPROGRESS, rather than EALREADY. */ res = -EINPROGRESS; - /* fall through */ + fallthrough; case TIPC_CONNECTING: if (!timeout) { if (previous == TIPC_CONNECTING) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index c04fc6cf6583..e45b95446ea0 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -654,10 +654,8 @@ static const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = { [NL80211_ATTR_RECEIVE_MULTICAST] = { .type = NLA_FLAG }, [NL80211_ATTR_WIPHY_FREQ_OFFSET] = NLA_POLICY_RANGE(NLA_U32, 0, 999), [NL80211_ATTR_SCAN_FREQ_KHZ] = { .type = NLA_NESTED }, - [NL80211_ATTR_HE_6GHZ_CAPABILITY] = { - .type = NLA_EXACT_LEN, - .len = sizeof(struct ieee80211_he_6ghz_capa), - }, + [NL80211_ATTR_HE_6GHZ_CAPABILITY] = + NLA_POLICY_EXACT_LEN(sizeof(struct ieee80211_he_6ghz_capa)), }; /* policy for the key attributes */ @@ -703,7 +701,7 @@ nl80211_wowlan_tcp_policy[NUM_NL80211_WOWLAN_TCP] = { [NL80211_WOWLAN_TCP_DST_MAC] = NLA_POLICY_EXACT_LEN_WARN(ETH_ALEN), [NL80211_WOWLAN_TCP_SRC_PORT] = { .type = NLA_U16 }, [NL80211_WOWLAN_TCP_DST_PORT] = { .type = NLA_U16 }, - [NL80211_WOWLAN_TCP_DATA_PAYLOAD] = { .type = NLA_MIN_LEN, .len = 1 }, + [NL80211_WOWLAN_TCP_DATA_PAYLOAD] = NLA_POLICY_MIN_LEN(1), [NL80211_WOWLAN_TCP_DATA_PAYLOAD_SEQ] = { .len = sizeof(struct nl80211_wowlan_tcp_data_seq) }, @@ -711,8 +709,8 @@ nl80211_wowlan_tcp_policy[NUM_NL80211_WOWLAN_TCP] = { .len = sizeof(struct nl80211_wowlan_tcp_data_token) }, [NL80211_WOWLAN_TCP_DATA_INTERVAL] = { .type = NLA_U32 }, - [NL80211_WOWLAN_TCP_WAKE_PAYLOAD] = { .type = NLA_MIN_LEN, .len = 1 }, - [NL80211_WOWLAN_TCP_WAKE_MASK] = { .type = NLA_MIN_LEN, .len = 1 }, + [NL80211_WOWLAN_TCP_WAKE_PAYLOAD] = NLA_POLICY_MIN_LEN(1), + [NL80211_WOWLAN_TCP_WAKE_MASK] = NLA_POLICY_MIN_LEN(1), }; #endif /* CONFIG_PM */ diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c index a61b7b3da549..00f837c9bc6c 100644 --- a/tools/testing/selftests/net/tcp_mmap.c +++ b/tools/testing/selftests/net/tcp_mmap.c @@ -123,6 +123,28 @@ void hash_zone(void *zone, unsigned int length) #define ALIGN_UP(x, align_to) (((x) + ((align_to)-1)) & ~((align_to)-1)) #define ALIGN_PTR_UP(p, ptr_align_to) ((typeof(p))ALIGN_UP((unsigned long)(p), ptr_align_to)) + +static void *mmap_large_buffer(size_t need, size_t *allocated) +{ + void *buffer; + size_t sz; + + /* Attempt to use huge pages if possible. */ + sz = ALIGN_UP(need, map_align); + buffer = mmap(NULL, sz, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); + + if (buffer == (void *)-1) { + sz = need; + buffer = mmap(NULL, sz, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (buffer != (void *)-1) + fprintf(stderr, "MAP_HUGETLB attempt failed, look at /sys/kernel/mm/hugepages for optimal performance\n"); + } + *allocated = sz; + return buffer; +} + void *child_thread(void *arg) { unsigned long total_mmap = 0, total = 0; @@ -135,6 +157,7 @@ void *child_thread(void *arg) void *addr = NULL; double throughput; struct rusage ru; + size_t buffer_sz; int lu, fd; fd = (int)(unsigned long)arg; @@ -142,9 +165,9 @@ void *child_thread(void *arg) gettimeofday(&t0, NULL); fcntl(fd, F_SETFL, O_NDELAY); - buffer = malloc(chunk_size); - if (!buffer) { - perror("malloc"); + buffer = mmap_large_buffer(chunk_size, &buffer_sz); + if (buffer == (void *)-1) { + perror("mmap"); goto error; } if (zflg) { @@ -179,6 +202,10 @@ void *child_thread(void *arg) total_mmap += zc.length; if (xflg) hash_zone(addr, zc.length); + /* It is more efficient to unmap the pages right now, + * instead of doing this in next TCP_ZEROCOPY_RECEIVE. + */ + madvise(addr, zc.length, MADV_DONTNEED); total += zc.length; } if (zc.recv_skip_hint) { @@ -230,7 +257,7 @@ end: ru.ru_nvcsw); } error: - free(buffer); + munmap(buffer, buffer_sz); close(fd); if (zflg) munmap(raddr, chunk_size + map_align); @@ -347,6 +374,7 @@ int main(int argc, char *argv[]) uint64_t total = 0; char *host = NULL; int fd, c, on = 1; + size_t buffer_sz; char *buffer; int sflg = 0; int mss = 0; @@ -437,8 +465,8 @@ int main(int argc, char *argv[]) } do_accept(fdlisten); } - buffer = mmap(NULL, chunk_size, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + + buffer = mmap_large_buffer(chunk_size, &buffer_sz); if (buffer == (char *)-1) { perror("mmap"); exit(1); @@ -484,6 +512,6 @@ int main(int argc, char *argv[]) total += wr; } close(fd); - munmap(buffer, chunk_size); + munmap(buffer, buffer_sz); return 0; } |