summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* tcp: refactor to consolidate TFO passive open codeYuchung Cheng2019-05-011-27/+25
| | | | | | | | | | Use a helper to consolidate two identical code block for passive TFO. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: undo cwnd on Fast Open spurious SYNACK retransmitYuchung Cheng2019-05-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This patch makes passive Fast Open reverts the cwnd to default initial cwnd (10 packets) if the SYNACK timeout is spurious. Passive Fast Open uses a full socket during handshake so it can use the existing undo logic to detect spurious retransmission by recording the first SYNACK timeout in key state variable retrans_stamp. Upon receiving the ACK of the SYNACK, if the socket has sent some data before the timeout, the spurious timeout is detected by tcp_try_undo_recovery() in tcp_process_loss() in tcp_ack(). But if the socket has not send any data yet, tcp_ack() does not execute the undo code since no data is acknowledged. The fix is to check such case explicitly after tcp_ack() during the ACK processing in SYN_RECV state. In addition this is checked in FIN_WAIT_1 state in case the server closes the socket before handshake completes. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: lower congestion window on Fast Open SYNACK timeoutYuchung Cheng2019-05-011-0/+3
| | | | | | | | | | | | | | | | | | TCP sender would use congestion window of 1 packet on the second SYN and SYNACK timeout except passive TCP Fast Open. This makes passive TFO too aggressive and unfair during congestion at handshake. This patch fixes this issue so TCP (fast open or not, passive or active) always conforms to the RFC6298. Note that tcp_enter_loss() is called only once during recurring timeouts. This is because during handshake, high_seq and snd_una are the same so tcp_enter_loss() would incorrect set the undo state variables multiple times. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: undo init congestion window on false SYNACK timeoutYuchung Cheng2019-05-012-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux implements RFC6298 and use an initial congestion window of 1 upon establishing the connection if the SYNACK packet is retransmitted 2 or more times. In cellular networks SYNACK timeouts are often spurious if the wireless radio was dormant or idle. Also some network path is longer than the default SYNACK timeout. In both cases falsely starting with a minimal cwnd are detrimental to performance. This patch avoids doing so when the final ACK's TCP timestamp indicates the original SYNACK was delivered. It remembers the original SYNACK timestamp when SYNACK timeout has occurred and re-uses the function to detect spurious SYN timeout conveniently. Note that a server may receives multiple SYNs from and immediately retransmits SYNACKs without any SYNACK timeout. This often happens on when the client SYNs have timed out due to wireless delay above. In this case since the server will still use the default initial congestion (e.g. 10) because tp->undo_marker is reset in tcp_init_metrics(). This is an intentional design because packets are not lost but delayed. This patch only covers regular TCP passive open. Fast Open is supported in the next patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: better SYNACK sent timestampYuchung Cheng2019-05-012-1/+5
| | | | | | | | | | | | | Detecting spurious SYNACK timeout using timestamp option requires recording the exact SYNACK skb timestamp. Previously the SYNACK sent timestamp was stamped slightly earlier before the skb was transmitted. This patch uses the SYNACK skb transmission timestamp directly. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: undo initial congestion window on false SYN timeoutYuchung Cheng2019-05-012-1/+17
| | | | | | | | | | | | | | | | | | | | | | | Linux implements RFC6298 and use an initial congestion window of 1 upon establishing the connection if the SYN packet is retransmitted 2 or more times. In cellular networks SYN timeouts are often spurious if the wireless radio was dormant or idle. Also some network path is longer than the default SYN timeout. Having a minimal cwnd on both cases are detrimental to TCP startup performance. This patch extends TCP undo feature (RFC3522 aka TCP Eifel) to detect spurious SYN timeout via TCP timestamps. Since tp->retrans_stamp records the initial SYN timestamp instead of first retransmission, we have to implement a different undo code additionally. The detection also must happen before tcp_ack() as retrans_stamp is reset when SYN is acknowledged. Note this patch covers both active regular and fast open. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: avoid unconditional congestion window undo on SYN retransmitYuchung Cheng2019-05-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously if an active TCP open has SYN timeout, it always undo the cwnd upon receiving the SYNACK. This is because tcp_clean_rtx_queue would reset tp->retrans_stamp when SYN is acked, which fools then tcp_try_undo_loss and tcp_packet_delayed. Addressing this issue is required to properly support undo for spurious SYN timeout. Fixing this is tricky -- for active TCP open tp->retrans_stamp records the time when the handshake starts, not the first retransmission time as the name may suggest. The simplest fix is for tcp_packet_delayed to ensure it is valid before comparing with other timestamp. One side effect of this change is active TCP Fast Open that incurred SYN timeout. Upon receiving a SYN-ACK that only acknowledged the SYN, it would immediately retransmit unacknowledged data in tcp_ack() because the data is marked lost after SYN timeout. But the retransmission would have an incorrect ack sequence number since rcv_nxt has not been updated yet tcp_rcv_synsent_state_process(), the retransmission needs to properly handed by tcp_rcv_fastopen_synack() like before. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* devlink: Change devlink health locking mechanismMoshe Shemesh2019-05-011-23/+74
| | | | | | | | | | | | | | | | | | | | The devlink health reporters create/destroy and user commands currently use the devlink->lock as a locking mechanism. Different reporters have different rules in the driver and are being created/destroyed during different stages of driver load/unload/running. So during execution of a reporter recover the flow can go through another reporter's destroy and create. Such flow leads to deadlock trying to lock a mutex already held. With the new locking mechanism the different reporters share mutex lock only to protect access to shared reporters list. Added refcount per reporter, to protect the reporters from destroy while being used. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Remove legacy probing supportAndrew Lunn2019-04-305-774/+0
| | | | | | | | Now that all drivers can be probed using more traditional methods, remove the legacy probe code. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Add more convenient functions for installing port VLANsVladimir Oltean2019-04-303-21/+36
| | | | | | | | | | | | | | | | | | | | | | | | | This hides the need to perform a two-phase transaction and construct a switchdev_obj_port_vlan struct. Call graph (including a function that will be introduced in a follow-up patch) looks like this now (same for the *_vlan_del function): dsa_slave_vlan_rx_add_vid dsa_port_setup_8021q_tagging | | | | | +-------------+ | | v v dsa_port_vid_add dsa_slave_port_obj_add | | +-------+ +-------+ | | v v dsa_port_vlan_add Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Skip calling .port_vlan_filtering on no changeVladimir Oltean2019-04-301-0/+3
| | | | | | | | | | Even if VLAN filtering is global, DSA will call this callback once per each port. Drivers should not have to compare the global state with the requested change. So let DSA do it. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Keep the vlan_filtering setting in dsa_switch if it's globalVladimir Oltean2019-04-301-1/+4
| | | | | | | | | | | | | The current behavior is not as obvious as one would assume (which is that, if the driver set vlan_filtering_is_global = 1, then checking any dp->vlan_filtering would yield the same result). Only the ports which are actively enslaved into a bridge would have vlan_filtering set. This makes it tricky for drivers to check what the global state is. So fix this and make the struct dsa_switch hold this global setting. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Unset vlan_filtering when ports leave the bridgeVladimir Oltean2019-04-301-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | When ports are standalone (after they left the bridge), they should have no VLAN filtering semantics (they should pass all traffic to the CPU). Currently this is not true for switchdev drivers, because the bridge "forgets" to unset that. Normally one would think that doing this at the bridge layer would be a better idea, i.e. call br_vlan_filter_toggle() from br_del_if(), similar to how nbp_vlan_init() is called from br_add_if(). However what complicates that approach, and makes this one preferable, is the fact that for the bridge core, vlan_filtering is a per-bridge setting, whereas for switchdev/DSA it is per-port. Also there are switches where the setting is per the entire device, and unsetting vlan_filtering one by one, for each leaving port, would not be possible from the bridge core without a certain level of awareness. So do this in DSA and let drivers be unaware of it. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Be aware of switches where VLAN filtering is a global settingVladimir Oltean2019-04-301-7/+45
| | | | | | | | | | | | | | On some switches, the action of whether to parse VLAN frame headers and use that information for ingress admission is configurable, but not per port. Such is the case for the Broadcom BCM53xx and the NXP SJA1105 families, for example. In that case, DSA can prevent the bridge core from trying to apply different VLAN filtering settings on net devices that belong to the same switch. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Store vlan_filtering as a property of dsa_portVladimir Oltean2019-04-301-4/+8
| | | | | | | | | | This allows drivers to query the VLAN setting imposed by the bridge driver directly from DSA, instead of keeping their own state based on the .port_vlan_filtering callback. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: Fix pharse -> phase typoVladimir Oltean2019-04-301-1/+1
| | | | | | | Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2019-04-3030-1499/+1226
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-04-30 1) A lot of work to remove indirections from the xfrm code. From Florian Westphal. 2) Support ESP offload in combination with gso partial. From Boris Pismenny. 3) Remove some duplicated code from vti4. From Jeremy Sowden. Please note that there is merge conflict between commit: 8742dc86d0c7 ("xfrm4: Fix uninitialized memory read in _decode_session4") from the ipsec tree and commit: c53ac41e3720 ("xfrm: remove decode_session indirection from afinfo_policy") from the ipsec-next tree. The merge conflict will appear when those trees get merged during the merge window. The conflict can be solved as it is done in linux-next: https://lkml.org/lkml/2019/4/25/1207 Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm: remove unneeded export_symbolsFlorian Westphal2019-04-233-7/+4
| | | | | | | | | | | | | | None of them have any external callers, make them static. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove decode_session indirection from afinfo_policyFlorian Westphal2019-04-233-229/+222
| | | | | | | | | | | | | | | | No external dependencies, might as well handle this directly. xfrm_afinfo_policy is now 40 bytes on x86_64. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove init_path indirection from afinfo_policyFlorian Westphal2019-04-233-35/+7
| | | | | | | | | | | | | | handle this directly, its only used by ipv6. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove tos indirection from afinfo_policyFlorian Westphal2019-04-233-23/+3
| | | | | | | | | | | | | | Only used by ipv4, we can read the fl4 tos value directly instead. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: kconfig: make xfrm depend on inetFlorian Westphal2019-04-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when CONFIG_INET is not enabled: net/xfrm/xfrm_output.c: In function ‘xfrm4_tunnel_encap_add’: net/xfrm/xfrm_output.c:234:2: error: implicit declaration of function ‘ip_select_ident’ [-Werror=implicit-function-declaration] ip_select_ident(dev_net(dst->dev), skb, NULL); XFRM only supports ipv4 and ipv6 so change dependency to INET and place user-visible options (pfkey sockets, migrate support and the like) under 'if INET' guard as well. Fixes: 1de70830066b7 ("xfrm: remove output2 indirection from xfrm_mode") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: store xfrm_mode directly, not its addressFlorian Westphal2019-04-0812-38/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This structure is now only 4 bytes, so its more efficient to cache a copy rather than its address. No significant size difference in allmodconfig vmlinux. With non-modular kernel that has all XFRM options enabled, this series reduces vmlinux image size by ~11kb. All xfrm_mode indirections are gone and all modes are built-in. before (ipsec-next master): text data bss dec filename 21071494 7233140 11104324 39408958 vmlinux.master after this series: 21066448 7226772 11104324 39397544 vmlinux.patched With allmodconfig kernel, the size increase is only 362 bytes, even all the xfrm config options removed in this series are modular. before: text data bss dec filename 15731286 6936912 4046908 26715106 vmlinux.master after this series: 15731492 6937068 4046908 26715468 vmlinux Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: make xfrm modes builtinFlorian Westphal2019-04-0818-482/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | after previous changes, xfrm_mode contains no function pointers anymore and all modules defining such struct contain no code except an init/exit functions to register the xfrm_mode struct with the xfrm core. Just place the xfrm modes core and remove the modules, the run-time xfrm_mode register/unregister functionality is removed. Before: text data bss dec filename 7523 200 2364 10087 net/xfrm/xfrm_input.o 40003 628 440 41071 net/xfrm/xfrm_state.o 15730338 6937080 4046908 26714326 vmlinux 7389 200 2364 9953 net/xfrm/xfrm_input.o 40574 656 440 41670 net/xfrm/xfrm_state.o 15730084 6937068 4046908 26714060 vmlinux The xfrm*_mode_{transport,tunnel,beet} modules are gone. v2: replace CONFIG_INET6_XFRM_MODE_* IS_ENABLED guards with CONFIG_IPV6 ones rather than removing them. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove afinfo pointer from xfrm_modeFlorian Westphal2019-04-086-13/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds an EXPORT_SYMBOL for afinfo_get_rcu, as it will now be called from ipv6 in case of CONFIG_IPV6=m. This change has virtually no effect on vmlinux size, but it reduces afinfo size and allows followup patch to make xfrm modes const. v2: mark if (afinfo) tests as likely (Sabrina) re-fetch afinfo according to inner_mode in xfrm_prepare_input(). Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove output2 indirection from xfrm_modeFlorian Westphal2019-04-085-211/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar to previous patch: no external module dependencies, so we can avoid the indirection by placing this in the core. This change removes the last indirection from xfrm_mode and the xfrm4|6_mode_{beet,tunnel}.c modules contain (almost) no code anymore. Before: text data bss dec hex filename 3957 136 0 4093 ffd net/xfrm/xfrm_output.o 587 44 0 631 277 net/ipv4/xfrm4_mode_beet.o 649 32 0 681 2a9 net/ipv4/xfrm4_mode_tunnel.o 625 44 0 669 29d net/ipv6/xfrm6_mode_beet.o 599 32 0 631 277 net/ipv6/xfrm6_mode_tunnel.o After: text data bss dec hex filename 5359 184 0 5543 15a7 net/xfrm/xfrm_output.o 171 24 0 195 c3 net/ipv4/xfrm4_mode_beet.o 171 24 0 195 c3 net/ipv4/xfrm4_mode_tunnel.o 172 24 0 196 c4 net/ipv6/xfrm6_mode_beet.o 172 24 0 196 c4 net/ipv6/xfrm6_mode_tunnel.o v2: fold the *encap_add functions into xfrm*_prepare_output preserve (move) output2 comment (Sabrina) use x->outer_mode->encap, not inner fix a build breakage on ppc (kbuild robot) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove input2 indirection from xfrm_modeFlorian Westphal2019-04-086-160/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No external dependencies on any module, place this in the core. Increase is about 1800 byte for xfrm_input.o. The beet helpers get added to internal header, as they can be reused from xfrm_output.c in the next patch (kernel contains several copies of them in the xfrm{4,6}_mode_beet.c files). Before: text data bss dec filename 5578 176 2364 8118 net/xfrm/xfrm_input.o 1180 64 0 1244 net/ipv4/xfrm4_mode_beet.o 171 40 0 211 net/ipv4/xfrm4_mode_transport.o 1163 40 0 1203 net/ipv4/xfrm4_mode_tunnel.o 1083 52 0 1135 net/ipv6/xfrm6_mode_beet.o 172 40 0 212 net/ipv6/xfrm6_mode_ro.o 172 40 0 212 net/ipv6/xfrm6_mode_transport.o 1056 40 0 1096 net/ipv6/xfrm6_mode_tunnel.o After: text data bss dec filename 7373 200 2364 9937 net/xfrm/xfrm_input.o 587 44 0 631 net/ipv4/xfrm4_mode_beet.o 171 32 0 203 net/ipv4/xfrm4_mode_transport.o 649 32 0 681 net/ipv4/xfrm4_mode_tunnel.o 625 44 0 669 net/ipv6/xfrm6_mode_beet.o 172 32 0 204 net/ipv6/xfrm6_mode_ro.o 172 32 0 204 net/ipv6/xfrm6_mode_transport.o 599 32 0 631 net/ipv6/xfrm6_mode_tunnel.o v2: pass inner_mode to xfrm_inner_mode_encap_remove to fix AF_UNSPEC selector breakage (bisected by Benedict Wong) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove gso_segment indirection from xfrm_modeFlorian Westphal2019-04-086-53/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These functions are small and we only have versions for tunnel and transport mode for ipv4 and ipv6 respectively. Just place the 'transport or tunnel' conditional in the protocol specific function instead of using an indirection. Before: 3226 12 0 3238 net/ipv4/esp4_offload.o 7004 492 0 7496 net/ipv4/ip_vti.o 3339 12 0 3351 net/ipv6/esp6_offload.o 11294 460 0 11754 net/ipv6/ip6_vti.o 1180 72 0 1252 net/ipv4/xfrm4_mode_beet.o 428 48 0 476 net/ipv4/xfrm4_mode_transport.o 1271 48 0 1319 net/ipv4/xfrm4_mode_tunnel.o 1083 60 0 1143 net/ipv6/xfrm6_mode_beet.o 172 48 0 220 net/ipv6/xfrm6_mode_ro.o 429 48 0 477 net/ipv6/xfrm6_mode_transport.o 1164 48 0 1212 net/ipv6/xfrm6_mode_tunnel.o 15730428 6937008 4046908 26714344 vmlinux After: 3461 12 0 3473 net/ipv4/esp4_offload.o 7000 492 0 7492 net/ipv4/ip_vti.o 3574 12 0 3586 net/ipv6/esp6_offload.o 11295 460 0 11755 net/ipv6/ip6_vti.o 1180 64 0 1244 net/ipv4/xfrm4_mode_beet.o 171 40 0 211 net/ipv4/xfrm4_mode_transport.o 1163 40 0 1203 net/ipv4/xfrm4_mode_tunnel.o 1083 52 0 1135 net/ipv6/xfrm6_mode_beet.o 172 40 0 212 net/ipv6/xfrm6_mode_ro.o 172 40 0 212 net/ipv6/xfrm6_mode_transport.o 1056 40 0 1096 net/ipv6/xfrm6_mode_tunnel.o 15730424 6937008 4046908 26714340 vmlinux Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove xmit indirection from xfrm_modeFlorian Westphal2019-04-085-55/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are only two versions (tunnel and transport). The ip/ipv6 versions are only differ in sizeof(iphdr) vs ipv6hdr. Place this in the core and use x->outer_mode->encap type to call the correct adjustment helper. Before: text data bss dec filename 15730311 6937008 4046908 26714227 vmlinux After: 15730428 6937008 4046908 26714344 vmlinux (about 117 byte increase) v2: use family from x->outer_mode, not inner Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove output indirection from xfrm_modeFlorian Westphal2019-04-0811-114/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Same is input indirection. Only exception: we need to export xfrm_outer_mode_output for pktgen. Increases size of vmlinux by about 163 byte: Before: text data bss dec filename 15730208 6936948 4046908 26714064 vmlinux After: 15730311 6937008 4046908 26714227 vmlinux xfrm_inner_extract_output has no more external callers, make it static. v2: add IS_ENABLED(IPV6) guard in xfrm6_prepare_output add two missing breaks in xfrm_outer_mode_output (Sabrina Dubroca) add WARN_ON_ONCE for 'call AF_INET6 related output function, but CONFIG_IPV6=n' case. make xfrm_inner_extract_output static Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: remove input indirection from xfrm_modeFlorian Westphal2019-04-087-55/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need for any indirection or abstraction here, both functions are pretty much the same and quite small, they also have no external dependencies. xfrm_prepare_input can then be made static. With allmodconfig build, size increase of vmlinux is 25 byte: Before: text data bss dec filename 15730207 6936924 4046908 26714039 vmlinux After: 15730208 6936948 4046908 26714064 vmlinux v2: Fix INET_XFRM_MODE_TRANSPORT name in is-enabled test (Sabrina Dubroca) change copied comment to refer to transport and network header, not skb->{h,nh}, which don't exist anymore. (Sabrina) make xfrm_prepare_input static (Eyal Birger) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: prefer family stored in xfrm_mode structFlorian Westphal2019-04-084-5/+5
| | | | | | | | | | | | | | | | | | Now that we have the family available directly in the xfrm_mode struct, we can use that and avoid one extra dereference. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: place af number into xfrm_mode structFlorian Westphal2019-04-088-48/+27
| | | | | | | | | | | | | | | | | | | | | | | | This will be useful to know if we're supposed to decode ipv4 or ipv6. While at it, make the unregister function return void, all module_exit functions did just BUG(); there is never a point in doing error checks if there is no way to handle such error. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * vti4: eliminated some duplicate code.Jeremy Sowden2019-03-241-38/+22
| | | | | | | | | | | | | | | | | | | | The ipip tunnel introduced in commit dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") largely duplicated the existing vti_input and vti_recv functions. Refactored to deduplicate the common code. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * xfrm: gso partial offload supportBoris Pismenny2019-03-242-3/+10
| | | | | | | | | | | | | | | | This patch introduces support for gso partial ESP offload. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | dsa: Allow tag drivers to be built as modulesAndrew Lunn2019-04-282-29/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the CONFIG symbols tristate and add help text. The broadcom and Microchip KSZ tag drivers support two different tagging protocols in one driver. Add a configuration option for the drivers, and then options to select the protocol. Create a submenu for the tagging drivers. Signed-off-by: Andrew Lunn <andrew@lunn.ch> v2: tab/space cleanup Help text wording NET_DSA_TAG_BRCM_COMMON and NET_DSA_TAG_KZS_COMMON hidden v3: More tabification Punctuation v4: trailler->trailer Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: tag_brcm: Avoid unused symbolsAndrew Lunn2019-04-281-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that the driver is compiled with both CONFIG_NET_DSA_TAG_BRCM and CONFIG_NET_DSA_TAG_BRCM_PREPEND disabled. This results in warnings about unused symbols. Add some conditional compilation to avoid this. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2 Reorder patch to before tag drivers can be modules Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Cleanup unneeded table and make tag structures staticAndrew Lunn2019-04-2811-76/+11
| | | | | | | | | | | | | | | | | | | | Now that tag drivers dynamically register, we don't need the static table. Remove it. This also means the tag driver structures can be made static. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Make use of the list of tag driversAndrew Lunn2019-04-281-5/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement the _get and _put functions to make use of the list of tag drivers. Also, trigger the loading of the module, based on the alias information. The _get function takes a reference on the tag driver, so it cannot be unloaded, and the _put function releases the reference. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2: Make tag_driver_register void Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add stub tag driver put methodAndrew Lunn2019-04-284-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | When a DSA switch driver is unloaded, the lock on the tag driver should be released so the module can be unloaded. Add the needed calls, but leave the actual release code as a stub. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2 Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Rename dsa_resolve_tag_protocol() to _get ready for lockingAndrew Lunn2019-04-284-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | dsa_resolve_tag_protocol() is used to find the tagging driver needed by a switch driver. When the tagging drivers become modules, it will be necassary to take a reference on the module to prevent it being unloaded. So rename this function to _get() to indicate it has some locking properties. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Register the none tagger opsAndrew Lunn2019-04-281-0/+7
| | | | | | | | | | | | | | | | | | | | The none tagger is special in that it does not live in a tag_*.c file, but is within the core. Register/unregister when DSA is loaded/unloaded. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Keep link list of tag driversAndrew Lunn2019-04-281-0/+28
| | | | | | | | | | | | | | | | | | | | | | Let the tag drivers register themselves with the DSA core, keeping them in a linked list. Signed-off-by: Andrew Lunn <andrew@lunn.ch> v2 Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add boilerplate helper to register DSA tag driver modulesAndrew Lunn2019-04-2810-2/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A DSA tag driver module will need to register the tag protocols it implements with the DSA core. Add macros containing this boiler plate. The registration/unregistration code is currently just a stub. A Later patch will add the real implementation. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2 Fix indent of #endif Rewrite to move list pointer into a new structure v3 Move kdoc next to macro Fix THIS_MODULE indentation Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add TAG protocol to tag opsAndrew Lunn2019-04-2810-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order that we can match the tagging protocol a switch driver request to the tagger, we need to know what protocol the tagger supports. Add this information to the ops structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2 More tag protocol to end of structure to keep hot members at the beginning. Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add MODULE_LICENSE to tag driversAndrew Lunn2019-04-289-0/+9
| | | | | | | | | | | | | | | | | | All the tag drivers are some variant of GPL. Add a MODULE_LICENSE() indicating this, so the drivers can later be compiled as modules. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add MODULE_ALIAS to taggers in preparation to become modulesAndrew Lunn2019-04-289-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When the tag drivers become modules, we will need to dynamically load them based on what the switch drivers need. Add aliases to map between the TAG protocol and the driver. In order to do this, we need the tag protocol number as something which the C pre-processor can stringinfy. Only the compiler knows the value of an enum, CPP cannot use them. So add #defines. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Move tagger name into its ops structureAndrew Lunn2019-04-2810-43/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than keep a list to map a tagger ops to a name, place the name into the ops structure. This removes the hard coded list, a step towards making the taggers more dynamic. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> v2: Move name to end of structure, keeping the hot entries at the beginning. Signed-off-by: David S. Miller <davem@davemloft.net>
* | dsa: Add SPDX header to tag drivers.Andrew Lunn2019-04-288-52/+8
| | | | | | | | | | | | | | | | Add an SPDX header, and remove the license boilerplate text. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2019-04-287-105/+1021
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2019-04-28 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Introduce BPF socket local storage map so that BPF programs can store private data they associate with a socket (instead of e.g. separate hash table), from Martin. 2) Add support for bpftool to dump BTF types. This is done through a new `bpftool btf dump` sub-command, from Andrii. 3) Enable BPF-based flow dissector for skb-less eth_get_headlen() calls which was currently not supported since skb was used to lookup netns, from Stanislav. 4) Add an opt-in interface for tracepoints to expose a writable context for attached BPF programs, used here for NBD sockets, from Matt. 5) BPF xadd related arm64 JIT fixes and scalability improvements, from Daniel. 6) Change the skb->protocol for bpf_skb_adjust_room() helper in order to support tunnels such as sit. Add selftests as well, from Willem. 7) Various smaller misc fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>