summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* net: optimize the sockptr_t for unified kernel/user address spacesChristoph Hellwig2020-07-242-7/+13
| | | | | | | | | | | | | | | For architectures like x86 and arm64 we don't need the separate bit to indicate that a pointer is a kernel pointer as the address spaces are unified. That way the sockptr_t can be reduced to a union of two pointers, which leads to nicer calling conventions. The only caveat is that we need to check that users don't pass in kernel address and thus gain access to kernel memory. Thus the USER_SOCKPTR helper is replaced with a init_user_sockptr function that does this check and returns an error if it fails. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: pass a sockptr_t into ->setsockoptChristoph Hellwig2020-07-2451-234/+218
| | | | | | | | | | | Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/tcp: switch do_tcp_setsockopt to sockptr_tChristoph Hellwig2020-07-241-18/+16
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/tcp: switch ->md5_parse to sockptr_tChristoph Hellwig2020-07-243-5/+6
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/udp: switch udp_lib_setsockopt to sockptr_tChristoph Hellwig2020-07-242-4/+6
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6: switch do_ipv6_setsockopt to sockptr_tChristoph Hellwig2020-07-241-33/+33
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6: factor out a ipv6_set_opt_hdr helperChristoph Hellwig2020-07-241-75/+75
| | | | | | | | Factour out a helper to set the IPv6 option headers from do_ipv6_setsockopt. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6: switch ipv6_flowlabel_opt to sockptr_tChristoph Hellwig2020-07-242-8/+10
| | | | | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Note that the get case is pretty weird in that it actually copies data back to userspace from setsockopt. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6: split up ipv6_flowlabel_optChristoph Hellwig2020-07-241-143/+166
| | | | | | | | Split ipv6_flowlabel_opt into a subfunction for each action and a small wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv6: switch ip6_mroute_setsockopt to sockptr_tChristoph Hellwig2020-07-242-9/+11
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv4: switch do_ip_setsockopt to sockptr_tChristoph Hellwig2020-07-241-35/+33
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv4: merge ip_options_get and ip_options_get_from_userChristoph Hellwig2020-07-242-35/+15
| | | | | | | Use the sockptr_t type to merge the versions. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ipv4: switch ip_mroute_setsockopt to sockptr_tChristoph Hellwig2020-07-242-8/+9
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpfilter: switch bpfilter_ip_set_sockopt to sockptr_tChristoph Hellwig2020-07-243-8/+9
| | | | | | | | This is mostly to prepare for cleaning up the callers, as bpfilter by design can't handle kernel pointers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: switch nf_setsockopt to sockptr_tChristoph Hellwig2020-07-249-64/+64
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: switch xt_copy_counters to sockptr_tChristoph Hellwig2020-07-244-21/+19
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: remove the unused user argument to do_update_countersChristoph Hellwig2020-07-241-5/+4
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/xfrm: switch xfrm_user_policy to sockptr_tChristoph Hellwig2020-07-243-5/+7
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_set_timeout to sockptr_tChristoph Hellwig2020-07-243-17/+18
| | | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_set_timeout to sockptr_tChristoph Hellwig2020-07-241-6/+9
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_setbindtodevice to sockptr_tChristoph Hellwig2020-07-241-4/+3
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch copy_bpf_fprog_from_user to sockptr_tChristoph Hellwig2020-07-243-7/+9
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpfilter: reject kernel addressesChristoph Hellwig2020-07-241-0/+4
| | | | | | | | | The bpfilter user mode helper processes the optval address using process_vm_readv. Don't send it kernel addresses fed under set_fs(KERNEL_DS) as that won't work. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/bpfilter: split __bpfilter_process_sockoptChristoph Hellwig2020-07-241-24/+27
| | | | | | | | | Split __bpfilter_process_sockopt into a low-level send request routine and the actual setsockopt hook to split the init time ping from the actual setsockopt processing. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpfilter: fix up a sparse annotationChristoph Hellwig2020-07-241-1/+1
| | | | | | | | | The __user doesn't make sense when casting to an integer type, just switch to a uintptr_t cast which also removes the need for the __force. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/sched: cls_flower: Add hash info to flow classificationAriel Levkovich2020-07-241-0/+16
| | | | | | | | | | Adding new cls flower keys for hash value and hash mask and dissect the hash info from the skb into the flow key towards flow classication. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/flow_dissector: add packet hash dissectionAriel Levkovich2020-07-241-0/+17
| | | | | | | | | Retreive a hash value from the SKB and store it in the dissector key for future matching. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: stop overriding master's ndo_get_phys_port_nameVladimir Oltean2020-07-232-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of this override is to give the user an indication of what the number of the CPU port is (in DSA, the CPU port is a hardware implementation detail and not a network interface capable of traffic). However, it has always failed (by design) at providing this information to the user in a reliable fashion. Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers"), the behavior was to only override this callback if it was not provided by the DSA master. That was its first failure: if the DSA master itself was a DSA port or a switchdev, then the user would not see the number of the CPU port in /sys/class/net/eth0/phys_port_name, but the number of the DSA master port within its respective physical switch. But that was actually ok in a way. The commit mentioned above changed that behavior, and now overrides the master's ndo_get_phys_port_name unconditionally. That comes with problems of its own, which are worse in a way. The idea is that it's typical for switchdev users to have udev rules for consistent interface naming. These are based, among other things, on the phys_port_name attribute. If we let the DSA switch at the bottom to start randomly overriding ndo_get_phys_port_name with its own CPU port, we basically lose any predictability in interface naming, or even uniqueness, for that matter. So, there are reasons to let DSA override the master's callback (to provide a consistent interface, a number which has a clear meaning and must not be interpreted according to context), and there are reasons to not let DSA override it (it breaks udev matching for the DSA master). But, there is an alternative method for users to retrieve the number of the CPU port of each DSA switch in the system: $ devlink port pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0 pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2 pci/0000:00:00.5/4: type notset flavour cpu port 4 spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0 spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1 spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2 spi/spi2.0/4: type notset flavour cpu port 4 spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0 spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1 spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2 spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3 spi/spi2.1/4: type notset flavour cpu port 4 So remove this duplicated, unreliable and troublesome method. From this patch on, the phys_port_name attribute of the DSA master will only contain information about itself (if at all). If the users need reliable information about the CPU port they're probably using devlink anyway. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup kzalloc callsTom Parkin2020-07-231-2/+2
| | | | | | | | | | | | Passing "sizeof(struct blah)" in kzalloc calls is less readable, potentially prone to future bugs if the type of the pointer is changed, and triggers checkpatch warnings. Tweak the kzalloc calls in l2tp which use this form to avoid the warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup netlink tunnel create address handlingTom Parkin2020-07-231-24/+33
| | | | | | | | | | | | | | | | | | When creating an L2TP tunnel using the netlink API, userspace must either pass a socket FD for the tunnel to use (for managed tunnels), or specify the tunnel source/destination address (for unmanaged tunnels). Since source/destination addresses may be AF_INET or AF_INET6, the l2tp netlink code has conditionally compiled blocks to support IPv6. Rather than embedding these directly into l2tp_nl_cmd_tunnel_create (where it makes the code difficult to read and confuses checkpatch to boot) split the handling of address-related attributes into a separate function. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup netlink send of tunnel address informationTom Parkin2020-07-231-56/+70
| | | | | | | | | | | | | l2tp_nl_tunnel_send has conditionally compiled code to support AF_INET6, which makes the code difficult to follow and triggers checkpatch warnings. Split the code out into functions to handle the AF_INET v.s. AF_INET6 cases, which both improves readability and resolves the checkpatch warnings. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: check socket address type in l2tp_dfs_seq_tunnel_showTom Parkin2020-07-231-3/+5
| | | | | | | | | | | | checkpatch warns about indentation and brace balancing around the conditionally compiled code for AF_INET6 support in l2tp_dfs_seq_tunnel_show. By adding another check on the socket address type we can make the code more readable while removing the checkpatch warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup unnecessary braces in if statementsTom Parkin2020-07-232-17/+12
| | | | | | | | These checks are all simple and don't benefit from extra braces to clarify intent. Remove them for easier-reading code. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup comparisons to NULLTom Parkin2020-07-236-48/+47
| | | | | | | | | | | | | | checkpatch warns about comparisons to NULL, e.g. CHECK: Comparison to NULL could be written "!rt" #474: FILE: net/l2tp/l2tp_ip.c:474: + if (rt == NULL) { These sort of comparisons are generally clearer and more readable the way checkpatch suggests, so update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ncsi: use eth_zero_addr() to clear mac addressMiaohe Lin2020-07-231-1/+1
| | | | | | | Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: introduce and use mptcp_can_accept_new_subflow()Paolo Abeni2020-07-231-0/+7
| | | | | | | | | | So that we can easily perform some basic PM-related adimission checks before creating the child socket. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: use rsk_ops->send_reset()Paolo Abeni2020-07-231-1/+1
| | | | | | | | | | | | tcp_send_active_reset() is more prone to transient errors (memory allocation or xmit queue full): in stress conditions the kernel may drop the egress packet, and the client will be stuck. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: explicitly check for plain tcp rskPaolo Abeni2020-07-231-1/+1
| | | | | | | | | | | | When syncookie are in use, the TCP stack may feed into subflow_syn_recv_sock() plain TCP request sockets. We can't access mptcp_subflow_request_sock-specific fields on such sockets. Explicitly check the rsk ops to do safe accesses. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: cleanup subflow_finish_connect()Paolo Abeni2020-07-231-31/+25
| | | | | | | | | | | | The mentioned function has several unneeded branches, handle each case - MP_CAPABLE, MP_JOIN, fallback - under a single conditional and drop quite a bit of duplicate code. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: explicitly track the fully established statusPaolo Abeni2020-07-234-9/+31
| | | | | | | | | | | | | | | | | | | | | | Currently accepted msk sockets become established only after accept() returns the new sk to user-space. As MP_JOIN request are refused as per RFC spec on non fully established socket, the above causes mp_join self-tests instabilities. This change lets the msk entering the established status as soon as it receives the 3rd ack and propagates the first subflow fully established status on the msk socket. Finally we can change the subflow acceptance condition to take in account both the sock state and the msk fully established flag. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: mark as fallback even early onesPaolo Abeni2020-07-231-2/+9
| | | | | | | | | | | | | | | | | In the unlikely event of a failure at connect time, we currently clear the request_mptcp flag - so that the MPC handshake is not started at all, but the msk is not explicitly marked as fallback. This would lead to later insertion of wrong DSS options in the xmitted packets, in violation of RFC specs and possibly fooling the peer. Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: avoid data corruption on reinsertPaolo Abeni2020-07-231-1/+6
| | | | | | | | | | | | | | When updating a partially acked data fragment, we actually corrupt it. This is irrelevant till we send data on a single subflow, as retransmitted data, if any are discarded by the peer as duplicate, but it will cause data corruption as soon as we will start creating non backup subflows. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: always init 'rel_write_seq'Paolo Abeni2020-07-232-1/+1
| | | | | | | | | | | Currently we do not init the subflow write sequence for MP_JOIN subflows. This will cause bad mapping being generated as soon as we will use non backup subflow. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: avoid precidence issues in L2TP_SKB_CB macroTom Parkin2020-07-221-1/+1
| | | | | | | | checkpatch warned about the L2TP_SKB_CB macro's use of its argument: add braces to avoid the problem. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: line-break long function prototypesTom Parkin2020-07-221-2/+4
| | | | | | | | | | | | | In l2tp_core.c both l2tp_tunnel_create and l2tp_session_create take quite a number of arguments and have a correspondingly long prototype. This is both quite difficult to scan visually, and triggers checkpatch warnings. Add a line break to make these function prototypes more readable. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: prefer seq_puts for unformatted outputTom Parkin2020-07-221-2/+2
| | | | | | | | | checkpatch warns about use of seq_printf where seq_puts would do. Modify l2tp_debugfs accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: prefer using BIT macroTom Parkin2020-07-221-2/+2
| | | | | | | Use BIT(x) rather than (1<<x), reported by checkpatch.pl. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: add identifier name in function pointer prototypeTom Parkin2020-07-221-1/+1
| | | | | | | | | | | | Reported by checkpatch: "WARNING: function definition argument 'struct sock *' should also have an identifier name" Add an identifier name to help document the prototype. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup suspect code indentTom Parkin2020-07-221-2/+2
| | | | | | | | | | | l2tp_core has conditionally compiled code in l2tp_xmit_skb for IPv6 support. The structure of this code triggered a checkpatch warning due to incorrect indentation. Fix up the indentation to address the checkpatch warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup wonky alignment of line-broken function callsTom Parkin2020-07-223-8/+8
| | | | | | | | | Arguments should be aligned with the function call open parenthesis as per checkpatch. Tweak some function calls which were not aligned correctly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>