summaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAgeFilesLines
* netfilter: switch xt_copy_counters to sockptr_tChristoph Hellwig2020-07-244-21/+19
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: remove the unused user argument to do_update_countersChristoph Hellwig2020-07-241-5/+4
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/xfrm: switch xfrm_user_policy to sockptr_tChristoph Hellwig2020-07-243-5/+7
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_set_timeout to sockptr_tChristoph Hellwig2020-07-243-17/+18
| | | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_set_timeout to sockptr_tChristoph Hellwig2020-07-241-6/+9
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch sock_setbindtodevice to sockptr_tChristoph Hellwig2020-07-241-4/+3
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: switch copy_bpf_fprog_from_user to sockptr_tChristoph Hellwig2020-07-243-7/+9
| | | | | | | | Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpfilter: reject kernel addressesChristoph Hellwig2020-07-241-0/+4
| | | | | | | | | The bpfilter user mode helper processes the optval address using process_vm_readv. Don't send it kernel addresses fed under set_fs(KERNEL_DS) as that won't work. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/bpfilter: split __bpfilter_process_sockoptChristoph Hellwig2020-07-241-24/+27
| | | | | | | | | Split __bpfilter_process_sockopt into a low-level send request routine and the actual setsockopt hook to split the init time ping from the actual setsockopt processing. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpfilter: fix up a sparse annotationChristoph Hellwig2020-07-241-1/+1
| | | | | | | | | The __user doesn't make sense when casting to an integer type, just switch to a uintptr_t cast which also removes the need for the __force. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/sched: cls_flower: Add hash info to flow classificationAriel Levkovich2020-07-241-0/+16
| | | | | | | | | | Adding new cls flower keys for hash value and hash mask and dissect the hash info from the skb into the flow key towards flow classication. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/flow_dissector: add packet hash dissectionAriel Levkovich2020-07-241-0/+17
| | | | | | | | | Retreive a hash value from the SKB and store it in the dissector key for future matching. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: stop overriding master's ndo_get_phys_port_nameVladimir Oltean2020-07-232-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of this override is to give the user an indication of what the number of the CPU port is (in DSA, the CPU port is a hardware implementation detail and not a network interface capable of traffic). However, it has always failed (by design) at providing this information to the user in a reliable fashion. Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers"), the behavior was to only override this callback if it was not provided by the DSA master. That was its first failure: if the DSA master itself was a DSA port or a switchdev, then the user would not see the number of the CPU port in /sys/class/net/eth0/phys_port_name, but the number of the DSA master port within its respective physical switch. But that was actually ok in a way. The commit mentioned above changed that behavior, and now overrides the master's ndo_get_phys_port_name unconditionally. That comes with problems of its own, which are worse in a way. The idea is that it's typical for switchdev users to have udev rules for consistent interface naming. These are based, among other things, on the phys_port_name attribute. If we let the DSA switch at the bottom to start randomly overriding ndo_get_phys_port_name with its own CPU port, we basically lose any predictability in interface naming, or even uniqueness, for that matter. So, there are reasons to let DSA override the master's callback (to provide a consistent interface, a number which has a clear meaning and must not be interpreted according to context), and there are reasons to not let DSA override it (it breaks udev matching for the DSA master). But, there is an alternative method for users to retrieve the number of the CPU port of each DSA switch in the system: $ devlink port pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0 pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2 pci/0000:00:00.5/4: type notset flavour cpu port 4 spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0 spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1 spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2 spi/spi2.0/4: type notset flavour cpu port 4 spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0 spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1 spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2 spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3 spi/spi2.1/4: type notset flavour cpu port 4 So remove this duplicated, unreliable and troublesome method. From this patch on, the phys_port_name attribute of the DSA master will only contain information about itself (if at all). If the users need reliable information about the CPU port they're probably using devlink anyway. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup kzalloc callsTom Parkin2020-07-231-2/+2
| | | | | | | | | | | | Passing "sizeof(struct blah)" in kzalloc calls is less readable, potentially prone to future bugs if the type of the pointer is changed, and triggers checkpatch warnings. Tweak the kzalloc calls in l2tp which use this form to avoid the warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup netlink tunnel create address handlingTom Parkin2020-07-231-24/+33
| | | | | | | | | | | | | | | | | | When creating an L2TP tunnel using the netlink API, userspace must either pass a socket FD for the tunnel to use (for managed tunnels), or specify the tunnel source/destination address (for unmanaged tunnels). Since source/destination addresses may be AF_INET or AF_INET6, the l2tp netlink code has conditionally compiled blocks to support IPv6. Rather than embedding these directly into l2tp_nl_cmd_tunnel_create (where it makes the code difficult to read and confuses checkpatch to boot) split the handling of address-related attributes into a separate function. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup netlink send of tunnel address informationTom Parkin2020-07-231-56/+70
| | | | | | | | | | | | | l2tp_nl_tunnel_send has conditionally compiled code to support AF_INET6, which makes the code difficult to follow and triggers checkpatch warnings. Split the code out into functions to handle the AF_INET v.s. AF_INET6 cases, which both improves readability and resolves the checkpatch warnings. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: check socket address type in l2tp_dfs_seq_tunnel_showTom Parkin2020-07-231-3/+5
| | | | | | | | | | | | checkpatch warns about indentation and brace balancing around the conditionally compiled code for AF_INET6 support in l2tp_dfs_seq_tunnel_show. By adding another check on the socket address type we can make the code more readable while removing the checkpatch warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup unnecessary braces in if statementsTom Parkin2020-07-232-17/+12
| | | | | | | | These checks are all simple and don't benefit from extra braces to clarify intent. Remove them for easier-reading code. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup comparisons to NULLTom Parkin2020-07-236-48/+47
| | | | | | | | | | | | | | checkpatch warns about comparisons to NULL, e.g. CHECK: Comparison to NULL could be written "!rt" #474: FILE: net/l2tp/l2tp_ip.c:474: + if (rt == NULL) { These sort of comparisons are generally clearer and more readable the way checkpatch suggests, so update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/ncsi: use eth_zero_addr() to clear mac addressMiaohe Lin2020-07-231-1/+1
| | | | | | | Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: introduce and use mptcp_can_accept_new_subflow()Paolo Abeni2020-07-231-0/+7
| | | | | | | | | | So that we can easily perform some basic PM-related adimission checks before creating the child socket. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: use rsk_ops->send_reset()Paolo Abeni2020-07-231-1/+1
| | | | | | | | | | | | tcp_send_active_reset() is more prone to transient errors (memory allocation or xmit queue full): in stress conditions the kernel may drop the egress packet, and the client will be stuck. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: explicitly check for plain tcp rskPaolo Abeni2020-07-231-1/+1
| | | | | | | | | | | | When syncookie are in use, the TCP stack may feed into subflow_syn_recv_sock() plain TCP request sockets. We can't access mptcp_subflow_request_sock-specific fields on such sockets. Explicitly check the rsk ops to do safe accesses. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: cleanup subflow_finish_connect()Paolo Abeni2020-07-231-31/+25
| | | | | | | | | | | | The mentioned function has several unneeded branches, handle each case - MP_CAPABLE, MP_JOIN, fallback - under a single conditional and drop quite a bit of duplicate code. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: explicitly track the fully established statusPaolo Abeni2020-07-234-9/+31
| | | | | | | | | | | | | | | | | | | | | | Currently accepted msk sockets become established only after accept() returns the new sk to user-space. As MP_JOIN request are refused as per RFC spec on non fully established socket, the above causes mp_join self-tests instabilities. This change lets the msk entering the established status as soon as it receives the 3rd ack and propagates the first subflow fully established status on the msk socket. Finally we can change the subflow acceptance condition to take in account both the sock state and the msk fully established flag. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: mark as fallback even early onesPaolo Abeni2020-07-231-2/+9
| | | | | | | | | | | | | | | | | In the unlikely event of a failure at connect time, we currently clear the request_mptcp flag - so that the MPC handshake is not started at all, but the msk is not explicitly marked as fallback. This would lead to later insertion of wrong DSS options in the xmitted packets, in violation of RFC specs and possibly fooling the peer. Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: avoid data corruption on reinsertPaolo Abeni2020-07-231-1/+6
| | | | | | | | | | | | | | When updating a partially acked data fragment, we actually corrupt it. This is irrelevant till we send data on a single subflow, as retransmitted data, if any are discarded by the peer as duplicate, but it will cause data corruption as soon as we will start creating non backup subflows. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* subflow: always init 'rel_write_seq'Paolo Abeni2020-07-232-1/+1
| | | | | | | | | | | Currently we do not init the subflow write sequence for MP_JOIN subflows. This will cause bad mapping being generated as soon as we will use non backup subflow. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: avoid precidence issues in L2TP_SKB_CB macroTom Parkin2020-07-221-1/+1
| | | | | | | | checkpatch warned about the L2TP_SKB_CB macro's use of its argument: add braces to avoid the problem. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: line-break long function prototypesTom Parkin2020-07-221-2/+4
| | | | | | | | | | | | | In l2tp_core.c both l2tp_tunnel_create and l2tp_session_create take quite a number of arguments and have a correspondingly long prototype. This is both quite difficult to scan visually, and triggers checkpatch warnings. Add a line break to make these function prototypes more readable. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: prefer seq_puts for unformatted outputTom Parkin2020-07-221-2/+2
| | | | | | | | | checkpatch warns about use of seq_printf where seq_puts would do. Modify l2tp_debugfs accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: prefer using BIT macroTom Parkin2020-07-221-2/+2
| | | | | | | Use BIT(x) rather than (1<<x), reported by checkpatch.pl. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: add identifier name in function pointer prototypeTom Parkin2020-07-221-1/+1
| | | | | | | | | | | | Reported by checkpatch: "WARNING: function definition argument 'struct sock *' should also have an identifier name" Add an identifier name to help document the prototype. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup suspect code indentTom Parkin2020-07-221-2/+2
| | | | | | | | | | | l2tp_core has conditionally compiled code in l2tp_xmit_skb for IPv6 support. The structure of this code triggered a checkpatch warning due to incorrect indentation. Fix up the indentation to address the checkpatch warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup wonky alignment of line-broken function callsTom Parkin2020-07-223-8/+8
| | | | | | | | | Arguments should be aligned with the function call open parenthesis as per checkpatch. Tweak some function calls which were not aligned correctly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup difficult-to-read line breaksTom Parkin2020-07-222-44/+31
| | | | | | | | | | | | | Some l2tp code had line breaks which made the code more difficult to read. These were originally motivated by the 80-character line width coding guidelines, but were actually a negative from the perspective of trying to follow the code. Remove these linebreaks for clearer code, even if we do exceed 80 characters in width in some places. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup commentsTom Parkin2020-07-228-68/+47
| | | | | | | | | | | | | | | | | | | | | | Modify some l2tp comments to better adhere to kernel coding style, as reported by checkpatch.pl. Add descriptive comments for the l2tp per-net spinlocks to document their use. Fix an incorrect comment in l2tp_recv_common: RFC2661 section 5.4 states that: "The LNS controls enabling and disabling of sequence numbers by sending a data message with or without sequence numbers present at any time during the life of a session." l2tp handles this correctly in l2tp_recv_common, but the comment around the code was incorrect and confusing. Fix up the comment accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cleanup whitespace useTom Parkin2020-07-227-48/+48
| | | | | | | | | | | | Fix up various whitespace issues as reported by checkpatch.pl: * remove spaces around operators where appropriate, * add missing blank lines following declarations, * remove multiple blank lines, or trailing blank lines at the end of functions. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* devlink: Always use user_ptr[0] for devlink and simplify post_doitParav Pandit2020-07-221-94/+70
| | | | | | | | | | | | | | | | | | | | | | | Currently devlink instance is searched on all doit() operations. But it is optionally stored into user_ptr[0]. This requires rediscovering devlink again doing post_doit(). Few devlink commands related to port shared buffers needs 3 pointers (devlink, devlink_port, and devlink_sb) while executing doit commands. Though devlink pointer can be derived from the devlink_port during post_doit() operation when doit() callback has acquired devlink instance lock, relying on such scheme to access devlik pointer makes code very fragile. Hence, to avoid ambiguity in post_doit() and to avoid searching devlink instance again, simplify code by always storing devlink instance in user_ptr[0] and derive devlink_sb pointer in their respective callback routines. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* mptcp: zero token hash at creation time.Paolo Abeni2020-07-221-1/+1
| | | | | | | | | | | | Otherwise the 'chain_len' filed will carry random values, some token creation calls will fail due to excessive chain length, causing unexpected fallback to TCP. Fixes: 2c5ebd001d4f ("mptcp: refactor token container") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)Richard Sailer2020-07-221-0/+9
| | | | | | | | | | | | | This adds support for the SIOCOUTQ IOCTL to get the send buffer fill of a DCCP socket, like UDP and TCP sockets already have. Regarding the used data field: DCCP uses per packet sequence numbers, not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued is not used by DCCP and always 0, even in test on highly congested paths. Therefore this uses sk_wmem_alloc like in UDP. Signed-off-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa: of: Allow ethernet-ports as encapsulating nodeKurt Kanzenbach2020-07-221-2/+6
| | | | | | | | Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as encapsulating node as well. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: explicitly include <linux/compat.h> in net/core/sock.cChristoph Hellwig2020-07-221-0/+1
| | | | | | | | | The buildbot found a config where the header isn't already implicitly pulled in, so add an explicit include as well. Fixes: 8c918ffbbad4 ("net: remove compat_sock_common_{get,set}sockopt") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2020-07-229-103/+472
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-07-21 The following pull-request contains BPF updates for your *net-next* tree. We've added 46 non-merge commits during the last 6 day(s) which contain a total of 68 files changed, 4929 insertions(+), 526 deletions(-). The main changes are: 1) Run BPF program on socket lookup, from Jakub. 2) Introduce cpumap, from Lorenzo. 3) s390 JIT fixes, from Ilya. 4) teach riscv JIT to emit compressed insns, from Luke. 5) use build time computed BTF ids in bpf iter, from Yonghong. ==================== Purely independent overlapping changes in both filter.h and xdp.h Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf: net: Use precomputed btf_id for bpf iteratorsYonghong Song2020-07-214-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | One additional field btf_id is added to struct bpf_ctx_arg_aux to store the precomputed btf_ids. The btf_id is computed at build time with BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions. All existing bpf iterators are changed to used pre-compute btf_ids. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163403.1393551-1-yhs@fb.com
| * bpf: Make btf_sock_ids globalYonghong Song2020-07-211-28/+2
| | | | | | | | | | | | | | | | | | | | | | | | tcp and udp bpf_iter can reuse some socket ids in btf_sock_ids, so make it global. I put the extern definition in btf_ids.h as a central place so it can be easily discovered by developers. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163402.1393427-1-yhs@fb.com
| * bpf: Compute bpf_skc_to_*() helper socket btf ids at build timeYonghong Song2020-07-211-31/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, socket types (struct tcp_sock, udp_sock, etc.) used by bpf_skc_to_*() helpers are computed when vmlinux_btf is first built in the kernel. Commit 5a2798ab32ba ("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros") implemented a mechanism to compute btf_ids at kernel build time which can simplify kernel implementation and reduce runtime overhead by removing in-kernel btf_id calculation. This patch did exactly this, removing in-kernel btf_id computation and utilizing build-time btf_id computation. If CONFIG_DEBUG_INFO_BTF is not defined, BTF_ID_LIST will define an array with size of 5, which is not enough for btf_sock_ids. So define its own static array if CONFIG_DEBUG_INFO_BTF is not defined. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163358.1393023-1-yhs@fb.com
| * udp6: Run SK_LOOKUP BPF program on socket lookupJakub Sitnicki2020-07-171-9/+51
| | | | | | | | | | | | | | | | | | | | | | | | Same as for udp4, let BPF program override the socket lookup result, by selecting a receiving socket of its choice or failing the lookup, if no connected UDP socket matched packet 4-tuple. Suggested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200717103536.397595-11-jakub@cloudflare.com
| * udp6: Extract helper for selecting socket from reuseport groupJakub Sitnicki2020-07-171-11/+26
| | | | | | | | | | | | | | | | | | Prepare for calling into reuseport from __udp6_lib_lookup as well. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200717103536.397595-10-jakub@cloudflare.com
| * udp: Run SK_LOOKUP BPF program on socket lookupJakub Sitnicki2020-07-171-9/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following INET/TCP socket lookup changes, modify UDP socket lookup to let BPF program select a receiving socket before searching for a socket by destination address and port as usual. Lookup of connected sockets that match packet 4-tuple is unaffected by this change. BPF program runs, and potentially overrides the lookup result, only if a 4-tuple match was not found. Suggested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200717103536.397595-9-jakub@cloudflare.com