summaryrefslogtreecommitdiffstats
path: root/net/bpf
Commit message (Collapse)AuthorAgeFilesLines
* bpf: fix warning about using plain integer as NULLBo YU2019-03-081-1/+1
| | | | | | | | | | | | | | Sparse warning below: sudo make C=2 CF=-D__CHECK_ENDIAN__ M=net/bpf/ CHECK net/bpf//test_run.c net/bpf//test_run.c:19:77: warning: Using plain integer as NULL pointer ./include/linux/bpf-cgroup.h:295:77: warning: Using plain integer as NULL pointer Fixes: 8bad74f9840f ("bpf: extend cgroup bpf core to allow multiple cgroup storage types") Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2019-03-041-6/+20
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2019-03-04 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add AF_XDP support to libbpf. Rationale is to facilitate writing AF_XDP applications by offering higher-level APIs that hide many of the details of the AF_XDP uapi. Sample programs are converted over to this new interface as well, from Magnus. 2) Introduce a new cant_sleep() macro for annotation of functions that cannot sleep and use it in BPF_PROG_RUN() to assert that BPF programs run under preemption disabled context, from Peter. 3) Introduce per BPF prog stats in order to monitor the usage of BPF; this is controlled by kernel.bpf_stats_enabled sysctl knob where monitoring tools can make use of this to efficiently determine the average cost of programs, from Alexei. 4) Split up BPF selftest's test_progs similarly as we already did with test_verifier. This allows to further reduce merge conflicts in future and to get more structure into our quickly growing BPF selftest suite, from Stanislav. 5) Fix a bug in BTF's dedup algorithm which can cause an infinite loop in some circumstances; also various BPF doc fixes and improvements, from Andrii. 6) Various BPF sample cleanups and migration to libbpf in order to further isolate the old sample loader code (so we can get rid of it at some point), from Jakub. 7) Add a new BPF helper for BPF cgroup skb progs that allows to set ECN CE code point and a Host Bandwidth Manager (HBM) sample program for limiting the bandwidth used by v2 cgroups, from Lawrence. 8) Enable write access to skb->queue_mapping from tc BPF egress programs in order to let BPF pick TX queue, from Jesper. 9) Fix a bug in BPF spinlock handling for map-in-map which did not propagate spin_lock_off to the meta map, from Yonghong. 10) Fix a bug in the new per-CPU BPF prog counters to properly initialize stats for each CPU, from Eric. 11) Add various BPF helper prototypes to selftest's bpf_helpers.h, from Willem. 12) Fix various BPF samples bugs in XDP and tracing progs, from Toke, Daniel and Yonghong. 13) Silence preemption splat in test_bpf after BPF_PROG_RUN() enforces it now everywhere, from Anders. 14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to get error handling working, from Dan. 15) Fix bpftool documentation and auto-completion with regards to stream_{verdict,parser} attach types, from Alban. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf/test_run: fix unkillable BPF_PROG_TEST_RUN for flow dissectorStanislav Fomichev2019-02-251-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff makes process unkillable. The problem is that when CONFIG_PREEMPT is enabled, we never see need_resched() return true. This is due to the fact that preempt_enable() (which we do in bpf_test_run_one on each iteration) now handles resched if it's needed. Let's disable preemption for the whole run, not per test. In this case we can properly see whether resched is needed. Let's also properly return -EINTR to the userspace in case of a signal interrupt. This is a follow up for a recently fixed issue in bpf_test_run, see commit df1a2cb7c74b ("bpf/test_run: fix unkillable BPF_PROG_TEST_RUN"). Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2019-02-241-21/+24
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf/test_run: fix unkillable BPF_PROG_TEST_RUNStanislav Fomichev2019-02-191-21/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff makes process unkillable. The problem is that when CONFIG_PREEMPT is enabled, we never see need_resched() return true. This is due to the fact that preempt_enable() (which we do in bpf_test_run_one on each iteration) now handles resched if it's needed. Let's disable preemption for the whole run, not per test. In this case we can properly see whether resched is needed. Let's also properly return -EINTR to the userspace in case of a signal interrupt. See recent discussion: http://lore.kernel.org/netdev/CAH3MdRWHr4N8jei8jxDppXjmw-Nw=puNDLbu1dQOFQHxfU2onA@mail.gmail.com I'll follow up with the same fix bpf_prog_test_run_flow_dissector in bpf-next. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | bpf: add BPF_PROG_TEST_RUN support for flow dissectorStanislav Fomichev2019-01-291-0/+82
|/ | | | | | | | | | The input is packet data, the output is struct bpf_flow_key. This should make it easy to test flow dissector programs without elaborate setup. Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-12-101-2/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-11 The following pull-request contains BPF updates for your *net-next* tree. It has three minor merge conflicts, resolutions: 1) tools/testing/selftests/bpf/test_verifier.c Take first chunk with alignment_prevented_execution. 2) net/core/filter.c [...] case bpf_ctx_range_ptr(struct __sk_buff, flow_keys): case bpf_ctx_range(struct __sk_buff, wire_len): return false; [...] 3) include/uapi/linux/bpf.h Take the second chunk for the two cases each. The main changes are: 1) Add support for BPF line info via BTF and extend libbpf as well as bpftool's program dump to annotate output with BPF C code to facilitate debugging and introspection, from Martin. 2) Add support for BPF_ALU | BPF_ARSH | BPF_{K,X} in interpreter and all JIT backends, from Jiong. 3) Improve BPF test coverage on archs with no efficient unaligned access by adding an "any alignment" flag to the BPF program load to forcefully disable verifier alignment checks, from David. 4) Add a new bpf_prog_test_run_xattr() API to libbpf which allows for proper use of BPF_PROG_TEST_RUN with data_out, from Lorenz. 5) Extend tc BPF programs to use a new __sk_buff field called wire_len for more accurate accounting of packets going to wire, from Petar. 6) Improve bpftool to allow dumping the trace pipe from it and add several improvements in bash completion and map/prog dump, from Quentin. 7) Optimize arm64 BPF JIT to always emit movn/movk/movk sequence for kernel addresses and add a dedicated BPF JIT backend allocator, from Ard. 8) Add a BPF helper function for IR remotes to report mouse movements, from Sean. 9) Various cleanups in BPF prog dump e.g. to make UAPI bpf_prog_info member naming consistent with existing conventions, from Yonghong and Song. 10) Misc cleanups and improvements in allowing to pass interface name via cmdline for xdp1 BPF example, from Matteo. 11) Fix a potential segfault in BPF sample loader's kprobes handling, from Daniel T. 12) Fix SPDX license in libbpf's README.rst, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bpf: respect size hint to BPF_PROG_TEST_RUN if presentLorenz Bauer2018-12-041-2/+13
| | | | | | | | | | | | | | | | | | Use data_size_out as a size hint when copying test output to user space. ENOSPC is returned if the output buffer is too small. Callers which so far did not set data_size_out are not affected. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* | bpf: refactor bpf_test_run() to separate own failures and test program resultRoman Gushchin2018-12-011-6/+15
|/ | | | | | | | | | | | | | | | | | | After commit f42ee093be29 ("bpf/test_run: support cgroup local storage") the bpf_test_run() function may fail with -ENOMEM, if it's not possible to allocate memory for a cgroup local storage. This error shouldn't be mixed with the return value of the testing program. Let's add an additional argument with a pointer where to store the testing program's result; and make bpf_test_run() return either 0 or -ENOMEM. Fixes: f42ee093be29 ("bpf/test_run: support cgroup local storage") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* bpf: add tests for direct packet access from CGROUP_SKBSong Liu2018-10-191-0/+15
| | | | | | | | | | | | | | | | Tests are added to make sure CGROUP_SKB cannot access: tc_classid, data_meta, flow_keys and can read and write: mark, prority, and cb[0-4] and can read other fields. To make selftest with skb->sk work, a dummy sk is added in bpf_prog_test_run_skb(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* bpf: extend cgroup bpf core to allow multiple cgroup storage typesRoman Gushchin2018-10-011-6/+14
| | | | | | | | | | | | | | | | | | | In order to introduce per-cpu cgroup storage, let's generalize bpf cgroup core to support multiple cgroup storage types. Potentially, per-node cgroup storage can be added later. This commit is mostly a formal change that replaces cgroup_storage pointer with a array of cgroup_storage pointers. It doesn't actually introduce a new storage type, it will be done later. Each bpf program is now able to have one cgroup storage of each type. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf/test_run: support cgroup local storageRoman Gushchin2018-08-031-2/+11
| | | | | | | | | | | | | | | | | | Allocate a temporary cgroup storage to use for bpf program test runs. Because the test program is not actually attached to a cgroup, the storage is allocated manually just for the execution of the bpf program. If the program is executed multiple times, the storage is not zeroed on each run, emulating multiple runs of the program, attached to a real cgroup. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: fix panic due to oob in bpf_prog_test_run_skbDaniel Borkmann2018-07-111-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sykzaller triggered several panics similar to the below: [...] [ 248.851531] BUG: KASAN: use-after-free in _copy_to_user+0x5c/0x90 [ 248.857656] Read of size 985 at addr ffff8808017ffff2 by task a.out/1425 [...] [ 248.865902] CPU: 1 PID: 1425 Comm: a.out Not tainted 4.18.0-rc4+ #13 [ 248.865903] Hardware name: Supermicro SYS-5039MS-H12TRF/X11SSE-F, BIOS 2.1a 03/08/2018 [ 248.865905] Call Trace: [ 248.865910] dump_stack+0xd6/0x185 [ 248.865911] ? show_regs_print_info+0xb/0xb [ 248.865913] ? printk+0x9c/0xc3 [ 248.865915] ? kmsg_dump_rewind_nolock+0xe4/0xe4 [ 248.865919] print_address_description+0x6f/0x270 [ 248.865920] kasan_report+0x25b/0x380 [ 248.865922] ? _copy_to_user+0x5c/0x90 [ 248.865924] check_memory_region+0x137/0x190 [ 248.865925] kasan_check_read+0x11/0x20 [ 248.865927] _copy_to_user+0x5c/0x90 [ 248.865930] bpf_test_finish.isra.8+0x4f/0xc0 [ 248.865932] bpf_prog_test_run_skb+0x6a0/0xba0 [...] After scrubbing the BPF prog a bit from the noise, turns out it called bpf_skb_change_head() for the lwt_xmit prog with headroom of 2. Nothing wrong in that, however, this was run with repeat >> 0 in bpf_prog_test_run_skb() and the same skb thus keeps changing until the pskb_expand_head() called from skb_cow() keeps bailing out in atomic alloc context with -ENOMEM. So upon return we'll basically have 0 headroom left yet blindly do the __skb_push() of 14 bytes and keep copying data from there in bpf_test_finish() out of bounds. Fix to check if we have enough headroom and if pskb_expand_head() fails, bail out with error. Another bug independent of this fix (but related in triggering above) is that BPF_PROG_TEST_RUN should be reworked to reset the skb/xdp buffer to it's original state from input as otherwise repeating the same test in a loop won't work for benchmarking when underlying input buffer is getting changed by the prog each time and reused for the next run leading to unexpected results. Fixes: 1cf1cae963c2 ("bpf: introduce BPF_PROG_TEST_RUN command") Reported-by: syzbot+709412e651e55ed96498@syzkaller.appspotmail.com Reported-by: syzbot+54f39d6ab58f39720a55@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* bpf: making bpf_prog_test run aware of possible data_end ptr changeNikita V. Shirokov2018-04-181-1/+2
| | | | | | | | | | after introduction of bpf_xdp_adjust_tail helper packet length could be changed not only if xdp->data pointer has been changed but xdp->data_end as well. making bpf_prog_test_run aware of this possibility Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: fix null pointer deref in bpf_prog_test_run_xdpDaniel Borkmann2018-02-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | syzkaller was able to generate the following XDP program ... (18) r0 = 0x0 (61) r5 = *(u32 *)(r1 +12) (04) (u32) r0 += (u32) 0 (95) exit ... and trigger a NULL pointer dereference in ___bpf_prog_run() via bpf_prog_test_run_xdp() where this was attempted to run. Reason is that recent xdp_rxq_info addition to XDP programs updated all drivers, but not bpf_prog_test_run_xdp(), where xdp_buff is set up. Thus when context rewriter does the deref on the netdev it's NULL at runtime. Fix it by using xdp_rxq from loopback dev. __netif_get_rx_queue() helper can also be reused in various other locations later on. Fixes: 02dd3291b2f0 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs") Reported-by: syzbot+1eb094057b338eb1fc00@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* bpf: add meta pointer for direct accessDaniel Borkmann2017-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This work enables generic transfer of metadata from XDP into skb. The basic idea is that we can make use of the fact that the resulting skb must be linear and already comes with a larger headroom for supporting bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work on a similar principle and introduce a small helper bpf_xdp_adjust_meta() for adjusting a new pointer called xdp->data_meta. Thus, the packet has a flexible and programmable room for meta data, followed by the actual packet data. struct xdp_buff is therefore laid out that we first point to data_hard_start, then data_meta directly prepended to data followed by data_end marking the end of packet. bpf_xdp_adjust_head() takes into account whether we have meta data already prepended and if so, memmove()s this along with the given offset provided there's enough room. xdp->data_meta is optional and programs are not required to use it. The rationale is that when we process the packet in XDP (e.g. as DoS filter), we can push further meta data along with it for the XDP_PASS case, and give the guarantee that a clsact ingress BPF program on the same device can pick this up for further post-processing. Since we work with skb there, we can also set skb->mark, skb->priority or other skb meta data out of BPF, thus having this scratch space generic and programmable allows for more flexibility than defining a direct 1:1 transfer of potentially new XDP members into skb (it's also more efficient as we don't need to initialize/handle each of such new members). The facility also works together with GRO aggregation. The scratch space at the head of the packet can be multiple of 4 byte up to 32 byte large. Drivers not yet supporting xdp->data_meta can simply be set up with xdp->data_meta as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out, such that the subsequent match against xdp->data for later access is guaranteed to fail. The verifier treats xdp->data_meta/xdp->data the same way as we treat xdp->data/xdp->data_end pointer comparisons. The requirement for doing the compare against xdp->data is that it hasn't been modified from it's original address we got from ctx access. It may have a range marking already from prior successful xdp->data/xdp->data_end pointer comparisons though. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: rename bpf_compute_data_end into bpf_compute_data_pointersDaniel Borkmann2017-09-261-1/+1
| | | | | | | | | | Just do the rename into bpf_compute_data_pointers() as we'll add one more pointer here to recompute. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: Align packet data properly in program testing framework.David Miller2017-05-021-5/+5
| | | | | | | | Make sure we apply NET_IP_ALIGN when reserving headroom for SKB and XDP test runs, just like a real driver would. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: Do not dereference user pointer in bpf_test_finish().David Miller2017-05-021-4/+5
| | | | | | | | | | Instead, pass the kattr in which has a kernel side copy of this data structure from userspace already. Fix based upon a suggestion from Alexei Starovoitov. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: introduce BPF_PROG_TEST_RUN commandAlexei Starovoitov2017-04-012-0/+173
development and testing of networking bpf programs is quite cumbersome. Despite availability of user space bpf interpreters the kernel is the ultimate authority and execution environment. Current test frameworks for TC include creation of netns, veth, qdiscs and use of various packet generators just to test functionality of a bpf program. XDP testing is even more complicated, since qemu needs to be started with gro/gso disabled and precise queue configuration, transferring of xdp program from host into guest, attaching to virtio/eth0 and generating traffic from the host while capturing the results from the guest. Moreover analyzing performance bottlenecks in XDP program is impossible in virtio environment, since cost of running the program is tiny comparing to the overhead of virtio packet processing, so performance testing can only be done on physical nic with another server generating traffic. Furthermore ongoing changes to user space control plane of production applications cannot be run on the test servers leaving bpf programs stubbed out for testing. Last but not least, the upstream llvm changes are validated by the bpf backend testsuite which has no ability to test the code generated. To improve this situation introduce BPF_PROG_TEST_RUN command to test and performance benchmark bpf programs. Joint work with Daniel Borkmann. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>