summaryrefslogtreecommitdiffstats
path: root/samples/bpf
Commit message (Collapse)AuthorAgeFilesLines
* samples: bpf: fix: error handling regarding kprobe_eventsDaniel T. Lee2018-11-231-9/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, kprobe_events failure won't be handled properly. Due to calling system() indirectly to write to kprobe_events, it can't be identified whether an error is derived from kprobe or system. // buf = "echo '%c:%s %s' >> /s/k/d/t/kprobe_events" err = system(buf); if (err < 0) { printf("failed to create kprobe .."); return -1; } For example, running ./tracex7 sample in ext4 partition, "echo p:open_ctree open_ctree >> /s/k/d/t/kprobe_events" gets 256 error code system() failure. => The error comes from kprobe, but it's not handled correctly. According to man of system(3), it's return value just passes the termination status of the child shell rather than treating the error as -1. (don't care success) Which means, currently it's not working as desired. (According to the upper code snippet) ex) running ./tracex7 with ext4 env. # Current Output sh: echo: I/O error failed to open event open_ctree # Desired Output failed to create kprobe 'open_ctree' error 'No such file or directory' The problem is, error can't be verified whether from child ps or system. But using write() directly can verify the command failure, and it will treat all error as -1. So I suggest using write() directly to 'kprobe_events' rather than calling system(). Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* tools/bpf: do not use pahole if clang/llvm can generate BTF sectionsYonghong Song2018-11-201-0/+8
| | | | | | | | | | Add additional checks in tools/testing/selftests/bpf and samples/bpf such that if clang/llvm compiler can generate BTF sections, do not use pahole. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* bpf_load: add map name to load_maps error messageShannon Nelson2018-11-071-2/+2
| | | | | | | | | | | To help when debugging bpf/xdp load issues, have the load_map() error message include the number and name of the map that failed. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf, tracex3_user: erase "ARRAY_SIZE" redefinedBo YU2018-10-041-2/+0
| | | | | | | | | | | | | | | | There is a warning when compiling bpf sample programs in sample/bpf: make -C /home/foo/bpf/samples/bpf/../../tools/lib/bpf/ RM='rm -rf' LDFLAGS= srctree=/home/foo/bpf/samples/bpf/../../ O= HOSTCC /home/foo/bpf/samples/bpf/tracex3_user.o /home/foo/bpf/samples/bpf/tracex3_user.c:20:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) In file included from /home/foo/bpf/samples/bpf/tracex3_user.c:18:0: ./tools/testing/selftests/bpf/bpf_util.h:48:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: extend test_cgrp2_attach2 test to use per-cpu cgroup storageRoman Gushchin2018-10-011-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit extends the test_cgrp2_attach2 test to cover per-cpu cgroup storage. Bpf program will use shared and per-cpu cgroup storages simultaneously, so a better coverage of corresponding core code will be achieved. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: fix compilation failurePrashant Bhole2018-09-213-11/+12
| | | | | | | | | | | | | | | | | | | | following commit: commit d58e468b1112 ("flow_dissector: implements flow dissector BPF hook") added struct bpf_flow_keys which conflicts with the struct with same name in sockex2_kern.c and sockex3_kern.c similar to commit: commit 534e0e52bc23 ("samples/bpf: fix a compilation failure") we tried the rename it "flow_keys" but it also conflicted with struct having same name in include/net/flow_dissector.h. Hence renaming the struct to "flow_key_record". Also, this commit doesn't fix the compilation error completely because the similar struct is present in sockex3_kern.c. Hence renaming it in both files sockex3_user.c and sockex3_kern.c Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: fix a compilation failureYonghong Song2018-09-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | samples/bpf build failed with the following errors: $ make samples/bpf/ ... HOSTCC samples/bpf/sockex3_user.o /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:16:8: error: redefinition of ‘struct bpf_flow_keys’ struct bpf_flow_keys { ^ In file included from /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:4:0: ./usr/include/linux/bpf.h:2338:9: note: originally defined here struct bpf_flow_keys *flow_keys; ^ make[3]: *** [samples/bpf/sockex3_user.o] Error 1 Commit d58e468b1112d ("flow_dissector: implements flow dissector BPF hook") introduced struct bpf_flow_keys in include/uapi/linux/bpf.h and hence caused the naming conflict with samples/bpf/sockex3_user.c. The fix is to rename struct bpf_flow_keys in samples/bpf/sockex3_user.c to flow_keys to avoid the conflict. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: remove duplicated includesYueHaibing2018-09-183-3/+0
| | | | | | | | Remove duplicated includes. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: xdpsock, minor fixesPrashant Bhole2018-09-012-3/+2
| | | | | | | | | | - xsks_map size was fixed to 4, changed it MAX_SOCKS - Remove redundant definition of MAX_SOCKS in xdpsock_user.c - In dump_stats(), add NULL check for xsks[i] Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample programNikita V. Shirokov2018-09-012-0/+88
| | | | | | | | | | Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example: bpf program which is doing TOS/TCLASS reflection (server would reply with a same TOS/TCLASS as client). Signed-off-by: Nikita V. Shirokov <tehnerd@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsockBjörn Töpel2018-08-291-1/+11
| | | | | | | | The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy mode. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERMJesper Dangaard Brouer2018-08-162-2/+4
| | | | | | | | | | | | | | | | | | | | It is common XDP practice to unload/deattach the XDP bpf program, when the XDP sample program is Ctrl-C interrupted (SIGINT) or killed (SIGTERM). The samples/bpf programs xdp_redirect_cpu and xdp_rxq_info, forgot to trap signal SIGTERM (which is the default signal used by the kill command). This was discovered by Red Hat QA, which automated scripts depend on killing the XDP sample program after a timeout period. Fixes: fad3917e361b ("samples/bpf: add cpumap sample program xdp_redirect_cpu") Fixes: 0fca931a6f21 ("samples/bpf: program demonstrating access to xdp_rxq_info") Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2018-08-1513-34/+590
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "Highlights: - Gustavo A. R. Silva keeps working on the implicit switch fallthru changes. - Support 802.11ax High-Efficiency wireless in cfg80211 et al, From Luca Coelho. - Re-enable ASPM in r8169, from Kai-Heng Feng. - Add virtual XFRM interfaces, which avoids all of the limitations of existing IPSEC tunnels. From Steffen Klassert. - Convert GRO over to use a hash table, so that when we have many flows active we don't traverse a long list during accumluation. - Many new self tests for routing, TC, tunnels, etc. Too many contributors to mention them all, but I'm really happy to keep seeing this stuff. - Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu. - Lots of cleanups and fixes in L2TP code from Guillaume Nault. - Add IPSEC offload support to netdevsim, from Shannon Nelson. - Add support for slotting with non-uniform distribution to netem packet scheduler, from Yousuk Seung. - Add UDP GSO support to mlx5e, from Boris Pismenny. - Support offloading of Team LAG in NFP, from John Hurley. - Allow to configure TX queue selection based upon RX queue, from Amritha Nambiar. - Support ethtool ring size configuration in aquantia, from Anton Mikaev. - Support DSCP and flowlabel per-transport in SCTP, from Xin Long. - Support list based batching and stack traversal of SKBs, this is very exciting work. From Edward Cree. - Busyloop optimizations in vhost_net, from Toshiaki Makita. - Introduce the ETF qdisc, which allows time based transmissions. IGB can offload this in hardware. From Vinicius Costa Gomes. - Add parameter support to devlink, from Moshe Shemesh. - Several multiplication and division optimizations for BPF JIT in nfp driver, from Jiong Wang. - Lots of prepatory work to make more of the packet scheduler layer lockless, when possible, from Vlad Buslov. - Add ACK filter and NAT awareness to sch_cake packet scheduler, from Toke Høiland-Jørgensen. - Support regions and region snapshots in devlink, from Alex Vesker. - Allow to attach XDP programs to both HW and SW at the same time on a given device, with initial support in nfp. From Jakub Kicinski. - Add TLS RX offload and support in mlx5, from Ilya Lesokhin. - Use PHYLIB in r8169 driver, from Heiner Kallweit. - All sorts of changes to support Spectrum 2 in mlxsw driver, from Ido Schimmel. - PTP support in mv88e6xxx DSA driver, from Andrew Lunn. - Make TCP_USER_TIMEOUT socket option more accurate, from Jon Maxwell. - Support for templates in packet scheduler classifier, from Jiri Pirko. - IPV6 support in RDS, from Ka-Cheong Poon. - Native tproxy support in nf_tables, from Máté Eckl. - Maintain IP fragment queue in an rbtree, but optimize properly for in-order frags. From Peter Oskolkov. - Improvde handling of ACKs on hole repairs, from Yuchung Cheng" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits) bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT" hv/netvsc: Fix NULL dereference at single queue mode fallback net: filter: mark expected switch fall-through xen-netfront: fix warn message as irq device name has '/' cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0 net: dsa: mv88e6xxx: missing unlock on error path rds: fix building with IPV6=m inet/connection_sock: prefer _THIS_IP_ to current_text_addr net: dsa: mv88e6xxx: bitwise vs logical bug net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() ieee802154: hwsim: using right kind of iteration net: hns3: Add vlan filter setting by ethtool command -K net: hns3: Set tx ring' tc info when netdev is up net: hns3: Remove tx ring BD len register in hns3_enet net: hns3: Fix desc num set to default when setting channel net: hns3: Fix for phy link issue when using marvell phy driver net: hns3: Fix for information of phydev lost problem when down/up net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero net: hns3: Add support for serdes loopback selftest bnxt_en: take coredump_record structure off stack ...
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-08-133-2/+160
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-13 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add driver XDP support for veth. This can be used in conjunction with redirect of another XDP program e.g. sitting on NIC so the xdp_frame can be forwarded to the peer veth directly without modification, from Toshiaki. 2) Add a new BPF map type REUSEPORT_SOCKARRAY and prog type SK_REUSEPORT in order to provide more control and visibility on where a SO_REUSEPORT sk should be located, and the latter enables to directly select a sk from the bpf map. This also enables map-in-map for application migration use cases, from Martin. 3) Add a new BPF helper bpf_skb_ancestor_cgroup_id() that returns the id of cgroup v2 that is the ancestor of the cgroup associated with the skb at the ancestor_level, from Andrey. 4) Implement BPF fs map pretty-print support based on BTF data for regular hash table and LRU map, from Yonghong. 5) Decouple the ability to attach BTF for a map from the key and value pretty-printer in BPF fs, and enable further support of BTF for maps for percpu and LPM trie, from Daniel. 6) Implement a better BPF sample of using XDP's CPU redirect feature for load balancing SKB processing to remote CPU. The sample implements the same XDP load balancing as Suricata does which is symmetric hash based on IP and L4 protocol, from Jesper. 7) Revert adding NULL pointer check with WARN_ON_ONCE() in __xdp_return()'s critical path as it is ensured that the allocator is present, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * samples/bpf: xdp_redirect_cpu load balance like SuricataJesper Dangaard Brouer2018-08-102-2/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implement XDP CPU redirection load-balancing across available CPUs, based on the hashing IP-pairs + L4-protocol. This equivalent to xdp-cpu-redirect feature in Suricata, which is inspired by the Suricata 'ippair' hashing code. An important property is that the hashing is flow symmetric, meaning that if the source and destination gets swapped then the selected CPU will remain the same. This is helps locality by placing both directions of a flows on the same CPU, in a forwarding/routing scenario. The hashing INITVAL (15485863 the 10^6th prime number) was fairly arbitrary choosen, but experiments with kernel tree pktgen scripts (pktgen_sample04_many_flows.sh +pktgen_sample05_flow_per_thread.sh) showed this improved the distribution. This patch also change the default loaded XDP program to be this load-balancer. As based on different user feedback, this seems to be the expected behavior of the sample xdp_redirect_cpu. Link: https://github.com/OISF/suricata/commit/796ec08dd7a63 Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * samples/bpf: add Paul Hsieh's (LGPL 2.1) hash function SuperFastHashJesper Dangaard Brouer2018-08-101-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adjusted function call API to take an initval. This allow the API user to set the initial value, as a seed. This could also be used for inputting the previous hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-08-112-3/+3
| |\ \ | | |/ | |/|
| * | samples/bpf: extend test_cgrp2_attach2 test to use cgroup storageRoman Gushchin2018-08-031-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test_cgrp2_attach test covers bpf cgroup attachment code well, so let's re-use it for testing allocation/releasing of cgroup storage. The extension is pretty straightforward: the bpf program will use the cgroup storage to save the number of transmitted bytes. Expected output: $ ./test_cgrp2_attach2 Attached DROP prog. This ping in cgroup /foo should fail... ping: sendmsg: Operation not permitted Attached DROP prog. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS prog. This ping in cgroup /foo/bar should pass... Detached PASS from /foo/bar while DROP is attached to /foo. This ping in cgroup /foo/bar should fail... ping: sendmsg: Operation not permitted Attached PASS from /foo/bar and detached DROP from /foo. This ping in cgroup /foo/bar should pass... ### override:PASS ### multi:PASS Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | samples: bpf: convert xdpsock_user.c to libbpfJakub Kicinski2018-07-272-10/+30
| | | | | | | | | | | | | | | | | | | | | | | | Convert xdpsock_user.c to use libbpf instead of bpf_load.o. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | samples: bpf: convert xdp_fwd_user.c to libbpfJakub Kicinski2018-07-272-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | Convert xdp_fwd_user.c to use libbpf instead of bpf_load.o. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | samples/bpf: Add BTF build flags to MakefileTaeung Song2018-07-271-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To smoothly test BTF supported binary on samples/bpf, let samples/bpf/Makefile probe llc, pahole and llvm-objcopy for BPF support and use them like tools/testing/selftests/bpf/Makefile changed from the commit c0fa1b6c3efc ("bpf: btf: Add BTF tests"). Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | samples/bpf: xdpsock: order memory on AArch64Brian Brooks2018-07-271-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define u_smp_rmb() and u_smp_wmb() to respective barrier instructions. This ensures the processor will order accesses to queue indices against accesses to queue ring entries. Signed-off-by: Brian Brooks <brian.brooks@linaro.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-07-202-1/+4
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-20 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add sharing of BPF objects within one ASIC: this allows for reuse of the same program on multiple ports of a device, and therefore gains better code store utilization. On top of that, this now also enables sharing of maps between programs attached to different ports of a device, from Jakub. 2) Cleanup in libbpf and bpftool's Makefile to reduce unneeded feature detections and unused variable exports, also from Jakub. 3) First batch of RCU annotation fixes in prog array handling, i.e. there are several __rcu markers which are not correct as well as some of the RCU handling, from Roman. 4) Two fixes in BPF sample files related to checking of the prog_cnt upper limit from sample loader, from Dan. 5) Minor cleanup in sockmap to remove a set but not used variable, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | samples/bpf: test_cgrp2_sock2: fix an off by oneDan Carpenter2018-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "prog_cnt" is the number of elements which are filled out in prog_fd[] so the test should be >= instead of >. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | * | samples: bpf: ensure that we don't load over MAX_PROGS programsDan Carpenter2018-07-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I can't see that we check prog_cnt to ensure it doesn't go over MAX_PROGS. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * | | Merge ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linuxDavid S. Miller2018-07-206-16/+93
| |\ \ \ | | |/ / | |/| | | | | | | | | | | | | | | | | | All conflicts were trivial overlapping changes, so reasonably easy to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | samples/bpf: xdp_redirect_cpu handle parsing of double VLAN tagged packetsJesper Dangaard Brouer2018-07-141-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | People noticed that the code match on IEEE 802.1ad (ETH_P_8021AD) ethertype, and this implies Q-in-Q or double tagged VLANs. Thus, we better parse the next VLAN header too. It is even marked as a TODO. This is relevant for real world use-cases, as XDP cpumap redirect can be used when the NIC RSS hashing is broken. E.g. the ixgbe driver HW cannot handle double tagged VLAN packets, and places everything into a single RX queue. Using cpumap redirect, users can redistribute traffic across CPUs to solve this, which is faster than the network stacks RPS solution. It is left as an exerise how to distribute the packets across CPUs. It would be convenient to use the RX hash, but that is not _yet_ exposed to XDP programs. For now, users can code their own hash, as I've demonstrated in the Suricata code (where Q-in-Q is handled correctly). Reported-by: Florian Maury <florian.maury-cv@x-cli.eu> Reported-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-07-045-6/+321
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-03 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Various improvements to bpftool and libbpf, that is, bpftool build speed improvements, missing BPF program types added for detection by section name, ability to load programs from '.text' section is made to work again, and better bash completion handling, from Jakub. 2) Improvements to nfp JIT's map read handling which allows for optimizing memcpy from map to packet, from Jiong. 3) New BPF sample is added which demonstrates XDP in combination with bpf_perf_event_output() helper to sample packets on all CPUs, from Toke. 4) Add a new BPF kselftest case for tracking connect(2) BPF hooks infrastructure in combination with TFO, from Andrey. 5) Extend the XDP/BPF xdp_rxq_info sample code with a cmdline option to read payload from packet data in order to use it for benchmarking. Also for '--action XDP_TX' option implement swapping of MAC addresses to avoid drops on some hardware seen during testing, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | samples/bpf: xdp_rxq_info action XDP_TX must adjust MAC-addrsJesper Dangaard Brouer2018-06-282-1/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | XDP_TX requires also changing the MAC-addrs, else some hardware may drop the TX packet before reaching the wire. This was observed with driver mlx5. If xdp_rxq_info select --action XDP_TX the swapmac functionality is activated. It is also possible to manually enable via cmdline option --swapmac. This is practical if wanting to measure the overhead of writing/updating payload for other action types. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | samples/bpf: extend xdp_rxq_info to read packet payloadJesper Dangaard Brouer2018-06-282-6/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a cost associated with reading the packet data payload that this test ignored. Add option --read to allow enabling reading part of the payload. This sample/tool helps us analyse an issue observed with a NIC mlx5 (ConnectX-5 Ex) and an Intel(R) Xeon(R) CPU E5-1650 v4. With no_touch of data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:no_touch XDP stats CPU pps issue-pps XDP-RX CPU 0 14,465,157 0 XDP-RX CPU 1 14,464,728 0 XDP-RX CPU 2 14,465,283 0 XDP-RX CPU 3 14,465,282 0 XDP-RX CPU 4 14,464,159 0 XDP-RX CPU 5 14,465,379 0 XDP-RX CPU total 86,789,992 When not touching data, we observe that the CPUs have idle cycles. When reading data the CPUs are 100% busy in softirq. With reading data: Running XDP on dev:mlx5p1 (ifindex:8) action:XDP_DROP options:read XDP stats CPU pps issue-pps XDP-RX CPU 0 9,620,639 0 XDP-RX CPU 1 9,489,843 0 XDP-RX CPU 2 9,407,854 0 XDP-RX CPU 3 9,422,289 0 XDP-RX CPU 4 9,321,959 0 XDP-RX CPU 5 9,395,242 0 XDP-RX CPU total 56,657,828 The effect seen above is a result of cache-misses occuring when more RXQs are being used. Based on perf-event observations, our conclusion is that the CPUs DDIO (Direct Data I/O) choose to deliver packet into main memory, instead of L3-cache. We also found, that this can be mitigated by either using less RXQs or by reducing NICs the RX-ring size. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * | | samples/bpf: Add xdp_sample_pkts exampleToke Høiland-Jørgensen2018-06-273-0/+239
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an example program showing how to sample packets from XDP using the perf event buffer. The example userspace program just prints the ethernet header for every packet sampled. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | | | Merge tag 'kbuild-v4.19' of ↵Linus Torvalds2018-08-151-11/+11
|\ \ \ \ \ | |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - verify depmod is installed before modules_install - support build salt in case build ids must be unique between builds - allow users to specify additional host compiler flags via HOST*FLAGS, and rename internal variables to KBUILD_HOST*FLAGS - update buildtar script to drop vax support, add arm64 support - update builddeb script for better debarch support - document the pit-fall of if_changed usage - fix parallel build of UML with O= option - make 'samples' target depend on headers_install to fix build errors - remove deprecated host-progs variable - add a new coccinelle script for refcount_t vs atomic_t check - improve double-test coccinelle script - misc cleanups and fixes * tag 'kbuild-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits) coccicheck: return proper error code on fail Coccinelle: doubletest: reduce side effect false positives kbuild: remove deprecated host-progs variable kbuild: make samples really depend on headers_install um: clean up archheaders recipe kbuild: add %asm-generic to no-dot-config-targets um: fix parallel building with O= option scripts: Add Python 3 support to tracing/draw_functrace.py builddeb: Add automatic support for sh{3,4}{,eb} architectures builddeb: Add automatic support for riscv* architectures builddeb: Add automatic support for m68k architecture builddeb: Add automatic support for or1k architecture builddeb: Add automatic support for sparc64 architecture builddeb: Add automatic support for mips{,64}r6{,el} architectures builddeb: Add automatic support for mips64el architecture builddeb: Add automatic support for ppc64 and powerpcspe architectures builddeb: Introduce functions to simplify kconfig tests in set_debarch builddeb: Drop check for 32-bit s390 builddeb: Change architecture detection fallback to use dpkg-architecture builddeb: Skip architecture detection when KBUILD_DEBARCH is set ...
| * | | | kbuild: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBSLaura Abbott2018-07-181-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES to KBUILD_HOSTLDLIBS as the internal use only flags. Also rename existing usage to HOSTLDLIBS for consistency. This should not have any visible effects. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
| * | | | kbuild: Rename HOSTCFLAGS to KBUILD_HOSTCFLAGSLaura Abbott2018-07-181-5/+5
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to KBUILD_HOSTCFLAGS as the internal use only flags. This should not have any visible effects. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
* | | / samples/bpf: xdp_redirect_cpu adjustment to reproduce teardown race easierJesper Dangaard Brouer2018-08-092-3/+3
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The teardown race in cpumap is really hard to reproduce. These changes makes it easier to reproduce, for QA. The --stress-mode now have a case of a very small queue size of 8, that helps to trigger teardown flush to encounter a full queue, which results in calling xdp_return_frame API, in a non-NAPI protect context. Also increase MAX_CPUS, as my QA department have larger machines than me. Tested-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: Fix tc and ip paths in xdp2skb_meta.shTaeung Song2018-07-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The below path error can occur: # ./xdp2skb_meta.sh --dev eth0 --list ./xdp2skb_meta.sh: line 61: /usr/sbin/tc: No such file or directory So just use command names instead of absolute paths of tc and ip. In addition, it allow callers to redefine $TC and $IP paths Fixes: 36e04a2d78d9 ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: add .gitignore fileTaeung Song2018-07-051-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For untracked executables of samples/bpf, add this. Untracked files: (use "git add <file>..." to include in what will be committed) samples/bpf/cpustat samples/bpf/fds_example samples/bpf/lathist samples/bpf/load_sock_ops ... Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: Check the error of write() and read()Taeung Song2018-07-051-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | test_task_rename() and test_urandom_read() can be failed during write() and read(), So check the result of them. Reviewed-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: Check the result of system()Taeung Song2018-07-051-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid the below build warning message, use new generate_load() checking the return value. ignoring return value of ‘system’, declared with attribute warn_unused_result And it also refactors the duplicate code of both test_perf_event_all_cpu() and test_perf_event_task() Cc: Teng Qin <qinteng@fb.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: add missing <linux/if_vlan.h>Taeung Song2018-07-051-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes build error regarding redefinition: CLANG-bpf samples/bpf/parse_varlen.o samples/bpf/parse_varlen.c:111:8: error: redefinition of 'vlan_hdr' struct vlan_hdr { ^ ./include/linux/if_vlan.h:38:8: note: previous definition is here So remove duplicate 'struct vlan_hdr' in sample code and include if_vlan.h Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | samples/bpf: deal with EBUSY return code from sendmsg in xdpsock sampleMagnus Karlsson2018-07-021-1/+1
|/ / | | | | | | | | | | | | | | | | | | | | Sendmsg in the SKB path of AF_XDP can now return EBUSY when a packet was discarded and completed by the driver. Just ignore this message in the sample application. Fixes: b4b8faa1ded7 ("samples/bpf: sample application and documentation for AF_XDP sockets") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Reported-by: Pavel Odintsov <pavel@fastnetmon.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* / bpf: Change bpf_fib_lookup to return lookup statusDavid Ahern2018-06-291-4/+4
|/ | | | | | | | | | | | | | | | | For ACLs implemented using either FIB rules or FIB entries, the BPF program needs the FIB lookup status to be able to drop the packet. Since the bpf_fib_lookup API has not reached a released kernel yet, change the return code to contain an encoding of the FIB lookup result and return the nexthop device index in the params struct. In addition, inform the BPF program of any post FIB lookup reason as to why the packet needs to go up the stack. The fib result for unicast routes must have an egress device, so remove the check that it is non-NULL. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: xdpsock: use skb Tx path for XDP_SKBBjörn Töpel2018-06-051-0/+5
| | | | | | | Make sure that XDP_SKB also uses the skb Tx path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: minor *_nb_free performance fixMagnus Karlsson2018-06-041-3/+5
| | | | | Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* samples/bpf: adapted to new uapiBjörn Töpel2018-06-041-48/+36
| | | | | | | | Here, the xdpsock sample application is adjusted to the new descriptor format. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* bpf: flowlabel in bpf_fib_lookup should be flowinfoDavid Ahern2018-06-031-1/+1
| | | | | | | | | | As Michal noted the flow struct takes both the flow label and priority. Update the bpf_fib_lookup API to note that it is flowinfo and not just the flow label. Cc: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-05-246-59/+599
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-05-24 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Björn Töpel cleans up AF_XDP (removes rebind, explicit cache alignment from uapi, etc). 2) David Ahern adds mtu checks to bpf_ipv{4,6}_fib_lookup() helpers. 3) Jesper Dangaard Brouer adds bulking support to ndo_xdp_xmit. 4) Jiong Wang adds support for indirect and arithmetic shifts to NFP 5) Martin KaFai Lau cleans up BTF uapi and makes the btf_header extensible. 6) Mathieu Xhonneux adds an End.BPF action to seg6local with BPF helpers allowing to edit/grow/shrink a SRH and apply on a packet generic SRv6 actions. 7) Sandipan Das adds support for bpf2bpf function calls in ppc64 JIT. 8) Yonghong Song adds BPF_TASK_FD_QUERY command for introspection of tracing events. 9) other misc fixes from Gustavo A. R. Silva, Sirio Balmelli, John Fastabend, and Magnus Karlsson ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * samples/bpf: xdp_monitor use err code from tracepoint xdp:xdp_devmap_xmitJesper Dangaard Brouer2018-05-242-5/+40
| | | | | | | | | | | | | | | | | | | | Update xdp_monitor to use the recently added err code introduced in tracepoint xdp:xdp_devmap_xmit, to show if the drop count is caused by some driver general delivery problem. Other kind of drops will likely just be more normal TX space issues. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * samples/bpf: xdp_monitor use tracepoint xdp:xdp_devmap_xmitJesper Dangaard Brouer2018-05-242-1/+82
| | | | | | | | | | | | | | | | The xdp_monitor sample/tool is updated to use the new tracepoint xdp:xdp_devmap_xmit the previous patch just introduced. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * samples/bpf: add a samples/bpf test for BPF_TASK_FD_QUERYYonghong Song2018-05-243-0/+405
| | | | | | | | | | | | | | This is mostly to test kprobe/uprobe which needs kernel headers. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>