summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net/mlx5e: TLS, add software statisticsBoris Pismenny2018-07-163-1/+17
| | | | | | | | This patch adds software statistics for TLS to count important events. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: TLS, add Innova TLS rx data pathBoris Pismenny2018-07-163-3/+118
| | | | | | | | | | | | | | | Implement the TLS rx offload data path according to the requirements of the TLS generic NIC offload infrastructure. Special metadata ethertype is used to pass information to the hardware. When hardware loses synchronization a special resync request metadata message is used to request resync. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: TLS, add innova rx supportBoris Pismenny2018-07-162-15/+46
| | | | | | | | | | Add the mlx5 implementation of the TLS Rx routines to add/del TLS contexts, also add the tls_dev_resync_rx routine to work with the TLS inline Rx crypto offload infrastructure. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5: Accel, add TLS rx offload routinesBoris Pismenny2018-07-165-46/+135
| | | | | | | | | | | | | In Innova TLS, TLS contexts are added or deleted via a command message over the SBU connection. The HW then sends a response message over the same connection. Complete the implementation for Innova TLS (FPGA-based) hardware by adding support for rx inline crypto offload. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx5e: TLS, refactor variable namesBoris Pismenny2018-07-163-8/+8
| | | | | | | | | | For symmetry, we rename mlx5e_tls_offload_context to mlx5e_tls_offload_context_tx before we add mlx5e_tls_offload_context_rx. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Fix zerocopy_from_iter iov handlingBoris Pismenny2018-07-161-3/+5
| | | | | | | | | | zerocopy_from_iter iterates over the message, but it doesn't revert the updates made by the iov iteration. This patch fixes it. Now, the iov can be used after calling zerocopy_from_iter. Fixes: 3c4d75591 ("tls: kernel TLS support") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Add rx inline crypto offloadBoris Pismenny2018-07-165-43/+355
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch completes the generic infrastructure to offload TLS crypto to a network device. It enables the kernel to skip decryption and authentication of some skbs marked as decrypted by the NIC. In the fast path, all packets received are decrypted by the NIC and the performance is comparable to plain TCP. This infrastructure doesn't require a TCP offload engine. Instead, the NIC only decrypts packets that contain the expected TCP sequence number. Out-Of-Order TCP packets are provided unmodified. As a result, at the worst case a received TLS record consists of both plaintext and ciphertext packets. These partially decrypted records must be reencrypted, only to be decrypted. The notable differences between SW KTLS Rx and this offload are as follows: 1. Partial decryption - Software must handle the case of a TLS record that was only partially decrypted by HW. This can happen due to packet reordering. 2. Resynchronization - tls_read_size calls the device driver to resynchronize HW after HW lost track of TLS record framing in the TCP stream. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Fill software context without allocationBoris Pismenny2018-07-161-12/+22
| | | | | | | | | | This patch allows tls_set_sw_offload to fill the context in case it was already allocated previously. We will use it in TLS_DEVICE to fill the RX software context. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Split tls_sw_release_resources_rxBoris Pismenny2018-07-162-1/+10
| | | | | | | | | | | | This patch splits tls_sw_release_resources_rx into two functions one which releases all inner software tls structures and another that also frees the containing structure. In TLS_DEVICE we will need to release the software structures without freeeing the containing structure, which contains other information. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Split decrypt_skb to two functionsBoris Pismenny2018-07-162-18/+28
| | | | | | | | | | | Previously, decrypt_skb also updated the TLS context. Now, decrypt_skb only decrypts the payload using the current context, while decrypt_skb_update also updates the state. Later, in the tls_device Rx flow, we will use decrypt_skb directly. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tls: Refactor tls_offload variable namesBoris Pismenny2018-07-164-28/+27
| | | | | | | | For symmetry, we rename tls_offload_context to tls_offload_context_tx before we add tls_offload_context_rx. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Don't coalesce decrypted and encrypted SKBsBoris Pismenny2018-07-162-0/+15
| | | | | | | | | Prevent coalescing of decrypted and encrypted SKBs in GRO and TCP layer. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add TLS rx resync NDOBoris Pismenny2018-07-161-0/+2
| | | | | | | Add new netdev tls op for resynchronizing HW tls context Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add TLS RX offload featureIlya Lesokhin2018-07-162-0/+3
| | | | | | | | This patch adds a netdev feature to configure TLS RX inline crypto offload. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add decrypted field to skbBoris Pismenny2018-07-162-1/+12
| | | | | | | | | | The decrypted bit is propogated to cloned/copied skbs. This will be used later by the inline crypto receive side offload of tls. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mvpp2-add-debugfs-interface'David S. Miller2018-07-168-36/+830
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Maxime Chevallier says: ==================== net: mvpp2: add debugfs interface The PPv2 Header Parser and Classifier are not straightforward to debug, having easy access to some of the many lookup tables configuration is helpful during development and debug. This series adds a basic debugfs interface, allowing to read data from the Header Parser and some of the Classifier tables. For now, the interface is read-only, and contains only some basic info. This was actually used during RSS development, and might be useful to troubleshoot some issues we might find. The first patch of the series converts the mvpp2 files to SPDX, which eases adding the new debugfs dedicated file. The second patch adds the interface, and exposes basic Header Parser data. The 3rd patch adds a hit counter for the Header Parser TCAM. The 4th patch exposes classifier info. The 5th patch adds some hit counters for some of the classifier engines. Changes since V1: - Rebased on the lastest net-next - Made cls_flow_get non static so that it can be used in mvpp2_debugfs ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: debugfs: add classifier hit countersMaxime Chevallier2018-07-164-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The classification operations that are used for RSS make use of several lookup tables. Having hit counters for these tables is really helpful to determine what flows were matched by ingress traffic, and see the path of packets among all the classifier tables. This commit adds hit counters for the 3 tables used at the moment : - The decoding table (also called lookup_id table), that links flows identified by the Header Parser to the flow table. There's one entry per flow, located at : .../mvpp2/<controller>/flows/XX/dec_hits Note that there are 21 flows in the decoding table, whereas there are 52 flows in the Header Parser. That's because there are several kind of traffic that will match a given flow. Reading the hit counter from one sub-flow will clear all hit counter that have the same flow_id. This also applies to the flow_hits. - The flow table, that contains all the different lookups to be performed by the classifier for each packet of a given flow. The match is done on the first entry of the flow sequence. - The C2 engine entries, that are used to assign the default rx queue, and enable or disable RSS for a given port. There's one entry per flow, located at: .../mvpp2/<controller>/flows/XX/flow_hits There is one C2 entry per port, so the c2 hit counter is located at : .../mvpp2/<controller>/ethX/c2_hits All hit counter values are 16-bits clear-on-read values. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: debugfs: add entries for classifier flowsMaxime Chevallier2018-07-164-5/+329
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The classifier configuration for RSS is quite complex, with several lookup tables being used. This commit adds useful info in debugfs to see how the different tables are configured : Added 2 new entries in the per-port directory : - .../eth0/default_rxq : The default rx queue on that port - .../eth0/rss_enable : Indicates if RSS is enabled in the C2 entry Added the 'flows' directory : It contains one entry per sub-flow. a 'sub-flow' is a unique path from Header Parser to the flow table. Multiple sub-flows can point to the same 'flow' (each flow has an id from 8 to 29, which is its index in the Lookup Id table) : - .../flows/00/... /01/... ... /51/id : The flow id. There are 21 unique flows. There's one flow per combination of the following parameters : - L4 protocol (TCP, UDP, none) - L3 protocol (IPv4, IPv6) - L3 parameters (Fragmented or not) - L2 parameters (Vlan tag presence or not) .../type : The flow type. This is an even higher level flow, that we manipulate with ethtool. It can be : "udp4" "tcp4" "udp6" "tcp6" "ipv4" "ipv6" "other". .../eth0/... .../eth1/engine : The hash generation engine used for this flow on the given port .../hash_opts : The hash generation options indicating on what data we base the hash (vlan tag, src IP, src port, etc.) Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: debugfs: add hit counter stats for Header Parser entriesMaxime Chevallier2018-07-164-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One helpful feature to help debug the Header Parser TCAM filter in PPv2 is to be able to see if the entries did match something when a packet comes in. This can be done by using the built-in hit counter for TCAM entries. This commit implements reading the counter, and exposing its value on debugfs for each filter entry. The counter is a 16-bits clear-on-read value, located at: .../mvpp2/<controller>/parser/XXX/hits Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: add a debugfs interface for the Header ParserMaxime Chevallier2018-07-166-7/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marvell PPv2 Packer Header Parser has a TCAM based filter, that is not trivial to configure and debug. Being able to dump TCAM entries from userspace can be really helpful to help development of new features and debug existing ones. This commit adds a basic debugfs interface for the PPv2 driver, focusing on TCAM related features. <mnt>/mvpp2/ --- f2000000.ethernet \- f4000000.ethernet --- parser --- 000 ... | \- 001 | \- ... | \- 255 --- ai | \- header_data | \- lookup_id | \- sram | \- valid \- eth1 ... \- eth2 --- mac_filter \- parser_entries \- vid_filter There's one directory per PPv2 instance, named after pdev->name to make sure names are uniques. In each of these directories, there's : - one directory per interface on the controller, each containing : - "mac_filter", which lists all filtered addresses for this port (based on TCAM, not on the kernel's uc / mc lists) - "parser_entries", which lists the indices of all valid TCAM entries that have this port in their port map - "vid_filter", which lists the vids allowed on this port, based on TCAM - one "parser" directory (the parser is common to all ports), containing : - one directory per TCAM entry (256 of them, from 0 to 255), each containing : - "ai" : Contains the 1 byte Additional Info field from TCAM, and - "header_data" : Contains the 8 bytes Header Data extracted from the packet - "lookup_id" : Contains the 4 bits LU_ID - "sram" : contains the raw SRAM data, which is the result of the TCAM lookup. This readonly at the moment. - "valid" : Indicates if the entry is valid of not. All entries are read-only, and everything is output in hex form. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: switch to SPDX identifiersAntoine Tenart2018-07-166-24/+6
|/ | | | | | | | | Use the appropriate SPDX license identifiers and drop the license text. This patch is only cosmetic. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-07-1467-929/+3119
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-07-15 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Various different arm32 JIT improvements in order to optimize code emission and make the JIT code itself more robust, from Russell. 2) Support simultaneous driver and offloaded XDP in order to allow for advanced use-cases where some work is offloaded to the NIC and some to the host. Also add ability for bpftool to load programs and maps beyond just the cgroup case, from Jakub. 3) Add BPF JIT support in nfp for multiplication as well as division. For the latter in particular, it uses the reciprocal algorithm to emulate it, from Jiong. 4) Add BTF pretty print functionality to bpftool in plain and JSON output format, from Okash. 5) Add build and installation to the BPF helper man page into bpftool, from Quentin. 6) Add a TCP BPF callback for listening sockets which is triggered right after the socket transitions to TCP_LISTEN state, from Andrey. 7) Add a new cgroup tree command to bpftool which iterates over the whole cgroup tree and prints all attached programs, from Roman. 8) Improve xdp_redirect_cpu sample to support parsing of double VLAN tagged packets, from Jesper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'bpf-tcp-listen-cb'Daniel Borkmann2018-07-159-69/+88
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrey Ignatov says: ==================== This patchset adds TCP-BPF callback for listening sockets. Patch 0001 provides more details and is the main patch in the set. Patch 0006 adds selftest for the new callback. Other patches are bug fixes and improvements in TCP-BPF selftest to make it easier to extend in 0006. ==================== Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: Test case for BPF_SOCK_OPS_TCP_LISTEN_CBAndrey Ignatov2018-07-153-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cover new TCP-BPF callback in test_tcpbpf: when listen() is called on socket, set BPF_SOCK_OPS_STATE_CB_FLAG so that BPF_SOCK_OPS_STATE_CB callback can be called on future state transition, and when such a transition happens (TCP_LISTEN -> TCP_CLOSE), track it in the map and verify it in user space later. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: Better verification in test_tcpbpfAndrey Ignatov2018-07-151-25/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce amount of copy/paste for debug info when result is verified in the test and keep that info together with values being checked so that they won't get out of sync. It also improves debug experience: instead of checking manually what doesn't match in debug output for all fields, only unexpected field is printed. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: Switch test_tcpbpf_user to cgroup_helpersAndrey Ignatov2018-07-152-34/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch to cgroup_helpers to simplify the code and fix cgroup cleanup: before cgroup was not cleaned up after the test. It also removes SYSTEM macro, that only printed error, but didn't terminate the test. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: Fix const'ness in cgroup_helpersAndrey Ignatov2018-07-152-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Lack of const in cgroup helpers signatures forces to write ugly client code. Fix it. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * bpf: Sync bpf.h to tools/Andrey Ignatov2018-07-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Sync BPF_SOCK_OPS_TCP_LISTEN_CB related UAPI changes to tools/. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * bpf: Add BPF_SOCK_OPS_TCP_LISTEN_CBAndrey Ignatov2018-07-152-0/+4
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new TCP-BPF callback that is called on listen(2) right after socket transition to TCP_LISTEN state. It fills the gap for listening sockets in TCP-BPF. For example BPF program can set BPF_SOCK_OPS_STATE_CB_FLAG when socket becomes listening and track later transition from TCP_LISTEN to TCP_CLOSE with BPF_SOCK_OPS_STATE_CB callback. Before there was no way to do it with TCP-BPF and other options were much harder to work with. E.g. socket state tracking can be done with tracepoints (either raw or regular) but they can't be attached to cgroup and their lifetime has to be managed separately. Signed-off-by: Andrey Ignatov <rdna@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * bpf: btf: print map dump and lookup with btf infoOkash Khawaja2018-07-141-16/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch augments the output of bpftool's map dump and map lookup commands to print data along side btf info, if the correspondin btf info is available. The outputs for each of map dump and map lookup commands are augmented in two ways: 1. when neither of -j and -p are supplied, btf-ful map data is printed whose aim is human readability. This means no commitments for json- or backward- compatibility. 2. when either -j or -p are supplied, a new json object named "formatted" is added for each key-value pair. This object contains the same data as the key-value pair, but with btf info. "formatted" object promises json- and backward- compatibility. Below is a sample output. $ bpftool map dump -p id 8 [{ "key": ["0x0f","0x00","0x00","0x00" ], "value": ["0x03", "0x00", "0x00", "0x00", ... ], "formatted": { "key": 15, "value": { "int_field": 3, ... } } } ] This patch calls btf_dumper introduced in previous patch to accomplish the above. Indeed, btf-ful info is only displayed if btf data for the given map is available. Otherwise existing output is displayed as-is. Signed-off-by: Okash Khawaja <osk@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * bpf: btf: add btf print functionalityOkash Khawaja2018-07-142-0/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This consumes functionality exported in the previous patch. It does the main job of printing with BTF data. This is used in the following patch to provide a more readable output of a map's dump. It relies on json_writer to do json printing. Below is sample output where map keys are ints and values are of type struct A: typedef int int_type; enum E { E0, E1, }; struct B { int x; int y; }; struct A { int m; unsigned long long n; char o; int p[8]; int q[4][8]; enum E r; void *s; struct B t; const int u; int_type v; unsigned int w1: 3; unsigned int w2: 3; }; $ sudo bpftool map dump id 14 [{ "key": 0, "value": { "m": 1, "n": 2, "o": "c", "p": [15,16,17,18,15,16,17,18 ], "q": [[25,26,27,28,25,26,27,28 ],[35,36,37,38,35,36,37,38 ],[45,46,47,48,45,46,47,48 ],[55,56,57,58,55,56,57,58 ] ], "r": 1, "s": 0x7ffd80531cf8, "t": { "x": 5, "y": 10 }, "u": 100, "v": 20, "w1": 0x7, "w2": 0x3 } } ] This patch uses json's {} and [] to imply struct/union and array. More explicit information can be added later. For example, a command line option can be introduced to print whether a key or value is struct or union, name of a struct etc. This will however come at the expense of duplicating info when, for example, printing an array of structs. enums are printed as ints without their names. Signed-off-by: Okash Khawaja <osk@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * bpf: btf: export btf types and name by offset from libOkash Khawaja2018-07-142-20/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces btf__resolve_type() function and exports two existing functions from libbpf. btf__resolve_type follows modifier types like const and typedef until it hits a type which actually takes up memory, and then returns it. This function follows similar pattern to btf__resolve_size but instead of computing size, it just returns the type. These functions will be used in the followig patch which parses information inside array of `struct btf_type *`. btf_name_by_offset is used for printing variable names. Signed-off-by: Okash Khawaja <osk@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * tools: include reallocarray feature test in FEATURE_TESTS_BASICJakub Kicinski2018-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | perf propagates its feature check results to libbpf. This means features for which perf probes must be a superset of libbpf's required features. perf depends on FEATURE_TESTS_BASIC for its list of features. commit 531b014e7a2f ("tools: bpf: make use of reallocarray") added reallocarray use to libbpf, make perf also perform the reallocarray feature check. Fixes: 531b014e7a2f ("tools: bpf: make use of reallocarray") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * samples/bpf: xdp_redirect_cpu handle parsing of double VLAN tagged packetsJesper Dangaard Brouer2018-07-141-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | People noticed that the code match on IEEE 802.1ad (ETH_P_8021AD) ethertype, and this implies Q-in-Q or double tagged VLANs. Thus, we better parse the next VLAN header too. It is even marked as a TODO. This is relevant for real world use-cases, as XDP cpumap redirect can be used when the NIC RSS hashing is broken. E.g. the ixgbe driver HW cannot handle double tagged VLAN packets, and places everything into a single RX queue. Using cpumap redirect, users can redistribute traffic across CPUs to solve this, which is faster than the network stacks RPS solution. It is left as an exerise how to distribute the packets across CPUs. It would be convenient to use the RX hash, but that is not _yet_ exposed to XDP programs. For now, users can code their own hash, as I've demonstrated in the Suricata code (where Q-in-Q is handled correctly). Reported-by: Florian Maury <florian.maury-cv@x-cli.eu> Reported-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * Merge branch 'bpf-xdp-driver-and-hw'Daniel Borkmann2018-07-1323-146/+246
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== This set is adding support for loading driver and offload XDP at the same time. This enables advanced use cases where some of the work is offloaded to the NIC and some is done by the host. Separate netlink attributes are added for each mode of operation. Driver callbacks for offload are cleaned up a little, including removal of .prog_attached flag. ==================== Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * nfp: add support for simultaneous driver and hw XDPJakub Kicinski2018-07-133-40/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split handling of offloaded and driver programs completely. Since offloaded programs always come with XDP_FLAGS_HW_MODE set in reality there could be no sharing, anyway, programs would only be installed in driver or in hardware. Splitting the handling allows us to install programs in HW and in driver at the same time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * selftests/bpf: add test for multiple programsJakub Kicinski2018-07-131-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add tests for having an XDP program attached in the driver and another one attached in HW simultaneously. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * netdevsim: add support for simultaneous driver and hw XDPJakub Kicinski2018-07-134-33/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Allow netdevsim to accept driver and offload attachment of XDP BPF programs at the same time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * xdp: support simultaneous driver and hw XDP attachmentJakub Kicinski2018-07-136-62/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split the query of HW-attached program from the software one. Introduce new .ndo_bpf command to query HW-attached program. This will allow drivers to install different programs in HW and SW at the same time. Netlink can now also carry multiple programs on dump (in which case mode will be set to XDP_ATTACHED_MULTI and user has to check per-attachment point attributes, IFLA_XDP_PROG_ID will not be present). We reuse IFLA_XDP_PROG_ID skb space for second mode, so rtnl_xdp_size() doesn't need to be updated. Note that the installation side is still not there, since all drivers currently reject installing more than one program at the time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * xdp: factor out common program/flags handling from driversJakub Kicinski2018-07-137-38/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Basic operations drivers perform during xdp setup and query can be moved to helpers in the core. Encapsulate program and flags into a structure and add helpers. Note that the structure is intended as the "main" program information source in the driver. Most drivers will additionally place the program pointer in their fast path or ring structures. The helpers don't have a huge impact now, but they will decrease the code duplication when programs can be installed in HW and driver at the same time. Encapsulating the basic operations in helpers will hopefully also reduce the number of changes to drivers which adopt them. Helpers could really be static inline, but they depend on definition of struct netdev_bpf which means they'd have to be placed in netdevice.h, an already 4500 line header. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * xdp: don't make drivers report attachment modeJakub Kicinski2018-07-1315-25/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | prog_attached of struct netdev_bpf should have been superseded by simply setting prog_id long time ago, but we kept it around to allow offloading drivers to communicate attachment mode (drv vs hw). Subsequently drivers were also allowed to report back attachment flags (prog_flags), and since nowadays only programs attached will XDP_FLAGS_HW_MODE can get offloaded, we can tell the attachment mode from the flags driver reports. Remove prog_attached member. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * xdp: add per mode attributes for attached programsJakub Kicinski2018-07-132-4/+29
| |/ | | | | | | | | | | | | | | | | | | | | In preparation for support of simultaneous driver and hardware XDP support add per-mode attributes. The catch-all IFLA_XDP_PROG_ID will still be reported, but user space can now also access the program ID in a new IFLA_XDP_<mode>_PROG_ID attribute. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * Merge branch 'bpf-arm-jit-improvements'Daniel Borkmann2018-07-131-47/+73
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | Russell King says: ==================== Four further jit compiler improves for 32-bit ARM. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: improve 64-bit ALU implementationRussell King2018-07-131-5/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improbe the 64-bit ALU implementation from: movw r8, #65532 movt r8, #65535 movw r9, #65535 movt r9, #65535 ldr r7, [fp, #-44] adds r7, r7, r8 str r7, [fp, #-44] ldr r7, [fp, #-40] adc r7, r7, r9 str r7, [fp, #-40] to: movw r8, #65532 movt r8, #65535 movw r9, #65535 movt r9, #65535 ldrd r6, [fp, #-44] adds r6, r6, r8 adc r7, r7, r9 strd r6, [fp, #-44] Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: improve 64-bit store implementationRussell King2018-07-131-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the 64-bit store implementation from: ldr r6, [fp, #-8] str r8, [r6] ldr r6, [fp, #-8] mov r7, #4 add r7, r6, r7 str r9, [r7] to: ldr r6, [fp, #-8] str r8, [r6] str r9, [r6, #4] We leave the store as two separate STR instructions rather than using STRD as the store may not be aligned, and STR can handle misalignment. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: improve 64-bit sign-extended immediate loadRussell King2018-07-131-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the 64-bit sign-extended immediate from: mov r6, #1 str r6, [fp, #-52] ; 0xffffffcc mov r6, #0 str r6, [fp, #-48] ; 0xffffffd0 to: mov r6, #1 mov r7, #0 strd r6, [fp, #-52] ; 0xffffffcc Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: improve 64-bit load immediate implementationRussell King2018-07-131-12/+20
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than writing each 32-bit half of the 64-bit immediate value separately when the register is on the stack: movw r6, #45056 ; 0xb000 movt r6, #60979 ; 0xee33 str r6, [fp, #-44] ; 0xffffffd4 mov r6, #0 str r6, [fp, #-40] ; 0xffffffd8 arrange to use the double-word store when available instead: movw r6, #45056 ; 0xb000 movt r6, #60979 ; 0xee33 mov r7, #0 strd r6, [fp, #-44] ; 0xffffffd4 Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * Merge branch 'bpf-arm-jit-improvements'Daniel Borkmann2018-07-122-481/+543
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Russell King says: ==================== This series improves the ARM BPF JIT compiler by: - enumerating the stack layout rather than using constants that happen to be multiples of four - rejig the BPF "register" accesses to use negative numbers instead of positive, which could be confused with register numbers in the bpf2a32 array. - since we maintain the ARM FP register as a pointer to the top of our scratch space (or, with frame pointers enabled, a valid ARM frame pointer register), we can access our scratch space using FP, which is constant across all BPF programs, including tail-called programs. - use immediate forms of ARM instructions where possible, rather than first loading the immediate into an ARM register. - use load-with-shift instruction rather than seperate shift instruction followed by load - avoid reloading index and array in the tail-call code - use double-word load/store instructions where available Version 2: - Fix ARMv5 test pointed out by Olof - Fix build error found by 0-day (adding an additional patch) ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: use double-word load/stores where availableRussell King2018-07-122-12/+45
| | | | | | | | | | | | | | | | | | | | | | | | Use double-word load and stores where support for this instruction is supported by the CPU architecture. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | * ARM: net: bpf: always use odd/even register pairRussell King2018-07-121-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | Always use an odd/even register pair for our 64-bit registers, so that we're able to use the double-word load/store instructions in the future. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>