bpf: direct packet access

Extended BPF carried over two instructions from classic to access packet data: LD_ABS and LD_IND. They're highly optimized in JITs, but due to their design they have to do length check for every access. When BPF is processing 20M packets per second single LD_ABS after JIT is consuming 3% cpu. Hence the need to optimize it further by amortizing the cost of 'off < skb_headlen' over multiple packet accesses. One option is to introduce two new eBPF instructions LD_ABS_DW and LD_IND_DW with similar usage as skb_header_pointer(). The kernel part for interpreter and x64 JIT was implemented in [1], but such new insns behave like old ld_abs and abort the program with 'return 0' if access is beyond linear data. Such hidden control flow is hard to workaround plus changing JITs and rolling out new llvm is incovenient. Therefore allow cls_bpf/act_bpf program access skb->data directly: int bpf_prog(struct __sk_buff *skb) { struct iphdr *ip; if (skb->data + sizeof(struct iphdr) + ETH_HLEN > skb->data_end) /* packet too small */ return 0; ip = skb->data + ETH_HLEN; /* access IP header fields with direct loads */ if (ip->version != 4 || ip->saddr == 0x7f000001) return 1; [...] } This solution avoids introduction of new instructions. llvm stays the same and all JITs stay the same, but verifier has to work extra hard to prove safety of the above program. For XDP the direct store instructions can be allowed as well. The skb->data is NET_IP_ALIGNED, so for common cases the verifier can check the alignment. The complex packet parsers where packet pointer is adjusted incrementally cannot be tracked for alignment, so allow byte access in such cases and misaligned access on architectures that define efficient_unaligned_access [1] https://git.kernel.org/cgit/linux/kernel/git/ast/bpf.git/?h=ld_abs_dw Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Alexei Starovoitov <ast@fb.com> 2016-05-05 19:49:10 -0700
committer: David S. Miller <davem@davemloft.net> 2016-05-06 16:01:53 -0400
commit: 969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d (patch)
tree: 000a84e285d11b22cc72dead3074c50b325f195c /kernel/bpf/core.c
parent: 1a0dc1ac1d2928e25739ee82d7e04423b01da563 (diff)
download: linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.tar.gz
linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.tar.bz2
linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.zip
1 files changed, 5 insertions, 0 deletions
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index e4248fe79513..d781b077431f 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -794,6 +794,11 @@ void __weak bpf_int_jit_compile(struct bpf_prog *prog)
 {
 }
 
+bool __weak bpf_helper_changes_skb_data(void *func)
+{
+	return false;
+}
+
 /* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call
  * skb_copy_bits(), so provide a weak definition of it for NET-less config.
  */
author	Alexei Starovoitov <ast@fb.com>	2016-05-05 19:49:10 -0700
committer	David S. Miller <davem@davemloft.net>	2016-05-06 16:01:53 -0400
commit	969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d (patch)
tree	000a84e285d11b22cc72dead3074c50b325f195c /kernel/bpf/core.c
parent	1a0dc1ac1d2928e25739ee82d7e04423b01da563 (diff)
download	linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.tar.gz linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.tar.bz2 linux-stable-969bf05eb3cedd5a8d4b7c346a85c2ede87a6d6d.zip