summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorDaniel Borkmann <daniel@iogearbox.net>2023-04-17 15:49:15 +0200
committerAlexei Starovoitov <ast@kernel.org>2023-04-17 13:17:41 -0700
commit59e498a3289f685261c076b998a8a2f8a516874f (patch)
tree4865aa7bb2355048cedaddadd0c331982ff533dd
parentd40f4f68132e9f6d4b1743c8eca0d6194ea1712f (diff)
downloadlinux-stable-59e498a3289f685261c076b998a8a2f8a516874f.tar.gz
linux-stable-59e498a3289f685261c076b998a8a2f8a516874f.tar.bz2
linux-stable-59e498a3289f685261c076b998a8a2f8a516874f.zip
bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
There are some use-cases where it is desirable to use bpf_redirect() in combination with ifb device, which currently is not supported, for example, around filtering inbound traffic with BPF to then push it to ifb which holds the qdisc for shaping in contrast to doing that on the egress device. Toke mentions the following case related to OpenWrt: Because there's not always a single egress on the other side. These are mainly home routers, which tend to have one or more WiFi devices bridged to one or more ethernet ports on the LAN side, and a single upstream WAN port. And the objective is to control the total amount of traffic going over the WAN link (in both directions), to deal with bufferbloat in the ISP network (which is sadly still all too prevalent). In this setup, the traffic can be split arbitrarily between the links on the LAN side, and the only "single bottleneck" is the WAN link. So we install both egress and ingress shapers on this, configured to something like 95-98% of the true link bandwidth, thus moving the queues into the qdisc layer in the router. It's usually necessary to set the ingress bandwidth shaper a bit lower than the egress due to being "downstream" of the bottleneck link, but it does work surprisingly well. We usually use something like a matchall filter to put all ingress traffic on the ifb, so doing the redirect from BPF has not been an immediate requirement thus far. However, it does seem a bit odd that this is not possible, and we do have a BPF-based filter that layers on top of this kind of setup, which currently uses u32 as the ingress filter and so it could presumably be improved to use BPF instead if that was available. Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Reported-by: Yafang Shao <laoar.shao@gmail.com> Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk Link: https://lore.kernel.org/r/8cebc8b2b6e967e10cbafe2ffd6795050e74accd.1681739137.git.daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-rw-r--r--include/linux/skbuff.h9
-rw-r--r--net/core/filter.c1
2 files changed, 10 insertions, 0 deletions
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 494a23a976b0..9ff2e3d57329 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -5041,6 +5041,15 @@ static inline void skb_reset_redirect(struct sk_buff *skb)
skb->redirected = 0;
}
+static inline void skb_set_redirected_noclear(struct sk_buff *skb,
+ bool from_ingress)
+{
+ skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+ skb->from_ingress = from_ingress;
+#endif
+}
+
static inline bool skb_csum_is_sctp(struct sk_buff *skb)
{
return skb->csum_not_inet;
diff --git a/net/core/filter.c b/net/core/filter.c
index df0df59814ae..44fb997434ad 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2122,6 +2122,7 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
}
skb->dev = dev;
+ skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb));
skb_clear_tstamp(skb);
dev_xmit_recursion_inc();