summaryrefslogtreecommitdiffstats
path: root/net/core
diff options
context:
space:
mode:
authorJakub Kicinski <jakub.kicinski@netronome.com>2019-08-07 17:03:59 -0700
committerSasha Levin <sashal@kernel.org>2019-08-25 10:10:38 -0400
commitbc110443a73f275de83d691bccbba9ecd25ca9be (patch)
treeb1a0fb8a264223164bf797e7359f5adac0160603 /net/core
parentbfdbef8aca0339bc8b3428d17790e2e87cb9036a (diff)
downloadlinux-stable-bc110443a73f275de83d691bccbba9ecd25ca9be.tar.gz
linux-stable-bc110443a73f275de83d691bccbba9ecd25ca9be.tar.bz2
linux-stable-bc110443a73f275de83d691bccbba9ecd25ca9be.zip
net/tls: prevent skb_orphan() from leaking TLS plain text with offload
[ Upstream commit 414776621d1006e57e80e6db7fdc3837897aaa64 ] sk_validate_xmit_skb() and drivers depend on the sk member of struct sk_buff to identify segments requiring encryption. Any operation which removes or does not preserve the original TLS socket such as skb_orphan() or skb_clone() will cause clear text leaks. Make the TCP socket underlying an offloaded TLS connection mark all skbs as decrypted, if TLS TX is in offload mode. Then in sk_validate_xmit_skb() catch skbs which have no socket (or a socket with no validation) and decrypted flag set. Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and sk->sk_validate_xmit_skb are slightly interchangeable right now, they all imply TLS offload. The new checks are guarded by CONFIG_TLS_DEVICE because that's the option guarding the sk_buff->decrypted member. Second, smaller issue with orphaning is that it breaks the guarantee that packets will be delivered to device queues in-order. All TLS offload drivers depend on that scheduling property. This means skb_orphan_partial()'s trick of preserving partial socket references will cause issues in the drivers. We need a full orphan, and as a result netem delay/throttling will cause all TLS offload skbs to be dropped. Reusing the sk_buff->decrypted flag also protects from leaking clear text when incoming, decrypted skb is redirected (e.g. by TC). See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") for justification why the internal flag is safe. The only location which could leak the flag in is tcp_bpf_sendmsg(), which is taken care of by clearing the previously unused bit. v2: - remove superfluous decrypted mark copy (Willem); - remove the stale doc entry (Boris); - rely entirely on EOR marking to prevent coalescing (Boris); - use an internal sendpages flag instead of marking the socket (Boris). v3 (Willem): - reorganize the can_skb_orphan_partial() condition; - fix the flag leak-in through tcp_bpf_sendmsg. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'net/core')
-rw-r--r--net/core/sock.c19
1 files changed, 14 insertions, 5 deletions
diff --git a/net/core/sock.c b/net/core/sock.c
index aa4a00d381e3..df7b38b60164 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1988,6 +1988,19 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
}
EXPORT_SYMBOL(skb_set_owner_w);
+static bool can_skb_orphan_partial(const struct sk_buff *skb)
+{
+#ifdef CONFIG_TLS_DEVICE
+ /* Drivers depend on in-order delivery for crypto offload,
+ * partial orphan breaks out-of-order-OK logic.
+ */
+ if (skb->decrypted)
+ return false;
+#endif
+ return (skb->destructor == sock_wfree ||
+ (IS_ENABLED(CONFIG_INET) && skb->destructor == tcp_wfree));
+}
+
/* This helper is used by netem, as it can hold packets in its
* delay queue. We want to allow the owner socket to send more
* packets, as if they were already TX completed by a typical driver.
@@ -1999,11 +2012,7 @@ void skb_orphan_partial(struct sk_buff *skb)
if (skb_is_tcp_pure_ack(skb))
return;
- if (skb->destructor == sock_wfree
-#ifdef CONFIG_INET
- || skb->destructor == tcp_wfree
-#endif
- ) {
+ if (can_skb_orphan_partial(skb)) {
struct sock *sk = skb->sk;
if (refcount_inc_not_zero(&sk->sk_refcnt)) {