summaryrefslogtreecommitdiffstats
path: root/net/ipv4/tcp_minisocks.c
diff options
context:
space:
mode:
authorEric Dumazet <edumazet@google.com>2018-02-13 06:14:12 -0800
committerDavid S. Miller <davem@davemloft.net>2018-02-14 14:21:45 -0500
commite0f9759f530bf789e984961dce79f525b151ecf3 (patch)
tree8f384615d4648a8fc46c70db06dd9021a0a480b4 /net/ipv4/tcp_minisocks.c
parent1d631583ae5e40681dc9a41d7cfa845222c3c7ff (diff)
downloadlinux-e0f9759f530bf789e984961dce79f525b151ecf3.tar.gz
linux-e0f9759f530bf789e984961dce79f525b151ecf3.tar.bz2
linux-e0f9759f530bf789e984961dce79f525b151ecf3.zip
tcp: try to keep packet if SYN_RCV race is lost
배석진 reported that in some situations, packets for a given 5-tuple end up being processed by different CPUS. This involves RPS, and fragmentation. 배석진 is seeing packet drops when a SYN_RECV request socket is moved into ESTABLISH state. Other states are protected by socket lock. This is caused by a CPU losing the race, and simply not caring enough. Since this seems to occur frequently, we can do better and perform a second lookup. Note that all needed memory barriers are already in the existing code, thanks to the spin_lock()/spin_unlock() pair in inet_ehash_insert() and reqsk_put(). The second lookup must find the new socket, unless it has already been accepted and closed by another cpu. Note that the fragmentation could be avoided in the first place by use of a correct TCP MSS option in the SYN{ACK} packet, but this does not mean we can not be more robust. Many thanks to 배석진 for a very detailed analysis. Reported-by: 배석진 <soukjin.bae@samsung.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4/tcp_minisocks.c')
-rw-r--r--net/ipv4/tcp_minisocks.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index a8384b0c11f8..e7e36433cdb5 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -578,7 +578,7 @@ EXPORT_SYMBOL(tcp_create_openreq_child);
struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
struct request_sock *req,
- bool fastopen)
+ bool fastopen, bool *req_stolen)
{
struct tcp_options_received tmp_opt;
struct sock *child;
@@ -785,6 +785,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
sock_rps_save_rxhash(child, skb);
tcp_synack_rtt_meas(child, req);
+ *req_stolen = !own_req;
return inet_csk_complete_hashdance(sk, child, req, own_req);
listen_overflow: