summaryrefslogtreecommitdiffstats
path: root/net/ipv4
diff options
context:
space:
mode:
authorEric Dumazet <edumazet@google.com>2014-11-13 09:45:22 -0800
committerDavid S. Miller <davem@davemloft.net>2014-11-13 15:21:44 -0500
commitd649a7a81f3b5bacb1d60abd7529894d8234a666 (patch)
tree6df6122eab00941a543a7e38a3cc50b62a6f7600 /net/ipv4
parent6eba82248ef47fd478f940a418429e3ec95cb3db (diff)
downloadlinux-stable-d649a7a81f3b5bacb1d60abd7529894d8234a666.tar.gz
linux-stable-d649a7a81f3b5bacb1d60abd7529894d8234a666.tar.bz2
linux-stable-d649a7a81f3b5bacb1d60abd7529894d8234a666.zip
tcp: limit GSO packets to half cwnd
In DC world, GSO packets initially cooked by tcp_sendmsg() are usually big, as sk_pacing_rate is high. When network is congested, cwnd can be smaller than the GSO packets found in socket write queue. tcp_write_xmit() splits GSO packets using the available cwnd, and we end up sending a single GSO packet, consuming all available cwnd. With GRO aggregation on the receiver, we might handle a single GRO packet, sending back a single ACK. 1) This single ACK might be lost TLP or RTO are forced to attempt a retransmit. 2) This ACK releases a full cwnd, sender sends another big GSO packet, in a ping pong mode. This behavior does not fill the pipes in the best way, because of scheduling artifacts. Make sure we always have at least two GSO packets in flight. This allows us to safely increase GRO efficiency without risking spurious retransmits. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv4')
-rw-r--r--net/ipv4/tcp_output.c12
1 files changed, 8 insertions, 4 deletions
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 0b88158dd4a7..eb73a1dccf56 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1562,7 +1562,7 @@ static unsigned int tcp_mss_split_point(const struct sock *sk,
static inline unsigned int tcp_cwnd_test(const struct tcp_sock *tp,
const struct sk_buff *skb)
{
- u32 in_flight, cwnd;
+ u32 in_flight, cwnd, halfcwnd;
/* Don't be strict about the congestion window for the final FIN. */
if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) &&
@@ -1571,10 +1571,14 @@ static inline unsigned int tcp_cwnd_test(const struct tcp_sock *tp,
in_flight = tcp_packets_in_flight(tp);
cwnd = tp->snd_cwnd;
- if (in_flight < cwnd)
- return (cwnd - in_flight);
+ if (in_flight >= cwnd)
+ return 0;
- return 0;
+ /* For better scheduling, ensure we have at least
+ * 2 GSO packets in flight.
+ */
+ halfcwnd = max(cwnd >> 1, 1U);
+ return min(halfcwnd, cwnd - in_flight);
}
/* Initialize TSO state of a skb.