diff options
author | SeongJae Park <sjpark@amazon.de> | 2020-02-02 03:38:26 +0000 |
---|---|---|
committer | Jakub Kicinski <kuba@kernel.org> | 2020-02-02 13:33:21 -0800 |
commit | 9603d47bad4642118fa19fd1562569663d9235f6 (patch) | |
tree | f84b794d65af5733f6dfc559cf353ef15eac801d /net/ipv4/tcp_input.c | |
parent | dff6bc1bfd462b76dc13ec19dedc2c134a62ac59 (diff) |
tcp: Reduce SYN resend delay if a suspicous ACK is received
When closing a connection, the two acks that required to change closing
socket's status to FIN_WAIT_2 and then TIME_WAIT could be processed in
reverse order. This is possible in RSS disabled environments such as a
connection inside a host.
For example, expected state transitions and required packets for the
disconnection will be similar to below flow.
00 (Process A) (Process B)
01 ESTABLISHED ESTABLISHED
02 close()
03 FIN_WAIT_1
04 ---FIN-->
05 CLOSE_WAIT
06 <--ACK---
07 FIN_WAIT_2
08 <--FIN/ACK---
09 TIME_WAIT
10 ---ACK-->
11 LAST_ACK
12 CLOSED CLOSED
In some cases such as LINGER option applied socket, the FIN and FIN/ACK
will be substituted to RST and RST/ACK, but there is no difference in
the main logic.
The acks in lines 6 and 8 are the acks. If the line 8 packet is
processed before the line 6 packet, it will be just ignored as it is not
a expected packet, and the later process of the line 6 packet will
change the status of Process A to FIN_WAIT_2, but as it has already
handled line 8 packet, it will not go to TIME_WAIT and thus will not
send the line 10 packet to Process B. Thus, Process B will left in
CLOSE_WAIT status, as below.
00 (Process A) (Process B)
01 ESTABLISHED ESTABLISHED
02 close()
03 FIN_WAIT_1
04 ---FIN-->
05 CLOSE_WAIT
06 (<--ACK---)
07 (<--FIN/ACK---)
08 (fired in right order)
09 <--FIN/ACK---
10 <--ACK---
11 (processed in reverse order)
12 FIN_WAIT_2
Later, if the Process B sends SYN to Process A for reconnection using
the same port, Process A will responds with an ACK for the last flow,
which has no increased sequence number. Thus, Process A will send RST,
wait for TIMEOUT_INIT (one second in default), and then try
reconnection. If reconnections are frequent, the one second latency
spikes can be a big problem. Below is a tcpdump results of the problem:
14.436259 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
14.436266 IP 127.0.0.1.4242 > 127.0.0.1.45150: Flags [.], ack 5, win 512
14.436271 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [R], seq 2541101298
/* ONE SECOND DELAY */
15.464613 IP 127.0.0.1.45150 > 127.0.0.1.4242: Flags [S], seq 2560603644
This commit mitigates the problem by reducing the delay for the next SYN
if the suspicous ACK is received while in SYN_SENT state.
Following commit will add a selftest, which can be also helpful for
understanding of this issue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'net/ipv4/tcp_input.c')
-rw-r--r-- | net/ipv4/tcp_input.c | 8 |
1 files changed, 7 insertions, 1 deletions
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index e325b4506e25..316ebdf8151d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5908,8 +5908,14 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * the segment and return)" */ if (!after(TCP_SKB_CB(skb)->ack_seq, tp->snd_una) || - after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt)) + after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt)) { + /* Previous FIN/ACK or RST/ACK might be ignored. */ + if (icsk->icsk_retransmits == 0) + inet_csk_reset_xmit_timer(sk, + ICSK_TIME_RETRANS, + TCP_TIMEOUT_MIN, TCP_RTO_MAX); goto reset_and_undo; + } if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr && !between(tp->rx_opt.rcv_tsecr, tp->retrans_stamp, |