draft-ietf-tcpm-fastopen-02.txt | draft-ietf-tcpm-fastopen-03.txt | |||
---|---|---|---|---|
Internet Draft Y. Cheng | Internet Draft Y. Cheng | |||
draft-ietf-tcpm-fastopen-02.txt J. Chu | draft-ietf-tcpm-fastopen-03.txt J. Chu | |||
Intended status: Experimental S. Radhakrishnan | Intended status: Experimental S. Radhakrishnan | |||
Expiration date: April, 2013 A. Jain | Expiration date: August, 2013 A. Jain | |||
Google, Inc. | Google, Inc. | |||
Octobor 22, 2012 | Feburary 25, 2013 | |||
TCP Fast Open | TCP Fast Open | |||
Status of this Memo | Status of this Memo | |||
Distribution of this memo is unlimited. | Distribution of this memo is unlimited. | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
skipping to change at page 2, line 21 | skipping to change at page 2, line 21 | |||
(3WHS) to complete before data can be exchanged. | (3WHS) to complete before data can be exchanged. | |||
Terminology | Terminology | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
TFO refers to TCP Fast Open. Client refers to the TCP's active open | TFO refers to TCP Fast Open. Client refers to the TCP's active open | |||
side and server refers to the TCP's passive open side. | side and server refers to the TCP's passive open side. | |||
Table of Contents | ||||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | ||||
2. Data In SYN . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
2.1 Relaxing TCP semantics on duplicated SYNs . . . . . . . . . 4 | ||||
2.2. SYNs with spoofed IP addresses . . . . . . . . . . . . . . 4 | ||||
3. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
4. Protocol Details . . . . . . . . . . . . . . . . . . . . . . . 7 | ||||
4.1. Fast Open Cookie . . . . . . . . . . . . . . . . . . . . . 7 | ||||
4.1.1. TCP Options . . . . . . . . . . . . . . . . . . . . . . 7 | ||||
4.1.2. Server Cookie Handling . . . . . . . . . . . . . . . . 8 | ||||
4.1.3. Client Cookie Handling . . . . . . . . . . . . . . . . 9 | ||||
4.2. Fast Open Protocol . . . . . . . . . . . . . . . . . . . . 9 | ||||
4.2.1. Fast Open Cookie Request . . . . . . . . . . . . . . . 10 | ||||
4.2.2. TCP Fast Open . . . . . . . . . . . . . . . . . . . . . 11 | ||||
5. Reliability and Deployment Issues . . . . . . . . . . . . . . . 13 | ||||
6. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 | ||||
6.1. Server Resource Exhaustion Attack by SYN Flood with Valid | ||||
Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
6.2. Amplified Reflection Attack to Random Host . . . . . . . . 15 | ||||
6.3 Attacks from behind sharing public IPs (NATs) . . . . . . . 16 | ||||
7. TFO's Applicability . . . . . . . . . . . . . . . . . . . . . . 17 | ||||
7.1 Duplicate data in SYNs . . . . . . . . . . . . . . . . . . . 17 | ||||
7.2 Potential performance improvement . . . . . . . . . . . . . 17 | ||||
7.3 Example: Web clients and servers . . . . . . . . . . . . . . 17 | ||||
7.3.1 HTTP request replay . . . . . . . . . . . . . . . . . . 17 | ||||
7.3.2 HTTP persistent connection . . . . . . . . . . . . . . . 18 | ||||
8. Performance Experiments . . . . . . . . . . . . . . . . . . . . 18 | ||||
9. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
9.1. T/TCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
9.2. Common Defenses Against SYN Flood Attacks . . . . . . . . . 19 | ||||
9.3. TCP Cookie Transaction (TCPCT) . . . . . . . . . . . . . . 20 | ||||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | ||||
11. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 | ||||
12.1. Normative References . . . . . . . . . . . . . . . . . . . 20 | ||||
12.2. Informative References . . . . . . . . . . . . . . . . . . 21 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 | ||||
1. Introduction | 1. Introduction | |||
TCP Fast Open (TFO) enables data to be exchanged safely during TCP's | TCP Fast Open (TFO) enables data to be exchanged safely during TCP's | |||
connection handshake. | connection handshake. | |||
This document describes a design that enables applications to save a | This document describes a design that enables applications to save a | |||
round trip while avoiding severe security ramifications. At the core | round trip while avoiding severe security ramifications. At the core | |||
of TFO is a security cookie used by the server side to authenticate a | of TFO is a security cookie used by the server side to authenticate a | |||
client initiating a TFO connection. This document covers the details | client initiating a TFO connection. This document covers the details | |||
of exchanging data during TCP's initial handshake, the protocol for | of exchanging data during TCP's initial handshake, the protocol for | |||
TFO cookies, and potential new security vulnerabilities and their | TFO cookies, and potential new security vulnerabilities and their | |||
mitigation. It also includes discussions of deployment issues and | mitigation. It also includes discussion of deployment issues and | |||
related proposals. TFO requires extensions to the socket API but this | related proposals. TFO requires extensions to the socket API but this | |||
document does not cover that. | document does not cover that. | |||
TFO is motivated by the performance needs of today's Web | TFO is motivated by the performance needs of today's Web | |||
applications. Network latency is determined by the round-trip time | applications. Network latency is largely determined by a connection's | |||
(RTT) and the number of round trips required to transfer application | round-trip time (RTT) and the number of round trips required to | |||
data. RTT consists of propagation delay and queuing delay. Network | transfer application data. RTT consists of propagation delay and | |||
bandwidth has grown substantially over the past two decades, reducing | queuing delay. | |||
queuing delay, while propagation delay is largely constrained by the | ||||
speed of light and has remained unchanged. Therefore reducing the | ||||
number of round trips has become the most effective way to improve | ||||
the latency of Web applications [CDCM11]. | ||||
Standard TCP only permits data exchange after 3WHS [RFC793], which | Network bandwidth has grown substantially over the past two decades, | |||
adds one RTT to the network latency. For short transfers (e.g., web | potentially reducing queuing delay, while propagation delay is | |||
objects) this additional RTT is a significant portion of the network | largely constrained by the speed of light and has remained unchanged. | |||
latency [THK98]. One widely deployed solution is HTTP persistent | Therefore reducing the number of round trips has typically become the | |||
connections. However, this solution is limited since hosts and middle | most effective way to improve the latency of applications like the | |||
boxes terminate idle TCP connections due to resource constraints. For | Web [CDCM11]. | |||
example, the Chrome browser keeps TCP connections idle up to 5 | ||||
minutes but 35% of Chrome HTTP requests are made on new TCP | Current TCP only permits data exchange after 3WHS [RFC793], which | |||
connections. We discuss HTTP persistent connections further in | adds one RTT to network latency. For short transfers (e.g., web | |||
section 7.1. | objects) this additional RTT is a significant portion of overall | |||
network latency [THK98]. One widely deployed solution is HTTP | ||||
persistent connections. However, this solution is limited since hosts | ||||
and middle boxes terminate idle TCP connections due to resource | ||||
constraints. For example, the Chrome browser keeps TCP connections | ||||
idle for up to 5 minutes but 35% of Chrome HTTP requests are made on | ||||
new TCP connections [RCCJR11]. We discuss Web applications and TFO in | ||||
detail later in section 7. | ||||
2. Data In SYN | 2. Data In SYN | |||
Allowing data in SYN packets to be delivered raises two issues | ||||
discussed in the following subsections. These issues make TFO | ||||
undesirable for certain applications. Therefore TCP implementations | ||||
MUST NOT use TFO by default and only use TFO if requested explicitly | ||||
by the application on a per service port basis. Applications need to | ||||
evaluate TFO applicability (described in Section 7) before using TFO. | ||||
2.1 Relaxing TCP semantics on duplicated SYNs | ||||
[RFC793] (section 3.4) already allows data in SYN packets but forbids | [RFC793] (section 3.4) already allows data in SYN packets but forbids | |||
the receiver to deliver the data to the application until 3WHS is | the receiver from delivering the data to the application until 3WHS | |||
completed. This is because TCP's initial handshake serves to capture | is completed. This is because TCP's initial handshake serves to | |||
1) Old or duplicate SYNs and 2)SYNs with spoofed IP addresses. | capture old or duplicate SYNs. | |||
TFO allows data to be delivered to the application before 3WHS is | TFO allows data to be delivered to the application before 3WHS is | |||
completed, thus opening itself to a possible data integrity problem | completed, thus opening itself to a data integrity issue for the | |||
caused by the problematic SYN packets above. This could cause a | applications in Section 2.1 in either of the following cases: | |||
problem in the following two examples: a) the receiver host receives | ||||
both duplicate and original SYNs before and after the host reboots, | ||||
and b) the duplicate is received after the connection created by the | ||||
original SYN has been closed. The receiver will not be protected by | ||||
the 2MSL TIMEWAIT state if the close is initiated by the sender. In | ||||
both cases, the data is replayed. | ||||
2.1. TCP Semantics and Duplicate SYNs | a) the receiver host receives data in a duplicate SYN after it has | |||
forgotten it received the original SYN (e.g. due to a reboot); b) the | ||||
duplicate is received after the connection created by the original | ||||
SYN has been closed and the close was initiated by the sender (so | ||||
the receiver will not be protected by the 2MSL TIMEWAIT state). | ||||
The proposed T/TCP protocol employs a new TCP "TAO" option and | The obsoleted T/TCP protocol employs a new TCP "TAO" option and | |||
connection count to guard against old or duplicate SYNs [RFC1644]. | connection count to guard against old or duplicate SYNs [RFC1644]. | |||
The solution is complex, involving state tracking on a per remote | However it is not widely used due to various vulnerabilities | |||
peer basis, and is vulnerable to IP spoofing attacks. Moreover, it | [PHRACK98]. | |||
has been shown that despite its complexity, T/TCP is still not | ||||
entirely protected. Old or duplicate SYNs may still be accepted by a | ||||
T/TCP server [PHRACK98]. | ||||
Rather than trying to capture all dubious SYN packets to make TFO | Rather than trying to capture all dubious SYN packets to make TFO | |||
100% compatible with TCP semantics, we made a design decision early | 100% compatible with TCP semantics, we made a design decision early | |||
on to accept old SYN packets with data, i.e., to restrict TFO to use | on to accept old SYN packets with data, i.e., to restrict TFO use to | |||
with a class of applications that are tolerant of duplicate SYN | a class of applications (Section 7) that are tolerant of duplicate | |||
packets with data. We believe this is the right design trade-off | SYN packets with data. We believe this is the right design trade-off | |||
balancing complexity with usefulness. Applications that require | balancing complexity with usefulness for certain applications. | |||
transactional semantics already deploy specific mechanisms to | ||||
tolerate similar data replay issues in TCP today. For example, a | ||||
browser reload event may replay any HTTP request even without data in | ||||
SYN. For transactional HTTP requests applications typically include | ||||
unique identifiers in the HTTP headers. Thus, allowing data in SYN | ||||
poses little risk to existing HTTP applications. | ||||
However, we note that some applications may rely on TCP 3-way | ||||
handshake semantics. For this reason, TFO MUST be used explicitly by | ||||
applications on a per service port basis. | ||||
2.2. SYNs with spoofed IP addresses | 2.2. SYNs with spoofed IP addresses | |||
Standard TCP suffers from the SYN flood attack [RFC4987] because | Standard TCP suffers from the SYN flood attack [RFC4987] because | |||
bogus SYN packets, i.e., SYN packets with spoofed source IP addresses | bogus SYN packets, i.e., SYN packets with spoofed source IP addresses | |||
can easily fill up a listener's small queue, causing a service port | can easily fill up a listener's small queue, causing a service port | |||
to be blocked completely until timeouts. Secondary damage comes from | to be blocked completely until timeouts. Secondary damage comes from | |||
these SYN requests taking up memory space. Though this is less of an | these SYN requests taking up memory space. Though this is less of an | |||
issue today as servers typically have plenty of memory. | issue today as servers typically have plenty of memory. | |||
TFO goes one step further to allow server side TCP to process and | TFO goes one step further to allow server-side TCP to process and | |||
send up data to the application layer before 3WHS is completed. This | send up data to the application layer before 3WHS is completed. This | |||
opens up more serious new vulnerabilities. Applications serving ports | opens up more serious new vulnerabilities. Applications serving ports | |||
that have TFO enabled may waste lots of CPU and memory resources | that have TFO enabled may waste lots of CPU and memory resources | |||
processing the requests and producing the responses. If the response | processing the requests and producing the responses. If the response | |||
is much larger than the request, the attacker can mount an amplified | is much larger than the request, the attacker can mount an amplified | |||
reflection attack against victims of choice beyond the TFO server | reflection attack against victims of choice beyond the TFO server | |||
itself. | itself. | |||
Numerous mitigation techniques against the regular SYN flood attack | Numerous mitigation techniques against regular SYN flood attacks | |||
exist and have been well documented [RFC4987]. Unfortunately none are | exist and have been well documented [RFC4987]. Unfortunately none are | |||
applicable to TFO. We propose a server supplied cookie to mitigate | applicable to TFO. We propose a server-supplied cookie to mitigate | |||
most of the security issues introduced by TFO. We defer further | the primary security issues introduced by TFO in Section 3. We defer | |||
discussion of SYN flood attacks to the "Security Considerations" | further discussion of SYN flood attacks to the "Security | |||
section. | Considerations" section. | |||
3. Protocol Overview | 3. Protocol Overview | |||
The key component of TFO is the Fast Open Cookie (cookie), a message | The key component of TFO is the Fast Open Cookie (cookie), a message | |||
authentication code (MAC) tag generated by the server. The client | authentication code (MAC) tag generated by the server. The client | |||
requests a cookie in one regular TCP connection, then uses it for | requests a cookie in one regular TCP connection, then uses it for | |||
future TCP connections to exchange data during 3WHS: | future TCP connections to exchange data during 3WHS: Requesting a | |||
Requesting a Fast Open Cookie: | Fast Open Cookie: | |||
1. The client sends a SYN with a Fast Open Cookie Request option. | 1. The client sends a SYN with a Fast Open Cookie Request option. | |||
2. The server generates a cookie and sends it through the Fast Open | 2. The server generates a cookie and sends it through the Fast Open | |||
Cookie option of a SYN-ACK packet. | Cookie option of a SYN-ACK packet. | |||
3. The client caches the cookie for future TCP Fast Open connections | 3. The client caches the cookie for future TCP Fast Open connections | |||
(see below). | (see below). | |||
Performing TCP Fast Open: | Performing TCP Fast Open: | |||
skipping to change at page 7, line 18 | skipping to change at page 8, line 17 | |||
Options with invalid Length values, without SYN flag set, or with ACK | Options with invalid Length values, without SYN flag set, or with ACK | |||
flag set MUST be ignored. | flag set MUST be ignored. | |||
4.1.2. Server Cookie Handling | 4.1.2. Server Cookie Handling | |||
The server is in charge of cookie generation and authentication. The | The server is in charge of cookie generation and authentication. The | |||
cookie SHOULD be a message authentication code tag with the following | cookie SHOULD be a message authentication code tag with the following | |||
properties: | properties: | |||
1. The cookie authenticates the client's (source) IP address of the | 1. The cookie authenticates the client's (source) IP address of the | |||
SYN packet. The IP address can be an IPv4 or IPv6 address. | SYN packet. The IP address can be an IPv4 or IPv6 address. | |||
2. The cookie can only be generated by the server and can not be | 2. The cookie can only be generated by the server and can not be | |||
fabricated by any other parties including the client. | fabricated by any other parties including the client. | |||
3. The generation and verification are fast relative to the rest of | 3. The generation and verification are fast relative to the rest of | |||
SYN and SYN-ACK processing. | SYN and SYN-ACK processing. | |||
4. A server may encode other information in the cookie, and accept | 4. A server may encode other information in the cookie, and accept | |||
more than one valid cookie per client at any given time. But this | more than one valid cookie per client at any given time. But this | |||
is all server implementation dependent and transparent to the | is all server implementation dependent and transparent to the | |||
client. | client. | |||
5. The cookie expires after a certain amount of time. The reason for | 5. The cookie expires after a certain amount of time. The reason for | |||
cookie expiration is detailed in the "Security Consideration" | cookie expiration is detailed in the "Security Consideration" | |||
section. This can be done by either periodically changing the | section. This can be done by either periodically changing the | |||
server key used to generate cookies or including a timestamp when | server key used to generate cookies or including a timestamp when | |||
generating the cookie. | generating the cookie. | |||
To gradually invalidate cookies over time, the server can | To gradually invalidate cookies over time, the server can | |||
implement key rotation to generate and verify cookies using | implement key rotation to generate and verify cookies using | |||
multiple keys. This approach is useful for large-scale servers to | multiple keys. This approach is useful for large-scale servers to | |||
retain Fast Open rolling key updates. We do not specify a | retain Fast Open rolling key updates. We do not specify a | |||
particular mechanism because the implementation is often server | particular mechanism because the implementation is often server | |||
specific. | specific. | |||
The server supports the cookie generation and verification | The server supports the cookie generation and verification | |||
operations: | operations: | |||
- GetCookie(IP_Address): returns a (new) cookie | - GetCookie(IP_Address): returns a (new) cookie | |||
- IsCookieValid(IP_Address, Cookie): checks if the cookie is valid, | - IsCookieValid(IP_Address, Cookie): checks if the cookie is valid, | |||
i.e., it has not expired and it authenticates the client IP address. | i.e., it has not expired and it authenticates the client IP address. | |||
Example Implementation: a simple implementation is to use AES_128 to | Example Implementation: a simple implementation is to use AES_128 to | |||
encrypt the IPv4 (with padding) or IPv6 address and truncate to 64 | encrypt the IPv4 (with padding) or IPv6 address and truncate to 64 | |||
bits. The server can periodically update the key to expire the | bits. The server can periodically update the key to expire the | |||
cookies. AES encryption on recent processors is fast and takes only a | cookies. AES encryption on recent processors is fast and takes only a | |||
few hundred nanoseconds [RCCJB11]. | few hundred nanoseconds [RCCJR11]. | |||
If only one valid cookie is allowed per-client and the server can | If only one valid cookie is allowed per-client and the server can | |||
regenerate the cookie independently, the best validation process is | regenerate the cookie independently, the best validation process is | |||
to simply regenerate a valid cookie and compare it against the | to simply regenerate a valid cookie and compare it against the | |||
incoming cookie. In that case if the incoming cookie fails the check, | incoming cookie. In that case if the incoming cookie fails the check, | |||
a valid cookie is readily available to be sent to the client. | a valid cookie is readily available to be sent to the client. | |||
The server MAY return a cookie request option, e.g., a null cookie, | The server MAY return a cookie request option, e.g., a null cookie, | |||
to signal the support of Fast Open without generating cookies, for | to signal the support of Fast Open without generating cookies, for | |||
probing or debugging purposes. | probing or debugging purposes. | |||
skipping to change at page 8, line 45 | skipping to change at page 9, line 43 | |||
discovery. | discovery. | |||
Caching RTT allows seeding a more accurate SYN timeout than the | Caching RTT allows seeding a more accurate SYN timeout than the | |||
default value [RFC6298]. This lowers the performance penalty if the | default value [RFC6298]. This lowers the performance penalty if the | |||
network or the server drops the SYN packets with data or the cookie | network or the server drops the SYN packets with data or the cookie | |||
options (See "Reliability and Deployment Issues" section below). | options (See "Reliability and Deployment Issues" section below). | |||
The cache replacement algorithm is not specified and is left for the | The cache replacement algorithm is not specified and is left for the | |||
implementations. | implementations. | |||
Note that before TFO sees wide deployment, clients are advised to | Note that before TFO sees wide deployment, clients SHOULD cache | |||
also cache negative responses from servers in order to reduce the | negative responses from servers in order to reduce the amount of | |||
amount of futile TFO attempts. Since TFO is enabled on a per-service | futile TFO attempts. Since TFO is enabled on a per-service port basis | |||
port basis but cookies are independent of service ports, clients' | but cookies are independent of service ports, clients' cache should | |||
cache should include remote port numbers too. | include remote port numbers too. | |||
4.2. Fast Open Protocol | 4.2. Fast Open Protocol | |||
One predominant requirement of TFO is to be fully compatible with | One predominant requirement of TFO is to be fully compatible with | |||
existing TCP implementations, both on the client and the server | existing TCP implementations, both on the client and the server | |||
sides. | sides. | |||
The server keeps two variables per listening port: | The server keeps two variables per listening port: | |||
FastOpenEnabled: default is off. It MUST be turned on explicitly by | FastOpenEnabled: default is off. It MUST be turned on explicitly by | |||
skipping to change at page 9, line 34 | skipping to change at page 10, line 29 | |||
The server keeps a FastOpened flag per TCB to mark if a connection | The server keeps a FastOpened flag per TCB to mark if a connection | |||
has successfully performed a TFO. | has successfully performed a TFO. | |||
4.2.1. Fast Open Cookie Request | 4.2.1. Fast Open Cookie Request | |||
Any client attempting TFO MUST first request a cookie from the server | Any client attempting TFO MUST first request a cookie from the server | |||
with the following steps: | with the following steps: | |||
1. The client sends a SYN packet with a Fast Open Cookie Request | 1. The client sends a SYN packet with a Fast Open Cookie Request | |||
option. | option. | |||
2. The server SHOULD respond with a SYN-ACK based on the procedures | 2. The server SHOULD respond with a SYN-ACK based on the procedures | |||
in the "Server Cookie Handling" section. This SYN-ACK SHOULD | in the "Server Cookie Handling" section. This SYN-ACK SHOULD | |||
contain a Fast Open Cookie option if the server currently supports | contain a Fast Open Cookie option if the server currently supports | |||
TFO for this listener port. | TFO for this listener port. | |||
3. If the SYN-ACK contains a Fast Open Cookie option, the client | 3. If the SYN-ACK contains a Fast Open Cookie option, the client | |||
replaces the cookie and other information as described in the | replaces the cookie and other information as described in the | |||
"Client Cookie Handling" section. Otherwise, if the SYN-ACK is | "Client Cookie Handling" section. Otherwise, if the SYN-ACK is | |||
first seen, i.e.,not a (spurious) retransmission, the client MAY | first seen, i.e.,not a (spurious) retransmission, the client MAY | |||
remove the server information from the cookie cache. If the SYN- | remove the server information from the cookie cache. If the SYN- | |||
ACK is a spurious retransmission without valid Fast Open Cookie | ACK is a spurious retransmission without valid Fast Open Cookie | |||
Option, the client does nothing to the cookie cache for the | Option, the client does nothing to the cookie cache for the reasons | |||
reasons below. | below. | |||
The network or servers may drop the SYN or SYN-ACK packets with the | The network or servers may drop the SYN or SYN-ACK packets with the | |||
new cookie options which causes SYN or SYN-ACK timeouts. We RECOMMEND | new cookie options which causes SYN or SYN-ACK timeouts. We RECOMMEND | |||
both the client and the server retransmit SYN and SYN-ACK without the | both the client and the server retransmit SYN and SYN-ACK without the | |||
cookie options on timeouts. This ensures the connections of cookie | cookie options on timeouts. This ensures the connections of cookie | |||
requests will go through and lowers the latency penalties (of dropped | requests will go through and lowers the latency penalties (of dropped | |||
SYN/SYN-ACK packets). The obvious downside for maximum compatibility | SYN/SYN-ACK packets). The obvious downside for maximum compatibility | |||
is that any regular SYN drop will fail the cookie (although one can | is that any regular SYN drop will fail the cookie (although one can | |||
argue the delay in the data transmission till after 3WHS is justified | argue the delay in the data transmission till after 3WHS is justified | |||
if the SYN drop is due to network congestion). Next section | if the SYN drop is due to network congestion). Next section | |||
skipping to change at page 10, line 37 | skipping to change at page 11, line 32 | |||
changes relatively small in addition to [RFC793]. | changes relatively small in addition to [RFC793]. | |||
Client: Sending SYN | Client: Sending SYN | |||
To open a TFO connection, the client MUST have obtained the cookie | To open a TFO connection, the client MUST have obtained the cookie | |||
from the server: | from the server: | |||
1. Send a SYN packet. | 1. Send a SYN packet. | |||
a. If the SYN packet does not have enough option space for the | a. If the SYN packet does not have enough option space for the | |||
Fast Open Cookie option, abort TFO and fall back to regular 3WHS. | Fast Open Cookie option, abort TFO and fall back to regular 3WHS. | |||
b. Otherwise, include the Fast Open Cookie option with the cookie | b. Otherwise, include the Fast Open Cookie option with the cookie | |||
of the server. Include any data up to the cached server MSS or | of the server. Include any data up to the cached server MSS or | |||
default 536 bytes. | default 536 bytes. | |||
2. Advance to SYN-SENT state and update SND.NXT to include the data | 2. Advance to SYN-SENT state and update SND.NXT to include the data | |||
accordingly. | accordingly. | |||
3. If RTT is available from the cache, seed SYN timer according to | 3. If RTT is available from the cache, seed SYN timer according to | |||
[RFC6298]. | [RFC6298]. | |||
To deal with network or servers dropping SYN packets with payload or | To deal with network or servers dropping SYN packets with payload or | |||
unknown options, when the SYN timer fires, the client SHOULD | unknown options, when the SYN timer fires, the client SHOULD | |||
retransmit a SYN packet without data and Fast Open Cookie options. | retransmit a SYN packet without data and Fast Open Cookie options. | |||
Server: Receiving SYN and responding with SYN-ACK | Server: Receiving SYN and responding with SYN-ACK | |||
Upon receiving the SYN packet with Fast Open Cookie option: | Upon receiving the SYN packet with Fast Open Cookie option: | |||
1. Initialize and reset a local FastOpened flag. If FastOpenEnabled | 1. Initialize and reset a local FastOpened flag. If FastOpenEnabled | |||
is false, go to step 5. | is false, go to step 5. | |||
2. If PendingFastOpenRequests is over the system limit, go to step 5. | 2. If PendingFastOpenRequests is over the system limit, go to step 5. | |||
3. If IsCookieValid() in section 4.1.2 returns false, go to step 5. | 3. If IsCookieValid() in section 4.1.2 returns false, go to step 5. | |||
4. Buffer the data and notify the application. Set FastOpened flag | 4. Buffer the data and notify the application. Set FastOpened flag | |||
and increment PendingFastOpenRequests. | and increment PendingFastOpenRequests. | |||
5. Send the SYN-ACK packet. The packet MAY include a Fast Open | 5. Send the SYN-ACK packet. The packet MAY include a Fast Open | |||
Option. If FastOpened flag is set, the packet acknowledges the SYN | Option. If FastOpened flag is set, the packet acknowledges the SYN | |||
and data sequence. Otherwise it acknowledges only the SYN | and data sequence. Otherwise it acknowledges only the SYN sequence. | |||
sequence. The server MAY include data in the SYN-ACK packet if the | The server MAY include data in the SYN-ACK packet if the response | |||
response data is readily available. Some application may favor | data is readily available. Some application may favor delaying the | |||
delaying the SYN-ACK, allowing the application to process the | SYN-ACK, allowing the application to process the request in order | |||
request in order to produce a response, but this is left to the | to produce a response, but this is left to the implementation. | |||
implementation. | ||||
6. Advance to the SYN-RCVD state. If the FastOpened flag is set, the | 6. Advance to the SYN-RCVD state. If the FastOpened flag is set, the | |||
server MUST follow the congestion control [RFC5681], in particular | server MUST follow the congestion control [RFC5681], in particular | |||
the initial congestion window [RFC3390], to send more data | the initial congestion window [RFC3390], to send more data packets. | |||
packets. | ||||
Note that if SYN-ACK is lost, regular TCP reduces the initial | ||||
congestion window before sending any data. In this case TFO is | ||||
slightly more aggressive in the first data round trip even though | ||||
it does not change the congestion control. | ||||
If the SYN-ACK timer fires, the server SHOULD retransmit a SYN-ACK | If the SYN-ACK timer fires, the server SHOULD retransmit a SYN-ACK | |||
segment with neither data nor Fast Open Cookie options for | segment with neither data nor Fast Open Cookie options for | |||
compatibility reasons. | compatibility reasons. | |||
A special case is simultaneous open where the SYN receiver is a | ||||
client in SYN-SENT state. The protocol remains the same because | ||||
[RFC793] already supports both data in SYN and simultaneous open. But | ||||
the client's socket may have data available to read before it's | ||||
connected. This document does not cover the corresponding API change. | ||||
Client: Receiving SYN-ACK | Client: Receiving SYN-ACK | |||
The client SHOULD perform the following steps upon receiving the SYN- | The client SHOULD perform the following steps upon receiving the SYN- | |||
ACK: | ACK: 1. Update the cookie cache if the SYN-ACK has a Fast Open Cookie | |||
1. Update the cookie cache if the SYN-ACK has a Fast Open Cookie | Option or MSS option or both. | |||
Option or MSS option or both. | ||||
2. Send an ACK packet. Set acknowledgment number to RCV.NXT and | 2. Send an ACK packet. Set acknowledgment number to RCV.NXT and | |||
include the data after SND.UNA if data is available. | include the data after SND.UNA if data is available. | |||
3. Advance to the ESTABLISHED state. | 3. Advance to the ESTABLISHED state. | |||
Note there is no latency penalty if the server does not acknowledge | Note there is no latency penalty if the server does not acknowledge | |||
the data in the original SYN packet. The client SHOULD retransmit any | the data in the original SYN packet. The client SHOULD retransmit any | |||
unacknowledged data in the first ACK packet in step 2. The data | unacknowledged data in the first ACK packet in step 2. The data | |||
exchange will start after the handshake like a regular TCP | exchange will start after the handshake like a regular TCP | |||
connection. | connection. | |||
If the client has timed out and retransmitted only regular SYN | If the client has timed out and retransmitted only regular SYN | |||
skipping to change at page 12, line 27 | skipping to change at page 13, line 34 | |||
state. No special handling is required further. | state. No special handling is required further. | |||
5. Reliability and Deployment Issues | 5. Reliability and Deployment Issues | |||
Network or Hosts Dropping SYN packets with data or unknown options | Network or Hosts Dropping SYN packets with data or unknown options | |||
A study [MAF04] found that some middle-boxes and end-hosts may drop | A study [MAF04] found that some middle-boxes and end-hosts may drop | |||
packets with unknown TCP options incorrectly. Studies [LANGLEY06, | packets with unknown TCP options incorrectly. Studies [LANGLEY06, | |||
HNRGHT11] both found that 6% of the probed paths on the Internet drop | HNRGHT11] both found that 6% of the probed paths on the Internet drop | |||
SYN packets with data or with unknown TCP options. The TFO protocol | SYN packets with data or with unknown TCP options. The TFO protocol | |||
deals with this problem by retransmitting SYN without data or cookie | deals with this problem by re-transmitting SYN without data or cookie | |||
options and we recommend tracking these servers in the client. | options and we recommend tracking these servers in the client. | |||
Server Farms | Server Farms | |||
A common server-farm setup is to have many physical hosts behind a | A common server-farm setup is to have many physical hosts behind a | |||
load-balancer sharing the same server IP. The load-balancer forwards | load-balancer sharing the same server IP. The load-balancer forwards | |||
new TCP connections to different physical hosts based on certain | new TCP connections to different physical hosts based on certain | |||
load-balancing algorithms. For TFO to work, the physical hosts need | load-balancing algorithms. For TFO to work, the physical hosts need | |||
to share the same key and update the key at about the same time. | to share the same key and update the key at about the same time. | |||
skipping to change at page 15, line 35 | skipping to change at page 17, line 5 | |||
This enables the server to issue different cookies to clients that | This enables the server to issue different cookies to clients that | |||
share the same IP address, hence can selectively discard those | share the same IP address, hence can selectively discard those | |||
misused cookies from the attacker. However the attacker can simply | misused cookies from the attacker. However the attacker can simply | |||
repeat the attack with new cookies. The server would eventually need | repeat the attack with new cookies. The server would eventually need | |||
to throttle all requests from the IP address just like the current | to throttle all requests from the IP address just like the current | |||
approach. Moreover this approach requires modifying [RFC 1323] to | approach. Moreover this approach requires modifying [RFC 1323] to | |||
send non-zero Timestamp Echo Reply in SYN, potentially cause firewall | send non-zero Timestamp Echo Reply in SYN, potentially cause firewall | |||
issues. Therefore we believe the benefit may not outweigh the | issues. Therefore we believe the benefit may not outweigh the | |||
drawbacks. | drawbacks. | |||
7. Web Performance | 7. TFO's Applicability | |||
7.1. HTTP persistent connection | This section is to help applications considering TFO to evaluate | |||
TFO's benefits and drawbacks using a Web client and server | ||||
applications as an example throughout. | ||||
7.1 Duplicate data in SYNs | ||||
It is possible, though uncommon, that using TFO the first data | ||||
written to a socket is delivered more than once to the application on | ||||
the remote host(Section 2.1). This replay potential only applies to | ||||
data in the SYN but not subsequent data exchanges. Thus applications | ||||
MUST NOT use TFO unless they can tolerate this behavior. | ||||
7.2 Potential performance improvement | ||||
TFO is designed for latency-conscious applications that are sensitive | ||||
to TCP's initial connection setup delay. For example, many | ||||
applications perform short request and response message exchanges. To | ||||
benefit from TFO, the first application data unit (e.g., an HTTP | ||||
request) needs to be no more than TCP's maximum segment size (minus | ||||
options used in SYN). Otherwise the remote server can only process | ||||
the client's application data unit once the rest of it is delivered | ||||
after the initial handshake, diminishing TFO's benefit. | ||||
To the extent possible, applications SHOULD employ long-lived | ||||
connections to best take advantage of TCP's built-in congestion | ||||
control, and to reduce the impact from TCP's connection setup | ||||
overhead. Note that when an application employs too many short-lived | ||||
connections, it may negatively impact network stability, as these | ||||
connections often exit before TCP's congestion control algorithm | ||||
takes effect. Implementations supporting a large number of short- | ||||
lived connections should employ temporal sharing of TCB data as | ||||
described in [RFC2140]. | ||||
7.3 Example: Web clients and servers | ||||
We look at Web client and server applications that use HTTP and TCP | ||||
protocols and follow the guidelines above to evaluate if TFO is safe | ||||
and useful for Web. | ||||
7.3.1 HTTP request replay | ||||
We believe TFO is safe for the Web because even with standard TCP the | ||||
Web browser may replay an HTTP request to the remote Web server | ||||
multiple times. After sending an HTTP request, the browser could time | ||||
out and retry the same request on another TCP connection. This | ||||
scenario occurs far more frequently than the SYN duplication issue | ||||
presented by TFO. To ensure transactional behavior, Web sites employ | ||||
application-specific mechanisms such as including unique identifiers | ||||
in the data. | ||||
7.3.2 HTTP persistent connection | ||||
Next we evaluate if the Web can benefit from TFO given that HTTP | ||||
persistent connection support is already widely deployed. | ||||
TCP connection setup overhead has long been identified as a | TCP connection setup overhead has long been identified as a | |||
performance bottleneck for web applications [THK98]. HTTP persistent | performance bottleneck for web applications [THK98]. HTTP persistent | |||
connection was proposed to mitigate this issue and has been widely | connection support was proposed to mitigate this issue and has been | |||
deployed. However, [RCCJR11][AERG11] show that the average number of | widely deployed. However, studies [RCCJR11][AERG11] show that the | |||
transactions per connection is between 2 and 4, based on large-scale | average number of transactions per connection is between 2 and 4, | |||
measurements from both servers and clients. In these studies, the | based on large-scale measurements from both servers and clients. In | |||
servers and clients both kept the idle connections up to several | these studies, the servers and clients both kept idle connections up | |||
minutes, well into the human think time. | to several minutes, well into "human think" time. | |||
Can the utilization rate increase by keeping connections even longer? | Can the utilization rate of such connections increase by keeping idle | |||
Unfortunately, this is problematic due to middle-boxes and rapidly | connections even longer? Unfortunately, such an approach is | |||
growing mobile end hosts. One major issue is NAT. Studies | problematic due to middle-boxes and the rapidly growing share of | |||
mobile end hosts. Thus one major issue faced by persistent | ||||
connections is NAT. Studies [HNESSK10][MQXMZ11] show that the | ||||
majority of home routers and ISPs fail to meet the the 124-minute | ||||
idle timeout mandated in [RFC5382]. In [MQXMZ11], 35% of mobile ISPs | ||||
timeout idle connections within 30 minutes. The end hosts attempting | ||||
to use these broken connections are often forced to wait for a | ||||
lengthy TCP timeout, as they often receive no signal when middleboxes | ||||
break their connections. Thus browsers risk large performance | ||||
penalties when keeping idle connections open. | ||||
[HNESSK10][MQXMZ11] show that the majority of home routers and ISPs | To circumvent this problem, some applications send frequent TCP keep- | |||
fail to meet the the 124 minutes idle timeout mandated in [RFC5382]. | alive probes. However, this technique drains power on mobile devices | |||
In [MQXMZ11], 35% of mobile ISPs timeout idle connections within 30 | [MQXMZ11]. In fact, power has become such a prominent issue in modern | |||
minutes. NAT boxes do not possess a reliable mechanism to notify end | LTE devices that mobile browsers close HTTP connections within | |||
hosts when idle connections are removed from local tables, either due | seconds or even immediately [SOUDERS11]. | |||
to resource constraints such as mapping table size, memory, or lookup | ||||
overhead, or due to the limited port number and IP address space. | ||||
Moreover, unmapped packets received by NAT boxes are often dropped | ||||
silently. (TCP RST is not required by RFC5382.) The end host | ||||
attempting to use these broken connections are often forced to wait | ||||
for a lengthy TCP timeout. Thus the browser risks large performance | ||||
penalty when keeping idle connections open. To circumvent this | ||||
problem, some applications send frequent TCP keep-alive probes. | ||||
However, this technique drains power on mobile devices [MQXMZ11]. In | ||||
fact, power has become a prominent issue in modern LTE devices that | ||||
mobile browsers close the HTTP connections within seconds or even | ||||
immediately [SOUDERS11]. | ||||
Idle connections also consume more memory resources. Due to the | Since TFO data duplication presents no new issues and HTTP persistent | |||
complexity of today's web applications, the application layer often | connection support has many limitations, Web applications can safely | |||
needs orders of magnitude more memory than the TCP connection | use TFO and will likely achieve performance gains. The next section | |||
footprint. As a result, servers need to implement advanced resource | presents more empirical data of the potential performance benefit. | |||
management in order to support a large number of idle connections. | ||||
7.2 Case Study: Chrome Browser | 8. Performance Experiments | |||
[RCCJR11] studied Chrome browser performance based on 28 days of | [RCCJR11] studied Chrome browser performance based on 28 days of | |||
global statistics. Chrome browser keeps idle HTTP persistent | global statistics. Chrome browser keeps idle HTTP persistent | |||
connections up to 5 to 10 minutes. However the average number of the | connections up to 5 to 10 minutes. However the average number of the | |||
transactions per connection is only 3.3. Due to the low utilization, | transactions per connection is only 3.3. Due to the low utilization, | |||
TCP 3WHS accounts up to 25% of the HTTP transaction network latency. | TCP 3WHS accounts up to 25% of the HTTP transaction network latency. | |||
The authors tested a Linux TFO implementation with TFO enabled Chrome | The authors tested a Linux TFO implementation with TFO enabled Chrome | |||
browser on popular web sites in emulated environments such as | browser on popular web sites in emulated environments such as | |||
residential broadband and mobile networks. They showed that TFO | residential broadband and mobile networks. They showed that TFO | |||
improves page load time by 10% to 40%. More detailed on the design | improves page load time by 10% to 40%. More details on the design | |||
tradeoffs and measurement can be found at [RCCJB11]. | tradeoffs and measurement can be found at [RCCJR11]. | |||
8. TFO's Applicability | ||||
TFO aims at latency conscious applications that are sensitive to | ||||
TCP's initial connection setup delay. These application protocols | ||||
often employ short-lived TCP connections, or employ long-lived | ||||
connections but are more sensitive to the connection setup delay due | ||||
to, e.g., a more strict connection fail-over requirement. | ||||
Only transaction-type applications where RTT constitutes a | ||||
significant portion of the total end-to-end latency will likely | ||||
benefit from TFO. Moreover, the client request must fit in the SYN | ||||
packet. Otherwise there may not be any saving in the total number of | ||||
round trips required to complete a transaction. | ||||
To the extent possible applications protocols SHOULD employ long- | ||||
lived connections to best take advantage of TCP's built-in congestion | ||||
control algorithm, and to reduce the impact from TCP's connection | ||||
setup overhead. E.g., for the web applications, P-HTTP will likely | ||||
help and is much easier to deploy hence should be attempted first. | ||||
TFO will likely provide further latency reduction on top of P-HTTP. | ||||
But the additional benefit will depend on how much persistency one | ||||
can get from HTTP in a given operating environment. | ||||
One alternative to short-lived TCP connection might be UDP, which is | ||||
connectionless hence doesn't inflict any connection setup delay, and | ||||
is best suited for application protocols that are transactional. | ||||
Practical deployment issues such as middle-box and/or firewall | ||||
traversal may severely limit the use of UDP based application | ||||
protocols though. | ||||
Note that when the application employs too many short-lived | ||||
connections, it may negatively impact network stability, as these | ||||
connections often exit before TCP's congestion control algorithm | ||||
kicks in. Implementations supporting large number of short-lived | ||||
connections should employ temporal sharing of TCB data as described | ||||
in [RFC2140]. | ||||
More discussion on TCP Fast Open and its projected performance | ||||
benefit can be found in [RCCJB11]. | ||||
9. Related Work | 9. Related Work | |||
9.1. T/TCP | 9.1. T/TCP | |||
TCP Extensions for Transactions [RFC1644] attempted to bypass the | TCP Extensions for Transactions [RFC1644] attempted to bypass the | |||
three-way handshake, among other things, hence shared the same goal | three-way handshake, among other things, hence shared the same goal | |||
but also the same set of issues as TFO. It focused most of its effort | but also the same set of issues as TFO. It focused most of its effort | |||
battling old or duplicate SYNs, but paid no attention to security | battling old or duplicate SYNs, but paid no attention to security | |||
vulnerabilities it introduced when bypassing 3WHS. Its TAO option and | vulnerabilities it introduced when bypassing 3WHS. Its TAO option and | |||
skipping to change at page 21, line 21 | skipping to change at page 22, line 11 | |||
[PHRACK98] "T/TCP vulnerabilities", Phrack Magazine, Volume 8, Issue | [PHRACK98] "T/TCP vulnerabilities", Phrack Magazine, Volume 8, Issue | |||
53 artical 6. July 8, 1998. URL | 53 artical 6. July 8, 1998. URL | |||
http://www.phrack.com/issues.html?issue=53&id=6 | http://www.phrack.com/issues.html?issue=53&id=6 | |||
[QWGMSS11] F. Qian, Z. Wang, A. Gerber, Z. Mao, S. Sen, O. | [QWGMSS11] F. Qian, Z. Wang, A. Gerber, Z. Mao, S. Sen, O. | |||
Spatscheck. "Profiling Resource Usage for Mobile | Spatscheck. "Profiling Resource Usage for Mobile | |||
Applications: A Cross-layer Approach", In Proceedings of | Applications: A Cross-layer Approach", In Proceedings of | |||
International Conference on Mobile Systems. April 2011. | International Conference on Mobile Systems. April 2011. | |||
[RCCJB11] Radhakrishnan, S., Cheng, Y., Chu, J., Jain, A. and B. | [RCCJR11] Radhakrishnan, S., Cheng, Y., Chu, J., Jain, A. and | |||
Raghavan, "TCP Fast Open". In Proceedings of 7th ACM CoNEXT | Raghavan, B., "TCP Fast Open". In Proceedings of 7th ACM | |||
Conference, December 2011. | CoNEXT Conference, December 2011. | |||
[RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions | [RFC1644] Braden, R., "T/TCP -- TCP Extensions for Transactions | |||
Functional Specification", RFC 1644, July 1994. | Functional Specification", RFC 1644, July 1994. | |||
[RFC2140] Touch, J., "TCP Control Block Interdependence", RFC2140, | [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC2140, | |||
April 1997. | April 1997. | |||
[RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common | |||
Mitigations", RFC 4987, August 2007. | Mitigations", RFC 4987, August 2007. | |||
skipping to change at page 22, line 6 | skipping to change at page 23, line 5 | |||
[THK98] Touch, J., Heidemann, J., Obraczka, K., "Analysis of HTTP | [THK98] Touch, J., Heidemann, J., Obraczka, K., "Analysis of HTTP | |||
Performance", USC/ISI Research Report 98-463. December | Performance", USC/ISI Research Report 98-463. December | |||
1998. | 1998. | |||
[BOB12] Briscoe, B., "Some ideas building on draft-ietf-tcpm- | [BOB12] Briscoe, B., "Some ideas building on draft-ietf-tcpm- | |||
fastopen-01", tcpm list, | fastopen-01", tcpm list, | |||
http://www.ietf.org/mail-archive/web/tcpm/current/ | http://www.ietf.org/mail-archive/web/tcpm/current/ | |||
msg07192.html | msg07192.html | |||
Author's Addresses | Authors' Addresses | |||
Yuchung Cheng | Yuchung Cheng | |||
Google, Inc. | Google, Inc. | |||
1600 Amphitheatre Parkway | 1600 Amphitheatre Parkway | |||
Mountain View, CA 94043, USA | Mountain View, CA 94043, USA | |||
EMail: ycheng@google.com | EMail: ycheng@google.com | |||
Jerry Chu | Jerry Chu | |||
Google, Inc. | Google, Inc. | |||
1600 Amphitheatre Parkway | 1600 Amphitheatre Parkway | |||
End of changes. 52 change blocks. | ||||
198 lines changed or deleted | 250 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |