--- 1/draft-paxson-tcpm-rfc2988bis-00.txt 2011-02-09 13:31:52.000000000 +0100 +++ 2/draft-paxson-tcpm-rfc2988bis-01.txt 2011-02-09 13:31:52.000000000 +0100 @@ -1,18 +1,20 @@ Internet Engineering Task Force V. Paxson INTERNET DRAFT ICSI/UC Berkeley -File: draft-paxson-tcpm-rfc2988bis-00.txt M. Allman +File: draft-paxson-tcpm-rfc2988bis-01.txt M. Allman ICSI J. Chu Google - February 2010 + M. Sargent + CWRU + December 6, 2010 Computing TCP's Retransmission Timer Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that @@ -23,21 +25,21 @@ months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. - This Internet-Draft will expire on August 1, 2010. + This Internet-Draft will expire on June 6, 2011. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -298,20 +300,30 @@ Non-Normative References [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network Path Properties", SIGCOMM 99. [Chu09] Chu, J., "Tuning TCP Parameters for the 21st Century", http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July 2009. + [SLS09] Schulman, A., Levin, D., and Spring, N., "CRAWDAD data set + umd/sigcomm2008 (v. 2009-03-02)", + http://crawdad.cs.dartmouth.edu/umd/sigcomm2008, March, + 2009. + + [HKA04] Henderson, T., Kotz, D., and Abyzov, I., "CRAWDAD trace + dartmouth/campus/tcpdump/fall03 (v. 2004-11-09)", + http://crawdad.cs.dartmouth.edu/dartmouth/campus/tcpdump/fall03, + November 2004. + [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988. [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time Estimates in Reliable Transport Protocols", SIGCOMM 87. Author's Addresses @@ -337,58 +349,122 @@ http://www.icir.org/mallman/ H.K. Jerry Chu Google, Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 Phone: 650-253-3010 Email: hkchu@google.com + Matt Sargent + Case Western Reserve University Olin Building + 10900 Euclid Avenue + Room 505 + Cleveland, OH 44106 + + Phone: 440-223-5932 + Email: mts71@case.edu + Appendix A Choosing a reasonable initial RTO requires balancing two competing considerations: 1. The initial RTO should be sufficiently large to cover most of the end-to-end paths to avoid spurious retransmissions and their associated negative performance impact. 2. The initial RTO should be small enough to ensure a timely recovery from packet loss occurring before an RTT sample is taken. Traditionally, TCP has used 3 seconds as the initial RTO [RFC1122,RFC2988]. This document calls for lowering this value to 1 - second for the following reasons: + second using the following rationale: - Modern networks are simply faster than the state-of-the-art was at the time the initial RTO of 3 seconds was defined. - - Studies have found that the round-trip time of more than 97.5% of + - Studies have found that the round-trip times of more than 97.5% of the connections observed in a large scale analysis were less than 1 second [Chu09], suggesting that 1 second meets criteria 1 above. - - In addition, the studies have observed retransmission rates within the - three-way handshake of roughly 2%. This shows that reducing the - initial RTO has benefit to a non-negligible set of connections. + - In addition, the studies observed retransmission rates within + the three-way handshake of roughly 2%. This shows that reducing + the initial RTO has benefit to a non-negligible set of connections. - However, roughly 2.5% of the connections studied in [Chu09] have an RTT longer than 1 second. For those connections, a 1 second - initial RTO guarantees a retransmission during connection establishment - (needed or not). + initial RTO guarantees a retransmission during connection + establishment (needed or not). When this happens, this document calls for reverting to an initial RTO of 3 seconds for the data transmission phase. Therefore, the implications of the spurious retransmission are modest: (1) an extra SYN is transmitted into the network, and (2) according to [RFC5681] the initial congestion window will be limited to 1 segment. While (2) clearly puts such connections at a disadvantage, this document at least resets the RTO such that the connection will not continually run into problems with a short timeout. (Of course, if the RTT is more than three seconds, the - connection will still encounter difficulties. But that is not a new - issue for TCP.) + connection will still encounter difficulties. But that is not a + new issue for TCP.) - In addition, we note that when using timestamps the TCP will be - able to take an RTT sample even in the presence of a spurious - retransmission, hence avoiding concern (2) above. + In addition, we note that when using timestamps, TCP will be able + to take an RTT sample even in the presence of a spurious + retransmission, facilitating convergence to a correct RTT estimate + when the RTT exceeds 1 second. + + As an additional check on the results presented in [Chu09], we + analyzed packet traces of client behavior collected at four + different vantage points at different times, as follows: + + Name Dates Pkts. Cnns. Clnts. Servs. + -------------------------------------------------------- + LBL-1 Oct/05--Mar/06 292M 242K 228 74K + LBL-2 Nov/09--Feb/10 1.1B 1.2M 1047 38K + ICSI-1 Sep/11--18/07 137M 2.1M 193 486K + ICSI-2 Sep/11--18/08 163M 1.9M 177 277K + ICSI-3 Sep/14--21/09 334M 3.1M 170 253K + ICSI-4 Sep/11--18/10 298M 5M 183 189K + Dartmouth Jan/4--21/04 1B 4M 3782 132K + SIGCOMM Aug/17--21/08 11.6M 133K 152 29K + + The "LBL" data was taken at the Lawrence Berkeley National + Laboratory, the "ICSI" data from the International Computer Science + Institute, the "SIGCOMM" data from the wireless network that served + the attendees of SIGCOMM 2008, and the "Dartmouth" data was + collected from Dartmouth College's wireless network. The latter two + datasets are available from the CRAWDAD data repository + [HKA04,SLS09]. The table lists the dates of the data collections, + the number of packets collected, the number of TCP connections + observed, the number of local clients monitored, and the number of + remote servers contacted. We consider only connections initiated + near the tracing vantage point. + + Analysis of these datasets finds the prevalence of retransmitted + SYNs to be between 0.03% (ICSI-4) to roughly 2% (LBL-1 and + Dartmouth). + + We then analyzed the data to determine the number of + additional---and spurious---retransmissions that would have been + incurred if the initial RTO was assumed to be 1 second. In most of + the datasets, the proportion of connections with spurious + retransmits was less than 0.1%. However, in the Dartmouth dataset + approximately 1.1% of the connections would have sent a spurious + retransmit with a lower initial RTO. We attribute this to the fact + that the monitored network is wireless and therefore susceptible to + additional delays from RF effects. + + Finally, there are obviously performance benefits from + retransmitting lost SYNs with a reduced initial RTO. Across our + datasets, the percentage of connections that retransmitted a SYN and + would realize at least a 10% performance improvement by using the + smaller initial RTO specified in this document ranges from 43% + (LBL-1) to 87% (ICSI-4). The percentage of connections that would + realize at least a 50% performance improvement ranges from 17% + (ICSI-1 and SIGCOMM) to 73% (ICSI-4). + + From the data to which we have access, we conclude that the lower + initial RTO is likely to be beneficial to many connections, and + harmful to relatively few.