--- 1/draft-ietf-tcpm-cubic-02.txt 2016-12-02 08:13:15.449076692 -0800 +++ 2/draft-ietf-tcpm-cubic-03.txt 2016-12-02 08:13:15.485077599 -0800 @@ -1,25 +1,26 @@ TCP Maintenance and Minor Extensions (TCPM) WG I. Rhee Internet-Draft NCSU Intended status: Informational L. Xu -Expires: February 6, 2017 UNL +Expires: June 5, 2017 UNL S. Ha Colorado A. Zimmermann + L. Eggert - R. Scheffenegger NetApp - August 5, 2016 + R. Scheffenegger + December 2, 2016 CUBIC for Fast Long-Distance Networks - draft-ietf-tcpm-cubic-02 + draft-ietf-tcpm-cubic-03 Abstract CUBIC is an extension to the current TCP standards. The protocol differs from the current TCP standards only in the congestion window adjustment function in the sender side. In particular, it uses a cubic function instead of a linear window increase of the current TCP standards to improve scalability and stability under fast and long distance networks. BIC-TCP, a predecessor of CUBIC, has been a default TCP adopted by Linux since year 2005 and has already been @@ -42,21 +43,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 6, 2017. + This Internet-Draft will expire on June 5, 2017. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -69,38 +70,39 @@ Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. CUBIC Congestion Control . . . . . . . . . . . . . . . . . . 5 3.1. Window growth function . . . . . . . . . . . . . . . . . 5 3.2. TCP-friendly region . . . . . . . . . . . . . . . . . . . 6 3.3. Concave region . . . . . . . . . . . . . . . . . . . . . 7 3.4. Convex region . . . . . . . . . . . . . . . . . . . . . . 7 3.5. Multiplicative decrease . . . . . . . . . . . . . . . . . 7 - 3.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 7 + 3.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 8 + 3.7. Timeout . . . . . . . . . . . . . . . . . . . . . . . . . 8 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 4.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 8 + 4.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 9 4.2. Using Spare Capacity . . . . . . . . . . . . . . . . . . 10 4.3. Difficult Environments . . . . . . . . . . . . . . . . . 11 4.4. Investigating a Range of Environments . . . . . . . . . . 11 4.5. Protection against Congestion Collapse . . . . . . . . . 11 4.6. Fairness within the Alternative Congestion Control Algorithm. . . . . . . . . . . . . . . . . . . . . . . . 11 - 4.7. Performance with Misbehaving Nodes and Outside Attackers 11 - 4.8. Behavior for Application-Limited Flows . . . . . . . . . 11 - 4.9. Responses to Sudden or Transient Events . . . . . . . . . 11 + 4.7. Performance with Misbehaving Nodes and Outside Attackers 12 + 4.8. Behavior for Application-Limited Flows . . . . . . . . . 12 + 4.9. Responses to Sudden or Transient Events . . . . . . . . . 12 4.10. Incremental Deployment . . . . . . . . . . . . . . . . . 12 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 8.1. Normative References . . . . . . . . . . . . . . . . . . 12 + 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 8.2. Informative References . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 1. Introduction The low utilization problem of TCP in fast long-distance networks is well documented in [K03][RFC3649]. This problem arises from a slow increase of congestion window following a congestion event in a network with a large bandwidth delay product (BDP). Our experience [HKLRX06] indicates that this problem is frequently observed even in @@ -111,88 +113,88 @@ variants, including TCP-RENO [RFC5681], TCP-NewReno [RFC6582], TCP- SACK [RFC2018], SCTP [RFC4960], TFRC [RFC5348] that use the same linear increase function for window growth, which we refer to collectively as Standard TCP below. CUBIC [HRX08] is a modification to the congestion control mechanism of Standard TCP, in particular, to the window increase function of Standard TCP senders, to remedy this problem. It uses a cubic increase function in terms of the elapsed time from the last congestion event. While most alternative algorithms to Standard TCP - uses a convex increase function where after a loss event, the window - increment is always increasing, CUBIC uses both the concave and - convex profiles of a cubic function for window increase. After a - window reduction following a loss event, it registers the window size - where it got the loss event as W_max and performs a multiplicative - decrease of congestion window and the regular fast recovery and - retransmit of Standard TCP. After it enters into congestion - avoidance from fast recovery, it starts to increase the window using - the concave profile of the cubic function. The cubic function is set - to have its plateau at W_max so the concave growth continues until - the window size becomes W_max. After that, the cubic function turns - into a convex profile and the convex window growth begins. This - style of window adjustment (concave and then convex) improves - protocol and network stability while maintaining high network - utilization [CEHRX07]. This is because the window size remains - almost constant, forming a plateau around W_max where network + uses a convex increase function where during congestion avoidance the + window increment is always increasing, CUBIC uses both the concave + and convex profiles of a cubic function for window increase. After a + window reduction following a loss event detected by duplicate ACKs, + it registers the window size where it got the loss event as W_max and + performs a multiplicative decrease of congestion window and the + regular fast recovery and retransmit of Standard TCP. After it + enters into congestion avoidance from fast recovery, it starts to + increase the window using the concave profile of the cubic function. + The cubic function is set to have its plateau at W_max so the concave + growth continues until the window size becomes W_max. After that, + the cubic function turns into a convex profile and the convex window + growth begins. This style of window adjustment (concave and then + convex) improves protocol and network stability while maintaining + high network utilization [CEHRX07]. This is because the window size + remains almost constant, forming a plateau around W_max where network utilization is deemed highest and under steady state, most window size samples of CUBIC are close to W_max, thus promoting high network utilization and protocol stability. Note that protocols with convex increase functions have the maximum increments around W_max and introduces a large number of packet bursts around the saturation point of the network, likely causing frequent global loss synchronizations. Another notable feature of CUBIC is that its window increase rate is mostly independent of RTT, and follows a (cubic) function of the - elapsed time since the last loss event. This feature promotes per- - flow fairness to Standard TCP as well as RTT-fairness. Note that - Standard TCP performs well under short RTT and small bandwidth (or - small BDP) networks. Only in a large long RTT and large bandwidth - (or large BDP) networks, it has the scalability problem. An - alternative protocol to Standard TCP designed to be friendly to - Standard TCP at a per-flow basis must operate to increase its window - much less aggressively in small BDP networks than in large BDP - networks. In CUBIC, its window growth rate is slowest around the + elapsed time from the beginning of congestion avoidance. This + feature promotes per-flow fairness to Standard TCP as well as RTT- + fairness. Note that Standard TCP performs well under short RTT and + small bandwidth (or small BDP) networks. Only in a large long RTT + and large bandwidth (or large BDP) networks, it has the scalability + problem. An alternative protocol to Standard TCP designed to be + friendly to Standard TCP at a per-flow basis must operate to increase + its window much less aggressively in small BDP networks than in large + BDP networks. In CUBIC, its window growth rate is slowest around the inflection point of the cubic function and this function does not depend on RTT. In a smaller BDP network where Standard TCP flows are working well, the absolute amount of the window decrease at a loss event is always smaller because of the multiplicative decrease. Therefore, in CUBIC, the starting window size after a loss event from which the window starts to increase, is smaller in a smaller BDP network, thus falling nearer to the plateau of the cubic function where the growth rate is slowest. By setting appropriate values of the cubic function parameters, CUBIC sets its growth rate always no faster than Standard TCP around its inflection point. When the cubic function grows slower than the window of Standard TCP, CUBIC simply follows the window size of Standard TCP to ensure fairness to Standard TCP in a small BDP network. We call this region where CUBIC behaves like Standard TCP, the TCP-friendly region. CUBIC maintains the same window growth rate independent of RTTs outside of the TCP-friendly region, and flows with different RTTs have the similar window sizes under steady state when they operate outside the TCP-friendly region. This ensures CUBIC flows with - different RTTs to have their bandwidth shares linearly proportional - to the inverse of their RTT ratio (the longer RTT, the smaller the - share). This behavior is the same as that of Standard TCP under high - statistical multiplexing environments where packet losses are - independent of individual flow rates. However, under low statistical - multiplexing environments, the bandwidth share ratio of Standard TCP - flows with different RTTs is squarely proportional to the inverse of - their RTT ratio [XHR04]. CUBIC always ensures the linear ratio - independent of the levels of statistical multiplexing. This is an - improvement over Standard TCP. While there is no consensus on a - particular bandwidth share ratios of different RTT flows, we believe - that under wired Internet, use of the linear share notion seems more - reasonable than equal share or a higher order shares. HTCP [LS08] - currently uses the equal share. + different RTTs to have their bandwidth shares (approximately, window/ + RTT) linearly proportional to the inverse of their RTT ratio (the + longer RTT, the smaller the share). This behavior is the same as + that of Standard TCP under high statistical multiplexing environments + where packet losses are independent of individual flow rates. + However, under low statistical multiplexing environments, the + bandwidth share ratio of Standard TCP flows with different RTTs is + squarely proportional to the inverse of their RTT ratio [XHR04]. + CUBIC always ensures the linear ratio independent of the levels of + statistical multiplexing. This is an improvement over Standard TCP. + While there is no consensus on a particular bandwidth share ratios of + different RTT flows, we believe that under wired Internet, use of the + linear share notion seems more reasonable than equal share or a + higher order shares. HTCP [LS08] currently uses the equal share. CUBIC sets the multiplicative window decrease factor to 0.7 while Standard TCP uses 0.5. While this improves the scalability of the protocol, a side effect of this decision is slower convergence especially under low statistical multiplexing environments. This design choice is following the observation that the author of HSTCP [RFC3649] has made along with other researchers (e.g., [GV02]): the current Internet becomes more asynchronous with less frequent loss synchronizations with high statistical multiplexing. Under this environment, even strict MIMD can converge. CUBIC flows with the @@ -234,40 +236,46 @@ where C is a constant fixed to determine the aggressiveness of window growth in high BDP networks, t is the elapsed time from the last window reduction (measured right after the fast recovery), and K is the time period that the above function takes to increase the current window size to W_max if there is no further loss event and is calculated by using the following equation: K = cubic_root(W_max*(1-beta_cubic)/C) (Eq. 2) where beta_cubic is the CUBIC multiplication decrease factor, that - is, when a packet loss occurs, CUBIC reduces its current window cwnd - to cwnd*beta_cubic. We discuss how we set C in the next Section in - more details. + is, when a packet loss (detected by duplicate ACKs) occurs, CUBIC + reduces its current window cwnd to W_cubic(0)=W_max*beta_cubic. We + discuss how we set C in the next Section in more details. Upon receiving an ACK during congestion avoidance, CUBIC computes the window growth rate during the next RTT period using Eq. 1. It sets - W_cubic(t+RTT) as the candidate target value of congestion window. + W_cubic(t+RTT) as the candidate target value of congestion window, + where RTT is the weithed average RTT calculated by the standard TCP. Depending on the value of the current window size cwnd, CUBIC runs in three different modes. First, if cwnd is less than the window size that Standard TCP would reach at time t after the last loss event, then CUBIC is in the TCP friendly region (we describe below how to determine this window size of Standard TCP in term of time t). Otherwise, if cwnd is less than W_max, then CUBIC is the concave region, and if cwnd is larger than W_max, CUBIC is in the convex region. Below, we describe the exact actions taken by CUBIC in each region. 3.2. TCP-friendly region + Standard TCP performs well in certain types of networks, for example, + under short RTT and small bandwidth (or small BDP) networks. In + these networks, we use the TCP-friendly region to ensure that CUBIC + achieves at least the same throughput as the standard TCP. + When receiving an ACK in congestion avoidance, we first check whether the protocol is in the TCP region or not. This is done by estimating the average rate of the Standard TCP using a simple analysis described in [FHP00]. It considers the Standard TCP as a special case of an Additive Increase and Multiplicative Decrease algorithm (AIMD), which has an additive factor alpha_aimd and a multiplicative factor beta_aimd with the following function: AVG_W_aimd = [ alpha_aimd * (1+beta_aimd) / (2*(1-beta_aimd)*p) ]^0.5 (Eq. 3) @@ -285,46 +293,49 @@ If W_cubic(t) is less than W_aimd(t), then the protocol is in the TCP friendly region and cwnd SHOULD be set to W_aimd(t) at each reception of ACK. 3.3. Concave region When receiving an ACK in congestion avoidance, if the protocol is not in the TCP-friendly region and cwnd is less than W_max, then the protocol is in the concave region. In this region, cwnd MUST be - incremented by (W_cubic(t+RTT) - cwnd)/cwnd for each received ACK. + incremented by (W_cubic(t+RTT) - cwnd)/cwnd for each received ACK, + where W_cubic(t+RTT) is calculated using Eq. 1. 3.4. Convex region When the current window size of CUBIC is larger than W_max, it passes the plateau of the cubic function after which CUBIC follows the convex profile of the cubic function. Since cwnd is larger than the previous saturation point W_max, this indicates that the network conditions might have been perturbed since the last loss event, possibly implying more available bandwidth after some flow departures. Since the Internet is highly asynchronous, some amount of perturbation is always possible without causing a major change in available bandwidth. In this phase, CUBIC is being very careful by very slowly increasing its window size. The convex profile ensures that the window increases very slowly at the beginning and gradually increases its growth rate. We also call this phase as the maximum probing phase since CUBIC is searching for a new W_max. In this region, cwnd MUST be incremented by (W_cubic(t+RTT) - cwnd)/cwnd for - each received ACK. + each received ACK, where W_cubic(t+RTT) is calculated using Eq. 1. 3.5. Multiplicative decrease - When a packet loss occurs, CUBIC reduces its window size by a factor - of beta. Parameter beta_cubic SHOULD be set to 0.7. + When a packet loss (detected by duplicate ACKs) occurs, CUBIC updates + its W_max, cwnd, and ssthresh (slow start threshold) as follows. + Parameter beta_cubic SHOULD be set to 0.7. W_max = cwnd; // save window size before reduction + ssthresh = cwnd * beta_cubic; // new slow start threshold cwnd = cwnd * beta_cubic; // window reduction A side effect of setting beta_cubic to a bigger value than 0.5 is slower convergence. We believe that while a more adaptive setting of beta_cubic could result in faster convergence, it will make the analysis of the protocol much harder. This adaptive adjustment of beta_cubic is an item for the next version of CUBIC. 3.6. Fast convergence @@ -346,20 +357,25 @@ W_max = W_max*(1+beta_cubic)/2; // further reduce W_max } else { // check upward trend W_last_max = W_max // remember the last W_max } This allows W_max to be slightly less than the original W_max. Since flows spend most of time around their W_max, flows with larger bandwidth shares tend to spend more time around the plateau allowing more time for flows with smaller shares to increase their windows. +3.7. Timeout + + In case of timeout, CUBIC follows the standard TCP to reduce cwnd, + but sets ssthresh using beta_cubic (same as in Section 3.5). + 4. Discussion With a deterministic loss model where the number of packets between two successive lost events is always 1/p, CUBIC always operates with the concave window profile which greatly simplifies the performance analysis of CUBIC. The average window size of CUBIC can be obtained by the following function: AVG_W_cubic = [C*(3+beta_cubic)/(4*(1-beta_cubic))]^0.25 * (RTT^0.75) / (p^0.75) (Eq. 5) @@ -477,53 +493,53 @@ Table 3 Our test results in [HKLRX06] indicate that CUBIC uses the spare bandwidth left unused by existing Standard TCP flows in the same bottleneck link without taking away much bandwidth from the existing flows. 4.3. Difficult Environments CUBIC is designed to remedy the poor performance of TCP in fast long- - distance networks. It is not designed for wireless networks. + distance networks. 4.4. Investigating a Range of Environments CUBIC has been extensively studied by using both NS-2 simulation and test-bed experiments covering a wide range of network environments. More information can be found in [HKLRX06]. 4.5. Protection against Congestion Collapse - In case that there is congestion collapse, CUBIC behaves likely - standard TCP since CUBIC modifies only the window adjustment - algorithm of TCP. Thus, it does not modify the ACK clocking and - Timeout behaviors of Standard TCP. + With regard to the potential of causing congestion collapse, CUBIC + behaves like standard TCP since CUBIC modifies only the window + adjustment algorithm of TCP. Thus, it does not modify the ACK + clocking and Timeout behaviors of Standard TCP. 4.6. Fairness within the Alternative Congestion Control Algorithm. CUBIC ensures convergence of competing CUBIC flows with the same RTT in the same bottleneck links to an equal bandwidth share. When competing flows have different RTTs, their bandwidth shares are linearly proportional to the inverse of their RTT ratios. This is true independent of the level of statistical multiplexing in the link. 4.7. Performance with Misbehaving Nodes and Outside Attackers This is not considered in the current CUBIC. 4.8. Behavior for Application-Limited Flows CUBIC does not raise its congestion window size if the flow is currently limited by the application instead of the congestion - window. In cases of idle periods, t in Eq. 1 should not include the + window. In cases of idle periods, t in Eq. 1 MUST NOT include the idle time; otherwise, W_cubic(t) might be very high after restarting from a long idle time. 4.9. Responses to Sudden or Transient Events In case that there is a sudden congestion, a routing change, or a mobility event, CUBIC behaves the same as Standard TCP. 4.10. Incremental Deployment @@ -629,48 +644,39 @@ North Carolina State University Department of Computer Science Raleigh, NC 27695-7534 US Email: rhee@ncsu.edu Lisong Xu University of Nebraska-Lincoln Department of Computer Science and Engineering - Lincoln, NE 68588-01150 + Lincoln, NE 68588-0115 US Email: xu@unl.edu - Sangtae Ha University of Colorado at Boulder Department of Computer Science Boulder, CO 80309-0430 US Email: sangtae.ha@colorado.edu Alexander Zimmermann - NetApp - Sonnenallee 1 - Kirchheim 85551 - Germany - Phone: +49 89 900594712 - Email: alexander.zimmermann@netapp.com + Phone: +49 175 5766838 + Email: alexander.zimmermann@rwth-aachen.de Lars Eggert NetApp Sonnenallee 1 Kirchheim 85551 Germany Phone: +49 151 12055791 Email: lars@netapp.com + Richard Scheffenegger - NetApp - Am Euro Platz 2 - Vienna 1120 - Austria - Phone: +43 1 3676811 3146 - Email: rs@netapp.com + Email: rscheff@gmx.at