 1/draftietftcpmcubic02.txt 20161202 08:13:15.449076692 0800
+++ 2/draftietftcpmcubic03.txt 20161202 08:13:15.485077599 0800
@@ 1,25 +1,26 @@
TCP Maintenance and Minor Extensions (TCPM) WG I. Rhee
InternetDraft NCSU
Intended status: Informational L. Xu
Expires: February 6, 2017 UNL
+Expires: June 5, 2017 UNL
S. Ha
Colorado
A. Zimmermann
+
L. Eggert
 R. Scheffenegger
NetApp
 August 5, 2016
+ R. Scheffenegger
+ December 2, 2016
CUBIC for Fast LongDistance Networks
 draftietftcpmcubic02
+ draftietftcpmcubic03
Abstract
CUBIC is an extension to the current TCP standards. The protocol
differs from the current TCP standards only in the congestion window
adjustment function in the sender side. In particular, it uses a
cubic function instead of a linear window increase of the current TCP
standards to improve scalability and stability under fast and long
distance networks. BICTCP, a predecessor of CUBIC, has been a
default TCP adopted by Linux since year 2005 and has already been
@@ 42,21 +43,21 @@
InternetDrafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as InternetDrafts. The list of current Internet
Drafts is at http://datatracker.ietf.org/drafts/current/.
InternetDrafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use InternetDrafts as reference
material or to cite them other than as "work in progress."
 This InternetDraft will expire on February 6, 2017.
+ This InternetDraft will expire on June 5, 2017.
Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/licenseinfo) in effect on the date of
publication of this document. Please review these documents
@@ 69,38 +70,39 @@
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. CUBIC Congestion Control . . . . . . . . . . . . . . . . . . 5
3.1. Window growth function . . . . . . . . . . . . . . . . . 5
3.2. TCPfriendly region . . . . . . . . . . . . . . . . . . . 6
3.3. Concave region . . . . . . . . . . . . . . . . . . . . . 7
3.4. Convex region . . . . . . . . . . . . . . . . . . . . . . 7
3.5. Multiplicative decrease . . . . . . . . . . . . . . . . . 7
 3.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 7
+ 3.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 8
+ 3.7. Timeout . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 8
 4.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 8
+ 4.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 9
4.2. Using Spare Capacity . . . . . . . . . . . . . . . . . . 10
4.3. Difficult Environments . . . . . . . . . . . . . . . . . 11
4.4. Investigating a Range of Environments . . . . . . . . . . 11
4.5. Protection against Congestion Collapse . . . . . . . . . 11
4.6. Fairness within the Alternative Congestion Control
Algorithm. . . . . . . . . . . . . . . . . . . . . . . . 11
 4.7. Performance with Misbehaving Nodes and Outside Attackers 11
 4.8. Behavior for ApplicationLimited Flows . . . . . . . . . 11
 4.9. Responses to Sudden or Transient Events . . . . . . . . . 11
+ 4.7. Performance with Misbehaving Nodes and Outside Attackers 12
+ 4.8. Behavior for ApplicationLimited Flows . . . . . . . . . 12
+ 4.9. Responses to Sudden or Transient Events . . . . . . . . . 12
4.10. Incremental Deployment . . . . . . . . . . . . . . . . . 12
5. Security Considerations . . . . . . . . . . . . . . . . . . . 12
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
 8.1. Normative References . . . . . . . . . . . . . . . . . . 12
+ 8.1. Normative References . . . . . . . . . . . . . . . . . . 13
8.2. Informative References . . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14
1. Introduction
The low utilization problem of TCP in fast longdistance networks is
well documented in [K03][RFC3649]. This problem arises from a slow
increase of congestion window following a congestion event in a
network with a large bandwidth delay product (BDP). Our experience
[HKLRX06] indicates that this problem is frequently observed even in
@@ 111,88 +113,88 @@
variants, including TCPRENO [RFC5681], TCPNewReno [RFC6582], TCP
SACK [RFC2018], SCTP [RFC4960], TFRC [RFC5348] that use the same
linear increase function for window growth, which we refer to
collectively as Standard TCP below.
CUBIC [HRX08] is a modification to the congestion control mechanism
of Standard TCP, in particular, to the window increase function of
Standard TCP senders, to remedy this problem. It uses a cubic
increase function in terms of the elapsed time from the last
congestion event. While most alternative algorithms to Standard TCP
 uses a convex increase function where after a loss event, the window
 increment is always increasing, CUBIC uses both the concave and
 convex profiles of a cubic function for window increase. After a
 window reduction following a loss event, it registers the window size
 where it got the loss event as W_max and performs a multiplicative
 decrease of congestion window and the regular fast recovery and
 retransmit of Standard TCP. After it enters into congestion
 avoidance from fast recovery, it starts to increase the window using
 the concave profile of the cubic function. The cubic function is set
 to have its plateau at W_max so the concave growth continues until
 the window size becomes W_max. After that, the cubic function turns
 into a convex profile and the convex window growth begins. This
 style of window adjustment (concave and then convex) improves
 protocol and network stability while maintaining high network
 utilization [CEHRX07]. This is because the window size remains
 almost constant, forming a plateau around W_max where network
+ uses a convex increase function where during congestion avoidance the
+ window increment is always increasing, CUBIC uses both the concave
+ and convex profiles of a cubic function for window increase. After a
+ window reduction following a loss event detected by duplicate ACKs,
+ it registers the window size where it got the loss event as W_max and
+ performs a multiplicative decrease of congestion window and the
+ regular fast recovery and retransmit of Standard TCP. After it
+ enters into congestion avoidance from fast recovery, it starts to
+ increase the window using the concave profile of the cubic function.
+ The cubic function is set to have its plateau at W_max so the concave
+ growth continues until the window size becomes W_max. After that,
+ the cubic function turns into a convex profile and the convex window
+ growth begins. This style of window adjustment (concave and then
+ convex) improves protocol and network stability while maintaining
+ high network utilization [CEHRX07]. This is because the window size
+ remains almost constant, forming a plateau around W_max where network
utilization is deemed highest and under steady state, most window
size samples of CUBIC are close to W_max, thus promoting high network
utilization and protocol stability. Note that protocols with convex
increase functions have the maximum increments around W_max and
introduces a large number of packet bursts around the saturation
point of the network, likely causing frequent global loss
synchronizations.
Another notable feature of CUBIC is that its window increase rate is
mostly independent of RTT, and follows a (cubic) function of the
 elapsed time since the last loss event. This feature promotes per
 flow fairness to Standard TCP as well as RTTfairness. Note that
 Standard TCP performs well under short RTT and small bandwidth (or
 small BDP) networks. Only in a large long RTT and large bandwidth
 (or large BDP) networks, it has the scalability problem. An
 alternative protocol to Standard TCP designed to be friendly to
 Standard TCP at a perflow basis must operate to increase its window
 much less aggressively in small BDP networks than in large BDP
 networks. In CUBIC, its window growth rate is slowest around the
+ elapsed time from the beginning of congestion avoidance. This
+ feature promotes perflow fairness to Standard TCP as well as RTT
+ fairness. Note that Standard TCP performs well under short RTT and
+ small bandwidth (or small BDP) networks. Only in a large long RTT
+ and large bandwidth (or large BDP) networks, it has the scalability
+ problem. An alternative protocol to Standard TCP designed to be
+ friendly to Standard TCP at a perflow basis must operate to increase
+ its window much less aggressively in small BDP networks than in large
+ BDP networks. In CUBIC, its window growth rate is slowest around the
inflection point of the cubic function and this function does not
depend on RTT. In a smaller BDP network where Standard TCP flows are
working well, the absolute amount of the window decrease at a loss
event is always smaller because of the multiplicative decrease.
Therefore, in CUBIC, the starting window size after a loss event from
which the window starts to increase, is smaller in a smaller BDP
network, thus falling nearer to the plateau of the cubic function
where the growth rate is slowest. By setting appropriate values of
the cubic function parameters, CUBIC sets its growth rate always no
faster than Standard TCP around its inflection point. When the cubic
function grows slower than the window of Standard TCP, CUBIC simply
follows the window size of Standard TCP to ensure fairness to
Standard TCP in a small BDP network. We call this region where CUBIC
behaves like Standard TCP, the TCPfriendly region.
CUBIC maintains the same window growth rate independent of RTTs
outside of the TCPfriendly region, and flows with different RTTs
have the similar window sizes under steady state when they operate
outside the TCPfriendly region. This ensures CUBIC flows with
 different RTTs to have their bandwidth shares linearly proportional
 to the inverse of their RTT ratio (the longer RTT, the smaller the
 share). This behavior is the same as that of Standard TCP under high
 statistical multiplexing environments where packet losses are
 independent of individual flow rates. However, under low statistical
 multiplexing environments, the bandwidth share ratio of Standard TCP
 flows with different RTTs is squarely proportional to the inverse of
 their RTT ratio [XHR04]. CUBIC always ensures the linear ratio
 independent of the levels of statistical multiplexing. This is an
 improvement over Standard TCP. While there is no consensus on a
 particular bandwidth share ratios of different RTT flows, we believe
 that under wired Internet, use of the linear share notion seems more
 reasonable than equal share or a higher order shares. HTCP [LS08]
 currently uses the equal share.
+ different RTTs to have their bandwidth shares (approximately, window/
+ RTT) linearly proportional to the inverse of their RTT ratio (the
+ longer RTT, the smaller the share). This behavior is the same as
+ that of Standard TCP under high statistical multiplexing environments
+ where packet losses are independent of individual flow rates.
+ However, under low statistical multiplexing environments, the
+ bandwidth share ratio of Standard TCP flows with different RTTs is
+ squarely proportional to the inverse of their RTT ratio [XHR04].
+ CUBIC always ensures the linear ratio independent of the levels of
+ statistical multiplexing. This is an improvement over Standard TCP.
+ While there is no consensus on a particular bandwidth share ratios of
+ different RTT flows, we believe that under wired Internet, use of the
+ linear share notion seems more reasonable than equal share or a
+ higher order shares. HTCP [LS08] currently uses the equal share.
CUBIC sets the multiplicative window decrease factor to 0.7 while
Standard TCP uses 0.5. While this improves the scalability of the
protocol, a side effect of this decision is slower convergence
especially under low statistical multiplexing environments. This
design choice is following the observation that the author of HSTCP
[RFC3649] has made along with other researchers (e.g., [GV02]): the
current Internet becomes more asynchronous with less frequent loss
synchronizations with high statistical multiplexing. Under this
environment, even strict MIMD can converge. CUBIC flows with the
@@ 234,40 +236,46 @@
where C is a constant fixed to determine the aggressiveness of window
growth in high BDP networks, t is the elapsed time from the last
window reduction (measured right after the fast recovery), and K is
the time period that the above function takes to increase the current
window size to W_max if there is no further loss event and is
calculated by using the following equation:
K = cubic_root(W_max*(1beta_cubic)/C) (Eq. 2)
where beta_cubic is the CUBIC multiplication decrease factor, that
 is, when a packet loss occurs, CUBIC reduces its current window cwnd
 to cwnd*beta_cubic. We discuss how we set C in the next Section in
 more details.
+ is, when a packet loss (detected by duplicate ACKs) occurs, CUBIC
+ reduces its current window cwnd to W_cubic(0)=W_max*beta_cubic. We
+ discuss how we set C in the next Section in more details.
Upon receiving an ACK during congestion avoidance, CUBIC computes the
window growth rate during the next RTT period using Eq. 1. It sets
 W_cubic(t+RTT) as the candidate target value of congestion window.
+ W_cubic(t+RTT) as the candidate target value of congestion window,
+ where RTT is the weithed average RTT calculated by the standard TCP.
Depending on the value of the current window size cwnd, CUBIC runs in
three different modes. First, if cwnd is less than the window size
that Standard TCP would reach at time t after the last loss event,
then CUBIC is in the TCP friendly region (we describe below how to
determine this window size of Standard TCP in term of time t).
Otherwise, if cwnd is less than W_max, then CUBIC is the concave
region, and if cwnd is larger than W_max, CUBIC is in the convex
region. Below, we describe the exact actions taken by CUBIC in each
region.
3.2. TCPfriendly region
+ Standard TCP performs well in certain types of networks, for example,
+ under short RTT and small bandwidth (or small BDP) networks. In
+ these networks, we use the TCPfriendly region to ensure that CUBIC
+ achieves at least the same throughput as the standard TCP.
+
When receiving an ACK in congestion avoidance, we first check whether
the protocol is in the TCP region or not. This is done by estimating
the average rate of the Standard TCP using a simple analysis
described in [FHP00]. It considers the Standard TCP as a special
case of an Additive Increase and Multiplicative Decrease algorithm
(AIMD), which has an additive factor alpha_aimd and a multiplicative
factor beta_aimd with the following function:
AVG_W_aimd = [ alpha_aimd * (1+beta_aimd) /
(2*(1beta_aimd)*p) ]^0.5 (Eq. 3)
@@ 285,46 +293,49 @@
If W_cubic(t) is less than W_aimd(t), then the protocol is in the TCP
friendly region and cwnd SHOULD be set to W_aimd(t) at each reception
of ACK.
3.3. Concave region
When receiving an ACK in congestion avoidance, if the protocol is not
in the TCPfriendly region and cwnd is less than W_max, then the
protocol is in the concave region. In this region, cwnd MUST be
 incremented by (W_cubic(t+RTT)  cwnd)/cwnd for each received ACK.
+ incremented by (W_cubic(t+RTT)  cwnd)/cwnd for each received ACK,
+ where W_cubic(t+RTT) is calculated using Eq. 1.
3.4. Convex region
When the current window size of CUBIC is larger than W_max, it passes
the plateau of the cubic function after which CUBIC follows the
convex profile of the cubic function. Since cwnd is larger than the
previous saturation point W_max, this indicates that the network
conditions might have been perturbed since the last loss event,
possibly implying more available bandwidth after some flow
departures. Since the Internet is highly asynchronous, some amount
of perturbation is always possible without causing a major change in
available bandwidth. In this phase, CUBIC is being very careful by
very slowly increasing its window size. The convex profile ensures
that the window increases very slowly at the beginning and gradually
increases its growth rate. We also call this phase as the maximum
probing phase since CUBIC is searching for a new W_max. In this
region, cwnd MUST be incremented by (W_cubic(t+RTT)  cwnd)/cwnd for
 each received ACK.
+ each received ACK, where W_cubic(t+RTT) is calculated using Eq. 1.
3.5. Multiplicative decrease
 When a packet loss occurs, CUBIC reduces its window size by a factor
 of beta. Parameter beta_cubic SHOULD be set to 0.7.
+ When a packet loss (detected by duplicate ACKs) occurs, CUBIC updates
+ its W_max, cwnd, and ssthresh (slow start threshold) as follows.
+ Parameter beta_cubic SHOULD be set to 0.7.
W_max = cwnd; // save window size before reduction
+ ssthresh = cwnd * beta_cubic; // new slow start threshold
cwnd = cwnd * beta_cubic; // window reduction
A side effect of setting beta_cubic to a bigger value than 0.5 is
slower convergence. We believe that while a more adaptive setting of
beta_cubic could result in faster convergence, it will make the
analysis of the protocol much harder. This adaptive adjustment of
beta_cubic is an item for the next version of CUBIC.
3.6. Fast convergence
@@ 346,20 +357,25 @@
W_max = W_max*(1+beta_cubic)/2; // further reduce W_max
} else { // check upward trend
W_last_max = W_max // remember the last W_max
}
This allows W_max to be slightly less than the original W_max. Since
flows spend most of time around their W_max, flows with larger
bandwidth shares tend to spend more time around the plateau allowing
more time for flows with smaller shares to increase their windows.
+3.7. Timeout
+
+ In case of timeout, CUBIC follows the standard TCP to reduce cwnd,
+ but sets ssthresh using beta_cubic (same as in Section 3.5).
+
4. Discussion
With a deterministic loss model where the number of packets between
two successive lost events is always 1/p, CUBIC always operates with
the concave window profile which greatly simplifies the performance
analysis of CUBIC. The average window size of CUBIC can be obtained
by the following function:
AVG_W_cubic = [C*(3+beta_cubic)/(4*(1beta_cubic))]^0.25 *
(RTT^0.75) / (p^0.75) (Eq. 5)
@@ 477,53 +493,53 @@
Table 3
Our test results in [HKLRX06] indicate that CUBIC uses the spare
bandwidth left unused by existing Standard TCP flows in the same
bottleneck link without taking away much bandwidth from the existing
flows.
4.3. Difficult Environments
CUBIC is designed to remedy the poor performance of TCP in fast long
 distance networks. It is not designed for wireless networks.
+ distance networks.
4.4. Investigating a Range of Environments
CUBIC has been extensively studied by using both NS2 simulation and
testbed experiments covering a wide range of network environments.
More information can be found in [HKLRX06].
4.5. Protection against Congestion Collapse
 In case that there is congestion collapse, CUBIC behaves likely
 standard TCP since CUBIC modifies only the window adjustment
 algorithm of TCP. Thus, it does not modify the ACK clocking and
 Timeout behaviors of Standard TCP.
+ With regard to the potential of causing congestion collapse, CUBIC
+ behaves like standard TCP since CUBIC modifies only the window
+ adjustment algorithm of TCP. Thus, it does not modify the ACK
+ clocking and Timeout behaviors of Standard TCP.
4.6. Fairness within the Alternative Congestion Control Algorithm.
CUBIC ensures convergence of competing CUBIC flows with the same RTT
in the same bottleneck links to an equal bandwidth share. When
competing flows have different RTTs, their bandwidth shares are
linearly proportional to the inverse of their RTT ratios. This is
true independent of the level of statistical multiplexing in the
link.
4.7. Performance with Misbehaving Nodes and Outside Attackers
This is not considered in the current CUBIC.
4.8. Behavior for ApplicationLimited Flows
CUBIC does not raise its congestion window size if the flow is
currently limited by the application instead of the congestion
 window. In cases of idle periods, t in Eq. 1 should not include the
+ window. In cases of idle periods, t in Eq. 1 MUST NOT include the
idle time; otherwise, W_cubic(t) might be very high after restarting
from a long idle time.
4.9. Responses to Sudden or Transient Events
In case that there is a sudden congestion, a routing change, or a
mobility event, CUBIC behaves the same as Standard TCP.
4.10. Incremental Deployment
@@ 629,48 +644,39 @@
North Carolina State University
Department of Computer Science
Raleigh, NC 276957534
US
Email: rhee@ncsu.edu
Lisong Xu
University of NebraskaLincoln
Department of Computer Science and Engineering
 Lincoln, NE 6858801150
+ Lincoln, NE 685880115
US
Email: xu@unl.edu

Sangtae Ha
University of Colorado at Boulder
Department of Computer Science
Boulder, CO 803090430
US
Email: sangtae.ha@colorado.edu
Alexander Zimmermann
 NetApp
 Sonnenallee 1
 Kirchheim 85551
 Germany
 Phone: +49 89 900594712
 Email: alexander.zimmermann@netapp.com
+ Phone: +49 175 5766838
+ Email: alexander.zimmermann@rwthaachen.de
Lars Eggert
NetApp
Sonnenallee 1
Kirchheim 85551
Germany
Phone: +49 151 12055791
Email: lars@netapp.com
+
Richard Scheffenegger
 NetApp
 Am Euro Platz 2
 Vienna 1120
 Austria
 Phone: +43 1 3676811 3146
 Email: rs@netapp.com
+ Email: rscheff@gmx.at