draft-ietf-tcpm-cubic-03.txt | draft-ietf-tcpm-cubic-04.txt | |||
---|---|---|---|---|
TCP Maintenance and Minor Extensions (TCPM) WG I. Rhee | TCP Maintenance and Minor Extensions (TCPM) WG I. Rhee | |||
Internet-Draft NCSU | Internet-Draft NCSU | |||
Intended status: Informational L. Xu | Intended status: Informational L. Xu | |||
Expires: June 5, 2017 UNL | Expires: August 8, 2017 UNL | |||
S. Ha | S. Ha | |||
Colorado | Colorado | |||
A. Zimmermann | A. Zimmermann | |||
L. Eggert | L. Eggert | |||
NetApp | NetApp | |||
R. Scheffenegger | R. Scheffenegger | |||
December 2, 2016 | February 4, 2017 | |||
CUBIC for Fast Long-Distance Networks | CUBIC for Fast Long-Distance Networks | |||
draft-ietf-tcpm-cubic-03 | draft-ietf-tcpm-cubic-04 | |||
Abstract | Abstract | |||
CUBIC is an extension to the current TCP standards. The protocol | CUBIC is an extension to the current TCP standards. The protocol | |||
differs from the current TCP standards only in the congestion window | differs from the current TCP standards only in the congestion window | |||
adjustment function in the sender side. In particular, it uses a | adjustment function in the sender side. In particular, it uses a | |||
cubic function instead of a linear window increase of the current TCP | cubic function instead of a linear window increase function of the | |||
standards to improve scalability and stability under fast and long | current TCP standards to improve scalability and stability under fast | |||
distance networks. BIC-TCP, a predecessor of CUBIC, has been a | and long distance networks. CUBIC and its predecessor algorithm have | |||
default TCP adopted by Linux since year 2005 and has already been | been adopted as default by Linux and have been used for many years. | |||
deployed globally and in use for several years by the Internet | This document provides a specification of CUBIC to enable third party | |||
community at large. CUBIC is using a similar window growth function | implementation and to solicit the community feedback through | |||
as BIC-TCP and is designed to be less aggressive and fairer to TCP in | experimentation on the performance of CUBIC. | |||
bandwidth usage than BIC-TCP while maintaining the strengths of BIC- | ||||
TCP such as stability, window scalability and RTT fairness. Through | ||||
extensive testing in various Internet scenarios, we believe that | ||||
CUBIC is safe for deployment and testing in the global Internet. The | ||||
intent of this document is to provide the protocol specification of | ||||
CUBIC for a third party implementation and solicit the community | ||||
feedback through experimentation on the performance of CUBIC. | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on June 5, 2017. | This Internet-Draft will expire on August 8, 2017. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3. CUBIC Congestion Control . . . . . . . . . . . . . . . . . . 5 | 3. Design principle of CUBIC . . . . . . . . . . . . . . . . . . 4 | |||
3.1. Window growth function . . . . . . . . . . . . . . . . . 5 | 4. CUBIC Congestion Control . . . . . . . . . . . . . . . . . . 5 | |||
3.2. TCP-friendly region . . . . . . . . . . . . . . . . . . . 6 | 4.1. Window growth function . . . . . . . . . . . . . . . . . 6 | |||
3.3. Concave region . . . . . . . . . . . . . . . . . . . . . 7 | 4.2. TCP-friendly region . . . . . . . . . . . . . . . . . . . 7 | |||
3.4. Convex region . . . . . . . . . . . . . . . . . . . . . . 7 | 4.3. Concave region . . . . . . . . . . . . . . . . . . . . . 7 | |||
3.5. Multiplicative decrease . . . . . . . . . . . . . . . . . 7 | 4.4. Convex region . . . . . . . . . . . . . . . . . . . . . . 7 | |||
3.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 8 | 4.5. Multiplicative decrease . . . . . . . . . . . . . . . . . 8 | |||
3.7. Timeout . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 4.6. Fast convergence . . . . . . . . . . . . . . . . . . . . 8 | |||
4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 4.7. Timeout . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
4.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 9 | 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
4.2. Using Spare Capacity . . . . . . . . . . . . . . . . . . 10 | 5.1. Fairness to standard TCP . . . . . . . . . . . . . . . . 9 | |||
4.3. Difficult Environments . . . . . . . . . . . . . . . . . 11 | 5.2. Using Spare Capacity . . . . . . . . . . . . . . . . . . 11 | |||
4.4. Investigating a Range of Environments . . . . . . . . . . 11 | 5.3. Difficult Environments . . . . . . . . . . . . . . . . . 12 | |||
4.5. Protection against Congestion Collapse . . . . . . . . . 11 | 5.4. Investigating a Range of Environments . . . . . . . . . . 12 | |||
4.6. Fairness within the Alternative Congestion Control | 5.5. Protection against Congestion Collapse . . . . . . . . . 12 | |||
Algorithm. . . . . . . . . . . . . . . . . . . . . . . . 11 | 5.6. Fairness within the Alternative Congestion Control | |||
4.7. Performance with Misbehaving Nodes and Outside Attackers 12 | Algorithm. . . . . . . . . . . . . . . . . . . . . . . . 12 | |||
4.8. Behavior for Application-Limited Flows . . . . . . . . . 12 | 5.7. Performance with Misbehaving Nodes and Outside Attackers 12 | |||
4.9. Responses to Sudden or Transient Events . . . . . . . . . 12 | 5.8. Behavior for Application-Limited Flows . . . . . . . . . 12 | |||
4.10. Incremental Deployment . . . . . . . . . . . . . . . . . 12 | 5.9. Responses to Sudden or Transient Events . . . . . . . . . 13 | |||
5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 | 5.10. Incremental Deployment . . . . . . . . . . . . . . . . . 13 | |||
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | |||
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | |||
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 12 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 | |||
8.1. Normative References . . . . . . . . . . . . . . . . . . 13 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
8.2. Informative References . . . . . . . . . . . . . . . . . 13 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 13 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 | 9.2. Informative References . . . . . . . . . . . . . . . . . 14 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 | ||||
1. Introduction | 1. Introduction | |||
The low utilization problem of TCP in fast long-distance networks is | The low utilization problem of TCP in fast long-distance networks is | |||
well documented in [K03][RFC3649]. This problem arises from a slow | well documented in [K03][RFC3649]. This problem arises from a slow | |||
increase of congestion window following a congestion event in a | increase of congestion window following a congestion event in a | |||
network with a large bandwidth delay product (BDP). Our experience | network with a large bandwidth delay product (BDP). Our experience | |||
[HKLRX06] indicates that this problem is frequently observed even in | [HKLRX06] indicates that this problem is frequently observed even in | |||
the range of congestion window sizes over several hundreds of packets | the range of congestion window sizes over several hundreds of packets | |||
(each packet is sized around 1000 bytes) especially under a network | (each packet is sized around 1000 bytes) especially under a network | |||
path with over 100ms round-trip times (RTTs). This problem is | path with over 100ms round-trip times (RTTs). This problem is | |||
equally applicable to all Reno style TCP standards and their | equally applicable to all Reno style TCP standards and their | |||
variants, including TCP-RENO [RFC5681], TCP-NewReno [RFC6582], TCP- | variants, including TCP-RENO [RFC5681], TCP-NewReno [RFC6582], TCP- | |||
SACK [RFC2018], SCTP [RFC4960], TFRC [RFC5348] that use the same | SACK [RFC2018], SCTP [RFC4960], TFRC [RFC5348] that use the same | |||
linear increase function for window growth, which we refer to | linear increase function for window growth, which we refer to | |||
collectively as Standard TCP below. | collectively as Standard TCP below. | |||
CUBIC [HRX08] is a modification to the congestion control mechanism | CUBIC [HRX08] is a modification to the congestion control mechanism | |||
of Standard TCP, in particular, to the window increase function of | of Standard TCP, in particular, to the window increase function of | |||
Standard TCP senders, to remedy this problem. It uses a cubic | Standard TCP senders, to remedy this problem. Specifically, it uses | |||
increase function in terms of the elapsed time from the last | a cubic function instead of a linear window increase function of the | |||
congestion event. While most alternative algorithms to Standard TCP | Standrad TCP to improve scalability and stability under fast and long | |||
uses a convex increase function where during congestion avoidance the | distance networks. | |||
window increment is always increasing, CUBIC uses both the concave | ||||
and convex profiles of a cubic function for window increase. After a | BIC-TCP, a predecessor of CUBIC, has been selected as default TCP | |||
window reduction following a loss event detected by duplicate ACKs, | congestion control algorithm by Linux in the year 2005 and been used | |||
it registers the window size where it got the loss event as W_max and | for several years by the Internet community at large. CUBIC uses a | |||
performs a multiplicative decrease of congestion window and the | similar window growth function as BIC-TCP and is designed to be less | |||
regular fast recovery and retransmit of Standard TCP. After it | aggressive and fairer to TCP in bandwidth usage than BIC-TCP while | |||
enters into congestion avoidance from fast recovery, it starts to | maintaining the strengths of BIC-TCP such as stability, window | |||
increase the window using the concave profile of the cubic function. | scalability and RTT fairness. CUBIC has already been deployed | |||
The cubic function is set to have its plateau at W_max so the concave | globally by Linux. Through extensive testing in various Internet | |||
growth continues until the window size becomes W_max. After that, | scenarios, we believe that CUBIC is safe for testing and deployment | |||
the cubic function turns into a convex profile and the convex window | in the global Internet. | |||
growth begins. This style of window adjustment (concave and then | ||||
convex) improves protocol and network stability while maintaining | In the ensuing sections, we first brefly explain the design principle | |||
high network utilization [CEHRX07]. This is because the window size | of CUBIC, then provide the exact specification of CUBIC, and finally | |||
discuss the safety features of CUBIC following the guidelines | ||||
specified in [RFC5033]. | ||||
2. Conventions | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
document are to be interpreted as described in [RFC2119]. | ||||
3. Design principle of CUBIC | ||||
CUBIC [HRX08] uses a cubic window increase function in terms of the | ||||
elapsed time from the last congestion event. While most alternative | ||||
algorithms to Standard TCP uses a convex increase function where | ||||
during congestion avoidance the window increment is always | ||||
increasing, CUBIC uses both the concave and convex profiles of a | ||||
cubic function for window increase. After a window reduction | ||||
following a loss event detected by duplicate ACKs, it registers the | ||||
window size where it got the loss event as W_max and performs a | ||||
multiplicative decrease of congestion window and the regular fast | ||||
recovery and retransmit of Standard TCP. After it enters into | ||||
congestion avoidance from fast recovery, it starts to increase the | ||||
window using the concave profile of the cubic function. The cubic | ||||
function is set to have its plateau at W_max so the concave growth | ||||
continues until the window size becomes W_max. After that, the cubic | ||||
function turns into a convex profile and the convex window growth | ||||
begins. This style of window adjustment (concave and then convex) | ||||
improves protocol and network stability while maintaining high | ||||
network utilization [CEHRX07]. This is because the window size | ||||
remains almost constant, forming a plateau around W_max where network | remains almost constant, forming a plateau around W_max where network | |||
utilization is deemed highest and under steady state, most window | utilization is deemed highest and under steady state, most window | |||
size samples of CUBIC are close to W_max, thus promoting high network | size samples of CUBIC are close to W_max, thus promoting high network | |||
utilization and protocol stability. Note that protocols with convex | utilization and protocol stability. Note that protocols with convex | |||
increase functions have the maximum increments around W_max and | increase functions have the maximum increments around W_max and | |||
introduces a large number of packet bursts around the saturation | introduces a large number of packet bursts around the saturation | |||
point of the network, likely causing frequent global loss | point of the network, likely causing frequent global loss | |||
synchronizations. | synchronizations. | |||
Another notable feature of CUBIC is that its window increase rate is | Another notable feature of CUBIC is that its window increase rate is | |||
skipping to change at page 5, line 13 ¶ | skipping to change at page 5, line 38 ¶ | |||
higher order shares. HTCP [LS08] currently uses the equal share. | higher order shares. HTCP [LS08] currently uses the equal share. | |||
CUBIC sets the multiplicative window decrease factor to 0.7 while | CUBIC sets the multiplicative window decrease factor to 0.7 while | |||
Standard TCP uses 0.5. While this improves the scalability of the | Standard TCP uses 0.5. While this improves the scalability of the | |||
protocol, a side effect of this decision is slower convergence | protocol, a side effect of this decision is slower convergence | |||
especially under low statistical multiplexing environments. This | especially under low statistical multiplexing environments. This | |||
design choice is following the observation that the author of HSTCP | design choice is following the observation that the author of HSTCP | |||
[RFC3649] has made along with other researchers (e.g., [GV02]): the | [RFC3649] has made along with other researchers (e.g., [GV02]): the | |||
current Internet becomes more asynchronous with less frequent loss | current Internet becomes more asynchronous with less frequent loss | |||
synchronizations with high statistical multiplexing. Under this | synchronizations with high statistical multiplexing. Under this | |||
environment, even strict MIMD can converge. CUBIC flows with the | environment, even strict Multiplicative-Increase Multiplicative- | |||
same RTT always converge to the same share of bandwidth independent | Decrease (MIMD) can converge. CUBIC flows with the same RTT always | |||
of statistical multiplexing, thus achieving intra-protocol fairness. | converge to the same share of bandwidth independent of statistical | |||
We also find that under the environments with sufficient statistical | multiplexing, thus achieving intra-protocol fairness. We also find | |||
multiplexing, the convergence speed of CUBIC flows is reasonable. | that under the environments with sufficient statistical multiplexing, | |||
the convergence speed of CUBIC flows is reasonable. | ||||
In the ensuing sections, we provide the exact specification of CUBIC | ||||
and discuss the safety features of CUBIC following the guidelines | ||||
specified in [RFC5033]. | ||||
2. Conventions | ||||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
document are to be interpreted as described in [RFC2119]. | ||||
3. CUBIC Congestion Control | 4. CUBIC Congestion Control | |||
The unit of all window sizes in this document is segments of the | The unit of all window sizes in this document is segments of the | |||
maximum segment size (MSS), and the unit of all times is seconds. | maximum segment size (MSS), and the unit of all times is seconds. | |||
3.1. Window growth function | 4.1. Window growth function | |||
CUBIC maintains the acknowledgment (ACK) clocking of Standard TCP by | CUBIC maintains the acknowledgment (ACK) clocking of Standard TCP by | |||
increasing congestion window only at the reception of ACK. The | increasing congestion window only at the reception of ACK. The | |||
protocol does not make any change to the fast recovery and retransmit | protocol does not make any change to the fast recovery and retransmit | |||
of TCP, such as TCP-NewReno [RFC6582] and TCP-SACK [RFC2018]. During | of TCP, such as TCP-NewReno [RFC6582] and TCP-SACK [RFC2018]. During | |||
congestion avoidance after fast recovery, CUBIC changes the window | congestion avoidance after fast recovery, CUBIC changes the window | |||
update algorithm of Standard TCP. Suppose that W_max is the window | update algorithm of Standard TCP. Suppose that W_max is the window | |||
size before the window is reduced in the last fast retransmit and | size before the window is reduced in the last fast retransmit and | |||
recovery. | recovery. | |||
The window growth function of CUBIC uses the following function: | The window growth function of CUBIC uses the following function: | |||
W_cubic(t) = C*(t-K)^3 + W_max (Eq. 1) | W_cubic(t) = C*(t-K)^3 + W_max (Eq. 1) | |||
where C is a constant fixed to determine the aggressiveness of window | where C is a constant fixed to determine the aggressiveness of window | |||
growth in high BDP networks, t is the elapsed time from the last | growth in high BDP networks, t is the elapsed time from the last | |||
window reduction (measured right after the fast recovery), and K is | window reduction that is measured right after the fast recovery in | |||
the time period that the above function takes to increase the current | response to duplicate ACKs or after the congestion window reduction | |||
window size to W_max if there is no further loss event and is | in response to ECN-Echo ACKs, and K is the time period that the above | |||
calculated by using the following equation: | function takes to increase the current window size to W_max if there | |||
is no further loss event and is calculated by using the following | ||||
equation: | ||||
K = cubic_root(W_max*(1-beta_cubic)/C) (Eq. 2) | K = cubic_root(W_max*(1-beta_cubic)/C) (Eq. 2) | |||
where beta_cubic is the CUBIC multiplication decrease factor, that | where beta_cubic is the CUBIC multiplication decrease factor, that | |||
is, when a packet loss (detected by duplicate ACKs) occurs, CUBIC | is, when a packet loss detected by duplicate ACKs or a network | |||
reduces its current window cwnd to W_cubic(0)=W_max*beta_cubic. We | congestion detected by ECN-Echo ACKs occurs, CUBIC reduces its | |||
discuss how we set C in the next Section in more details. | current window cwnd to W_cubic(0)=W_max*beta_cubic. We discuss how | |||
we set C in the next Section in more details. | ||||
Upon receiving an ACK during congestion avoidance, CUBIC computes the | Upon receiving an ACK during congestion avoidance, CUBIC computes the | |||
window growth rate during the next RTT period using Eq. 1. It sets | window growth rate during the next RTT period using Eq. 1. It sets | |||
W_cubic(t+RTT) as the candidate target value of congestion window, | W_cubic(t+RTT) as the candidate target value of congestion window, | |||
where RTT is the weithed average RTT calculated by the standard TCP. | where RTT is the weithed average RTT calculated by the standard TCP. | |||
Depending on the value of the current window size cwnd, CUBIC runs in | Depending on the value of the current window size cwnd, CUBIC runs in | |||
three different modes. First, if cwnd is less than the window size | three different modes. First, if cwnd is less than the window size | |||
that Standard TCP would reach at time t after the last loss event, | that Standard TCP would reach at time t after the last loss event, | |||
then CUBIC is in the TCP friendly region (we describe below how to | then CUBIC is in the TCP friendly region (we describe below how to | |||
determine this window size of Standard TCP in term of time t). | determine this window size of Standard TCP in term of time t). | |||
Otherwise, if cwnd is less than W_max, then CUBIC is the concave | Otherwise, if cwnd is less than W_max, then CUBIC is the concave | |||
region, and if cwnd is larger than W_max, CUBIC is in the convex | region, and if cwnd is larger than W_max, CUBIC is in the convex | |||
region. Below, we describe the exact actions taken by CUBIC in each | region. Below, we describe the exact actions taken by CUBIC in each | |||
region. | region. | |||
3.2. TCP-friendly region | 4.2. TCP-friendly region | |||
Standard TCP performs well in certain types of networks, for example, | Standard TCP performs well in certain types of networks, for example, | |||
under short RTT and small bandwidth (or small BDP) networks. In | under short RTT and small bandwidth (or small BDP) networks. In | |||
these networks, we use the TCP-friendly region to ensure that CUBIC | these networks, we use the TCP-friendly region to ensure that CUBIC | |||
achieves at least the same throughput as the standard TCP. | achieves at least the same throughput as the standard TCP. | |||
When receiving an ACK in congestion avoidance, we first check whether | When receiving an ACK in congestion avoidance, we first check whether | |||
the protocol is in the TCP region or not. This is done by estimating | the protocol is in the TCP region or not. This is done by estimating | |||
the average rate of the Standard TCP using a simple analysis | the average rate of the Standard TCP using a simple analysis | |||
described in [FHP00]. It considers the Standard TCP as a special | described in [FHP00]. It considers the Standard TCP as a special | |||
skipping to change at page 7, line 14 ¶ | skipping to change at page 7, line 38 ¶ | |||
alpha_aimd per RTT, we can get the window size of AIMD in terms of | alpha_aimd per RTT, we can get the window size of AIMD in terms of | |||
the elapsed time t as follows: | the elapsed time t as follows: | |||
W_aimd(t) = W_max*beta_aimd + | W_aimd(t) = W_max*beta_aimd + | |||
[3*(1-beta_aimd)/(1+beta_aimd)] * (t/RTT) (Eq. 4) | [3*(1-beta_aimd)/(1+beta_aimd)] * (t/RTT) (Eq. 4) | |||
If W_cubic(t) is less than W_aimd(t), then the protocol is in the TCP | If W_cubic(t) is less than W_aimd(t), then the protocol is in the TCP | |||
friendly region and cwnd SHOULD be set to W_aimd(t) at each reception | friendly region and cwnd SHOULD be set to W_aimd(t) at each reception | |||
of ACK. | of ACK. | |||
3.3. Concave region | 4.3. Concave region | |||
When receiving an ACK in congestion avoidance, if the protocol is not | When receiving an ACK in congestion avoidance, if the protocol is not | |||
in the TCP-friendly region and cwnd is less than W_max, then the | in the TCP-friendly region and cwnd is less than W_max, then the | |||
protocol is in the concave region. In this region, cwnd MUST be | protocol is in the concave region. In this region, cwnd MUST be | |||
incremented by (W_cubic(t+RTT) - cwnd)/cwnd for each received ACK, | incremented by (W_cubic(t+RTT) - cwnd)/cwnd for each received ACK, | |||
where W_cubic(t+RTT) is calculated using Eq. 1. | where W_cubic(t+RTT) is calculated using Eq. 1. | |||
3.4. Convex region | 4.4. Convex region | |||
When the current window size of CUBIC is larger than W_max, it passes | When the current window size of CUBIC is larger than W_max, it passes | |||
the plateau of the cubic function after which CUBIC follows the | the plateau of the cubic function after which CUBIC follows the | |||
convex profile of the cubic function. Since cwnd is larger than the | convex profile of the cubic function. Since cwnd is larger than the | |||
previous saturation point W_max, this indicates that the network | previous saturation point W_max, this indicates that the network | |||
conditions might have been perturbed since the last loss event, | conditions might have been perturbed since the last loss event, | |||
possibly implying more available bandwidth after some flow | possibly implying more available bandwidth after some flow | |||
departures. Since the Internet is highly asynchronous, some amount | departures. Since the Internet is highly asynchronous, some amount | |||
of perturbation is always possible without causing a major change in | of perturbation is always possible without causing a major change in | |||
available bandwidth. In this phase, CUBIC is being very careful by | available bandwidth. In this phase, CUBIC is being very careful by | |||
very slowly increasing its window size. The convex profile ensures | very slowly increasing its window size. The convex profile ensures | |||
that the window increases very slowly at the beginning and gradually | that the window increases very slowly at the beginning and gradually | |||
increases its growth rate. We also call this phase as the maximum | increases its growth rate. We also call this phase as the maximum | |||
probing phase since CUBIC is searching for a new W_max. In this | probing phase since CUBIC is searching for a new W_max. In this | |||
region, cwnd MUST be incremented by (W_cubic(t+RTT) - cwnd)/cwnd for | region, cwnd MUST be incremented by (W_cubic(t+RTT) - cwnd)/cwnd for | |||
each received ACK, where W_cubic(t+RTT) is calculated using Eq. 1. | each received ACK, where W_cubic(t+RTT) is calculated using Eq. 1. | |||
3.5. Multiplicative decrease | 4.5. Multiplicative decrease | |||
When a packet loss (detected by duplicate ACKs) occurs, CUBIC updates | When a packet loss detected by duplicate ACKs or a network congestion | |||
its W_max, cwnd, and ssthresh (slow start threshold) as follows. | detected by ECN-Echo ACKs occurs, CUBIC updates its W_max, cwnd, and | |||
Parameter beta_cubic SHOULD be set to 0.7. | ssthresh (slow start threshold) as follows. Parameter beta_cubic | |||
SHOULD be set to 0.7. | ||||
W_max = cwnd; // save window size before reduction | W_max = cwnd; // save window size before reduction | |||
ssthresh = cwnd * beta_cubic; // new slow start threshold | ssthresh = cwnd * beta_cubic; // new slow start threshold | |||
cwnd = cwnd * beta_cubic; // window reduction | cwnd = cwnd * beta_cubic; // window reduction | |||
A side effect of setting beta_cubic to a bigger value than 0.5 is | A side effect of setting beta_cubic to a bigger value than 0.5 is | |||
slower convergence. We believe that while a more adaptive setting of | slower convergence. We believe that while a more adaptive setting of | |||
beta_cubic could result in faster convergence, it will make the | beta_cubic could result in faster convergence, it will make the | |||
analysis of the protocol much harder. This adaptive adjustment of | analysis of the protocol much harder. This adaptive adjustment of | |||
beta_cubic is an item for the next version of CUBIC. | beta_cubic is an item for the next version of CUBIC. | |||
3.6. Fast convergence | 4.6. Fast convergence | |||
To improve the convergence speed of CUBIC, we add a heuristic in the | To improve the convergence speed of CUBIC, we add a heuristic in the | |||
protocol. When a new flow joins the network, existing flows in the | protocol. When a new flow joins the network, existing flows in the | |||
network need to give up their bandwidth shares to allow the flow some | network need to give up their bandwidth shares to allow the flow some | |||
room for growth if the existing flows have been using all the | room for growth if the existing flows have been using all the | |||
bandwidth of the network. To increase this release of bandwidth by | bandwidth of the network. To increase this release of bandwidth by | |||
existing flows, the following mechanism called fast convergence | existing flows, the following mechanism called fast convergence | |||
SHOULD be implemented. | SHOULD be implemented. | |||
With fast convergence, when a loss event occurs, before a window | With fast convergence, when a loss event occurs, before a window | |||
skipping to change at page 8, line 35 ¶ | skipping to change at page 9, line 17 ¶ | |||
W_max = W_max*(1+beta_cubic)/2; // further reduce W_max | W_max = W_max*(1+beta_cubic)/2; // further reduce W_max | |||
} else { // check upward trend | } else { // check upward trend | |||
W_last_max = W_max // remember the last W_max | W_last_max = W_max // remember the last W_max | |||
} | } | |||
This allows W_max to be slightly less than the original W_max. Since | This allows W_max to be slightly less than the original W_max. Since | |||
flows spend most of time around their W_max, flows with larger | flows spend most of time around their W_max, flows with larger | |||
bandwidth shares tend to spend more time around the plateau allowing | bandwidth shares tend to spend more time around the plateau allowing | |||
more time for flows with smaller shares to increase their windows. | more time for flows with smaller shares to increase their windows. | |||
3.7. Timeout | 4.7. Timeout | |||
In case of timeout, CUBIC follows the standard TCP to reduce cwnd, | In case of timeout, CUBIC follows the standard TCP to reduce cwnd, | |||
but sets ssthresh using beta_cubic (same as in Section 3.5). | but sets ssthresh using beta_cubic (same as in Section 4.5). | |||
4. Discussion | 5. Discussion | |||
In this section, we further discuss the safety features of CUBIC | ||||
following the guidelines specified in [RFC5033]. | ||||
With a deterministic loss model where the number of packets between | With a deterministic loss model where the number of packets between | |||
two successive lost events is always 1/p, CUBIC always operates with | two successive lost events is always 1/p, CUBIC always operates with | |||
the concave window profile which greatly simplifies the performance | the concave window profile which greatly simplifies the performance | |||
analysis of CUBIC. The average window size of CUBIC can be obtained | analysis of CUBIC. The average window size of CUBIC can be obtained | |||
by the following function: | by the following function: | |||
AVG_W_cubic = [C*(3+beta_cubic)/(4*(1-beta_cubic))]^0.25 * | AVG_W_cubic = [C*(3+beta_cubic)/(4*(1-beta_cubic))]^0.25 * | |||
(RTT^0.75) / (p^0.75) (Eq. 5) | (RTT^0.75) / (p^0.75) (Eq. 5) | |||
With beta_cubic set to 0.7, the above formula is reduced to: | With beta_cubic set to 0.7, the above formula is reduced to: | |||
AVG_W_cubic = (C*3.7/1.2)^0.25 * (RTT^0.75) / (p^0.75) (Eq. 6) | AVG_W_cubic = (C*3.7/1.2)^0.25 * (RTT^0.75) / (p^0.75) (Eq. 6) | |||
We will determine the value of C in the following subsection using | We will determine the value of C in the following subsection using | |||
Eq. 6. | Eq. 6. | |||
4.1. Fairness to standard TCP | 5.1. Fairness to standard TCP | |||
In environments where standard TCP is able to make reasonable use of | In environments where standard TCP is able to make reasonable use of | |||
the available bandwidth, CUBIC does not significantly change this | the available bandwidth, CUBIC does not significantly change this | |||
state. | state. | |||
Standard TCP performs well in the following two types of networks: | Standard TCP performs well in the following two types of networks: | |||
1. networks with a small bandwidth-delay product (BDP) | 1. networks with a small bandwidth-delay product (BDP) | |||
2. networks with a short RTT, but not necessarily a small BDP | 2. networks with a short RTT, but not necessarily a small BDP | |||
CUBIC is designed to behave very similarly to standard TCP in the | CUBIC is designed to behave very similarly to standard TCP in the | |||
above two types of networks. The following two tables show the | above two types of networks. The following two tables show the | |||
average window size of standard TCP, HSTCP, and CUBIC. The average | average window size of standard TCP, HSTCP, and CUBIC. The average | |||
window size of standard TCP and HSTCP is from [RFC3649]. The average | window size of standard TCP and HSTCP is from [RFC3649]. The average | |||
window size of CUBIC is calculated by using Eq. 6 and CUBIC TCP | window size of CUBIC is calculated by using Eq. 6 and CUBIC TCP | |||
friendly mode for three different values of C. | friendly mode for three different values of C. | |||
+----------+-------+--------+-------------+-------------+-----------+ | +----------+-------+--------+-------------+-------------+-----------+ | |||
skipping to change at page 10, line 48 ¶ | skipping to change at page 11, line 26 ¶ | |||
bandwidth networks. Based on these observations, we find C=0.4 gives | bandwidth networks. Based on these observations, we find C=0.4 gives | |||
a good balance between TCP-friendliness and aggressiveness of window | a good balance between TCP-friendliness and aggressiveness of window | |||
growth. Therefore, C SHOULD be set to 0.4. With C set to 0.4, Eq. 6 | growth. Therefore, C SHOULD be set to 0.4. With C set to 0.4, Eq. 6 | |||
is reduced to: | is reduced to: | |||
AVG_W_cubic = 1.054 * (RTT^0.75) / (p^0.75) (Eq. 7) | AVG_W_cubic = 1.054 * (RTT^0.75) / (p^0.75) (Eq. 7) | |||
Eq. 7 is then used in the next subsection to show the scalability of | Eq. 7 is then used in the next subsection to show the scalability of | |||
CUBIC. | CUBIC. | |||
4.2. Using Spare Capacity | 5.2. Using Spare Capacity | |||
CUBIC uses a more aggressive window growth function than Standard TCP | CUBIC uses a more aggressive window growth function than Standard TCP | |||
under long RTT and high bandwidth networks. | under long RTT and high bandwidth networks. | |||
The following table shows that to achieve 10Gbps rate, standard TCP | The following table shows that to achieve 10Gbps rate, standard TCP | |||
requires a packet loss rate of 2.0e-10, while CUBIC requires a packet | requires a packet loss rate of 2.0e-10, while CUBIC requires a packet | |||
loss rate of 2.9e-8. | loss rate of 2.9e-8. | |||
+------------------+-----------+---------+---------+---------+ | +------------------+-----------+---------+---------+---------+ | |||
| Throughput(Mbps) | Average W | TCP P | HSTCP P | CUBIC P | | | Throughput(Mbps) | Average W | TCP P | HSTCP P | CUBIC P | | |||
skipping to change at page 11, line 30 ¶ | skipping to change at page 12, line 10 ¶ | |||
achieve a certain throughput. We use 1500-byte packets and an RTT of | achieve a certain throughput. We use 1500-byte packets and an RTT of | |||
0.1 seconds. | 0.1 seconds. | |||
Table 3 | Table 3 | |||
Our test results in [HKLRX06] indicate that CUBIC uses the spare | Our test results in [HKLRX06] indicate that CUBIC uses the spare | |||
bandwidth left unused by existing Standard TCP flows in the same | bandwidth left unused by existing Standard TCP flows in the same | |||
bottleneck link without taking away much bandwidth from the existing | bottleneck link without taking away much bandwidth from the existing | |||
flows. | flows. | |||
4.3. Difficult Environments | 5.3. Difficult Environments | |||
CUBIC is designed to remedy the poor performance of TCP in fast long- | CUBIC is designed to remedy the poor performance of TCP in fast long- | |||
distance networks. | distance networks. | |||
4.4. Investigating a Range of Environments | 5.4. Investigating a Range of Environments | |||
CUBIC has been extensively studied by using both NS-2 simulation and | CUBIC has been extensively studied by using both NS-2 simulation and | |||
test-bed experiments covering a wide range of network environments. | test-bed experiments covering a wide range of network environments. | |||
More information can be found in [HKLRX06]. | More information can be found in [HKLRX06]. | |||
4.5. Protection against Congestion Collapse | Same as Standard TCP, CUBIC is a loss-based congestion control | |||
algorithm. Therefore, CUBIC, which is designed to be more aggressive | ||||
than Standard TCP in fast and long distance networks, can fill large | ||||
drop-tail buffers more quickly than Standard TCP. In this case, | ||||
proper queue sizing and management [RFC7567] could be used to reduce | ||||
the packet queueing delay. | ||||
5.5. Protection against Congestion Collapse | ||||
With regard to the potential of causing congestion collapse, CUBIC | With regard to the potential of causing congestion collapse, CUBIC | |||
behaves like standard TCP since CUBIC modifies only the window | behaves like standard TCP since CUBIC modifies only the window | |||
adjustment algorithm of TCP. Thus, it does not modify the ACK | adjustment algorithm of TCP. Thus, it does not modify the ACK | |||
clocking and Timeout behaviors of Standard TCP. | clocking and Timeout behaviors of Standard TCP. | |||
4.6. Fairness within the Alternative Congestion Control Algorithm. | 5.6. Fairness within the Alternative Congestion Control Algorithm. | |||
CUBIC ensures convergence of competing CUBIC flows with the same RTT | CUBIC ensures convergence of competing CUBIC flows with the same RTT | |||
in the same bottleneck links to an equal bandwidth share. When | in the same bottleneck links to an equal bandwidth share. When | |||
competing flows have different RTTs, their bandwidth shares are | competing flows have different RTTs, their bandwidth shares are | |||
linearly proportional to the inverse of their RTT ratios. This is | linearly proportional to the inverse of their RTT ratios. This is | |||
true independent of the level of statistical multiplexing in the | true independent of the level of statistical multiplexing in the | |||
link. | link. | |||
4.7. Performance with Misbehaving Nodes and Outside Attackers | 5.7. Performance with Misbehaving Nodes and Outside Attackers | |||
This is not considered in the current CUBIC. | This is not considered in the current CUBIC. | |||
4.8. Behavior for Application-Limited Flows | 5.8. Behavior for Application-Limited Flows | |||
CUBIC does not raise its congestion window size if the flow is | CUBIC does not raise its congestion window size if the flow is | |||
currently limited by the application instead of the congestion | currently limited by the application instead of the congestion | |||
window. In cases of idle periods, t in Eq. 1 MUST NOT include the | window. In cases of idle periods, t in Eq. 1 MUST NOT include the | |||
idle time; otherwise, W_cubic(t) might be very high after restarting | idle time; otherwise, W_cubic(t) might be very high after restarting | |||
from a long idle time. | from a long idle time. | |||
4.9. Responses to Sudden or Transient Events | 5.9. Responses to Sudden or Transient Events | |||
In case that there is a sudden congestion, a routing change, or a | In case that there is a sudden congestion, a routing change, or a | |||
mobility event, CUBIC behaves the same as Standard TCP. | mobility event, CUBIC behaves the same as Standard TCP. | |||
4.10. Incremental Deployment | 5.10. Incremental Deployment | |||
CUBIC requires only the change of TCP senders, and does not require | CUBIC requires only the change of TCP senders, and does not require | |||
any assistant of routers. | any assistant of routers. | |||
5. Security Considerations | 6. Security Considerations | |||
This proposal makes no changes to the underlying security of TCP. | This proposal makes no changes to the underlying security of TCP. | |||
6. IANA Considerations | 7. IANA Considerations | |||
There are no IANA considerations regarding this document. | There are no IANA considerations regarding this document. | |||
7. Acknowledgements | 8. Acknowledgements | |||
Alexander Zimmermann and Lars Eggert have received funding from the | Alexander Zimmermann and Lars Eggert have received funding from the | |||
European Union's Horizon 2020 research and innovation program | European Union's Horizon 2020 research and innovation program | |||
2014-2018 under grant agreement No. 644866 (SSICLOPS). This document | 2014-2018 under grant agreement No. 644866 (SSICLOPS). This document | |||
reflects only the authors' views and the European Commission is not | reflects only the authors' views and the European Commission is not | |||
responsible for any use that may be made of the information it | responsible for any use that may be made of the information it | |||
contains. | contains. | |||
8. References | 9. References | |||
8.1. Normative References | ||||
9.1. Normative References | ||||
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP | [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP | |||
Selective Acknowledgment Options", RFC 2018, | Selective Acknowledgment Options", RFC 2018, | |||
DOI 10.17487/RFC2018, October 1996, | DOI 10.17487/RFC2018, October 1996, | |||
<http://www.rfc-editor.org/info/rfc2018>. | <http://www.rfc-editor.org/info/rfc2018>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<http://www.rfc-editor.org/info/rfc2119>. | <http://www.rfc-editor.org/info/rfc2119>. | |||
skipping to change at page 13, line 43 ¶ | skipping to change at page 14, line 28 ¶ | |||
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion | |||
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | |||
<http://www.rfc-editor.org/info/rfc5681>. | <http://www.rfc-editor.org/info/rfc5681>. | |||
[RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The | |||
NewReno Modification to TCP's Fast Recovery Algorithm", | NewReno Modification to TCP's Fast Recovery Algorithm", | |||
RFC 6582, DOI 10.17487/RFC6582, April 2012, | RFC 6582, DOI 10.17487/RFC6582, April 2012, | |||
<http://www.rfc-editor.org/info/rfc6582>. | <http://www.rfc-editor.org/info/rfc6582>. | |||
8.2. Informative References | [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF | |||
Recommendations Regarding Active Queue Management", | ||||
BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, | ||||
<http://www.rfc-editor.org/info/rfc7567>. | ||||
9.2. Informative References | ||||
[CEHRX07] Cai, H., Eun, D., Ha, S., Rhee, I., and L. Xu, "Stochastic | [CEHRX07] Cai, H., Eun, D., Ha, S., Rhee, I., and L. Xu, "Stochastic | |||
Ordering for Internet Congestion Control and its | Ordering for Internet Congestion Control and its | |||
Applications", In Proceedings of IEEE INFOCOM , May 2007. | Applications", In Proceedings of IEEE INFOCOM , May 2007. | |||
[FHP00] Floyd, S., Handley, M., and J. Padhye, "A Comparison of | [FHP00] Floyd, S., Handley, M., and J. Padhye, "A Comparison of | |||
Equation-Based and AIMD Congestion Control", May 2000. | Equation-Based and AIMD Congestion Control", May 2000. | |||
[GV02] Gorinsky, S. and H. Vin, "Extended Analysis of Binary | [GV02] Gorinsky, S. and H. Vin, "Extended Analysis of Binary | |||
Adjustment Algorithms", Technical Report TR2002-29, | Adjustment Algorithms", Technical Report TR2002-29, | |||
End of changes. 38 change blocks. | ||||
117 lines changed or deleted | 150 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |