[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]
Versions: (RFC 8312) 00 01
TCPM L. Xu
Internet-Draft UNL
Obsoletes: 8312 (if approved) S. Ha
Intended status: Standards Track Colorado
Expires: 6 August 2021 I. Rhee
Bowery
V. Goel
Apple Inc.
L. Eggert, Ed.
NetApp
2 February 2021
CUBIC for Fast Long-Distance Networks
draft-eggert-tcpm-rfc8312bis-01
Abstract
CUBIC is an extension to the current TCP standards. It differs from
the current TCP standards only in the congestion control algorithm on
the sender side. In particular, it uses a cubic function instead of
a linear window increase function of the current TCP standards to
improve scalability and stability under fast and long-distance
networks. CUBIC and its predecessor algorithm have been adopted as
defaults by Linux and have been used for many years. This document
provides a specification of CUBIC to enable third-party
implementations and to solicit community feedback through
experimentation on the performance of CUBIC.
This documents obsoletes [RFC8312], updating the specification of
CUBIC to conform to the current Linux version.
Note to Readers
Discussion of this draft takes place on the TCPM working group
mailing list (mailto:tcpm@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/tcpm/.
Working Group information can be found at
https://datatracker.ietf.org/wg/tcpm/; source code and issues list
for this draft can be found at https://github.com/NTAP/rfc8312bis.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Xu, et al. Expires 6 August 2021 [Page 1]
Internet-Draft CUBIC February 2021
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 6 August 2021.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Design Principles of CUBIC . . . . . . . . . . . . . . . . . 4
4. CUBIC Congestion Control . . . . . . . . . . . . . . . . . . 6
4.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 6
4.1.1. Constants of Interest . . . . . . . . . . . . . . . . 6
4.1.2. Variables of Interest . . . . . . . . . . . . . . . . 7
4.2. Window Increase Function . . . . . . . . . . . . . . . . 8
4.3. TCP-Friendly Region . . . . . . . . . . . . . . . . . . . 9
4.4. Concave Region . . . . . . . . . . . . . . . . . . . . . 11
4.5. Convex Region . . . . . . . . . . . . . . . . . . . . . . 11
4.6. Multiplicative Decrease . . . . . . . . . . . . . . . . . 12
4.7. Fast Convergence . . . . . . . . . . . . . . . . . . . . 12
4.8. Timeout . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.9. Spurious Congestion Events . . . . . . . . . . . . . . . 13
4.10. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 15
5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.1. Fairness to Standard TCP . . . . . . . . . . . . . . . . 16
5.2. Using Spare Capacity . . . . . . . . . . . . . . . . . . 18
5.3. Difficult Environments . . . . . . . . . . . . . . . . . 19
Xu, et al. Expires 6 August 2021 [Page 2]
Internet-Draft CUBIC February 2021
5.4. Investigating a Range of Environments . . . . . . . . . . 19
5.5. Protection against Congestion Collapse . . . . . . . . . 19
5.6. Fairness within the Alternative Congestion Control
Algorithm . . . . . . . . . . . . . . . . . . . . . . . 19
5.7. Performance with Misbehaving Nodes and Outside
Attackers . . . . . . . . . . . . . . . . . . . . . . . 19
5.8. Behavior for Application-Limited Flows . . . . . . . . . 19
5.9. Responses to Sudden or Transient Events . . . . . . . . . 20
5.10. Incremental Deployment . . . . . . . . . . . . . . . . . 20
6. Security Considerations . . . . . . . . . . . . . . . . . . . 20
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
8. References . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.1. Normative References . . . . . . . . . . . . . . . . . . 20
8.2. Informative References . . . . . . . . . . . . . . . . . 21
Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 23
Appendix B. Evolution of CUBIC . . . . . . . . . . . . . . . . . 23
B.1. Since draft-eggert-tcpm-rfc8312bis-00 . . . . . . . . . . 23
B.2. Since RFC8312 . . . . . . . . . . . . . . . . . . . . . . 24
B.3. Since the Original Paper . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction
The low utilization problem of TCP in fast long-distance networks is
well documented in [K03] and [RFC3649]. This problem arises from a
slow increase of the congestion window following a congestion event
in a network with a large bandwidth-delay product (BDP). [HKLRX06]
indicates that this problem is frequently observed even in the range
of congestion window sizes over several hundreds of packets. This
problem is equally applicable to all Reno-style TCP standards and
their variants, including TCP-Reno [RFC5681], TCP-NewReno
[RFC6582][RFC6675], SCTP [RFC4960], and TFRC [RFC5348], which use the
same linear increase function for window growth, which we refer to
collectively as "Standard TCP" below.
CUBIC, originally proposed in [HRX08], is a modification to the
congestion control algorithm of Standard TCP to remedy this problem.
This document describes the most recent specification of CUBIC.
Specifically, CUBIC uses a cubic function instead of a linear window
increase function of Standard TCP to improve scalability and
stability under fast and long-distance networks.
Binary Increase Congestion Control (BIC-TCP) [XHR04], a predecessor
of CUBIC, was selected as the default TCP congestion control
algorithm by Linux in the year 2005 and has been used for several
years by the Internet community at large. CUBIC uses a similar
window increase function as BIC-TCP and is designed to be less
aggressive and fairer to Standard TCP in bandwidth usage than BIC-TCP
Xu, et al. Expires 6 August 2021 [Page 3]
Internet-Draft CUBIC February 2021
while maintaining the strengths of BIC-TCP such as stability, window
scalability, and RTT fairness. CUBIC has already replaced BIC-TCP as
the default TCP congestion control algorithm in Linux and has been
deployed globally by Linux. Through extensive testing in various
Internet scenarios, we believe that CUBIC is safe for testing and
deployment in the global Internet.
In the following sections, we first briefly explain the design
principles of CUBIC, then provide the exact specification of CUBIC,
and finally discuss the safety features of CUBIC following the
guidelines specified in [RFC5033].
2. Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. Design Principles of CUBIC
CUBIC is designed according to the following design principles:
Principle 1: For better network utilization and stability, CUBIC
uses both the concave and convex profiles of a cubic function to
increase the congestion window size, instead of using just a
convex function.
Principle 2: To be TCP-friendly, CUBIC is designed to behave like
Standard TCP in networks with short RTTs and small bandwidth where
Standard TCP performs well.
Principle 3: For RTT-fairness, CUBIC is designed to achieve linear
bandwidth sharing among flows with different RTTs.
Principle 4: CUBIC appropriately sets its multiplicative window
decrease factor in order to balance between the scalability and
convergence speed.
Principle 1: For better network utilization and stability, CUBIC
[HRX08] uses a cubic window increase function in terms of the elapsed
time from the last congestion event. While most alternative
congestion control algorithms to Standard TCP increase the congestion
window using convex functions, CUBIC uses both the concave and convex
profiles of a cubic function for window growth. After a window
reduction in response to a congestion event is detected by duplicate
ACKs or Explicit Congestion Notification-Echo (ECN-Echo) ACKs
Xu, et al. Expires 6 August 2021 [Page 4]
Internet-Draft CUBIC February 2021
[RFC3168], CUBIC registers the congestion window size where it got
the congestion event as _W_(max)_ and performs a multiplicative
decrease of congestion window. After it enters into congestion
avoidance, it starts to increase the congestion window using the
concave profile of the cubic function. The cubic function is set to
have its plateau at _W_(max)_ so that the concave window increase
continues until the window size becomes _W_(max)_. After that, the
cubic function turns into a convex profile and the convex window
increase begins. This style of window adjustment (concave and then
convex) improves the algorithm stability while maintaining high
network utilization [CEHRX07]. This is because the window size
remains almost constant, forming a plateau around _W_(max)_ where
network utilization is deemed highest. Under steady state, most
window size samples of CUBIC are close to _W_(max)_, thus promoting
high network utilization and stability. Note that those congestion
control algorithms using only convex functions to increase the
congestion window size have the maximum increments around _W_(max)_,
and thus introduce a large number of packet bursts around the
saturation point of the network, likely causing frequent global loss
synchronizations.
Principle 2: CUBIC promotes per-flow fairness to Standard TCP. Note
that Standard TCP performs well under short RTT and small bandwidth
(or small BDP) networks. There is only a scalability problem in
networks with long RTTs and large bandwidth (or large BDP). An
alternative congestion control algorithm to Standard TCP designed to
be friendly to Standard TCP on a per-flow basis must operate to
increase its congestion window less aggressively in small BDP
networks than in large BDP networks. The aggressiveness of CUBIC
mainly depends on the maximum window size before a window reduction,
which is smaller in small BDP networks than in large BDP networks.
Thus, CUBIC increases its congestion window less aggressively in
small BDP networks than in large BDP networks. Furthermore, in cases
when the cubic function of CUBIC increases its congestion window less
aggressively than Standard TCP, CUBIC simply follows the window size
of Standard TCP to ensure that CUBIC achieves at least the same
throughput as Standard TCP in small BDP networks. We call this
region where CUBIC behaves like Standard TCP, the "TCP-friendly
region".
Principle 3: Two CUBIC flows with different RTTs have their
throughput ratio linearly proportional to the inverse of their RTT
ratio, where the throughput of a flow is approximately the size of
its congestion window divided by its RTT. Specifically, CUBIC
maintains a window increase rate independent of RTTs outside of the
TCP-friendly region, and thus flows with different RTTs have similar
congestion window sizes under steady state when they operate outside
the TCP-friendly region. This notion of a linear throughput ratio is
Xu, et al. Expires 6 August 2021 [Page 5]
Internet-Draft CUBIC February 2021
similar to that of Standard TCP under high statistical multiplexing
environments where packet losses are independent of individual flow
rates. However, under low statistical multiplexing environments, the
throughput ratio of Standard TCP flows with different RTTs is
quadratically proportional to the inverse of their RTT ratio [XHR04].
CUBIC always ensures the linear throughput ratio independent of the
levels of statistical multiplexing. This is an improvement over
Standard TCP. While there is no consensus on particular throughput
ratios of different RTT flows, we believe that under wired Internet,
use of a linear throughput ratio seems more reasonable than equal
throughputs (i.e., the same throughput for flows with different RTTs)
or a higher-order throughput ratio (e.g., a quadratical throughput
ratio of Standard TCP under low statistical multiplexing
environments).
Principle 4: To balance between the scalability and convergence
speed, CUBIC sets the multiplicative window decrease factor to 0.7
while Standard TCP uses 0.5. While this improves the scalability of
CUBIC, a side effect of this decision is slower convergence,
especially under low statistical multiplexing environments. This
design choice is following the observation that the author of
HighSpeed TCP (HSTCP) [RFC3649] has made along with other researchers
(e.g., [GV02]): the current Internet becomes more asynchronous with
less frequent loss synchronizations with high statistical
multiplexing. Under this environment, even strict Multiplicative-
Increase Multiplicative-Decrease (MIMD) can converge. CUBIC flows
with the same RTT always converge to the same throughput independent
of statistical multiplexing, thus achieving intra-algorithm fairness.
We also find that under the environments with sufficient statistical
multiplexing, the convergence speed of CUBIC flows is reasonable.
4. CUBIC Congestion Control
In this section, we discuss how the congestion window is updated
during the different stages of the CUBIC congestion controller.
4.1. Definitions
The unit of all window sizes in this document is segments of the
maximum segment size (MSS), and the unit of all times is seconds.
4.1.1. Constants of Interest
β__(cubic)_: CUBIC multiplication decrease factor as described in
Section 4.6.
Xu, et al. Expires 6 August 2021 [Page 6]
Internet-Draft CUBIC February 2021
_C_: constant that determines the aggressiveness of CUBIC in
competing with other congestion control algorithms in high BDP
networks. Please see Section 5 for more explanation on how it is
set. The unit for _C_ is
segment
-------
3
second
4.1.2. Variables of Interest
Variables required to implement CUBIC are described in this section.
_RTT_: Smoothed round-trip time in seconds calculated as described in
[RFC6298].
_cwnd_: Current congestion window in segments.
_ssthresh_: Current slow start threshold in segments.
_W_(max)_: Size of _cwnd_ in segments just before _cwnd_ is reduced
in the last congestion event.
_K_: The time period in seconds it takes to increase the congestion
window size at the beginning of the current congestion avoidance
stage to _W_(max)_.
_current_time_: Current time of the system in seconds.
_epoch_(start)_: The time in seconds at which the current congestion
avoidance stage starts.
_cwnd_(start)_: The _cwnd_ at the beginning of the current congestion
avoidance stage, i.e., at time _epoch_(start)_.
W_(cubic)(_t_): Target value of the congestion window in segments at
time t in seconds based on the cubic increase function as described
in Section 4.2.
_target_: Target value of congestion window in segments after the
next _RTT_, that is, W_(cubic)(_t_ + _RTT_) as described in
Section 4.2.
_W_(est)_: An estimate for the congestion window in segments in the
TCP-friendly region, that is, an estimate for the congestion window
using the AIMD approach similar to TCP-NewReno congestion controller.
Xu, et al. Expires 6 August 2021 [Page 7]
Internet-Draft CUBIC February 2021
4.2. Window Increase Function
CUBIC maintains the acknowledgment (ACK) clocking of Standard TCP by
increasing the congestion window only at the reception of an ACK. It
does not make any change to the fast recovery and retransmit of TCP,
such as TCP-NewReno [RFC6582][RFC6675]. During congestion avoidance
after a congestion event where a packet loss is detected by duplicate
ACKs or a network congestion is detected by ACKs with ECN-Echo flags
[RFC3168], CUBIC changes the window increase function of Standard
TCP.
CUBIC uses the following window increase function:
3
W (t) = C * (t - K) + W
cubic max
Figure 1
where t is the elapsed time in seconds from the beginning of the
current congestion avoidance stage, that is,
t = current_time - epoch
start
and where _epoch_(start)_ is the time at which the current congestion
avoidance stage starts. _K_ is the time period that the above
function takes to increase the congestion window size at the
beginning of the current congestion avoidance stage to _W_(max)_ if
there are no further congestion events and is calculated using the
following equation:
________________
/W - cwnd
3 / max start
K = | / ----------------
|/ C
Figure 2
where _cwnd_(start)_ is the congestion window at the beginning of the
current congestion avoidance stage. _cwnd_(start)_ is calculated as
described in Section 4.6 when a congestion event is detected,
although implementations can further adjust _cwnd_(start)_ based on
other fast recovery mechanisms. In special cases, if _cwnd_(start)_
is greater than _W_(max)_, _K_ is set to 0.
Xu, et al. Expires 6 August 2021 [Page 8]
Internet-Draft CUBIC February 2021
Upon receiving an ACK during congestion avoidance, CUBIC computes the
_target_ congestion window size after the next _RTT_ using Figure 1
as follows, where _RTT_ is the smoothed round-trip time. The lower
and upper bounds below ensure that CUBIC's congestion window increase
rate is non-decreasing and is less than the increase rate of slow
start.
/
| if W (t + RTT) < cwnd
|cwnd cubic
|
|
|
target = < if W (t + RTT) > 1.5 * cwnd
|1.5 * cwnd cubic
|
|
|W (t + RTT)
| cubic otherwise
\
Depending on the value of the current congestion window size _cwnd_,
CUBIC runs in three different modes.
1. The TCP-friendly region, which ensures that CUBIC achieves at
least the same throughput as Standard TCP.
2. The concave region, if CUBIC is not in the TCP-friendly region
and _cwnd_ is less than _W_(max)_.
3. The convex region, if CUBIC is not in the TCP-friendly region and
_cwnd_ is greater than _W_(max)_.
Below, we describe the exact actions taken by CUBIC in each region.
4.3. TCP-Friendly Region
Standard TCP performs well in certain types of networks, for example,
under short RTT and small bandwidth (or small BDP) networks. In
these networks, we use the TCP-friendly region to ensure that CUBIC
achieves at least the same throughput as Standard TCP.
Xu, et al. Expires 6 August 2021 [Page 9]
Internet-Draft CUBIC February 2021
The TCP-friendly region is designed according to the analysis
described in [FHP00]. The analysis studies the performance of an
Additive Increase and Multiplicative Decrease (AIMD) algorithm with
an additive factor of α__(aimd)_ (segments per _RTT_) and a
multiplicative factor of β__(aimd)_, denoted by AIMD(α__(aimd)_,
β__(aimd)_). Specifically, the average congestion window size of
AIMD(α__(aimd)_, β__(aimd)_) can be calculated using Figure 3. The
analysis shows that AIMD(α__(aimd)_, β__(aimd)_) with
1 - β
cubic
α = 3 * ----------
aimd 1 + β
cubic
achieves the same average window size as Standard TCP that uses
AIMD(1, 0.5).
___________________
/α * (1 + β )
/ aimd aimd
AVG_AIMD(α , β ) = | / -------------------
aimd aimd | / 2 * (1 - β ) * p
|/ aimd
Figure 3
Based on the above analysis, CUBIC uses Figure 4 to estimate the
window size _W_(est)_ of AIMD(α__(aimd)_, β__(aimd)_) with
1 - β
cubic
α = 3 * ----------
aimd 1 + β
cubic
β = β
aimd cubic
which achieves the same average window size as Standard TCP. When
receiving an ACK in congestion avoidance (_cwnd_ could be greater
than or less than _W_(max)_), CUBIC checks whether W_(cubic)(_t_) is
less than _W_(est)_. If so, CUBIC is in the TCP-friendly region and
_cwnd_ SHOULD be set to _W_(est)_ at each reception of an ACK.
_W_(est)_ is set equal to _cwnd_ at the start of the congestion
avoidance stage. After that, on every ACK, _W_(est)_ is updated
using Figure 4.
Xu, et al. Expires 6 August 2021 [Page 10]
Internet-Draft CUBIC February 2021
segments_acked
W = W + α * --------------
est est aimd cwnd
Figure 4
Note that once _W_(est)_ reaches _W_(max)_, that is, _W_(est)_ >=
_W_(max)_, α__(aimd)_ SHOULD be set to 1 to achieve the same
congestion window size as standard TCP that uses AIMD.
4.4. Concave Region
When receiving an ACK in congestion avoidance, if CUBIC is not in the
TCP-friendly region and _cwnd_ is less than _W_(max)_, then CUBIC is
in the concave region. In this region, _cwnd_ MUST be incremented by
target - cwnd
-------------
cwnd
for each received ACK, where _target_ is calculated as described in
Section 4.2.
4.5. Convex Region
When receiving an ACK in congestion avoidance, if CUBIC is not in the
TCP-friendly region and _cwnd_ is larger than or equal to _W_(max)_,
then CUBIC is in the convex region. The convex region indicates that
the network conditions might have been perturbed since the last
congestion event, possibly implying more available bandwidth after
some flow departures. Since the Internet is highly asynchronous,
some amount of perturbation is always possible without causing a
major change in available bandwidth. In this region, CUBIC is being
very careful by very slowly increasing its window size. The convex
profile ensures that the window increases very slowly at the
beginning and gradually increases its increase rate. We also call
this region the "maximum probing phase" since CUBIC is searching for
a new _W_(max)_. In this region, _cwnd_ MUST be incremented by
target - cwnd
-------------
cwnd
for each received ACK, where _target_ is calculated as described in
Section 4.2.
Xu, et al. Expires 6 August 2021 [Page 11]
Internet-Draft CUBIC February 2021
4.6. Multiplicative Decrease
When a packet loss is detected by duplicate ACKs or a network
congestion is detected by receiving packets marked with ECN-Echo
(ECE), CUBIC updates its _W_(max)_ and reduces its _cwnd_ and
_ssthresh_ immediately as below. For both packet loss and congestion
detection through ECN, the sender MAY employ a fast recovery
algorithm to gradually adjust the congestion window to its new
reduced value. Parameter β__(cubic)_ SHOULD be set to 0.7.
ssthresh = cwnd * β // new slow-start threshold
cubic
ssthresh = max(ssthresh, 2) // threshold is at least 2 MSS
// window reduction
cwnd = ssthresh
A side effect of setting β__(cubic)_ to a value bigger than 0.5 is
slower convergence. We believe that while a more adaptive setting of
β__(cubic)_ could result in faster convergence, it will make the
analysis of CUBIC much harder. This adaptive adjustment of
β__(cubic)_ is an item for the next version of CUBIC.
4.7. Fast Convergence
To improve the convergence speed of CUBIC, we add a heuristic in
CUBIC. When a new flow joins the network, existing flows in the
network need to give up some of their bandwidth to allow the new flow
some room for growth if the existing flows have been using all the
bandwidth of the network. To speed up this bandwidth release by
existing flows, the following mechanism called "fast convergence"
SHOULD be implemented.
With fast convergence, when a congestion event occurs, we update
_W_(max)_ as follows before the window reduction as described in
Section 4.6.
/ 1 + β
| cubic if cwnd < W , further reduce W
|W * ---------- max max
W = < max 2
max |
| otherwise, remember cwnd before reduction
\cwnd
Xu, et al. Expires 6 August 2021 [Page 12]
Internet-Draft CUBIC February 2021
At a congestion event, if the current _cwnd_ is less than _W_(max)_,
this indicates that the saturation point experienced by this flow is
getting reduced because of the change in available bandwidth. Then
we allow this flow to release more bandwidth by reducing _W_(max)_
further. This action effectively lengthens the time for this flow to
increase its congestion window because the reduced _W_(max)_ forces
the flow to have the plateau earlier. This allows more time for the
new flow to catch up to its congestion window size.
The fast convergence is designed for network environments with
multiple CUBIC flows. In network environments with only a single
CUBIC flow and without any other traffic, the fast convergence SHOULD
be disabled.
4.8. Timeout
In case of timeout, CUBIC follows Standard TCP to reduce _cwnd_
[RFC5681], but sets _ssthresh_ using β__(cubic)_ (same as in
Section 4.6) that is different from Standard TCP [RFC5681].
During the first congestion avoidance after a timeout, CUBIC
increases its congestion window size using Figure 1, where t is the
elapsed time since the beginning of the current congestion avoidance,
_K_ is set to 0, and _W_(max)_ is set to the congestion window size
at the beginning of the current congestion avoidance. In addition,
for the tcp-friendliness region, _W_(est)_ should be set to the
congestion window size at the beginning of the current congestion
avoidance.
4.9. Spurious Congestion Events
For the case where CUBIC reduces its congestion window in response to
detection of packet loss via duplicate ACKs or timeout, there is a
possibility that the missing ACK would arrive after the congestion
window reduction and the corresponding packet retransmission. For
example, packet reordering which is common in networks could trigger
this behavior. A high degree of packet reordering could cause
multiple events of congestion window reduction where spurious losses
are incorrectly interpreted as congestion signals, thus degrading
CUBIC's performance significantly.
When there is a congestion event, a CUBIC implementation SHOULD save
the current value of the following variables before the congestion
window reduction.
Xu, et al. Expires 6 August 2021 [Page 13]
Internet-Draft CUBIC February 2021
prior_cwnd = cwnd
prior_ssthresh = ssthresh
prior_W = W
max max
prior_K = K
prior_epoch = epoch
start start
prior_W_{est} = W
est
CUBIC MAY implement an algorithm to detect spurious retransmissions,
such as DSACK [RFC3708], Forward RTO-Recovery [RFC5682] or Eifel
[RFC3522]. Once a spurious congestion event is detected, CUBIC
SHOULD restore the original values of above mentioned variables as
follows if the current _cwnd_ is lower than _prior_cwnd_. Restoring
to the original values ensures that CUBIC's performance is similar to
what it would be if there were no spurious losses.
\
cwnd = prior_cwnd |
|
ssthresh = prior_ssthresh |
|
W = prior_W |
max max |
>if cwnd < prior_cwnd
K = prior_K |
|
epoch = prior_epoch |
start start|
|
W = prior_W |
est est /
In rare cases, when the detection happens long after a spurious loss
event and the current _cwnd_ is already higher than the _prior_cwnd_,
CUBIC SHOULD continue to use the current and the most recent values
of these variables.
Xu, et al. Expires 6 August 2021 [Page 14]
Internet-Draft CUBIC February 2021
4.10. Slow Start
CUBIC MUST employ a slow-start algorithm, when _cwnd_ is no more than
_ssthresh_. Among the slow-start algorithms, CUBIC MAY choose the
standard TCP slow start [RFC5681] in general networks, or the limited
slow start [RFC3742] or hybrid slow start [HR08] for fast and long-
distance networks.
In the case when CUBIC runs the hybrid slow start [HR08], it may exit
the first slow start without incurring any packet loss and thus
_W_(max)_ is undefined. In this special case, CUBIC switches to
congestion avoidance and increases its congestion window size using
Figure 1, where t is the elapsed time since the beginning of the
current congestion avoidance, _K_ is set to 0, and _W_(max)_ is set
to the congestion window size at the beginning of the current
congestion avoidance.
5. Discussion
In this section, we further discuss the safety features of CUBIC
following the guidelines specified in [RFC5033].
With a deterministic loss model where the number of packets between
two successive packet losses is always _1/p_, CUBIC always operates
with the concave window profile, which greatly simplifies the
performance analysis of CUBIC. The average window size of CUBIC can
be obtained by the following function:
________________ ____
/C * (3 + β ) 3 / 4
4 / cubic |/ RTT
AVG_W = | / ---------------- * -------
cubic | / 4 * (1 - β ) __
|/ cubic 3 / 4
|/ p
Figure 5
With β__(cubic)_ set to 0.7, the above formula is reduced to:
____
_______ 3 / 4
4 /C * 3.7 |/ RTT
AVG_W = | / ------- * -------
cubic |/ 1.2 __
3 / 4
|/ p
Xu, et al. Expires 6 August 2021 [Page 15]
Internet-Draft CUBIC February 2021
Figure 6
We will determine the value of _C_ in the following subsection using
Figure 6.
5.1. Fairness to Standard TCP
In environments where Standard TCP is able to make reasonable use of
the available bandwidth, CUBIC does not significantly change this
state.
Standard TCP performs well in the following two types of networks:
1. networks with a small bandwidth-delay product (BDP)
2. networks with a short RTTs, but not necessarily a small BDP
CUBIC is designed to behave very similarly to Standard TCP in the
above two types of networks. The following two tables show the
average window sizes of Standard TCP, HSTCP, and CUBIC. The average
window sizes of Standard TCP and HSTCP are from [RFC3649]. The
average window size of CUBIC is calculated using Figure 6 and the
CUBIC TCP-friendly region for three different values of _C_.
+=============+=======+========+================+=========+========+
| Loss Rate P | TCP | HSTCP | CUBIC (C=0.04) | CUBIC | CUBIC |
| | | | | (C=0.4) | (C=4) |
+=============+=======+========+================+=========+========+
| 1.0e-02 | 12 | 12 | 12 | 12 | 12 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-03 | 38 | 38 | 38 | 38 | 59 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-04 | 120 | 263 | 120 | 187 | 333 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-05 | 379 | 1795 | 593 | 1054 | 1874 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-06 | 1200 | 12280 | 3332 | 5926 | 10538 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-07 | 3795 | 83981 | 18740 | 33325 | 59261 |
+-------------+-------+--------+----------------+---------+--------+
| 1.0e-08 | 12000 | 574356 | 105383 | 187400 | 333250 |
+-------------+-------+--------+----------------+---------+--------+
Table 1: Standard TCP, HSTCP, and CUBIC with RTT = 0.1 seconds
Table 1 describes the response function of Standard TCP, HSTCP, and
CUBIC in networks with _RTT_ = 0.1 seconds. The average window size
is in MSS-sized segments.
Xu, et al. Expires 6 August 2021 [Page 16]
Internet-Draft CUBIC February 2021
+=============+=======+========+================+=========+=======+
| Loss Rate P | TCP | HSTCP | CUBIC (C=0.04) | CUBIC | CUBIC |
| | | | | (C=0.4) | (C=4) |
+=============+=======+========+================+=========+=======+
| 1.0e-02 | 12 | 12 | 12 | 12 | 12 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-03 | 38 | 38 | 38 | 38 | 38 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-04 | 120 | 263 | 120 | 120 | 120 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-05 | 379 | 1795 | 379 | 379 | 379 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-06 | 1200 | 12280 | 1200 | 1200 | 1874 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-07 | 3795 | 83981 | 3795 | 5926 | 10538 |
+-------------+-------+--------+----------------+---------+-------+
| 1.0e-08 | 12000 | 574356 | 18740 | 33325 | 59261 |
+-------------+-------+--------+----------------+---------+-------+
Table 2: Standard TCP, HSTCP, and CUBIC with RTT = 0.01 seconds
Table 2 describes the response function of Standard TCP, HSTCP, and
CUBIC in networks with _RTT_ = 0.01 seconds. The average window size
is in MSS-sized segments.
Both tables show that CUBIC with any of these three _C_ values is
more friendly to TCP than HSTCP, especially in networks with a short
_RTT_ where TCP performs reasonably well. For example, in a network
with _RTT_ = 0.01 seconds and p=10^-6, TCP has an average window of
1200 packets. If the packet size is 1500 bytes, then TCP can achieve
an average rate of 1.44 Gbps. In this case, CUBIC with _C_=0.04 or
_C_=0.4 achieves exactly the same rate as Standard TCP, whereas HSTCP
is about ten times more aggressive than Standard TCP.
We can see that _C_ determines the aggressiveness of CUBIC in
competing with other congestion control algorithms for bandwidth.
CUBIC is more friendly to Standard TCP, if the value of _C_ is lower.
However, we do not recommend setting _C_ to a very low value like
0.04, since CUBIC with a low _C_ cannot efficiently use the bandwidth
in long-_RTT_ and high-bandwidth networks. Based on these
observations and our experiments, we find _C_=0.4 gives a good
balance between TCP- friendliness and aggressiveness of window
increase. Therefore, _C_ SHOULD be set to 0.4. With _C_ set to 0.4,
Figure 6 is reduced to:
Xu, et al. Expires 6 August 2021 [Page 17]
Internet-Draft CUBIC February 2021
____
3 / 4
|/ RTT
AVG_W = 1.054 * -------
cubic __
3 / 4
|/ p
Figure 7
Figure 7 is then used in the next subsection to show the scalability
of CUBIC.
5.2. Using Spare Capacity
CUBIC uses a more aggressive window increase function than Standard
TCP under long-_RTT_ and high-bandwidth networks.
The following table shows that to achieve the 10 Gbps rate, Standard
TCP requires a packet loss rate of 2.0e-10, while CUBIC requires a
packet loss rate of 2.9e-8.
+===================+===========+=========+=========+=========+
| Throughput (Mbps) | Average W | TCP P | HSTCP P | CUBIC P |
+===================+===========+=========+=========+=========+
| 1 | 8.3 | 2.0e-2 | 2.0e-2 | 2.0e-2 |
+-------------------+-----------+---------+---------+---------+
| 10 | 83.3 | 2.0e-4 | 3.9e-4 | 2.9e-4 |
+-------------------+-----------+---------+---------+---------+
| 100 | 833.3 | 2.0e-6 | 2.5e-5 | 1.4e-5 |
+-------------------+-----------+---------+---------+---------+
| 1000 | 8333.3 | 2.0e-8 | 1.5e-6 | 6.3e-7 |
+-------------------+-----------+---------+---------+---------+
| 10000 | 83333.3 | 2.0e-10 | 1.0e-7 | 2.9e-8 |
+-------------------+-----------+---------+---------+---------+
Table 3: Required packet loss rate for Standard TCP, HSTCP,
and CUBIC to achieve a certain throughput
Table 3 describes the required packet loss rate for Standard TCP,
HSTCP, and CUBIC to achieve a certain throughput. We use 1500-byte
packets and an _RTT_ of 0.1 seconds.
Our test results in [HKLRX06] indicate that CUBIC uses the spare
bandwidth left unused by existing Standard TCP flows in the same
bottleneck link without taking away much bandwidth from the existing
flows.
Xu, et al. Expires 6 August 2021 [Page 18]
Internet-Draft CUBIC February 2021
5.3. Difficult Environments
CUBIC is designed to remedy the poor performance of TCP in fast and
long-distance networks.
5.4. Investigating a Range of Environments
CUBIC has been extensively studied by using both NS-2 simulation and
test-bed experiments covering a wide range of network environments.
More information can be found in [HKLRX06].
Same as Standard TCP, CUBIC is a loss-based congestion control
algorithm. Because CUBIC is designed to be more aggressive (due to a
faster window increase function and bigger multiplicative decrease
factor) than Standard TCP in fast and long-distance networks, it can
fill large drop-tail buffers more quickly than Standard TCP and
increase the risk of a standing queue [RFC8511]. In this case,
proper queue sizing and management [RFC7567] could be used to reduce
the packet queuing delay.
5.5. Protection against Congestion Collapse
With regard to the potential of causing congestion collapse, CUBIC
behaves like Standard TCP since CUBIC modifies only the window
adjustment algorithm of TCP. Thus, it does not modify the ACK
clocking and Timeout behaviors of Standard TCP.
5.6. Fairness within the Alternative Congestion Control Algorithm
CUBIC ensures convergence of competing CUBIC flows with the same
_RTT_ in the same bottleneck links to an equal throughput. When
competing flows have different _RTT_ values, their throughput ratio
is linearly proportional to the inverse of their _RTT_ ratios. This
is true independent of the level of statistical multiplexing in the
link.
5.7. Performance with Misbehaving Nodes and Outside Attackers
This is not considered in the current CUBIC.
5.8. Behavior for Application-Limited Flows
CUBIC does not raise its congestion window size if the flow is
currently limited by the application instead of the congestion
window. In case of long periods when _cwnd_ has not been updated due
to the application rate limit, such as idle periods, t in Figure 1
MUST NOT include these periods; otherwise, W_(cubic)(_t_) might be
very high after restarting from these periods.
Xu, et al. Expires 6 August 2021 [Page 19]
Internet-Draft CUBIC February 2021
5.9. Responses to Sudden or Transient Events
If there is a sudden congestion, a routing change, or a mobility
event, CUBIC behaves the same as Standard TCP.
5.10. Incremental Deployment
CUBIC requires only the change of TCP senders, and it does not make
any changes to TCP receivers. That is, a CUBIC sender works
correctly with the Standard TCP receivers. In addition, CUBIC does
not require any changes to the routers and does not require any
assistance from the routers.
6. Security Considerations
This proposal makes no changes to the underlying security of TCP.
More information about TCP security concerns can be found in
[RFC5681].
7. IANA Considerations
This document does not require any IANA actions.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP",
RFC 3168, DOI 10.17487/RFC3168, September 2001,
<https://www.rfc-editor.org/info/rfc3168>.
[RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
Control Algorithms", BCP 133, RFC 5033,
DOI 10.17487/RFC5033, August 2007,
<https://www.rfc-editor.org/info/rfc5033>.
[RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification",
RFC 5348, DOI 10.17487/RFC5348, September 2008,
<https://www.rfc-editor.org/info/rfc5348>.
Xu, et al. Expires 6 August 2021 [Page 20]
Internet-Draft CUBIC February 2021
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<https://www.rfc-editor.org/info/rfc5681>.
[RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent,
"Computing TCP's Retransmission Timer", RFC 6298,
DOI 10.17487/RFC6298, June 2011,
<https://www.rfc-editor.org/info/rfc6298>.
[RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
NewReno Modification to TCP's Fast Recovery Algorithm",
RFC 6582, DOI 10.17487/RFC6582, April 2012,
<https://www.rfc-editor.org/info/rfc6582>.
[RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M.,
and Y. Nishida, "A Conservative Loss Recovery Algorithm
Based on Selective Acknowledgment (SACK) for TCP",
RFC 6675, DOI 10.17487/RFC6675, August 2012,
<https://www.rfc-editor.org/info/rfc6675>.
[RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF
Recommendations Regarding Active Queue Management",
BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
<https://www.rfc-editor.org/info/rfc7567>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
8.2. Informative References
[CEHRX07] Cai, H., Eun, D., Ha, S., Rhee, I., and L. Xu, "Stochastic
Ordering for Internet Congestion Control and its
Applications", IEEE INFOCOM 2007 - 26th IEEE International
Conference on Computer Communications,
DOI 10.1109/infcom.2007.111, 2007,
<https://doi.org/10.1109/infcom.2007.111>.
[FHP00] Floyd, S., Handley, M., and J. Padhye, "A Comparison of
Equation-Based and AIMD Congestion Control", May 2000,
<https://www.icir.org/tfrc/aimd.pdf>.
[GV02] Gorinsky, S. and H. Vin, "Extended Analysis of Binary
Adjustment Algorithms", Technical Report TR2002-29,
Department of Computer Sciences, The University of
Texas at Austin, 11 August 2002,
<http://www.cs.utexas.edu/ftp/techreports/tr02-39.ps.gz>.
Xu, et al. Expires 6 August 2021 [Page 21]
Internet-Draft CUBIC February 2021
[HKLRX06] Ha, S., Kim, Y., Le, L., Rhee, I., and L. Xu, "A Step
toward Realistic Performance Evaluation of High-Speed TCP
Variants", International Workshop on Protocols for Fast
Long-Distance Networks, February 2006,
<https://pfld.net/2006/paper/s2_03.pdf>.
[HR08] Ha, S. and I. Rhee, "Hybrid Slow Start for High-Bandwidth
and Long-Distance Networks", International Workshop
on Protocols for Fast Long-Distance Networks, March 2008,
<http://www.hep.man.ac.uk/g/GDARN-IT/pfldnet2008/paper/
Sangate_Ha%20Final.pdf>.
[HRX08] Ha, S., Rhee, I., and L. Xu, "CUBIC: a new TCP-friendly
high-speed TCP variant", ACM SIGOPS Operating Systems
Review Vol. 42, pp. 64-74, DOI 10.1145/1400097.1400105,
July 2008, <https://doi.org/10.1145/1400097.1400105>.
[K03] Kelly, T., "Scalable TCP: improving performance in
highspeed wide area networks", ACM SIGCOMM Computer
Communication Review Vol. 33, pp. 83-91,
DOI 10.1145/956981.956989, April 2003,
<https://doi.org/10.1145/956981.956989>.
[RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm
for TCP", RFC 3522, DOI 10.17487/RFC3522, April 2003,
<https://www.rfc-editor.org/info/rfc3522>.
[RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",
RFC 3649, DOI 10.17487/RFC3649, December 2003,
<https://www.rfc-editor.org/info/rfc3649>.
[RFC3708] Blanton, E. and M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission
Protocol (SCTP) Duplicate Transmission Sequence Numbers
(TSNs) to Detect Spurious Retransmissions", RFC 3708,
DOI 10.17487/RFC3708, February 2004,
<https://www.rfc-editor.org/info/rfc3708>.
[RFC3742] Floyd, S., "Limited Slow-Start for TCP with Large
Congestion Windows", RFC 3742, DOI 10.17487/RFC3742, March
2004, <https://www.rfc-editor.org/info/rfc3742>.
[RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol",
RFC 4960, DOI 10.17487/RFC4960, September 2007,
<https://www.rfc-editor.org/info/rfc4960>.
Xu, et al. Expires 6 August 2021 [Page 22]
Internet-Draft CUBIC February 2021
[RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata,
"Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP", RFC 5682,
DOI 10.17487/RFC5682, September 2009,
<https://www.rfc-editor.org/info/rfc5682>.
[RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
RFC 8312, DOI 10.17487/RFC8312, February 2018,
<https://www.rfc-editor.org/info/rfc8312>.
[RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
"TCP Alternative Backoff with ECN (ABE)", RFC 8511,
DOI 10.17487/RFC8511, December 2018,
<https://www.rfc-editor.org/info/rfc8511>.
[SXEZ19] Sun, W., Xu, L., Elbaum, S., and D. Zhao, "Model-Agnostic
and Efficient Exploration of Numerical State Space of
Real-World TCP Congestion Control Implementations", USENIX
NSDI 2019, February 2019,
<https://www.usenix.org/system/files/nsdi19-sun.pdf>.
[XHR04] Xu, L., Harfoush, K., and I. Rhee, "Binary Increase
Congestion Control (BIC) for Fast Long-Distance Networks",
IEEE INFOCOM 2004, DOI 10.1109/infcom.2004.1354672, March
2004, <https://doi.org/10.1109/infcom.2004.1354672>.
Appendix A. Acknowledgements
Richard Scheffenegger and Alexander Zimmermann originally co-authored
[RFC8312].
Appendix B. Evolution of CUBIC
B.1. Since draft-eggert-tcpm-rfc8312bis-00
* acknowledge former co-authors (#15
(https://github.com/NTAP/rfc8312bis/issues/15))
* prevent _cwnd_ from becoming less than two (#7
(https://github.com/NTAP/rfc8312bis/issues/7))
* add list of variables and constants (#5
(https://github.com/NTAP/rfc8312bis/issues/5), #6
(https://github.com/NTAP/rfc8312bis/issues/5))
Xu, et al. Expires 6 August 2021 [Page 23]
Internet-Draft CUBIC February 2021
* update _K_'s definition and add bounds for CUBIC _target_ _cwnd_
[SXEZ19] (#1 (https://github.com/NTAP/rfc8312bis/issues/1), #14
(https://github.com/NTAP/rfc8312bis/issues/14))
* update _W_(est)_ to use AIMD approach (#20
(https://github.com/NTAP/rfc8312bis/issues/20))
* set alpha__(aimd)_ to 1 once _W_(est)_ reaches _W_(max)_ (#2
(https://github.com/NTAP/rfc8312bis/issues/2))
* add Vidhi as co-author
* (#17 (https://github.com/NTAP/rfc8312bis/issues/17))
* note for fast recovery during _cwnd_ decrease due to congestion
event (#11 (https://github.com/NTAP/rfc8312bis11/issues/11))
* add section for spurious congestion events (#23
(https://github.com/NTAP/rfc8312bis/issues/23))
* initialize _W_(est)_ after timeout and remove variable
_W_(last_max)_ (#28 (https://github.com/NTAP/rfc8312bis/
issues/28))
B.2. Since RFC8312
* converted to Markdown and xml2rfc v3
* updated references (as part of the conversion)
* updated author information
* various formatting changes
* move to Standards Track
B.3. Since the Original Paper
CUBIC has gone through a few changes since the initial release
[HRX08] of its algorithm and implementation. Below we highlight the
differences between its original paper and [RFC8312].
* The original paper [HRX08] includes the pseudocode of CUBIC
implementation using Linux's pluggable congestion control
framework, which excludes system-specific optimizations. The
simplified pseudocode might be a good source to start with and
understand CUBIC.
Xu, et al. Expires 6 August 2021 [Page 24]
Internet-Draft CUBIC February 2021
* [HRX08] also includes experimental results showing its performance
{{and fairness.
* The definition of beta__(cubic)_ constant was changed in
[RFC8312]. For example, beta__(cubic)_ in the original paper was
the window decrease constant while [RFC8312] changed it to CUBIC
multiplication decrease factor. With this change, the current
congestion window size after a congestion event in [RFC8312] was
beta__(cubic)_ * _W_(max)_ while it was (1-beta__(cubic)_) *
_W_(max)_ in the original paper.
* Its pseudocode used _W_(last_max)_ while [RFC8312] used _W_(max)_.
* Its TCP friendly window was W_(tcp) while [RFC8312] used
_W_(est)_.
Authors' Addresses
Lisong Xu
University of Nebraska-Lincoln
Department of Computer Science and Engineering
Lincoln, NE 68588-0115
United States of America
Email: xu@unl.edu
URI: https://cse.unl.edu/~xu/
Sangtae Ha
University of Colorado at Boulder
Department of Computer Science
Boulder, CO 80309-0430
United States of America
Email: sangtae.ha@colorado.edu
URI: https://netstech.org/sangtaeha/
Injong Rhee
Bowery Farming
151 W 26TH Street, 12TH Floor
New York, NY 10001
United States of America
Email: injongrhee@gmail.com
Xu, et al. Expires 6 August 2021 [Page 25]
Internet-Draft CUBIC February 2021
Vidhi Goel
Apple Inc.
One Apple Park Way
Cupertino, California 95014
United States of America
Email: vidhi_goel@apple.com
Lars Eggert (editor)
NetApp
Stenbergintie 12 B
FI-02700 Kauniainen
Finland
Email: lars@eggert.org
URI: https://eggert.org/
Xu, et al. Expires 6 August 2021 [Page 26]
Html markup produced by rfcmarkup 1.129d, available from
https://tools.ietf.org/tools/rfcmarkup/