draft-ietf-tcpm-hystartplusplus-03.txt   draft-ietf-tcpm-hystartplusplus-04.txt 
Network Working Group P. Balasubramanian Network Working Group P. Balasubramanian
Internet-Draft Y. Huang Internet-Draft Y. Huang
Intended status: Standards Track M. Olson Intended status: Standards Track M. Olson
Expires: 26 January 2022 Microsoft Expires: 27 July 2022 Microsoft
25 July 2021 23 January 2022
HyStart++: Modified Slow Start for TCP HyStart++: Modified Slow Start for TCP
draft-ietf-tcpm-hystartplusplus-03 draft-ietf-tcpm-hystartplusplus-04
Abstract Abstract
This doument describes HyStart++, a simple modification to the slow This doument describes HyStart++, a simple modification to the slow
start phase of TCP congestion control algorithms. Traditional slow start phase of TCP congestion control algorithms. Traditional slow
start can cause overshooting of the ideal send rate and cause large start can overshoot the ideal send rate in many cases, causing high
packet loss within a round-trip time which results in poor packet loss and poor performance. HyStart++ uses a delay increase
performance. HyStart++ uses a delay increase heuristic to exit slow heuristic to find an exit point before possible overshoot. It also
start early while also mitigating poor performance which can result adds a mitigation to prevent jitter from causing premature slow start
from false positives. exit.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 26 January 2022. This Internet-Draft will expire on 27 July 2022.
Copyright Notice Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Revised BSD License text as
as described in Section 4.e of the Trust Legal Provisions and are described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3
4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3 4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3
4.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 3 4.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 3
4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4
4.3. Tuning constants . . . . . . . . . . . . . . . . . . . . 6 4.3. Tuning constants . . . . . . . . . . . . . . . . . . . . 6
skipping to change at page 2, line 39 skipping to change at page 2, line 39
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8
1. Introduction 1. Introduction
[RFC5681] describes the slow start congestion control algorithm for [RFC5681] describes the slow start congestion control algorithm for
TCP. The slow start algorithm is used when the congestion window TCP. The slow start algorithm is used when the congestion window
(cwnd) is less than the slow start threshold (ssthresh). During slow (cwnd) is less than the slow start threshold (ssthresh). During slow
start, in absence of packet loss signals, TCP increases cwnd start, in absence of packet loss signals, TCP increases cwnd
exponentially to probe the network capacity. This fast growth can exponentially to probe the network capacity. This fast growth can
overshoot the ideal sending rate and cause significant packet loss overshoot the ideal sending rate and cause significant packet loss
which cannot always be recovered efficiently, impairing flow which cannot always be recovered efficiently.
completion time.
HyStart++ first uses delay increase as a signal to exit slow start HyStart++ uses delay increase as a signal to exit slow start before
before any packet loss occurs. This is one of two algorithms potential packet loss occurs as a result of overshoot. This is one
specified in [HyStart]. After the HyStart delay algorithm finds an of two algorithms specified in [HyStart]. After the slow start exit,
exit point, a novel Conservative Slow Start (CSS) phase is used to a novel Conservative Slow Start (CSS) phase is used to determine
determine whether the slow start exit was spurious. This provides whether the slow start exit was premature and to resume slow start.
protection against jitter and prevents performance problems that This mitigation improves performance in presence of jitter.
result from early slow start exit due to false positives. HyStart++ HyStart++ reduces packet loss and retransmissions, and improves
reduces packet loss and retransmissions, and improves goodput in lab goodput in lab measurements and real world deployments.
measurements as well as real world deployments.
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
3. Definitions 3. Definitions
We repeat here some definition from [RFC5681] to aid the reader. We repeat here some definition from [RFC5681] to aid the reader.
skipping to change at page 3, line 44 skipping to change at page 3, line 44
4. HyStart++ Algorithm 4. HyStart++ Algorithm
4.1. Summary 4.1. Summary
[HyStart] specifies two algorithms (a "Delay Increase" algorithm and [HyStart] specifies two algorithms (a "Delay Increase" algorithm and
an "Inter-Packet Arrival" algorithm) to be run in parallel to detect an "Inter-Packet Arrival" algorithm) to be run in parallel to detect
that the sending rate has reached capacity. In practice, the Inter- that the sending rate has reached capacity. In practice, the Inter-
Packet Arrival algorithm does not perform well and is not able to Packet Arrival algorithm does not perform well and is not able to
detect congestion early, primarily due to ACK compression. The idea detect congestion early, primarily due to ACK compression. The idea
of the Delay Increase algorithm is to look for RTT spikes, which of the Delay Increase algorithm is to look for spikes in RTT (round-
suggest that the bottleneck buffer is filling up. trip time), which suggest that the bottleneck buffer is filling up.
In HyStart++, a TCP sender uses traditional slow start and then uses In HyStart++, a TCP sender uses traditional slow start and then uses
the "Delay Increase" algorithm to trigger an exit from slow start. the "Delay Increase" algorithm to trigger an exit from slow start.
But instead of going straight from slow start to congestion But instead of going straight from slow start to congestion
avoidance, the sender spends a number of RTTs in a Conservative Slow avoidance, the sender spends a number of RTTs in a Conservative Slow
Start (CSS) phase to determine whether the exit was spurious. During Start (CSS) phase to determine whether the exit from slow start was
CSS, the congestion window is grown exponentially like in regular premature. During CSS, the congestion window is grown exponentially
slow start, but with a smaller exponential base, resulting in less like in regular slow start, but with a smaller exponential base,
aggressive growth. If the RTT shrinks at any time during CSS, it's resulting in less aggressive growth. If the RTT reduces during CSS,
concluded that the RTT spike was not related to congestion caused by it's concluded that the RTT spike was not related to congestion
the connection sending too fast (i.e. the exit was spurious), and the caused by the connection sending at a rate greater than the ideal
connection resumes slow start. If the RTT inflation persists send rate, and the connection resumes slow start. If the RTT
throughout CSS, the connection enters congestion avoidance. inflation persists throughout CSS, the connection enters congestion
avoidance.
4.2. Algorithm Details 4.2. Algorithm Details
We assume that Appropriate Byte Counting (as described in [RFC3465]) For the pseudocode, we assume that Appropriate Byte Counting (as
is in use and L is the cwnd increase limit as discussed in RFC 3465. described in [RFC3465]) is in use and L is the cwnd increase limit as
discussed in RFC 3465.
A round is chosen to be approximately the Round-Trip Time (RTT). We lastRoundMinRTT and currentRoundMinRTT are initialized to infinity at
recommend that rounds be measured using sequence numbers. Round can the initialization time
be approximated using sequence numbers as follows:
Define windowEnd as a sequence number initialize to SND.UNA Hystart++ measures rounds using sequence numbers, as follows:
Define windowEnd as a sequence number initialized to SND.UNA
When windowEnd is ACKed, the current round ends and windowEnd is When windowEnd is ACKed, the current round ends and windowEnd is
set to SND.NXT set to SND.NXT
At the start of each round during standard slow start ([RFC5681]) and At the start of each round during standard slow start ([RFC5681]) and
CSS: CSS:
lastRoundMinRTT = currentRoundMinRTT lastRoundMinRTT = currentRoundMinRTT
currentRoundMinRTT = infinity currentRoundMinRTT = infinity
skipping to change at page 4, line 48 skipping to change at page 5, line 4
- cwnd = cwnd + min (N, L * SMSS) - cwnd = cwnd + min (N, L * SMSS)
Keep track of minimum observed RTT Keep track of minimum observed RTT
- currentRoundMinRTT = min(currentRoundMinRTT, currRTT) - currentRoundMinRTT = min(currentRoundMinRTT, currRTT)
- where currRTT is the RTT sampled from the latest incoming ACK - where currRTT is the RTT sampled from the latest incoming ACK
- rttSampleCount += 1 - rttSampleCount += 1
For rounds where N_RTT_SAMPLE RTT samples have been obtained and
For rounds where cwnd is at or higher than LOW_CWND and currentRoundMinRTT and lastRoundMinRTT are valid, check if delay
N_RTT_SAMPLE RTT samples have been obtained, check if delay
increase triggers slow start exit increase triggers slow start exit
- if (cwnd >= (LOW_CWND * SMSS) AND rttSampleCount >=
N_RTT_SAMPLE) - if (rttSampleCount >= N_RTT_SAMPLE AND currentRoundMinRTT !=
infinity AND lastRoundMinRTT != infinity)
o RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, o RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8,
MAX_RTT_THRESH) MAX_RTT_THRESH)
o if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh)) o if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh))
+ cssBaselineMinRtt = currentRoundMinRTT + cssBaselineMinRtt = currentRoundMinRTT
+ exit slow start and enter CSS + exit slow start and enter CSS
skipping to change at page 6, line 12 skipping to change at page 6, line 14
If loss or ECN-marking is observed anytime during standard slow start If loss or ECN-marking is observed anytime during standard slow start
or CSS, enter congestion avoidance. or CSS, enter congestion avoidance.
* ssthresh = cwnd * ssthresh = cwnd
4.3. Tuning constants 4.3. Tuning constants
It is RECOMMENDED that a HyStart++ implementation use the following It is RECOMMENDED that a HyStart++ implementation use the following
constants: constants:
* LOW_CWND = 16
* MIN_RTT_THRESH = 4 msec * MIN_RTT_THRESH = 4 msec
* MAX_RTT_THRESH = 16 msec * MAX_RTT_THRESH = 16 msec
* N_RTT_SAMPLE = 8 * N_RTT_SAMPLE = 8
* CSS_GROWTH_DIVISOR = 4 * CSS_GROWTH_DIVISOR = 4
* CSS_ROUNDS = 5 * CSS_ROUNDS = 5
These constants have been determined with lab measurements and real These constants have been determined with lab measurements and real
world deployments. An implementation MAY tune them for different world deployments. An implementation MAY tune them for different
network characteristics. network characteristics.
Using smaller values of LOW_CWND will cause the algorithm to kick in
before the last round RTT can be measured, particularly if the
implementation uses an initial cwnd of 10 MSS. Higher values will
delay the detection of delay increase and reduce the ability of
HyStart++ to prevent overshoot problems.
The delay increase sensitivity is determined by MIN_RTT_THRESH and The delay increase sensitivity is determined by MIN_RTT_THRESH and
MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious
exits from slow start. Larger values of MAX_RTT_THRESH may result in exits from slow start. Larger values of MAX_RTT_THRESH may result in
slow start not exiting until loss is encountered for connections on slow start not exiting until loss is encountered for connections on
large RTT paths. large RTT paths.
A TCP implementation is required to take at least one RTT sample each A TCP implementation is required to take at least one RTT sample each
round. Using lower values of N_RTT_SAMPLE will lower the accuracy of round. Using lower values of N_RTT_SAMPLE will lower the accuracy of
the measured RTT for the round; higher values will improve accuracy the measured RTT for the round; higher values will improve accuracy
at the cost of more processing. at the cost of more processing.
skipping to change at page 7, line 16 skipping to change at page 7, line 10
start (when ssthresh is at its initial value of arbitrarily high per start (when ssthresh is at its initial value of arbitrarily high per
[RFC5681]) and fall back to using traditional slow start for the [RFC5681]) and fall back to using traditional slow start for the
remainder of the connection lifetime. This is acceptable because remainder of the connection lifetime. This is acceptable because
subsequent slow starts will use the discovered ssthresh value to exit subsequent slow starts will use the discovered ssthresh value to exit
slow start and avoid the overshoot problem. An implementation MAY slow start and avoid the overshoot problem. An implementation MAY
use HyStart++ to grow the restart window ([RFC5681]) after a long use HyStart++ to grow the restart window ([RFC5681]) after a long
idle period. idle period.
5. Deployments and Performance Evaluations 5. Deployments and Performance Evaluations
As of the time of writing, HyStart++ draft 01 was default enabled for As of the time of writing, HyStart++ as described in draft versions
all TCP connections in Windows for two years. The original Hystart 01 through 04 was default enabled for all TCP connections in the
has been default-enabled for all TCP connections using the default Windows operating system for over three years. The original Hystart
congestion control module CUBIC ([RFC8312]) for a decade. has been default-enabled for all TCP connections in the Linux
operating system using the default congestion control module CUBIC
([RFC8312]) for a decade.
In lab measurements with Windows TCP, HyStart++ shows both goodput In lab measurements with Windows TCP, HyStart++ shows both goodput
improvements as well as reductions in packet loss and improvements as well as reductions in packet loss and
retransmissions. For example across a variety of tests on a 100 Mbps retransmissions. For example across a variety of tests on a 100 Mbps
link with a bottleneck buffer size of bandwidth-delay product, link with a bottleneck buffer size of bandwidth-delay product,
HyStart++ reduces bytes retransmitted by 50% and retransmission HyStart++ reduces bytes retransmitted by 50% and retransmission
timeouts by 36%. timeouts by 36%.
In an A/B test for HyStart++ draft 01 across a large Windows device In an A/B test for HyStart++ draft 01 across a large Windows device
population, out of 52 billion TCP connections, 0.7% of connections population, out of 52 billion TCP connections, 0.7% of connections
 End of changes. 18 change blocks. 
57 lines changed or deleted 52 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/