draft-ietf-tcpm-alternativebackoff-ecn-00.txt | draft-ietf-tcpm-alternativebackoff-ecn-01.txt | |||
---|---|---|---|---|
Network Working Group N. Khademi | Network Working Group N. Khademi | |||
Internet-Draft M. Welzl | Internet-Draft M. Welzl | |||
Intended status: Experimental University of Oslo | Intended status: Experimental University of Oslo | |||
Expires: August 6, 2017 G. Armitage | Expires: November 5, 2017 G. Armitage | |||
Swinburne University of Technology | Swinburne University of Technology | |||
G. Fairhurst | G. Fairhurst | |||
University of Aberdeen | University of Aberdeen | |||
February 2, 2017 | May 4, 2017 | |||
TCP Alternative Backoff with ECN (ABE) | TCP Alternative Backoff with ECN (ABE) | |||
draft-ietf-tcpm-alternativebackoff-ecn-00 | draft-ietf-tcpm-alternativebackoff-ecn-01 | |||
Abstract | Abstract | |||
This memo updates the TCP sender-side reaction to a congestion | Recent Active Queue Management (AQM) mechanisms instantiate shallow | |||
notification received via Explicit Congestion Notification (ECN). | buffers with burst tolerance to minimise the time that packets spend | |||
The updated method reduces FlightSize in Congestion Avoidance by a | enqueued at a bottleneck. However, shallow buffering can cause | |||
smaller amount than the TCP reaction to loss. The intention is to | noticeable performance degradation when TCP is used over a network | |||
achieve good throughput when the queue at the bottleneck is smaller | path with a large bandwidth-delay-product. Traditional methods rely | |||
than the bandwidth-delay-product of the connection. This is more | on detecting network congestion through reported loss of transport | |||
likely when an Active Queue Management (AQM) mechanism has used ECN | packets. Explicit Congestion Notification (ECN) instead allows a | |||
to CE-mark a packet, than when a packet was lost. Future versions of | router to directly signal incipient congestion. A sending endpoint | |||
this document will also describe a corresponding method for SCTP. | can distinguish when congestion is signalled via ECN, rather than by | |||
packet loss. An ECN signal indicates that an AQM mechanism has done | ||||
its job, and therefore the bottleneck network queue is likely to be | ||||
shallow. This document therefore proposes an update to the TCP | ||||
sender-side ECN reaction in congestion avoidance to reduce the | ||||
FlightSize by a smaller amount than the congestion control | ||||
algorithm's reaction to loss. Future versions of this document will | ||||
also describe a corresponding method for the Stream Control | ||||
Transmission Protocol (SCTP). | ||||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on November 5, 2017. | ||||
This Internet-Draft will expire on August 6, 2017. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3.1. Why Use ECN to Vary the Degree of Backoff? . . . . . . . 4 | 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3.2. Focus on ECN as Defined in RFC3168 . . . . . . . . . . . 4 | 4.1. Why Use ECN to Vary the Degree of Backoff? . . . . . . . 4 | |||
3.3. Discussion: Choice of ABE Multiplier . . . . . . . . . . 4 | 4.2. Focus on ECN as Defined in RFC3168 . . . . . . . . . . . 5 | |||
4. Specification . . . . . . . . . . . . . . . . . . . . . . . . 6 | 4.3. Discussion: Choice of ABE Multiplier . . . . . . . . . . 5 | |||
5. Status of the Update . . . . . . . . . . . . . . . . . . . . 6 | 5. Status of the Update . . . . . . . . . . . . . . . . . . . . 6 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 | 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 | |||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 7 | 8. Implementation Status . . . . . . . . . . . . . . . . . . . . 8 | |||
9. Security Considerations . . . . . . . . . . . . . . . . . . . 7 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
10. Revision Information . . . . . . . . . . . . . . . . . . . . 7 | 10. Revision Information . . . . . . . . . . . . . . . . . . . . 8 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 8 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 9 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 8 | 11.2. Informative References . . . . . . . . . . . . . . . . . 9 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
1. Definitions | 1. Definitions | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
2. Introduction | 2. Introduction | |||
Complementing [I-D.AQM-ECN-benefits], [I-D.ECN-exp] enables wider ECN | Explicit Congestion Notification (ECN) [RFC3168] makes it possible | |||
deployment by updating rules in [RFC3168] that prohibited certain | for an Active Queue Management (AQM) mechanism to signal the presence | |||
experiments. Specifically, [I-D.ECN-exp] allows for experiments to | of incipient congestion without incurring packet loss. This lets the | |||
specify a congestion control response to a CE-marked packet that | network deliver some packets to an application that would have been | |||
differs from the response to a dropped packet. This memo defines | dropped if the application or transport did not support ECN. This | |||
such a different congestion control response, called "ABE" | packet loss reduction is the most obvious benefit of ECN, but it is | |||
(Alternative Backoff with ECN). ABE is thus an Experiment in | often relatively modest. There are also significant other benefits | |||
accordance with [I-D.ECN-exp]. | from deploying ECN [RFC8087], including reduced end-to-end network | |||
latency. | ||||
[RFC5681] stipulates that TCP congestion control sets "ssthresh" to | The rules for ECN were originally written to be very conservative, | |||
max(FlightSize / 2, 2*SMSS) in response to packet loss. This | and required the congestion control algorithms of ECN-capable | |||
corresponds to a backoff multiplier of 0.5 (halving cwnd and | transport protocols to treat ECN congestion signals exactly the same | |||
sshthresh after packet loss). Consequently, a standard TCP flow | as they would treat a packet loss [RFC3168]. | |||
using this reaction needs significant network queue space: it can | ||||
only fully utilise a bottleneck when the length of the link queue (or | ||||
the AQM dropping threshold) is at least the bandwidth-delay product | ||||
(BDP) of the flow. | ||||
A backoff multiplier of 0.5 is not the only available strategy. As | Research has demonstrated the benefits of reducing network delays due | |||
defined in [I-D.CUBIC], CUBIC multiplies the current cwnd by 0.7 in | to excessive buffering [BUFFERBLOAT]; this has led to the creation of | |||
response to loss (the Linux implementation of CUBIC has used a | new AQM mechanisms like PIE [RFC8033] and CoDel [CODEL2012] | |||
multiplier of 0.7 since kernel version 2.6.25 released in 2008). | [I-D.CoDel], which avoid causing the bloated queues that are common | |||
Consequently, CUBIC utilises paths well even when the bottleneck | with a simple tail-drop behaviour (also known as a First-In First- | |||
queue is shorter than the bandwidth-delay product of the flow. | Out, FIFO, queue). | |||
However, in the case of a DropTail (FIFO) queue without AQM, such | ||||
less-aggressive backoff increases the risk of creating a standing | ||||
queue [CODEL2012]. | ||||
The standard TCP backoff behaviour defined in [RFC5681] entails | These AQM mechanisms instantiate short queues that are designed to | |||
reduced link utilisation in situations with short queues and low | tolerate packet bursts. However, congestion control mechanisms | |||
statistical multiplexing. This memo proposes a concrete sender-side- | cannot always utilise a bottleneck link well where there are short | |||
only congestion control response that remedies this problem. | queues. For example, to allow a single TCP connection to fully | |||
utilise a network path, the queue at the bottleneck link must be able | ||||
to compensate for TCP halving the "FlightSize" and "ssthresh" | ||||
variables in response to a lost packet [RFC5681]. This requires the | ||||
bottleneck queue to be able to store at least an end-to-end | ||||
bandwidth-delay product (BDP) of data, which effectively doubles both | ||||
the amount of data that can be in flight and the round-trip time | ||||
(RTT) experience using the network path. | ||||
Devices implementing AQM are likely to be the dominant (and possibly | Modern AQM mechanisms can use ECN to signal the early signs of | |||
only) source of ECN CE-marking for packets from ECN-capable senders. | impending queue buildup long before a tail-drop queue would be forced | |||
AQM mechanisms typically strive to maintain a small average queue | to resort to dropping packets. It is therefore appropriate for the | |||
length, regardless of the bandwidth-delay product of flows passing | transport protocol congestion control algorithm to have a more | |||
through them. Receipt of an ECN CE-mark might therefore reasonably | measured response when an early-warning signal of congestion is | |||
be taken to indicate that a small bottleneck queue exists in the | received in the form of an ECN CE-marked packet. Recognizing these | |||
path, and hence the TCP flow would benefit from using a less | changes in modern AQM practices, more recent rules have relaxed the | |||
aggressive backoff multiplier. | strict requirement that ECN signals be treated identically to packet | |||
loss [I-D.ECN-exp]. Following these newer, more flexible rules, this | ||||
document defines a new sender-side-only congestion control response, | ||||
called "ABE" (Alternative Backoff with ECN). ABE improves the | ||||
performance when routers use shallow buffered AQM mechanisms. | ||||
Much of the background to this proposal can be found in [ABE2015]. | 3. Specification | |||
Using a mix of experiments, theory and simulations with standard | ||||
NewReno and CUBIC, [ABE2015] recommends enabling ECN and letting | This specification describes an update to the congestion control | |||
individual TCP senders use a larger multiplicative decrease factor as | algorithm of an ECN-capable TCP transport protocol. It allows a TCP | |||
a reaction to the receiver reporting ECN CE-marks from AQM-enabled | stack to update the TCP sender response when it receives feedback | |||
bottlenecks. Such a change is noted to result in "...significant | indicating reception of a CE-marked packet. It RECOMMENDS that a TCP | |||
sender multiplies the FlightSize by 0.8 and reduces the slow start | ||||
threshold (ssthresh) in congestion avoidance following reception of a | ||||
TCP segment that sets the ECN-Echo flag (defined in [RFC3168]). | ||||
4. Discussion | ||||
Much of the technical background to this congestion control response | ||||
can be found in a research paper [ABE2017]. This paper used a mix of | ||||
experiments, theory and simulations with standard NewReno and CUBIC | ||||
to evaluate the technique. It examined the impact of enabling ECN | ||||
and letting individual TCP senders back off by a reduced amount in | ||||
reaction to the receiver that reports ECN CE-marks from AQM-enabled | ||||
bottlenecks. The technique was shown to present "...significant | ||||
performance gains in lightly-multiplexed scenarios, without losing | performance gains in lightly-multiplexed scenarios, without losing | |||
the delay-reduction benefits of deploying CoDel or PIE" [I-D.CoDel] | the delay-reduction benefits of deploying CoDel or PIE". The | |||
[I-D.PIE]. This is achieved when reacting to ECN-Echo in Congestion | performance improvement is achieved when reacting to ECN-Echo in | |||
Avoidance by multiplying cwnd and sstthresh with a value in the range | congestion avoidance by multiplying FlightSize and sstthresh with a | |||
[0.7..0.85]. | value in the range [0.7..0.85]. | |||
3. Discussion | 4.1. Why Use ECN to Vary the Degree of Backoff? | |||
3.1. Why Use ECN to Vary the Degree of Backoff? | The classic rule-of-thumb dictates that a network path needs to | |||
provide a BDP of bottleneck buffering if a TCP connection wishes to | ||||
optimise path utilisation. A single TCP bulk transfer running | ||||
through such a bottleneck will have increased its congestion window | ||||
(cwnd) up to 2*BDP by the time that packet loss occurs. When packet | ||||
loss is detected (regarded as a notification of congestion), Standard | ||||
TCP halves the FlightSize and ssthresh [RFC5681], which causes the | ||||
TCP congestion control to go back to allowing only a BDP of packets | ||||
in flight -- just sufficient to maintain 100% utilisation of the | ||||
bottleneck on the network path. | ||||
The classic rule-of-thumb dictates that a transport provides a BDP of | AQM mechanisms such as CoDel [I-D.CoDel] and PIE [RFC8033] set a | |||
bottleneck buffering if a TCP connection wishes to optimise path | delay target in routers and use congestion notifications to constrain | |||
utilisation. A single TCP connection running through such a | the queuing delays experienced by packets, rather than in response to | |||
bottleneck will have opened cwnd up to 2*BDP by the time packet loss | impending or actual bottleneck buffer exhaustion. With current | |||
occurs. [RFC5681]'s halving of cwnd and ssthresh pushes the TCP | default delay targets, CoDel and PIE both effectively emulate a | |||
connection back to allowing only a BDP of packets in flight -- just | shallow buffered bottleneck (section II, [ABE2017]) while also | |||
sufficient to maintain 100% utilisation of the network path. | allowing short traffic bursts into the queue. This provides | |||
acceptable performance for TCP connections over a path with a low | ||||
BDP, or in highly multiplexed scenarios (many concurrent transport | ||||
connections). However, it interacts badly for a lightly-multiplexed | ||||
case (few concurrent connections) over a path with a large BDP. | ||||
Conventional TCP backoff in such cases leads to gaps in packet | ||||
transmission and under-utilisation of the path. | ||||
AQM schemes like CoDel [I-D.CoDel] and PIE [I-D.PIE] use congestion | Instead of discarding packets, an AQM mechanism is allowed to mark | |||
notifications to constrain the queuing delays experienced by packets, | ECN-capable packets with an ECN CE-mark. The reception of a CE-mark | |||
rather than in response to impending or actual bottleneck buffer | not only indicates congestion on the network path, it also indicates | |||
exhaustion. With current default delay targets, CoDel and PIE both | that an AQM mechanism exists at the bottleneck along the path, and | |||
effectively emulate a shallow buffered bottleneck (section II, | hence the CE-mark likely came from a bottleneck with a shallow queue. | |||
[ABE2015]) while allowing short traffic bursts into the queue. This | Reacting differently to an ECN CE-mark than to packet loss can then | |||
interacts acceptably for TCP connections over low BDP paths, or | yield the benefit of a reduced back-off, as with CUBIC [I-D.CUBIC], | |||
highly multiplexed scenarios (many concurrent TCP connections). | when queues are short, yet it can avoid generating excessive delay | |||
However, it interacts badly with lightly-multiplexed cases (few | when queues are long. Using ECN can also be advantageous for several | |||
concurrent connections) over a high BDP path. Conventional TCP | other reasons [RFC8087]. | |||
backoff in such cases leads to gaps in packet transmission and under- | ||||
utilisation of the path. | ||||
The idea to react differently to loss upon detecting an ECN CE-mark | The idea of reacting differently to loss and detection of an ECN CE- | |||
pre-dates [ABE2015]. [ICC2002] also proposed using ECN CE-marks to | mark pre-dates this document. For example, previous research | |||
modify TCP congestion control behaviour, using a larger | proposed using ECN CE-marks to modify TCP congestion control | |||
multiplicative decrease factor in conjunction with a smaller additive | behaviour via a larger multiplicative decrease factor in conjunction | |||
increase factor to work with RED-based bottlenecks that were not | with a smaller additive increase factor [ICC2002]. The goal of this | |||
necessarily configured to emulate a shallow queue. | former work was to operate across AQM bottlenecks using Random Early | |||
Detection (RED) that were not necessarily configured to emulate a | ||||
shallow queue ([RFC7567] notes the current status of RED as an AQM | ||||
method.) | ||||
3.2. Focus on ECN as Defined in RFC3168 | 4.2. Focus on ECN as Defined in RFC3168 | |||
Some mechanisms rely on ECN semantics that differ from the | Some transport protocol mechanisms rely on ECN semantics that differ | |||
definitions in [RFC3168] -- for example, Congestion Exposure (ConEx) | from the original ECN definition [RFC3168] -- for example, Congestion | |||
[RFC7713] and DCTCP [I-D.ietf-tcpm-dctcp] need more accurate ECN | Exposure (ConEx) [RFC7713] and Datacenter TCP (DCTCP) | |||
information than the feedback mechanism in [RFC3168] offers (defined | [I-D.ietf-tcpm-dctcp] need more accurate ECN information than that | |||
in [I-D.ietf-tcpm-accurate-ecn]). Such mechanisms allow a sending | offered by the original feedback method. Other mechanisms (e.g., | |||
rate adjustment more frequent than each RTT. These mechanisms are | [I-D.ietf-tcpm-accurate-ecn]) allow the sender to adjust the rate | |||
more frequently than once each path RTT. Use of these mechanisms is | ||||
out of the scope of the current document. | out of the scope of the current document. | |||
3.3. Discussion: Choice of ABE Multiplier | 4.3. Discussion: Choice of ABE Multiplier | |||
Alternative Backoff with ECN (ABE) decouples a TCP sender's reaction | ABE decouples the reaction of a TCP sender to loss and ECN CE-marks | |||
to loss and ECN CE-marks in Congestion Avoidance. The description | when in the congestion avoidance phase. The description respectively | |||
respectively uses beta_{loss} and beta_{ecn} to refer to the | uses beta_{loss} and beta_{ecn} to refer to the multiplicative | |||
multiplicative decrease factors applied in response to packet loss, | decrease factors applied in response to packet loss, and in response | |||
and also in response to a receiver indicating that an ECN CE-mark was | to a receiver indicating that an ECN CE-mark was received on an ECN- | |||
received on an ECN-enabled TCP connection (based on the terms used in | enabled TCP connection. For non-ECN-enabled TCP connections, no ECN | |||
[ABE2015]). For non-ECN-enabled TCP connections, no ECN CE-marks are | CE-marks are received and only beta_{loss} applies. | |||
received and only beta_{loss} applies. | ||||
In other words, in response to detected loss: | In other words, in response to detected loss: | |||
FlightSize_(n+1) = FlightSize_n * beta_{loss} | FlightSize_(n+1) = FlightSize_n * beta_{loss} | |||
and in response to an indication of a received ECN CE-mark: | and in response to an indication of a received ECN CE-mark: | |||
FlightSize_(n+1) = FlightSize_n * beta_{ecn} | FlightSize_(n+1) = FlightSize_n * beta_{ecn} | |||
where, as in [RFC5681], FlightSize is the amount of outstanding data | where FlightSize is the amount of outstanding data in the network, | |||
in the network, upper-bounded by the sender's congestion window | upper-bounded by the sender's cwnd and the receiver's advertised | |||
(cwnd) and the receiver's advertised window (rwnd). The higher the | window (rwnd) [RFC5681]. The higher the values of beta_{loss} and | |||
values of beta_{loss} and beta_{ecn}, the less aggressive the | beta_{ecn}, the less aggressive the response of any individual | |||
response of any individual backoff event. | backoff event. | |||
The appropriate choice for beta_{loss} and beta_{ecn} values is a | The appropriate choice for beta_{loss} and beta_{ecn} values is a | |||
balancing act between path utilisation and draining the bottleneck | balancing act between path utilisation and draining the bottleneck | |||
queue. More aggressive backoff (smaller beta_*) risks underutilising | queue. More aggressive backoff (smaller beta_*) risks underutilising | |||
the path, while less aggressive backoff (larger beta_*) can result in | the path, while less aggressive backoff (larger beta_*) can result in | |||
slower draining of the bottleneck queue. | slower draining of the bottleneck queue. | |||
The Internet has already been running with at least two different | The Internet has already been running with at least two different | |||
beta_{loss} values for several years: the value in [RFC5681] is 0.5, | beta_{loss} values for several years: the standard value is 0.5 | |||
and Linux CUBIC uses 0.7. ABE proposes no change to beta_{loss} used | [RFC5681], and the Linux implementation of CUBIC [I-D.CUBIC] has used | |||
by any current TCP implementations. | a multiplier of 0.7 since kernel version 2.6.25 released in 2008. | |||
ABE proposes no change to beta_{loss} used by current TCP | ||||
implementations. | ||||
beta_{ecn} depends on how the response of a TCP connection to shallow | beta_{ecn} depends on how the response of a TCP connection to shallow | |||
AQM marking thresholds is optimised. beta_{loss} reflects the | AQM marking thresholds is optimised. beta_{loss} reflects the | |||
preferred response of each TCP algorithm when faced with exhaustion | preferred response of each congestion control algorithm when faced | |||
of buffers (of unknown depth) signalled by packet loss. | with exhaustion of buffers (of unknown depth) signalled by packet | |||
Consequently, for any given TCP algorithm the choice of beta_{ecn} is | loss. Consequently, for any given TCP congestion control algorithm | |||
likely to be algorithm-specific, rather than a constant multiple of | the choice of beta_{ecn} is likely to be algorithm-specific, rather | |||
the algorithm's existing beta_{loss}. | than a constant multiple of the algorithm's existing beta_{loss}. | |||
A range of experiments (section IV, [ABE2015]) with NewReno and CUBIC | A range of tests (section IV, [ABE2017]) with NewReno and CUBIC over | |||
over CoDel and PIE in lightly-multiplexed scenarios have explored | CoDel and PIE in lightly-multiplexed scenarios have explored this | |||
this choice of parameter. These experiments indicate that CUBIC | choice of parameter. The results of these tests indicate that CUBIC | |||
connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), | connections benefit from beta_{ecn} of 0.85 (cf. beta_{loss} = 0.7), | |||
and NewReno connections see improvements with beta_{ecn} in the range | and NewReno connections see improvements with beta_{ecn} in the range | |||
0.7 to 0.85 (cf. beta_{loss} = 0.5). | 0.7 to 0.85 (cf. beta_{loss} = 0.5). | |||
4. Specification | ||||
This document RECOMMENDS that experimental deployments multiply the | ||||
FlightSize by 0.8 and reduce the slow start threshold 'ssthresh' in | ||||
Congestion Avoidance in response to reception of a TCP segment that | ||||
sets the ECN-Echo flag. | ||||
5. Status of the Update | 5. Status of the Update | |||
This update is a sender-side only change. Like other changes to | This update is a sender-side only change. Like other changes to | |||
congestion-control algorithms it does not require any change to the | congestion-control algorithms, it does not require any change to the | |||
TCP receiver or to network devices (except to enable an ECN-marking | TCP receiver or to network devices. It does not require any ABE- | |||
algorithm [RFC3168] [RFC7567]). If the method is only deployed by | specific changes in routers or the use of Accurate ECN feedback | |||
some TCP senders, and not by others, the senders that use this method | [I-D.ietf-tcpm-accurate-ecn] by a receiver. | |||
can gain advantage, possibly at the expense of other flows that do | ||||
not use this updated method. This advantage applies only to ECN- | ||||
marked packets and not to loss indications. Hence, the new method | ||||
can not lead to congestion collapse. | ||||
The present specification has been assigned an Experimental status, | The currently published ECN specification requires that the | |||
to provide Internet deployment experience before being proposed as a | congestion control response to a CE-marked packet is the same as the | |||
Standards-Track update. | response to a dropped packet [RFC3168]. The specification is | |||
currently being updated to allow for specifications that do not | ||||
follow this rule [I-D.ECN-exp]. The present specification defines | ||||
such an experiment and has thus been assigned an Experimental status | ||||
before being proposed as a Standards-Track update. | ||||
This experiment will evaluate the impact of ABE on the Internet. The | The purpose of the Internet experiment is to collect experience with | |||
result will be reported by presentation to the TCPM WG (or IESG) or | deployment of ABE, and confirm the safety in deployed networks using | |||
an implementation report at the end of the experiment. Progressing | this update to TCP congestion control. | |||
the experiment requires support of ECN-marking packets carrying the | ||||
ECT(0) codepoint by routers [I-D.ECN-exp], but does not require any | When used with bottlenecks that do not support ECN-marking the | |||
ABE-specific changes in routers or Accurate ECN feedback | specification does not modify the transport protocol. | |||
[I-D.ietf-tcpm-accurate-ecn] from receivers. | ||||
To evaluate the benefit, this experiment therefore requires support | ||||
in AQM routers (except to enable an ECN-marking mechanism [RFC3168] | ||||
[RFC7567]) for ECN-marking of packets carrying the ECN Capable | ||||
Transport, ECT(0), codepoint [RFC3168]. | ||||
If the method is only deployed by some senders, and not by others, | ||||
the senders that use this method can gain some advantage, possibly at | ||||
the expense of other flows that do not use this updated method. | ||||
Because this advantage applies only to ECN-marked packets and not to | ||||
loss indications, the new method cannot lead to congestion collapse. | ||||
The result of this Internet experiment will be reported by | ||||
presentation to the TCPM WG (or IESG) or an implementation report at | ||||
the end of the experiment. | ||||
6. Acknowledgements | 6. Acknowledgements | |||
Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by | Authors N. Khademi, M. Welzl and G. Fairhurst were part-funded by | |||
the European Community under its Seventh Framework Programme through | the European Community under its Seventh Framework Programme through | |||
the Reducing Internet Transport Latency (RITE) project (ICT-317700). | the Reducing Internet Transport Latency (RITE) project (ICT-317700). | |||
The views expressed are solely those of the authors. | The views expressed are solely those of the authors. | |||
The authors would like to thank the following people for their | The authors would like to thank Stuart Cheshire for many suggestions | |||
contributions to [ABE2015]: Chamil Kulatunga, David Ros, Stein | when revising the draft, and the following people for their | |||
Gjessing, Sebastian Zander. Thanks to (in alphabetical order) Bob | contributions to [ABE2017]: Chamil Kulatunga, David Ros, Stein | |||
Briscoe, Markku Kojo, John Leslie, Dave Taht and the TCPM WG for | Gjessing, Sebastian Zander. Thanks also to (in alphabetical order) | |||
providing valuable feedback on this document. | Bob Briscoe, Markku Kojo, John Leslie, Dave Taht and the TCPM working | |||
group for providing valuable feedback on this document. | ||||
The authors would like to thank feedback on the congestion control | The authors would finally like to thank everyone who provided | |||
behaviour specified in this update received from the IRTF Internet | feedback on the congestion control behaviour specified in this update | |||
Congestion Control Research Group (ICCRG). | received from the IRTF Internet Congestion Control Research Group | |||
(ICCRG). | ||||
7. IANA Considerations | 7. IANA Considerations | |||
XX RFC ED - PLEASE REMOVE THIS SECTION XXX | XX RFC ED - PLEASE REMOVE THIS SECTION XXX | |||
This memo includes no request to IANA. | This document includes no request to IANA. | |||
8. Implementation Status | 8. Implementation Status | |||
ABE is implemented as a patch for Linux and FreeBSD. It is meant for | ABE is implemented as a patch for Linux and FreeBSD. It is meant for | |||
research and available for download from | research and available for download from | |||
http://heim.ifi.uio.no/naeemk/research/ABE/ This code was used to | http://heim.ifi.uio.no/naeemk/research/ABE/ This code was used to | |||
produce the test results that are reported in [ABE2015]. | produce the test results that are reported in [ABE2017]. | |||
9. Security Considerations | 9. Security Considerations | |||
The described method is a sender-side only transport change, and does | The described method is a sender-side only transport change, and does | |||
not change the protocol messages exchanged. The security | not change the protocol messages exchanged. The security | |||
considerations of [RFC3168] therefore still apply. | considerations for ECN [RFC3168] therefore still apply. | |||
This document describes a change to TCP congestion control with ECN | This is a change to TCP congestion control with ECN that will | |||
that will typically lead to a change in the capacity achieved when | typically lead to a change in the capacity achieved when flows share | |||
flows share a network bottleneck. Similar unfairness in the way that | a network bottleneck. This could result in some flows receiving more | |||
capacity is shared is also exhibited by other congestion control | than their fair share of capacity. Similar unfairness in the way | |||
that capacity is shared is also exhibited by other congestion control | ||||
mechanisms that have been in use in the Internet for many years | mechanisms that have been in use in the Internet for many years | |||
(e.g., CUBIC [I-D.CUBIC]). Unfairness may also be a result of other | (e.g., CUBIC [I-D.CUBIC]). Unfairness may also be a result of other | |||
factors, including the round trip time experienced by a flow. This | factors, including the round trip time experienced by a flow. ABE | |||
advantage applies only to ECN-marked packets and not to loss | applies only when ECN-marked packets are received, not when packets | |||
indications, and will therefore not lead to congestion collapse. | are lost, hence use of ABE cannot lead to congestion collapse. | |||
10. Revision Information | 10. Revision Information | |||
XX RFC ED - PLEASE REMOVE THIS SECTION XXX | XX RFC ED - PLEASE REMOVE THIS SECTION XXX | |||
-01. Text improved, mainly incorporating comments from Stuart | ||||
Cheshire. The reference to a technical report has been updated to a | ||||
published version of the tests [ABE2017]. Used "AQM Mechanism" | ||||
throughout in place of other alternatives, and more consistent use of | ||||
technical language and clarification on the intended purpose of the | ||||
experiments required by EXP status. There was no change to the | ||||
technical content. | ||||
-00. draft-ietf-tcpm-alternativebackoff-ecn-00 replaces draft- | -00. draft-ietf-tcpm-alternativebackoff-ecn-00 replaces draft- | |||
khademi-tcpm-alternativebackoff-ecn-01. Text describing the nature | khademi-tcpm-alternativebackoff-ecn-01. Text describing the nature | |||
of the experiment was added. | of the experiment was added. | |||
-01. This I-D now refers to draft-black-tsvwg-ecn-experimentation- | Individual draft -01. This I-D now refers to draft-black-tsvwg-ecn- | |||
02, which replaces draft-khademi-tsvwg-ecn-response-00 to make a | experimentation-02, which replaces draft-khademi-tsvwg-ecn- | |||
broader update to RFC3168 for the sake of allowing experiments. As a | response-00 to make a broader update to RFC3168 for the sake of | |||
result, some of the motivating and discussing text that was moved | allowing experiments. As a result, some of the motivating and | |||
from draft-khademi-alternativebackoff-ecn-03 to draft-khademi-tsvwg- | discussing text that was moved from draft-khademi-alternativebackoff- | |||
ecn-response-00 has now been re-inserted here. | ecn-03 to draft-khademi-tsvwg-ecn-response-00 has now been re- | |||
inserted here. | ||||
-00. draft-khademi-tsvwg-ecn-response-00 and draft-khademi-tcpm- | Individual draft -00. draft-khademi-tsvwg-ecn-response-00 and draft- | |||
alternativebackoff-ecn-00 replace draft-khademi-alternativebackoff- | khademi-tcpm-alternativebackoff-ecn-00 replace draft-khademi- | |||
ecn-03, following discussion in the TSVWG and TCPM working groups. | alternativebackoff-ecn-03, following discussion in the TSVWG and TCPM | |||
working groups. | ||||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
[I-D.ECN-exp] | [I-D.ECN-exp] | |||
Black, D., "Explicit Congestion Notification (ECN) | Black, D., "Explicit Congestion Notification (ECN) | |||
Experimentation", Internet-draft, IETF work-in-progress | Experimentation", Internet-draft, IETF work-in-progress | |||
draft-black-tsvwg-ecn-experimentation-02, October 2016. | draft-ietf-tsvwg-ecn-experimentation-02, April 2017. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, | Requirement Levels", BCP 14, RFC 2119, | |||
DOI 10.17487/RFC2119, March 1997, | DOI 10.17487/RFC2119, March 1997, | |||
<http://www.rfc-editor.org/info/rfc2119>. | <http://www.rfc-editor.org/info/rfc2119>. | |||
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition | |||
of Explicit Congestion Notification (ECN) to IP", | of Explicit Congestion Notification (ECN) to IP", | |||
RFC 3168, DOI 10.17487/RFC3168, September 2001, | RFC 3168, DOI 10.17487/RFC3168, September 2001, | |||
<http://www.rfc-editor.org/info/rfc3168>. | <http://www.rfc-editor.org/info/rfc3168>. | |||
skipping to change at page 8, line 35 ¶ | skipping to change at page 9, line 45 ¶ | |||
Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, | |||
<http://www.rfc-editor.org/info/rfc5681>. | <http://www.rfc-editor.org/info/rfc5681>. | |||
[RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF | [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF | |||
Recommendations Regarding Active Queue Management", | Recommendations Regarding Active Queue Management", | |||
BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, | BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, | |||
<http://www.rfc-editor.org/info/rfc7567>. | <http://www.rfc-editor.org/info/rfc7567>. | |||
11.2. Informative References | 11.2. Informative References | |||
[ABE2015] Khademi, N., Welzl, M., Armitage, G., Kulatunga, C., Ros, | [ABE2017] Khademi, N., Armitage, G., Welzl, M., Fairhurst, G., | |||
D., Fairhurst, G., Gjessing, S., and S. Zander, | Zander, S., and D. Ros, "Alternative Backoff: Achieving | |||
"Alternative Backoff: Achieving Low Latency and High | Low Latency and High Throughput with ECN and AQM", IFIP | |||
Throughput with ECN and AQM", CAIA Technical Report CAIA- | NETWORKING 2017, Stockholm, Sweden, June 2017. | |||
TR-150710A, Swinburne University of Technology, July 2015, | ||||
<http://caia.swin.edu.au/reports/150710A/ | [BUFFERBLOAT] | |||
CAIA-TR-150710A.pdf>. | "Bufferbloat project", | |||
<https://www.bufferbloat.net/projects/bloat/wiki/ | ||||
Introduction/>. | ||||
[CODEL2012] | [CODEL2012] | |||
Nichols, K. and V. Jacobson, "Controlling Queue Delay", | Nichols, K. and V. Jacobson, "Controlling Queue Delay", | |||
July 2012, <http://queue.acm.org/detail.cfm?id=2209336>. | July 2012, <http://queue.acm.org/detail.cfm?id=2209336>. | |||
[I-D.AQM-ECN-benefits] | ||||
Fairhurst, G. and M. Welzl, "The Benefits of using | ||||
Explicit Congestion Notification (ECN)", Internet-draft, | ||||
IETF work-in-progress draft-ietf-aqm-ecn-benefits-08, | ||||
November 2015. | ||||
[I-D.CoDel] | [I-D.CoDel] | |||
Nichols, K., Jacobson, V., McGregor, V., and J. Iyengar, | Nichols, K., Jacobson, V., McGregor, V., and J. Iyengar, | |||
"Controlled Delay Active Queue Management", Internet- | "Controlled Delay Active Queue Management", Internet- | |||
draft, IETF work-in-progress draft-ietf-aqm-codel-04, June | draft, IETF work-in-progress draft-ietf-aqm-codel-07, | |||
2016. | March 2017. | |||
[I-D.CUBIC] | [I-D.CUBIC] | |||
Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and | |||
R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", | |||
Internet-draft, IETF work-in-progress draft-ietf-tcpm- | Internet-draft, IETF work-in-progress draft-ietf-tcpm- | |||
cubic-02, August 2016. | cubic-04, February 2017. | |||
[I-D.ietf-tcpm-accurate-ecn] | [I-D.ietf-tcpm-accurate-ecn] | |||
Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More | Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More | |||
Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- | Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- | |||
ecn-01 (work in progress), June 2016. | ecn-01 (work in progress), June 2016. | |||
[I-D.ietf-tcpm-dctcp] | [I-D.ietf-tcpm-dctcp] | |||
Bensley, S., Eggert, L., Thaler, D., Balasubramanian, P., | Bensley, S., Eggert, L., Thaler, D., Balasubramanian, P., | |||
and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion | and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion | |||
Control for Datacenters", draft-ietf-tcpm-dctcp-02 (work | Control for Datacenters", draft-ietf-tcpm-dctcp-02 (work | |||
in progress), July 2016. | in progress), July 2016. | |||
[I-D.PIE] Pan, R., Natarajan, P., Baker, F., and G. White, "PIE: A | ||||
Lightweight Control Scheme To Address the Bufferbloat | ||||
Problem", Internet-draft, IETF work-in-progress draft- | ||||
ietf-aqm-pie-10, September 2016. | ||||
[ICC2002] Kwon, M. and S. Fahmy, "TCP Increase/Decrease Behavior | [ICC2002] Kwon, M. and S. Fahmy, "TCP Increase/Decrease Behavior | |||
with Explicit Congestion Notification (ECN)", IEEE | with Explicit Congestion Notification (ECN)", IEEE | |||
ICC 2002, New York, New York, USA, May 2002, | ICC 2002, New York, New York, USA, May 2002, | |||
<http://dx.doi.org/10.1109/ICC.2002.997262>. | <http://dx.doi.org/10.1109/ICC.2002.997262>. | |||
[RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) | [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) | |||
Concepts, Abstract Mechanism, and Requirements", RFC 7713, | Concepts, Abstract Mechanism, and Requirements", RFC 7713, | |||
DOI 10.17487/RFC7713, December 2015, | DOI 10.17487/RFC7713, December 2015, | |||
<http://www.rfc-editor.org/info/rfc7713>. | <http://www.rfc-editor.org/info/rfc7713>. | |||
[RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, | ||||
"Proportional Integral Controller Enhanced (PIE): A | ||||
Lightweight Control Scheme to Address the Bufferbloat | ||||
Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, | ||||
<http://www.rfc-editor.org/info/rfc8033>. | ||||
[RFC8087] Fairhurst, G. and M. Welzl, "The Benefits of Using | ||||
Explicit Congestion Notification (ECN)", RFC 8087, | ||||
DOI 10.17487/RFC8087, March 2017, | ||||
<http://www.rfc-editor.org/info/rfc8087>. | ||||
Authors' Addresses | Authors' Addresses | |||
Naeem Khademi | Naeem Khademi | |||
University of Oslo | University of Oslo | |||
PO Box 1080 Blindern | PO Box 1080 Blindern | |||
Oslo N-0316 | Oslo N-0316 | |||
Norway | Norway | |||
Email: naeemk@ifi.uio.no | Email: naeemk@ifi.uio.no | |||
Michael Welzl | Michael Welzl | |||
University of Oslo | University of Oslo | |||
PO Box 1080 Blindern | PO Box 1080 Blindern | |||
Oslo N-0316 | Oslo N-0316 | |||
Norway | Norway | |||
Email: michawe@ifi.uio.no | Email: michawe@ifi.uio.no | |||
Grenville Armitage | Grenville Armitage | |||
Centre for Advanced Internet Architectures | Centre for Advanced Internet Architectures | |||
End of changes. 49 change blocks. | ||||
208 lines changed or deleted | 274 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |