draft-ietf-tcpm-newcwv-00.txt   draft-ietf-tcpm-newcwv-01.txt 
TCPM Working Group G. Fairhurst TCPM Working Group G. Fairhurst
Internet-Draft A. Sathiaseelan Internet-Draft A. Sathiaseelan
Obsoletes: 2861 (if approved) University of Aberdeen Obsoletes: 2861 (if approved) R. Secchi
Updates: 5681 (if approved) February 14, 2013 Updates: 5681 (if approved) University of Aberdeen
Intended status: Standards Track Intended status: Standards Track June 20, 2013
Expires: August 18, 2013 Expires: December 22, 2013
Updating TCP to support Rate-Limited Traffic Updating TCP to support Rate-Limited Traffic
draft-ietf-tcpm-newcwv-00 draft-ietf-tcpm-newcwv-01
Abstract Abstract
This document proposes an update to RFC 5681 to address issues that This document proposes an update to RFC 5681 to address issues that
arise when TCP is used to support traffic that exhibits periods where arise when TCP is used to support traffic that exhibits periods where
the sending rate is limited by the application rather than the the sending rate is limited by the application rather than the
congestion window. It updates TCP to allow a TCP sender to restart congestion window. It updates TCP to allow a TCP sender to restart
quickly following either an idle or rate-limited interval. This quickly following either an idle or rate-limited interval. This
method is expected to benefit applications that send rate-limited method is expected to benefit applications that send rate-limited
traffic using TCP, while also providing an appropriate response if traffic using TCP, while also providing an appropriate response if
congestion is experienced. congestion is experienced.
It also evaluates TCP Congestion Window Validation, CWV, an IETF It also evaluates the Experimental specification of TCP Congestion
experimental specification defined in RFC 2861, and concludes that Window Validation, CWV, defined in RFC 2861, and concludes that RFC
CWV sought to address important issues, but failed to deliver a 2861 sought to address important issues, but failed to deliver a
widely used solution. This document therefore proposes an update to widely used solution. This document therefore recommends that the
the status of RFC 2861 by recommending it is moved from Experimental status of RFC 2861 is moved from Experimental to Historic, and that
to Historic status, and that it is replaced by the current it is replaced by the current specification.
specification.
NOTE: The standards status of this WG document is under review for NOTE: The standards status of this WG document is under review for
consideration as either Experimental (EXP) or Proposed Standard (PS). consideration as either Experimental (EXP) or Proposed Standard (PS).
This decision will be made later as the document is finalised. This decision will be made later as the document is finalised.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 18, 2013. This Internet-Draft will expire on December 22, 2013.
Copyright Notice Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 4 2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 5
3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
4. An updated TCP response to idle and application-limited 4. An updated TCP response to idle and application-limited
periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1. A method for preserving cwnd during the idle and 4.1. A method for preserving cwnd during the idle and
application-limited periods. . . . . . . . . . . . . . . . 7 application-limited periods. . . . . . . . . . . . . . . . 7
4.2. The nonvalidated phase . . . . . . . . . . . . . . . . . . 7 4.2. Initialisation . . . . . . . . . . . . . . . . . . . . . . 8
4.3. TCP congestion control during the nonvalidated phase . . . 8 4.3. The nonvalidated phase . . . . . . . . . . . . . . . . . . 8
4.3.1. Response to congestion in the nonvalidated phase . . . 9 4.4. TCP congestion control during the nonvalidated phase . . . 8
4.3.2. Adjustment at the end of the nonvalidated phase . . . 9 4.4.1. Response to congestion in the nonvalidated phase . . . 9
5. Determining a safe period to preserve cwnd . . . . . . . . . . 10 4.4.2. Adjustment at the end of the nonvalidated phase . . . 10
6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 4.4.3. Examples of Implementation . . . . . . . . . . . . . . 11
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 5. Determining a safe period to preserve cwnd . . . . . . . . . . 12
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13
9. Author Notes . . . . . . . . . . . . . . . . . . . . . . . . . 12 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
9.1. Other related work . . . . . . . . . . . . . . . . . . . . 12 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13
9.2. Revision notes . . . . . . . . . . . . . . . . . . . . . . 14 9. Author Notes . . . . . . . . . . . . . . . . . . . . . . . . . 14
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 9.1. Other related work . . . . . . . . . . . . . . . . . . . . 14
10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 9.2. Revision notes . . . . . . . . . . . . . . . . . . . . . . 16
10.2. Informative References . . . . . . . . . . . . . . . . . . 15 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 10.1. Normative References . . . . . . . . . . . . . . . . . . . 17
10.2. Informative References . . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18
1. Introduction 1. Introduction
TCP is used to support a range of application behaviours. The TCP TCP is used to support a range of application behaviours. The TCP
congestion window (cwnd) controls the number of unacknoeledged congestion window (cwnd) controls the number of unacknowledged
packets/bytes that a TCP flow may have in the network at any time, a packets/bytes that a TCP flow may have in the network at any time, a
value known as the FlightSize [RFC5681]. A bulk application will value known as the FlightSize [RFC5681]. A bulk application will
always have data available to transmit. The rate at which it sends always have data available to transmit. The rate at which it sends
is therefore limited by the maximum permitted by the receiver and is therefore limited by the maximum permitted by the receiver
congestion windows. In contrast, a rate-limited application will advertised window and the sender congestion window (cwnd). In
experience periods when the sender is either idle or is unable to contrast, a rate-limited application will experience periods when the
send at the maximum rate permitted by the cwnd. This latter case is sender is either idle or is unable to send at the maximum rate
called rate-limited. The focus of this document is on the operation permitted by the cwnd. This latter case is called rate-limited. The
of TCP in such an idle or rate-limited case. focus of this document is on the operation of TCP in such an idle or
rate-limited case.
Standard TCP [RFC5681] requires the cwnd to be reset to the restart Standard TCP [RFC5681] requires the cwnd to be reset to the restart
window (RW) when an application becomes idle. [RFC2861] noted that window (RW) when an application becomes idle. [RFC2861] noted that
this TCP behaviour was not always observed in current this TCP behaviour was not always observed in current
implementations. Recent experiments [Bis08] confirm this to still be implementations. Recent experiments [Bis08] confirm this to still be
the case. the case.
Standard TCP does not impose additional restrictions on the growth of Standard TCP does not impose additional restrictions on the growth of
the cwnd when a TCP sender is rate-limited. A rate-limited sender the cwnd when a TCP sender is rate-limited. A rate-limited sender
may therefore grow a cwnd far beyond that corresponding to the may therefore grow a cwnd far beyond that corresponding to the
current transmit rate, resulting in a value that does not reflect current transmit rate, resulting in a value that does not reflect
current information about the state of the network path the flow is current information about the state of the network path the flow is
using. Use of such an invalid cwnd may result in reduced application using. Use of such an invalid cwnd may result in reduced application
performance and/or could significantly contribute to network performance and/or could significantly contribute to network
congestion. congestion.
[RFC2861] proposed a solution to these issues in an experimental [RFC2861] proposed a solution to these issues in an experimental
method known as Congestion Window Validation (CWV). CWV was intended method known as Congestion Window Validation (CWV). CWV was intended
to help reduce cases where TCP accumulated an invalid cwnd. The use to help reduce cases where TCP accumulated an invalid cwnd. The use
and drawbacks of using CWV with an application are discussed in and drawbacks of using the CWV algorithm in RFC 2861 with an
Section 2. application are discussed in Section 2.
Section 3 defines relevant terminology. Section 3 defines relevant terminology.
Section 4 specifies an alternative to CWV that seeks to address the Section 4 specifies an alternative to CWV that seeks to address the
same issues, but does this in a way that is expected to mitigate the same issues, but does this in a way that is expected to mitigate the
impact on an application that varies its sending rate. The method impact on an application that varies its sending rate. The method
described applies to both a rate-limited and an idle condition. described applies to both a rate-limited and an idle condition.
Section 5 describes the rationale for selecting the safe period to
preserve the cwnd.
2. Reviewing experience with TCP-CWV 2. Reviewing experience with TCP-CWV
RFC 2861 described a simple modification to the TCP congestion RFC 2861 described a simple modification to the TCP congestion
control algorithm that decayed the cwnd after the transition to a control algorithm that decayed the cwnd after the transition to a
"sufficiently-long" idle period. This used the slow-start threshold "sufficiently-long" idle period. This used the slow-start threshold
(ssthresh) to save information about the previous value of the (ssthresh) to save information about the previous value of the
congestion window. The approach relaxed the standard TCP behaviour congestion window. The approach relaxed the standard TCP behaviour
[RFC5681] for an idle session, intended to improve application [RFC5681] for an idle session, intended to improve application
performance. CWV also modified the behaviour for a rate-limited performance. CWV also modified the behaviour for a rate-limited
skipping to change at page 5, line 22 skipping to change at page 5, line 28
the default behaviour [Bis08]. Analysis (e.g. [Bis10] [Fai12]) has the default behaviour [Bis08]. Analysis (e.g. [Bis10] [Fai12]) has
shown that a TCP sender using CWV is able to use available capacity shown that a TCP sender using CWV is able to use available capacity
on a shared path after an idle period. This can benefit some on a shared path after an idle period. This can benefit some
applications, especially over long delay paths, when compared to the applications, especially over long delay paths, when compared to the
slow-start restart specified by standard TCP. However, CWV would slow-start restart specified by standard TCP. However, CWV would
only benefit an application if the idle period were less than several only benefit an application if the idle period were less than several
Retransmission Time Out (RTO) intervals [RFC6298], since the Retransmission Time Out (RTO) intervals [RFC6298], since the
behaviour would otherwise be the same as for standard TCP, which behaviour would otherwise be the same as for standard TCP, which
resets the cwnd to the RTCP Restart Window (RW) after this period. resets the cwnd to the RTCP Restart Window (RW) after this period.
Experience with CWV suggests that although CWV benefits the network Experience with RFC 2861 suggests that although the CWV method
in a rate-limited scenario (reducing the probability of network benefited the network in a rate-limited scenario (reducing the
congestion), the behaviour can be too conservative for many common probability of network congestion), the behaviour was too
rate-limited applications. This mechanism does not therefore offer conservative for many common rate-limited applications. This
the desirable increase in application performance for rate-limited mechanism did not therefore offer the desirable increase in
applications and it is unclear whether applications actually use this application performance for rate-limited applications and it is
mechanism in the general Internet. unclear whether applications actually use this mechanism in the
general Internet.
It is therefore concluded that CWV is often a poor solution for many It is therefore concluded that CWV, as defined in RFC2681, was often
rate-limited applications. It has the correct motivation, but has a poor solution for many rate-limited applications. It had the
the wrong approach to solving this problem. correct motivation, but had the wrong approach to solving this
problem.
3. Terminology 3. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
The document assumes familiarity with the terminology of TCP The document assumes familiarity with the terminology of TCP
congestion control [RFC5681]. congestion control [RFC5681].
The following new terminology is introduced: The following new terminology is introduced:
Validated phase: The phase where the cwnd reflects a current estimate pipeACK: A variable that records the volume of data acknowledged by
of the available path capacity. the network within an RTT.
pipeACK Sampling Period: The maximum period that a measured sample of
the pipeACK may influence the pipeACK variable.
Non-validated phase: The phase where the cwnd reflects a previous Non-validated phase: The phase where the cwnd reflects a previous
measurement of the available path capacity. measurement of the available path capacity.
Non-validated period, NVP: The maximum period for which cwnd is Non-validated period, NVP: The maximum period for which cwnd is
preserved in the non-validated phase. preserved in the non-validated phase.
Rate-limited: A TCP flow that does not consume more than one half of Rate-limited: A TCP flow that does not consume more than one half of
cwnd, and hence operates in the non-validated phase. cwnd, and hence operates in the non-validated phase.
pipe ACK: The measured volume of data that was acknowledged by the Validated phase: The phase where the cwnd reflects a current estimate
network per RTT. of the available path capacity.
4. An updated TCP response to idle and application-limited periods 4. An updated TCP response to idle and application-limited periods
This section proposes an update to the TCP congestion control This section proposes an update to the TCP congestion control
behaviour during an idle or rate-limited period. The new method behaviour during an idle or rate-limited period. The new method
permits a TCP sender to preserve the cwnd when an application becomes permits a TCP sender to preserve the cwnd when an application becomes
idle for a period of time (to be known as the non-validated period, idle for a period of time (the non-validated period, NVP, see section
NVP, see section 5). The period where actual usage is less than 5). The period where actual usage is less than allowed by cwnd, is
allowed by cwnd, is named as the non-validated phase. This method named as the non-validated phase. This method allows an application
allows an application to resume transmission at a previous rate to resume transmission at a previous rate without incurring the delay
without incurring the delay of slow-start. However, if the TCP of slow-start. However, if the TCP sender experiences congestion
sender experiences congestion using the preserved cwnd, it is using the preserved cwnd, it is required to immediately reset the
required to immediately reset the cwnd to an appropriate value cwnd to an appropriate value specified by the method. If a sender
specified by the method. If a sender does not take advantage of the does not take advantage of the preserved cwnd within the NVP, the
preserved cwnd within the NVP, the value of cwnd is reduced, ensuring value of cwnd is reduced, ensuring the value better reflects the
the value better reflects the capacity that was recently actually capacity that was recently actually used.
used.
The method requires that the TCP SACK option [RFC3517]is enabled.
This allows the sender to select an appropriate value for the cwnd
following a congestion event that is based on the measured path
capacity, and better reflects the fair-share. A similar approach was
proposed by TCP Jump Start [Liu07], as a congestion response after
more rapid opening of a TCP connection.
It is expected that this update will satisfy the requirements of many It is expected that this update will satisfy the requirements of many
rate-limited applications and at the same time provide an appropriate rate-limited applications and at the same time provide an appropriate
method for use in the Internet. It also reduces the incentive for an method for use in the Internet. It also reduces the incentive for an
application to send data simply to keep transport congestion state. application to send data simply to keep transport congestion state.
(This is sometimes known as "padding"). (This is sometimes known as "padding").
The new method does not differentiate between times when the sender The new method does not differentiate between times when the sender
has become idle or rate-limited. This is partly a response to has become idle or rate-limited. This is partly a response to
recognition that some applications wish to transmit at a rate less recognition that some applications wish to transmit at a rate less
skipping to change at page 7, line 11 skipping to change at page 7, line 12
expected to encourage applications and TCP stacks to use standards- expected to encourage applications and TCP stacks to use standards-
based congestion control methods. It may also encourage the use of based congestion control methods. It may also encourage the use of
long-lived connections where this offers benefit (such as persistent long-lived connections where this offers benefit (such as persistent
http). http).
The method is specified in following subsections. The method is specified in following subsections.
4.1. A method for preserving cwnd during the idle and application- 4.1. A method for preserving cwnd during the idle and application-
limited periods. limited periods.
The method described in this document updates [RFC5681]. Use of the
method REQUIRES a TCP sender and the corresponding receiver to enable
the TCP SACK option [RFC3517].
[RFC5681] defines a variable, FlightSize, that indicates the amount [RFC5681] defines a variable, FlightSize, that indicates the amount
of outstanding data in the network. This is assumed to be equal to of outstanding data in the network. This is assumed to be equal to
the value of Pipe calculated based on the pipe algorithm [RFC3517]. the value of Pipe calculated based on the pipe algorithm [RFC3517].
In RFC5681 this value is used during loss recovery, whereas in this In RFC5681 this value is used during loss recovery, whereas in this
method a new variable "pipeACK" is introduced and used to determine method a new variable "pipeACK" is introduced to measure the
if the sender has validated the cwnd. acknowledged size of the pipe, which is used to determine if the
sender has validated the cwnd.
The value of pipeACK is initialised to the maxium value. This value A sender determines a value for pipeACK by measuring the volume of
is used to inhibt entering the nonvalidated phase until the first data that was acknowledged by the network over the period of a
measurement of pipeACK completes. measured Round Trip Time (RTT). Using the variables defined in
[RFC3517], a value could be measured by caching the value of HighACK
and after one RTT measuring the difference between the cached HighACK
value and the current HighACK value. Other equivalent methods may be
used.
A sender is not required to continuously track the pipeACK value, but A sender is not required to continuously update the pipeACK variable
MUST set this variable to the volume of data that was acknowledged by after each received ACK, but MUST make a measurement at least once
the network per measured Round Trip Time (RTT), with a sampling per RTT when it has sent unacknowledged segments. The pipeACK value
period of not less than one measurement for Min(RTT, 1 second). used by the algorithm MAY consider multiple pipeACK measurements over
Using the variables defined in [RFC3517]. This could be implemented the pipeACK Sampling Period. The calculated pipeACK value MUST NOT
by caching the value of HighACK and after one RTT assigning pipeACK exceed the maximum (highest value) within the sampling period. This
to the difference between the cached HighACK value and the current specification degiones the pipeACK Sampling Period as Max(3*RTT, 1
HighACK value. Other equivalent methods may be used. second). This period enables a sender to compensate for large
fluctuations in the sending rate, where there may be pauses in
transmission, and allows pipeACK to reflect the largest recently
measured size of "pipeACK".
4.2. The nonvalidated phase When no measurements are available, the pipeACK variable is set to
the maximum (undefined) value. This value is used to inhibit
entering the nonvalidated phase until the first measurement of
pipeACK completes.
The method RECOMMENDS that the TCP SACK option [RFC3517] is enabled.
This allows the sender to more accurately determine the number of
missing bytes during the loss recovery phase, and using this method
will result in a higher cwnd following loss.
4.2. Initialisation
A sender starts a TCP connection in the Validated phase and
initialises the pipeACK variable to the maximum (undefined) value.
4.3. The nonvalidated phase
The updated method creates a new TCP sender phase that captures The updated method creates a new TCP sender phase that captures
whether the cwnd reflects a validated or non-validated value. The whether the cwnd reflects a validated or non-validated value. The
phases are defined as: phases are defined as:
o Validated phase: pipeACK >=(1/2)*cwnd. This is the normal phase, o Validated phase: pipeACK >=(1/2)*cwnd. This is the normal phase,
where cwnd is expected to be an approximate indication of the where cwnd is expected to be an approximate indication of the
available capacity currently available along the network path, and capacity currently available along the network path, and the
the standard methods are used to increase cwnd (currently standard methods are used to increase cwnd (currently [RFC5681]).
[RFC5681]). The rule for transitioning to the non-validated phase The rule for transitioning to the non-validated phase is specified
is specified in section 4.3. in section 4.3.
o Non-validated phase: pipeACK <(1/2)*cwnd. This is the phase where o Non-validated phase: pipeACK <(1/2)*cwnd. This is the phase where
the cwnd has a value based on a previous measurement of the the cwnd has a value based on a previous measurement of the
available capacity, and the usage of this capacity has not been available capacity, and the usage of this capacity has not been
validated in the previous RTT. That is, when it is not known validated in the pipeACK Sampling Period. That is, when it is not
whether the cwnd reflects the currently available capacity along known whether the cwnd reflects the currently available capacity
the network path. The mechanisms to be used in this phase seek to along the network path. The mechanisms to be used in this phase
determine a safe value for cwnd and an appropriate reaction to seek to determine a safe value for cwnd and an appropriate
congestion. These mechanisms are specified in section 4.3. reaction to congestion. These mechanisms are specified in section
4.3.
A sender starts a TCP connection in the Validated phase.
The value 1/2 was selected to reduce the effects of variations in the The value 1/2 was selected to reduce the effects of variations in the
measured pipeACK, and to allow the sender some flexibility in when it measured pipeACK, and to allow the sender some flexibility in when it
sends data. sends data.
4.3. TCP congestion control during the nonvalidated phase 4.4. TCP congestion control during the nonvalidated phase
A TCP sender MUST enter the non-validated phase when the measured A TCP sender MUST enter the non-validated phase when the measured
pipeACK is less than (1/2)*cwnd. pipeACK is less than (1/2)*cwnd.
A TCP sender that enters the non-validated phase will preserve the A TCP sender that enters the non-validated phase will preserve the
cwnd (i.e., this neither grows nor reduces while the sender remains cwnd (i.e., this neither grows nor reduces while the sender remains
in this phase). The phase is concluded after a fixed period of time in this phase). If the sender receives an indication of congestion
(the NVP, as explained in section 4.3.2) or when the sender transmits (loss or Explicit Congestion Notification, ECN, mark [RFC3168]) it
sufficient data so that pipeACK > (1/2)*cwnd (i.e. it is no longer uses the method described below. The phase is concluded after a
rate-limited). fixed period of time (the NVP, as explained in section 4.3.2) or when
the sender transmits sufficient data so that pipeACK > (1/2)*cwnd
(i.e. it is no longer rate-limited).
The behaviour in the non-validated phase is specified as: The behaviour in the non-validated phase is specified as:
o The cwnd is not increased when ACK packets are received in this o The cwnd is not increased when ACK packets are received in this
phase. phase.
o If the sender receives an indication of congestion while in the o If the sender receives an indication of congestion while in the
non-validated phase (i.e. detects loss, or an Explicit Congestion non-validated phase (i.e. detects loss, or an ECN mark), the
Notification, ECN, mark [RFC3168]), the sender MUST exit the non- sender MUST exit the non-validated phase (reducing the cwnd as
validated phase (reducing the cwnd as defined in section 4.3.1). defined in section 4.3.1).
o If the Retransmission Time Out (RTO) expires while in the non- o If the Retransmission Time Out (RTO) expires while in the non-
validated phase, the sender MUST exit the non-validated phase. It validated phase, the sender MUST exit the non-validated phase. It
then resumes using the Standard TCP RTO mechanism [RFC5681]. (The then resumes using the Standard TCP RTO mechanism [RFC5681]. (The
resulting reduction of cwnd described in section 4.3.2 is resulting reduction of cwnd described in section 4.3.2 is
appropriate, since any accumulated path history is considered appropriate, since any accumulated path history is considered
unreliable). unreliable).
o A sender that measures a pipeACK greater than (1/2)*cwnd SHOULD o A sender that measures a pipeACK greater than (1/2)*cwnd SHOULD
enter the validated phase. (A rate-limited sender will not enter the validated phase. (A rate-limited sender will not
normally be impacted by whether it is in a validated or non- normally be impacted by whether it is in a validated or non-
validate phase, since it will normally not consume the entire validate phase, since it will normally not consume the entire
cwnd. However a change to the validated phase will release the cwnd. However a change to the validated phase will release the
sender from constraints on the growth of cwnd, and restore the use sender from constraints on the growth of cwnd, and restore the use
of the standard congestion response.) of the standard congestion response.)
4.3.1. Response to congestion in the nonvalidated phase 4.4.1. Response to congestion in the nonvalidated phase
Reception of congestion feedback while in the non-validated phase is Reception of congestion feedback while in the non-validated phase is
interpreted as an indication that it was inappropriate for the sender interpreted as an indication that it was inappropriate for the sender
to use the preserved cwnd. The sender is therefore required to to use the preserved cwnd. The sender is therefore required to
quickly reduce the rate to avoid further congestion. Since the cwnd quickly reduce the rate to avoid further congestion. Since the cwnd
does not have a validated value, a new cwnd value must be selected does not have a validated value, a new cwnd value must be selected
based on the utilised rate. based on the utilised rate.
A sender that detects a packet-drop or receives an ECN marked packet A sender that detects a packet-drop or receives an ECN marked packet
MUST calculate a safe cwnd, by setting it to the value specified in MUST record the current FlightSize in the variable LossFlightSize and
calculate a safe cwnd, by setting it to the value specified in
Section 3.2 of [RFC5681]. Section 3.2 of [RFC5681].
At the end of the recovery phase, the TCP sender MUST reset the cwnd A TCP sender MUST calculate a safe cwnd to use for loss recovery
using the method below: using the method below:
cwnd = ((FlightSize - R)/2). cwnd = Min(cwnd/2,Max(pipeACK,LossFlightSize)).
Where, R is the volume of data that was reported as unacknowledged by This new cwnd is set to reflect that a nonvalidated cwnd may be much
the SACK information. This follows the method proposed for Jump larger than the actual flightsize, or recently used flightsize
Start [Liu07]. (recorded in pipeACK). The updated cwnd therefore prevents overshoot
by a sender significantly increasing its transmission rate during the
recovery period.
The inclusion of the term R makes this adjustment more conservative At the end of the recovery phase, the TCP sender MUST reset the cwnd
than standard TCP. (This is required, since the sender may have sent using the method below:
more segments than a Standard TCP sender would have done. The cwnd = ((LossFlightSize - R)/2).
additional reduction is beneficial when the FlightSize significantly
overshoots the available path capacity incurring significant loss, Where, R is the volume of data that was retransmitted during the
for instance an intense traffic burst following a non-validated recovery phase. This follows the method proposed for Jump Start
period.) [Liu07]. The inclusion of the term R makes this adjustment more
conservative than standard TCP. (This is required, since the sender
may have sent more segments than a Standard TCP sender would have
done. The additional reduction is beneficial when the LossFlightSize
significantly overshoots the available path capacity incurring
significant loss, for instance an intense traffic burst following a
non-validated period.)
If the sender implements a method that allows it to identify the If the sender implements a method that allows it to identify the
number of ECN-marked segments within a window that were observed by number of ECN-marked segments within a window that were observed by
the receiver, the sender SHOULD use the method above, further the receiver, the sender SHOULD use the method above, further
reducing R by the number of marked segments. reducing R by the number of marked segments.
The sender MUST also re-initialise the pipeACK variable to the maxium The sender MUST also re-initialise the pipeACK variable to the
value. This ensures that standard TCP methods are used immediately maximum (undefined) value. This ensures that standard TCP methods
after completing loss recovery. are used immediately after completing loss recovery.
4.3.2. Adjustment at the end of the nonvalidated phase 4.4.2. Adjustment at the end of the nonvalidated phase
During the non-validated phase, a sender can produce bursts of data During the non-validated phase, a sender can produce bursts of data
of up to the cwnd in size. While this is no different to standard of up to the cwnd in size. While this is no different to standard
TCP, it is desirable to control the maximum burst size, e.g. by TCP, it is desirable to control the maximum burst size, e.g. by
setting a burst size limit, using a pacing algorithm, or some other setting a burst size limit, using a pacing algorithm, or some other
method [Hug01]. method [Hug01].
An application that remains in the non-validated phase for a period An application that remains in the non-validated phase for a period
greater than the NVP is required to adjust its congestion control greater than the NVP is required to adjust its congestion control
state. If the sender exits the non-validated phase after this state. If the sender exits the non-validated phase after this
skipping to change at page 10, line 19 skipping to change at page 11, line 5
(This adjustment of ssthresh ensures that the sender records that it (This adjustment of ssthresh ensures that the sender records that it
has safely sustained the present rate. The change is beneficial to has safely sustained the present rate. The change is beneficial to
rate-limited flows that encounter occasional congestion, and could rate-limited flows that encounter occasional congestion, and could
otherwise suffer an unwanted additional delay in recovering the otherwise suffer an unwanted additional delay in recovering the
sending rate.) sending rate.)
The sender MUST then update cwnd to be not greater than: The sender MUST then update cwnd to be not greater than:
cwnd = max(1/2*cwnd, IW). cwnd = max(1/2*cwnd, IW).
Where IW is the TCP inital window [RFC5681]. Where IW is the appropriate TCP initial window, used by the TCP
sender (e.g. [RFC5681]).
(This adjustment ensures that sender responds conservatively at the (This adjustment ensures that sender responds conservatively at the
end of the non-validated phase by reducing the cwnd to better reflect end of the non-validated phase by reducing the cwnd to better reflect
the current sending rate of the sender. The cwnd update does not the current rate of the sender. The cwnd update does not take into
take into account FlightSize or pipeACK because these values only account FlightSize or pipeACK because these values only reflect data
reflect data during the last RTT and do not reflect the average or during the last RTT and do not reflect the average or maximum sending
peak sending rate.) rate.)
After completing this adjustment, the sender MAY re-enter the non- 4.4.3. Examples of Implementation
validated phase, if required (see section 4.2).
This section is intended to provide informative examples of
implementation methods. Implementations may choose to use other
methods that comply with the normative requirements.
XXX This section is work in progress - discussion is welcome to help
complete this section XXX
The pipeACK value may be sampled once each RTT. This reduces the
sender processing burden for calculating after each acknowledgement
and also reduces storage requirements at the sender.
Since application behaviour can be bursty using CWV, it may be
desirable to implement a maximum filter to accumulate the measured
values so that the pipeACK variable records the largest value within
the pipeACK Sampling Period. One simple way to implement this is to
divide the pipeACK Sampling Period into several (e.g. 5) equal length
measurement periods. The sender then records the start time for each
measurement period and the highest measured pipeACK value. At the
end of the measurement period, any measurement(s) that are older than
the pipeACK Sampling Period are discarded. The pipeACK variable is
then assigned the largest of the set of the highest measured values.
+----------+----------+ +----------+---......
| Sample A | Sample B | No | Sample C | Sample D
| | | Sample | |
| |\ 5 | | | |
| | | | | | /\ 4 |
| | | | |\ 3 | | | \ |
| | \ | | \--- | | / \ | /| 2
|/ \------| - | | / \------/ \...
+----------+---------\---/ /-----//--------+-------------> Time
<------------------------------------------------|
Sampling Period Current Time
Figure XX: Example of sampling pipeACK values
Figure XX shows an example of how measurement samples may be
collected. At the time represented by the figure new samples are
being accumulated into sample D. Three previous samples also fall
within the pipeACK Sampling Period: A, B, and C. There was also a
period of inactivity between samples B and C during which no
measurements were taken. The current value of the pipeACK variable
will be 5, the maximum across all samples.
After one further measurement period, Sample A will be discarded,
since it then is older than the pipeACK Sampling Period and the
pipeACK variable will be recalculated, Its value will be the larger
of Sample C or the final value accumulated in Sample D.
The NVP period does not necessarily require a new timer to be
implemented. An alternative is to record a timestamp when the sender
enters the NVP. Each time a sender transmits a new segment, this
timestamp may be used to determine if the NVP period has expired. If
the period expires, the sender may take into account how many units
of the NVP period have passed and make one reduction (as defined in
section 4.3.2) for each NVP period.
5. Determining a safe period to preserve cwnd 5. Determining a safe period to preserve cwnd
This section documents the rationale for selecting the maximum period This section documents the rationale for selecting the maximum period
that cwnd may be preserved, known as the non-validated period, NVP. that cwnd may be preserved, known as the non-validated period, NVP.
Limiting the period that cwnd may be preserved avoids undesirable Limiting the period that cwnd may be preserved avoids undesirable
side effects that would result if the cwnd were to be kept side effects that would result if the cwnd were to be kept
unecessarily high for an arbitrary long period, which was a part of unnecessarily high for an arbitrary long period, which was a part of
the problem that CWV originally attempted to address. The period a the problem that CWV originally attempted to address. The period a
sender may safely preserve the cwnd, is a function of the period that sender may safely preserve the cwnd, is a function of the period that
a network path is expected to sustain the capacity reflected by cwnd. a network path is expected to sustain the capacity reflected by cwnd.
There is no ideal choice for this time. There is no ideal choice for this time.
A period of five minutes was chosen for this NVP. This is a A period of five minutes was chosen for this NVP. This is a
compromise that was larger than the idle intervals of common compromise that was larger than the idle intervals of common
applications, but not sufficiently larger than the period for which applications, but not sufficiently larger than the period for which
the capacity of an Internet path may commonly be regarded as stable. the capacity of an Internet path may commonly be regarded as stable.
The capacity of wired networks is usually relatively stable for The capacity of wired networks is usually relatively stable for
skipping to change at page 11, line 48 skipping to change at page 13, line 43
discussed in [RFC5681]. This document describes an algorithm that discussed in [RFC5681]. This document describes an algorithm that
updates one aspect of the congestion control procedures, and so the updates one aspect of the congestion control procedures, and so the
considerations described in RFC 5681 also apply to this algorithm. considerations described in RFC 5681 also apply to this algorithm.
7. IANA Considerations 7. IANA Considerations
There are no IANA considerations. There are no IANA considerations.
8. Acknowledgments 8. Acknowledgments
The authors acknowledge the contributions of Dr I Biswas and Dr R The authors acknowledge the contributions of Dr I Biswas, Mr Ziaul
Secchi in supporting the evaluation of CWV and for their help in Hossain in supporting the evaluation of CWV and for their help in
developing the mechanisms proposed in this draft. We also developing the mechanisms proposed in this draft. We also
acknowledge comments received from the Internet Congestion Control acknowledge comments received from the Internet Congestion Control
Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, and Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, and
Joe Touch. Joe Touch. trhis work was part-funded by the European Community under
its Seventh Framework Programme through the Reducing Internet
Transport Latency (RITE) project (ICT-317700).
9. Author Notes 9. Author Notes
9.1. Other related work 9.1. Other related work
There are several issues to be discussed more widely: There are several issues to be discussed more widely:
o Should the method explicitly state a procedure for limiting o Should the method explicitly state a procedure for limiting
burstiness or pacing? burstiness or pacing?
This is often regarded as good practice, but is not presently a This is often regarded as good practice, but is not presently a
formal part of TCP. draft-hughes-restart-00.txt provides some formal part of TCP. draft-hughes-restart-00.txt provides some
discussion of this topic. discussion of this topic.
o There are potential interactions with the proposal to raise the o There are potential interactions with the Experimental update in
TCP initial Window to ten segments, do these cases need to be [RFC6928] that raises the TCP initial Window to ten segments, do
elaborated? these cases need to be elaborated?
This relates to draft-ietf-tcpm-initcwnd. This relates to the Experimental specification for increasing
the TCP IW defined in RFC 6928.
The two methods have different functions and different response The two methods have different functions and different response
to loss/congestion. to loss/congestion.
IW=10 proposes an experimental update to TCP that would allow RFC 6928 proposes an experimental update to TCP that would
faster opening of the cwnd, and also a large (same size) increase the IW to ten segments. This would allow faster
restart window. This approach is based on the assumption that opening of the cwnd, and also a large (same size) restart
many forward paths can sustain bursts of up to ten segments window. This approach is based on the assumption that many
without (appreciable) loss. Such a significant increase in forward paths can sustain bursts of up to ten segments without
cwnd must be matched with an equally large reduction of cwnd if (appreciable) loss. Such a significant increase in cwnd must
loss/congestion is detected, and such a congestion indication be matched with an equally large reduction of cwnd if loss/
is likely to require future use of IW=10 to be disabled for congestion is detected, and such a congestion indication is
this path for some time. This guards against the unwanted likely to require future use of IW=10 to be disabled for this
behaviour of a series of short flows continuously flooding a path for some time. This guards against the unwanted behaviour
network path without network congestion feedback. of a series of short flows continuously flooding a network path
without network congestion feedback.
In contrast, new-CWV proposes a standards-track update with a In contrast, this document proposes an update with a rationale
rationale that relies on recent previous path history to select that relies on recent previous path history to select an
an appropriate cwnd after restart. appropriate cwnd after restart.
The behaviour differs in three ways: The behaviour differs in three ways:
1) For applications that send little initially, new-cwv may 1) For applications that send little initially, new-cwv may
constrain more than IW=10, but would not require the connection constrain more than RFC 6928, but would not require the
to reset any path information when a restart incurred loss. In connection to reset any path information when a restart
contrast, new-cwv would allow the TCP connection to preserve incurred loss. In contrast, new-cwv would allow the TCP
the cached cwnd, any loss, would impact cwnd, but not impact connection to preserve the cached cwnd, any loss, would impact
other flows. cwnd, but not impact other flows.
2) For applications that utilise more capacity than provided by 2) For applications that utilise more capacity than provided by
a cwnd=10, this method would permit a larger restart window a cwnd of 10 segments, this method would permit a larger
compared to a restart using IW=10. This is justified by the restart window compared to a restart using the method in RFC
recent path history. 6928. This is justified by the recent path history.
3) new-CWV is attended to also be used for rate-limited 3) new-CWV is attended to also be used for rate-limited
applications, where the application sends, but does not seek to applications, where the application sends, but does not seek to
fully utilise the cwnd. In this case, new-cwv constrains the fully utilise the cwnd. In this case, new-cwv constrains the
cwnd to that justified by the recent path history. The cwnd to that justified by the recent path history. The
performance trade-offs are hence different, and it would be performance trade-offs are hence different, and it would be
possible to enable new-cwv when also using IW=10, and yield the possible to enable new-cwv when also using the method in RFC
benefits of this. 6928, and yield benefits.
o There is potential overlap with the Laminar proposal o There is potential overlap with the Laminar proposal
(draft-mathis-tcpm-tcp-laminar) (draft-mathis-tcpm-tcp-laminar)
The current draft was intended as a standards-track update to The current draft was intended as a standards-track update to
TCP, rather than a new transport variant. At least, it would TCP, rather than a new transport variant. At least, it would
be good to understand how the two interact and whether there is be good to understand how the two interact and whether there is
a possibility of a single method. a possibility of a single method.
o There is potential performance loss in loss of a short burst o There is potential performance loss in loss of a short burst
(off list with M Allman) (off list with M Allman)
A sender can transmit several segments then become idle. If A sender can transmit several segments then become idle. If
the first segments are all ACK'ed the ssthresh collapses to a the first segments are all ACK'ed the ssthresh collapses to a
small value (no new data is sent by the idle sender). Loss of small value (no new data is sent by the idle sender). Loss of
the later data results in congestion (e.g. maybe a RED drop or the later data results in congestion (e.g. maybe a RED drop or
some other cause, rather than the peak rate of this flow). some other cause, rather than the maximum rate of this flow).
When performs loss recovery it may have an appreciable pipeACK When performs loss recovery it may have an appreciable pipeACK
and cwnd, but a very low flight size - the Standard algorithm and cwnd, but a very low flight size - the Standard algorithm
results in an unusually low cwnd (1/2 Flight size). results in an unusually low cwnd (1/2 Flight size).
A constant rate flow would have maintained a flight size A constant rate flow would have maintained a flight size
appropriate to pipeACK (cwnd if it is a bulk flow). appropriate to pipeACK (cwnd if it is a bulk flow).
This could be fixed by adding a new state variable? It could This could be fixed by adding a new state variable? It could
also be argued this is a corner case (e.g. loss of only the also be argued this is a corner case (e.g. loss of only the
last segments would have resulted in RTO), the impact could be last segments would have resulted in RTO), the impact could be
significant. significant.
o There is potential interaction with TCP Control Block Sharing(M
Welzl)
An application that is non-validated can accumulate a cwnd that
is larger than the actual capacity. Is this a fair value to
use in TCB sharing?
9.2. Revision notes 9.2. Revision notes
RFC-Editor note: please remove this section prior to publication. RFC-Editor note: please remove this section prior to publication.
Draft 03 was submitted to ICCRG to receive comments and feedback. Draft 03 was submitted to ICCRG to receive comments and feedback.
Draft 04 contained the first set of clarifications after feedback: Draft 04 contained the first set of clarifications after feedback:
o Changed name to application limited and used the term rate-limited o Changed name to application limited and used the term rate-limited
in all places. in all places.
skipping to change at page 14, line 51 skipping to change at page 17, line 14
Draft 06 contained various updates: Draft 06 contained various updates:
o Required reset of pipeACK after congestion. o Required reset of pipeACK after congestion.
o Added comment on the effect of congestion after a short burst (M. o Added comment on the effect of congestion after a short burst (M.
Allman). Allman).
o Correction of minor Typos. o Correction of minor Typos.
WG draft 01 contained various updates: WG draft 00 contained various updates:
o Updaed initialisation of pipeACK to maximum value. o Updated initialisation of pipeACK to maximum value.
o Added note on intended status still to be determined. o Added note on intended status still to be determined.
WG draft 01 contained:
o Added corrections from Richard Scheffenegger.
o Raffaello Secchi added to the mechanism, based on implementation
experience.
o Removed that the requirement for the method to use TCP SACK option
[RFC3517] to be enabled - Although it may be desirable to use
SACK, this is not essential to the algorithm.
o Added the notion of the sampling period to accommodate large rate
variations and ensure that the method is stable. This algorithm
to be validated through implementation.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, [RFC0793] Postel, J., "Transmission Control Protocol", STD 7,
RFC 793, September 1981. RFC 793, September 1981.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
skipping to change at page 15, line 37 skipping to change at page 18, line 16
Conservative Selective Acknowledgment (SACK)-based Loss Conservative Selective Acknowledgment (SACK)-based Loss
Recovery Algorithm for TCP", RFC 3517, April 2003. Recovery Algorithm for TCP", RFC 3517, April 2003.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009. Control", RFC 5681, September 2009.
[RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent,
"Computing TCP's Retransmission Timer", RFC 6298, "Computing TCP's Retransmission Timer", RFC 6298,
June 2011. June 2011.
[RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
"Increasing TCP's Initial Window", RFC 6928, April 2013.
10.2. Informative References 10.2. Informative References
[Bis08] Biswas and Fairhurst, "A Practical Evaluation of [Bis08] Biswas and Fairhurst, "A Practical Evaluation of
Congestion Window Validation Behaviour, 9th Annual Congestion Window Validation Behaviour, 9th Annual
Postgraduate Symposium in the Convergence of Postgraduate Symposium in the Convergence of
Telecommunications, Networking and Broadcasting (PGNet), Telecommunications, Networking and Broadcasting (PGNet),
Liverpool, UK", June 2008. Liverpool, UK", June 2008.
[Bis10] Biswas, Sathiaseelan, Secchi, and Fairhurst, "Analysing [Bis10] Biswas, Sathiaseelan, Secchi, and Fairhurst, "Analysing
TCP for Bursty Traffic, Int'l J. of Communications, TCP for Bursty Traffic, Int'l J. of Communications,
skipping to change at line 687 skipping to change at page 19, line 26
Arjuna Sathiaseelan Arjuna Sathiaseelan
University of Aberdeen University of Aberdeen
School of Engineering School of Engineering
Fraser Noble Building Fraser Noble Building
Aberdeen, Scotland AB24 3UE Aberdeen, Scotland AB24 3UE
UK UK
Email: arjuna@erg.abdn.ac.uk Email: arjuna@erg.abdn.ac.uk
URI: http://www.erg.abdn.ac.uk URI: http://www.erg.abdn.ac.uk
Raffaello Secchi
University of Aberdeen
School of Engineering
Fraser Noble Building
Aberdeen, Scotland AB24 3UE
UK
Email: raffaello@erg.abdn.ac.uk
URI: http://www.erg.abdn.ac.uk
 End of changes. 53 change blocks. 
167 lines changed or deleted 284 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/