draft-ietf-tcpm-accecn-reqs-08.txt   rfc7560.txt 
TCP Maintenance and Minor Extensions (tcpm) M. Kuehlewind, Ed. Internet Engineering Task Force (IETF) M. Kuehlewind, Ed.
Internet-Draft ETH Zurich Request for Comments: 7560 ETH Zurich
Intended status: Informational R. Scheffenegger Category: Informational R. Scheffenegger
Expires: September 10, 2015 NetApp, Inc. ISSN: 2070-1721 NetApp, Inc.
B. Briscoe B. Briscoe
BT BT
March 9, 2015 August 2015
Problem Statement and Requirements for a More Accurate ECN Feedback Problem Statement and Requirements for Increased Accuracy
draft-ietf-tcpm-accecn-reqs-08 in Explicit Congestion Notification (ECN) Feedback
Abstract Abstract
Explicit Congestion Notification (ECN) is a mechanism where network Explicit Congestion Notification (ECN) is a mechanism where network
nodes can mark IP packets instead of dropping them to indicate nodes can mark IP packets, instead of dropping them, to indicate
congestion to the end-points. An ECN-capable receiver will feed this congestion to the endpoints. An ECN-capable receiver will feed this
information back to the sender. ECN is specified for TCP in such a information back to the sender. ECN is specified for TCP in such a
way that it can only feed back one congestion signal per Round-Trip way that it can only feed back one congestion signal per Round-Trip
Time (RTT). In contrast, ECN for other transport protocols, such as Time (RTT). In contrast, ECN for other transport protocols, such as
RTP/UDP and SCTP, is specified with more accurate ECN feedback. RTP/UDP and SCTP, is specified with more accurate ECN feedback.
Recent new TCP mechanisms (like ConEx or DCTCP) need more accurate Recent new TCP mechanisms (like Congestion Exposure (ConEx) or Data
ECN feedback in the case where more than one marking is received in Center TCP (DCTCP)) need more accurate ECN feedback in the case where
one RTT. This document specifies requirements for an update to the more than one marking is received in one RTT. This document
TCP protocol to provide more accurate ECN feedback. specifies requirements for an update to the TCP protocol to provide
more accurate ECN feedback.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
This Internet-Draft will expire on September 10, 2015. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7560.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
2. Recap of Classic ECN and ECN Nonce in IP/TCP . . . . . . . . 4 2. Recap of Classic ECN and ECN Nonce in IP/TCP . . . . . . . . 5
3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 8
5. Design Approaches . . . . . . . . . . . . . . . . . . . . . . 10 5. Design Approaches . . . . . . . . . . . . . . . . . . . . . . 11
5.1. Re-Definition of ECN/NS Header Bits . . . . . . . . . . . 11 5.1. Redefinition of ECN/NS Header Bits . . . . . . . . . . . 11
5.2. Using Other Header Bits . . . . . . . . . . . . . . . . . 12 5.2. Using Other Header Bits . . . . . . . . . . . . . . . . . 13
5.3. Using a TCP Option . . . . . . . . . . . . . . . . . . . 12 5.3. Using a TCP Option . . . . . . . . . . . . . . . . . . . 13
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 14
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 14
8. Security Considerations . . . . . . . . . . . . . . . . . . . 13 7.1. Normative References . . . . . . . . . . . . . . . . . . 14
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 7.2. Informative References . . . . . . . . . . . . . . . . . 14
9.1. Normative References . . . . . . . . . . . . . . . . . . 14 Appendix A. Ambiguity of the More Accurate ECN Feedback in DCTCP 16
9.2. Informative References . . . . . . . . . . . . . . . . . 14 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 17
Appendix A. Ambiguity of the More Accurate ECN Feedback in DCTCP 15 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16
1. Introduction 1. Introduction
Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where Explicit Congestion Notification (ECN) [RFC3168] is a mechanism where
network nodes can mark IP packets instead of dropping them to network nodes can mark IP packets instead of dropping them to
indicate congestion to the end-points. An ECN-capable receiver will indicate congestion to the endpoints. An ECN-capable receiver will
feed this information back to the sender. ECN is specified for TCP feed this information back to the sender. ECN is specified for TCP
in such a way that only one feedback signal can be transmitted per in such a way that only one feedback signal can be transmitted per
Round-Trip Time (RTT). This is sufficient for pre-existing TCP Round-Trip Time (RTT). This is sufficient for preexisting TCP
congestion control mechanisms that perform only one reduction in congestion control mechanisms that perform only one reduction in
sending rate per RTT, independent of the number of ECN congestion sending rate per RTT, independent of the number of ECN congestion
marks. But recently proposed or deployed mechanisms like Congestion marks. But recently proposed or deployed mechanisms like Congestion
Exposure (ConEx) [RFC6789] or Data Center TCP (DCTCP) Exposure (ConEx) [RFC6789] or Data Center TCP (DCTCP) [DCTCP] need
[I-D.bensley-tcpm-dctcp] need more accurate ECN feedback than more accurate ECN feedback than 'classic ECN' [RFC3168] to work
'classic' ECN [RFC3168] to work correctly in the case where more than correctly in the case where more than one marking is received in any
one marking is received in any one RTT. one RTT.
For an in-depth discussion of the application benefits of using ECN For an in-depth discussion of the application benefits of using ECN
(including with sufficiently granular feedback) see (including with sufficiently granular feedback), see [ECN-BENEFITS].
[I-D.welzl-ecn-benefits].
ECN is also defined for transport protocols beside TCP. ECN feedback ECN is also defined for transport protocols beside TCP. ECN feedback
as defined for RTP/UDP [RFC6679] provides a very detailed level of as defined for RTP/UDP [RFC6679] provides a very detailed level of
information, delivering individual counters for all four ECN information, delivering individual counters for all four ECN
codepoints as well as lost and duplicate segments, but at the cost of codepoints as well as lost and duplicate segments, but at the cost of
high signaling overhead. ECN feedback for SCTP has been proposed in high signalling overhead. ECN feedback for SCTP has been proposed in
[I-D.stewart-tsvwg-sctpecn]. This delivers a counter for the number [SCTP-ECN]. This delivers a counter for the number of ECN-capable
of CE marked segments between CWR chunks, but also comes at the cost packets that were marked due to congestion (since the last sender-
of increased overhead. side window reduction), but it comes at the cost of increased
overhead.
Today, implementations of DCTCP already exist that alter TCP's ECN Today, implementations of DCTCP already exist that alter TCP's ECN
feedback protocol in proprietary ways (DCTCP was released in feedback protocol in proprietary ways (DCTCP was released in
Microsoft Windows 8, and implementations exist for Linux and Microsoft Windows 8, and implementations exist for Linux and
FreeBSD). The changes DCTCP makes to TCP are not currently the FreeBSD). However, the changes DCTCP makes to TCP omit capability
subject of any IETF standardization activity, and they omit negotiation, relying instead on uniform configuration across all
capability negotiation, relying instead on uniform configuration hosts and network devices with ECN capability. A primary motivation
across all hosts and network devices with ECN capability. A primary for this document is to intervene before each proprietary
motivation for this document is to intervene before each proprietary
implementation invents its own non-interoperable handshake, which implementation invents its own non-interoperable handshake, which
could lead to _de facto_ consumption of the few flags or codepoints could lead to _de facto_ consumption of the few flags or codepoints
that remain available for standardizing capability negotiation. that remain available for standardizing capability negotiation.
This document lists requirements for a robust and interoperable TCP/ This document lists requirements for a robust and interoperable TCP/
ECN feedback protocol that is more accurate than classic ECN ECN feedback protocol that is more accurate than classic ECN
[RFC3168] and that all implementations of new TCP extensions, like [RFC3168] and that all implementations of new TCP extensions, like
ConEx and/or DCTCP, can use. While a new feedback scheme should ConEx and/or DCTCP, can use. While a new feedback scheme should
still deliver as much information as classic ECN, this document also still deliver as much information as classic ECN, this document also
clarifies what has to be taken into consideration in addition. Thus clarifies what has to be taken into consideration in addition. Thus,
the listed requirements should be addressed in the specification of a the listed requirements should be addressed in the specification of a
more accurate ECN feedback scheme. A few solutions have already been more accurate ECN feedback scheme. A few solutions have already been
proposed. Section 5 demonstrates how to use the requirements to proposed. Section 5 demonstrates how to use the requirements to
compare them, by briefly sketching their high level design choices compare them, by briefly sketching their high-level design choices
and discussing the benefits and drawbacks of each. and discussing the benefits and drawbacks of each.
The scope of these requirements is not limited to any specific The scope of these requirements is not limited to any specific
environment and is intended for general deployment over public and environment and is intended for general deployment over public and
private IP networks. Candidate solutions should try to adhere to all private IP networks. Candidate solutions should try to adhere to all
these requirements, but where this is not possible they should these requirements, but, where this is not possible, they should
justify the deviation. The ordering of the requirements listed in justify the deviation. The ordering of the requirements listed in
this document is not to be taken as an order of importance, because this document is not to be taken as an order of importance, because
each requirement might have different weight in different deployment each requirement might have different weight in different deployment
scenarios. scenarios.
These requirements are only concerned with the type and quality of These requirements are only concerned with the type and quality of
the ECN feedback signal. The requirements do not stipulate how a TCP the ECN feedback signal. The requirements do not stipulate how a TCP
sender might react to the improved ECN signal. The requirements also sender might react to the improved ECN signal. The requirements also
do not imply that any modifications to TCP senders or receivers are do not imply that any modifications to TCP senders or receivers are
obligatory. obligatory.
skipping to change at page 4, line 36 skipping to change at page 4, line 51
ECE: the ECN-Echo flag, and ECE: the ECN-Echo flag, and
NS: ECN Nonce Sum. NS: ECN Nonce Sum.
In this document, the ECN feedback scheme as specified in [RFC3168] In this document, the ECN feedback scheme as specified in [RFC3168]
is called 'classic ECN' and any new proposal is called a 'more is called 'classic ECN' and any new proposal is called a 'more
accurate ECN feedback' scheme. A 'congestion mark' is defined as an accurate ECN feedback' scheme. A 'congestion mark' is defined as an
IP packet where the CE codepoint is set. A 'congestion episode' IP packet where the CE codepoint is set. A 'congestion episode'
refers to one or more congestion marks that belong to the same refers to one or more congestion marks that belong to the same
overload situation in the network (usually during one RTT). A TCP overload situation in the network (usually during one RTT). A TCP
segment with the acknowledgment flag set is simply called an ACK. segment with the acknowledgement flag set is simply called an ACK.
2. Recap of Classic ECN and ECN Nonce in IP/TCP 2. Recap of Classic ECN and ECN Nonce in IP/TCP
ECN requires two bits in the IP header. The ECN capability of a ECN requires two bits in the IP header. The ECN capability of a
packet is indicated when either one of the two bits is set. A packet is indicated when either one of the two bits is set. A
network node can set both bits simultaneously when it experiences network node can set both bits simultaneously when it experiences
congestion. This leads to the four codepoints (not-ECT, ECT(0), congestion. This leads to the four codepoints (Not-ECT, ECT(0),
ECT(1), and CE) as listed above. ECT(1), and CE) as listed above.
In the TCP header the first two bits in byte 14 are defined as ECN In the TCP header, the first two bits in byte 14 are defined as ECN
feedback for each half-connection. A TCP receiver signals the feedback for each half-connection. A TCP receiver signals the
reception of a congestion mark using the ECN-Echo (ECE) flag in the reception of a congestion mark using the ECN-Echo (ECE) flag in the
TCP header. For reliability, the receiver continues to set the ECE TCP header. For reliability, the receiver continues to set the ECE
flag on every ACK. To enable the TCP receiver to determine when to flag on every ACK. To enable the TCP receiver to determine when to
stop setting the ECN-Echo flag, the sender sets the CWR flag upon stop setting the ECE flag, the sender sets the CWR flag upon
reception of an ECE feedback signal. This always leads to a full RTT reception of an ECE feedback signal. This always leads to a full RTT
of ACKs with ECE set. Thus the receiver cannot signal back any of ACKs with ECE set. Thus, the receiver cannot signal back any
additional CE markings arriving within the same RTT. additional CE markings arriving within the same RTT.
The ECN Nonce [RFC3540] is an experimental addition to ECN that the The ECN Nonce [RFC3540] is an experimental addition to ECN that the
TCP sender can use to protect itself against accidental or malicious TCP sender can use to protect itself against accidental or malicious
concealment of CE-marked or dropped packets. This addition defines concealment of CE-marked or dropped packets. This addition defines
the last bit of byte 13 in the TCP header as the Nonce Sum (NS) flag. the last bit of byte 13 in the TCP header as the Nonce Sum (NS) flag.
The receiver maintains a nonce sum that counts the occurrence of The receiver maintains a nonce sum that counts the occurrence of
ECT(1) packets, and signals the least significant bit of this sum on ECT(1) packets and signals the least significant bit of this sum on
the NS flag. There are no known deployments of a TCP stack that the NS flag. There are no known deployments of a TCP stack that
makes use of the ECN Nonce extension. makes use of the ECN Nonce extension.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | | N | C | E | U | A | P | R | S | F | | | | N | C | E | U | A | P | R | S | F |
| Header Length | Reserved | S | W | C | R | C | S | S | Y | I | | Header Length | Reserved | S | W | C | R | C | S | S | Y | I |
| | | | R | E | G | K | H | T | N | N | | | | | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 1: The (post-ECN Nonce) definition of the TCP header flags Figure 1: The (Post-ECN Nonce) Definition of the TCP Header Flags
An alternative for a sender to assure feedback integrity has been An alternative for a sender to assure feedback integrity has been
proposed where the sender occasionally inserts a CE mark or proposed where the sender itself occasionally inserts a CE mark or
reordering itself, and checks that the receiver feeds it back reorders packets, and checks that the receiver feeds these back
faithfully [I-D.moncaster-tcpm-rcv-cheat]. This alternative consumes faithfully [TEST-RCV]. This alternative consumes no header bits or
no header bits or codepoints, as well as releasing the ECT(1) codepoints, and it releases the ECT(1) codepoint in the IP header and
codepoint in the IP header and the NS flag in the TCP header for the NS flag in the TCP header for other uses.
other uses.
3. Use Cases 3. Use Cases
The following two examples serve to show where existing mechanisms The following two examples serve to show where existing mechanisms
would already benefit from more accurate ECN feedback information. would already benefit from more accurate ECN feedback information.
However, as it is hard to predict the future, once a more accurate However, as it is hard to predict the future, once a more accurate
ECN feedback mechanism that adheres to the requirements stated in ECN feedback mechanism that adheres to the requirements stated in
this document is widely deployed, it's very likely that additional this document is widely deployed, it's very likely that additional
uses are found. The examples listed below are in no particular uses will be found. The examples listed below are in no particular
order. order.
ConEx is an experimental approach that allows a sender to relay ConEx is an experimental approach that allows a sender to relay
congestion feedback provided by the receiver into the network along congestion feedback provided by the receiver into the network along
the forward data path. ConEx information can be used for traffic the forward data path. ConEx information can be used for traffic
management to limit traffic proportionate to the actual congestion management to limit traffic proportionate to the actual congestion
being caused, rather than limiting traffic based on rate or volume being caused, rather than limiting traffic based on rate or volume
[RFC6789]. A ConEx sender uses selective acknowledgements (SACK) [RFC6789]. A ConEx sender uses selective acknowledgements (SACK)
[RFC2018] for accurate feedback of loss signals, but currently TCP [RFC2018] for accurate feedback of loss signals, but until now TCP
offers no equivalent accurate feedback for ECN. has offered no equivalent accurate feedback for ECN.
DCTCP offers very low and predictable queuing delay. DCTCP changes DCTCP offers very low and predictable queuing delay. DCTCP changes
the reaction to congestion of a TCP sender and additionally requires the reaction to congestion of a TCP sender and additionally requires
switches/routers to have ECN enabled and configured with a low step switches/routers to have ECN enabled and configured with a low step
threshold and no signal smoothing, so it is currently only used in threshold and no signal smoothing, so it is currently only used in
private networks, e.g. internal to data centers. DCTCP was released private networks, e.g., internal to data centers. DCTCP was released
in Microsoft Windows 8, and implementations exist for Linux and in Microsoft Windows 8, and implementations exist for Linux and
FreeBSD. To retrieve sufficient congestion information, the FreeBSD. To retrieve sufficient congestion information, the
different DCTCP implementations use a proprietary ECN feedback different DCTCP implementations use a proprietary ECN feedback
protocol, but they omit capability negotiation. Moreover, the protocol, but they omit capability negotiation. Moreover, the
feedback protocol proposed in [I-D.bensley-tcpm-dctcp] only works if feedback protocol proposed in [DCTCP] only works if there are no
there are no losses at all, and otherwise it gets very confused (see losses at all, and otherwise it gets very confused (see Appendix A).
Appendix A). Therefore, if a generic more accurate ECN feedback Therefore, if a generic, more accurate ECN feedback scheme were
scheme were available, it would solve two problems for DCTCP: i) need available, it would solve two problems for DCTCP: i) the need for a
for a consistent variant of DCTCP to be deployed network-wide and ii) consistent variant of DCTCP to be deployed network-wide and ii) the
inability to cope with ACK loss. inability to cope with ACK loss.
Classic ECN-TCP would not benefit from more accurate ECN feedback, Classic ECN-TCP would not benefit from more accurate ECN feedback,
but it would not suffer either. The same signal that is currently but it would not suffer either. The same signal that is currently
conveyed with ECN following the specification given in [RFC3168] conveyed with ECN following the specification given in [RFC3168]
would be available. would be available.
The following scenarios should briefly show where accurate ECN The following scenarios should briefly show where accurate ECN
feedback is needed or adds value: feedback is needed or adds value:
A sender with standardised TCP congestion control that supports A sender with standardized TCP congestion control that supports
ConEx: ConEx:
In this case the ConEx mechanism uses the extra information In this case, the ConEx mechanism uses the extra information per
per RTT to re-echo the precise congestion information, but RTT to re-echo the precise congestion information, but the
the congestion control algorithm still ignores multiple marks congestion control algorithm still ignores multiple marks per RTT
per RTT [RFC5681]. [RFC5681].
A sender using DCTCP congestion control without ConEx: A sender using DCTCP congestion control without ConEx:
The congestion control algorithm uses the extra info per RTT The congestion control algorithm uses the extra info per RTT to
to perform its decrease depending on the number of congestion perform its decrease depending on the number of congestion marks.
marks.
A sender using DCTCP congestion control and supporting ConEx: A sender using DCTCP congestion control and supporting ConEx:
Both the congestion control algorithm and ConEx use the more Both the congestion control algorithm and ConEx use the more
accurate ECN feedback mechanism. accurate ECN feedback mechanism.
As-yet-unspecified sender mechanisms: As-yet-unspecified sender mechanisms:
The above are two examples of more general interest in sender The above are two examples of more general interest in sender
mechanisms that respond to the extent of congestion feedback, mechanisms that respond to the extent of congestion feedback, not
not just its existence. It will greatly simplify incremental just its existence. It will greatly simplify incremental
deployment if the sender can unilaterally deploy new deployment if the sender can unilaterally deploy new behaviours
behaviours, and rely on the presence of generic receivers and rely on the presence of generic receivers that have already
that have already implemented more accurate feedback. implemented more accurate feedback.
An RFC5681 TCP sender without ConEx: A TCP sender using congestion control as specified in RFC 5681
No accurate feedback is necessary here. The congestion without ConEx:
control algorithm still reacts to only one signal per RTT. No accurate feedback is necessary here. The congestion control
But it is best to feed back all the information the receiver algorithm still reacts to only one signal per RTT. But, it is
gets, whether the sender uses it or not -- at least as long best to feed back all the information the receiver gets, whether
as overhead is low or zero. or not the sender uses it -- at least as long as overhead is low
or zero.
Using CE for checking integrity: Using CE for checking integrity:
If a more accurate ECN feedback scheme feeds all occurrences If a more accurate ECN feedback scheme feeds all occurrences of CE
of CE marks back, a sender could perform integrity checking marks back, a sender could perform integrity checking by
by occasionally injecting CE marks itself. Specifically, a occasionally injecting CE marks itself. Specifically, a sender
sender can send packets which it randomly marks with CE (at can send packets that it randomly marks with CE (at low
low frequency), then check if feedback is received for these frequency), then check if feedback is received for these packets.
packets. The congestion notification feedback for these The congestion notification feedback for these self-injected
self-injected markings, would not require a congestion markings would not require a congestion control reaction
control reaction [I-D.moncaster-tcpm-rcv-cheat]. [TEST-RCV].
4. Requirements 4. Requirements
The requirements of the accurate ECN feedback protocol are to have The requirements of the accurate ECN feedback protocol are to have
fairly accurate (not necessarily perfect), timely and protected fairly accurate (not necessarily perfect), timely, and protected
signaling. This leads to the following requirements, which should be signalling. This leads to the following requirements, which should
discussed for any proposed more accurate ECN feedback scheme: be discussed for any proposed more accurate ECN feedback scheme:
Resilience Resilience
The ECN feedback signal is carried within the ACK. Pure TCP The ECN feedback signal is carried within the ACK. Pure TCP ACKs
ACKs can get lost without recovery (not just due to can get lost without recovery (not just due to congestion but also
congestion, but also due to deliberate ACK thinning). due to deliberate ACK thinning). Moreover, delayed ACKs are
Moreover, delayed ACKs are commonly used with TCP. commonly used with TCP. Typically, an ACK is triggered after two
Typically, an ACK is triggered after two data segments (or data segments (or more, e.g., due to receive segment coalescing,
more e.g., due to receive segment coalescing, ACK ACK compression, ACK congestion control [RFC5690], or other
compression, ACK congestion control [RFC5690] or other phenomena; see [RFC3449]). In a high-congestion situation where
phenomena, see [RFC3449]). In a high congestion situation most of the packets are marked with CE, an accurate feedback
where most of the packets are marked with CE, an accurate mechanism should still be able to signal sufficient congestion
feedback mechanism should still be able to signal sufficient information. Thus, the accurate ECN feedback extension has to
congestion information. Thus the accurate ECN feedback take delayed ACKs and ACK loss into account. Also, a more
extension has to take delayed ACKs and ACK loss into account. accurate feedback protocol should still provide more accurate
Also, a more accurate feedback protocol should still provide feedback than classic ECN when delayed ACKs cover more than two
more accurate feedback than classic ECN when delayed ACKs segments, or when a thin stream disables Nagle's algorithm
cover more than two segments, or when a thin stream disables [RFC896]. Finally, the feedback mechanism should not be impacted
Nagle's algorithm [RFC0896]. Finally, the feedback mechanism by reordering of ACKs, even when the ACKed sequence number does
should not be impacted by reordering of ACKs, even when the not increase.
ACK'ed sequence number does not increase.
Timeliness Timeliness
A CE mark can be induced by the sending host, or more A CE mark can be induced by the sending host, or more commonly a
commonly a network node on the transmission path, and is then network node on the transmission path, and is then echoed by the
echoed by the receiver in the TCP ACK. Thus when this receiver in the TCP ACK. Thus, when this information arrives at
information arrives at the sender, it is naturally already the sender, it is naturally already about one RTT old. With a
about one RTT old. With a sufficient ACK rate a further sufficient ACK rate, a further delay of a small number of packets
delay of a small number of packets can be tolerated. can be tolerated. However, this information will become stale
However, this information will become stale with large with large delays, given the dynamic nature of networks. TCP
delays, given the dynamic nature of networks. TCP congestion congestion control (which itself partly introduces these dynamics)
control (which itself partly introduces these dynamics) operates on a time scale of one RTT. Thus, to be timely,
operates on a time scale of one RTT. Thus, to be timely, congestion feedback information should be delivered within about
congestion feedback information should be delivered within one RTT.
about one RTT.
Integrity Integrity
The integrity of the feedback in a more accurate ECN feedback The integrity of the feedback in a more accurate ECN feedback
scheme should be assured, at least as well as the ECN Nonce. scheme should be assured, at least as well as the ECN Nonce.
Alternatively, it should at least be possible to give strong Alternatively, it should at least be possible to give strong
incentives for the receiver and network nodes to cooperate incentives for the receiver and network nodes to cooperate
honestly. honestly.
Given there are known problems with ECN Nonce deployment, Given there are known problems with ECN Nonce deployment, this
this document only requires that the integrity of the more document only requires that the integrity of the more accurate ECN
accurate ECN feedback can be assured; it does not require feedback can be assured; it does not require that the ECN Nonce
that the ECN Nonce mechanism is employed to achieve this. mechanism is employed to achieve this. Indeed, if integrity could
Indeed, if integrity could be provided else-wise, a more be provided in another manner, a more accurate ECN feedback
accurate ECN feedback protocol might re-purpose the nonce sum protocol might repurpose the nonce sum (NS) flag in the TCP
(NS) flag in the TCP header. header.
If the more accurate ECN feedback scheme provides sufficient If the more accurate ECN feedback scheme provides sufficient
information, the integrity check could e.g. be performed by information, the integrity check could be performed by, e.g.,
deterministically setting the CE in the sender and monitoring deterministically setting the CE in the sender and monitoring the
the respective feedback (similar to ECT(1) and the ECN Nonce respective feedback (similar to ECT(1) and the ECN Nonce sum).
sum). Whether a sender should enforce when it detects wrong Whether a sender should enforce when it detects wrong feedback
feedback information, and what kind of enforcement it should information, and what kind of enforcement it should apply, are
apply, are policy issues that need not be specified as part policy issues that need not be specified as part of the more
of more accurate ECN feedback signal scheme itself, but accurate ECN feedback signal scheme itself, but rather when
rather when specifying an update to core TCP mechanisms like specifying an update to core TCP mechanisms like congestion
congestion control that makes use of the more accurate ECN control that make use of the more accurate ECN signal.
signal.
Accuracy Accuracy
Classic ECN feeds back one congestion notification per RTT, Classic ECN feeds back one congestion notification per RTT; this
which is sufficient for classic TCP congestion control which is sufficient for classic TCP congestion control, which reduces
reduces the sending rate at most once per RTT. Thus the more the sending rate at most once per RTT. Thus, the more accurate
accurate ECN feedback scheme should ensure that, if a ECN feedback scheme should ensure that, if a congestion episode
congestion episode occurs, at least one congestion occurs, at least one congestion notification is echoed and
notification is echoed and received per RTT as classic ECN received per RTT as classic ECN would do. Of course, the goal of
would do. Of course, the goal of a more accurate ECN a more accurate ECN extension is to reconstruct the number of CE
extension is to reconstruct the number of CE markings more markings more accurately. In the best case, the new scheme should
accurately. In the best case the new scheme should even even allow reconstruction of the exact number of payload bytes
allow reconstruction of the exact number of payload bytes that a CE-marked packet was carrying. However, it is accepted
that a CE marked packet was carrying. However, it is that it may be too complex for a sender to get the exact number of
accepted that it may be too complex for a sender to get the congestion markings or marked bytes in all situations. Ideally,
exact number of congestion markings or marked bytes in all the feedback scheme should preserve the order in which any (of the
situations. Ideally, the feedback scheme should preserve the four) ECN signals were received. And, ideally, it would even be
order in which any (of the four) ECN signals were received. possible for the sender to determine which of the packets covered
And, ideally, it would even be possible for the sender to by one delayed ACK were congestion marked, e.g., if the flow
determine which of the packets covered by one delayed ACK consists of packets of different sizes, or to allow for future
were congestion marked, e.g. if the flow consists of packets protocols where the order of the markings may be important.
of different sizes, or to allow for future protocols where
the order of the markings may be important.
In the best case, a sender that sees more accurate ECN In the best case, a sender that sees more accurate ECN feedback
feedback information would be able to reconstruct the information would be able to reconstruct the occurrence of any of
occurrence of any of the four code points (non-ECT, CE, the four codepoints (Not-ECT, CE, ECT(0), ECT(1)). However,
ECT(0), ECT(1)). However, assuming the sender marks all data assuming the sender marks all data packets as ECN-capable and uses
packets as ECN-capable and uses a default setting of ECT(0) a default setting of ECT(0) (as with [RFC3168]), solely feeding
(as with [RFC3168], solely feeding back the occurrence of CE back the occurrence of CE and ECT(1) might be sufficient. Because
and ECT(1) might be sufficient. Because the sender can keep the sender can keep account of the transmitted segments with any
account of the transmitted segments with any of the three ECN of the three ECN codepoints, conveying any two of these back to
codepoints, conveying any two of these back to the sender is the sender is sufficient for it to reconstruct the third as
sufficient for it to reconstruct the third as observed by the observed by the receiver. Thus, a more accurate ECN feedback
receiver. Thus a more accurate ECN feedback scheme should at scheme should at least provide information on two of these
least provide information on two of these signals, e.g. CE signals, e.g., CE and ECT(1).
and ECT(1).
If a more accurate ECN scheme can reliably deliver feedback If a more accurate ECN scheme can reliably deliver feedback in
in most but not all circumstances, ideally the scheme should most but not all circumstances, ideally the scheme should at least
at least not introduce bias. In other words, undetected loss not introduce bias. In other words, undetected loss of some ACKs
of some ACKs should be as likely to increase as decrease the should be as likely to increase as decrease the sender's estimate
sender's estimate of the probability of ECN marking. of the probability of ECN marking.
Complexity Complexity
Implementation should be as simple as possible and only a Implementation should be as simple as possible, and only a minimum
minimum of additional state information should be needed. of additional state information should be needed. This will
This will enable more accurate ECN feedback to be used as the enable more accurate ECN feedback to be used as the default
default feedback mechanism, even if only one ECN feedback feedback mechanism, even if only one ECN feedback signal per RTT
signal per RTT is needed. is needed.
Overhead Overhead
A more accurate ECN feedback signal should limit the A more accurate ECN feedback signal should limit the additional
additional network load, because ECN feedback is ultimately network load, because ECN feedback is ultimately not critical
not critical information (in the worst case, loss will still information (in the worst case, loss will still be available as a
be available as a congestion signal of last resort). As congestion signal of last resort). As feedback information has to
feedback information has to be provided frequently and in a be provided frequently and in a timely fashion, potentially all or
timely fashion, potentially all or a large fraction of TCP a large fraction of TCP acknowledgements might carry this
acknowledgments might carry this information. Ideally, no information. Ideally, no additional segments should be exchanged
additional segments should be exchanged compared to an compared to a TCP session as specified in RFC 3168, and the
RFC3168 TCP session, and the overhead in each segment should overhead in each segment should be minimized.
be minimized.
Backward and forward compatibility Backward and forward compatibility
Given more accurate ECN feedback will involve a change to the Given more accurate ECN feedback will involve a change to the TCP
TCP protocol, it should be negotiated between the two TCP protocol, it should be negotiated between the two TCP endpoints.
endpoints. If either end does not support the more accurate If either end does not support the more accurate feedback, they
feedback, they should both be able to fall-back to classic should both be able to fall back to classic ECN feedback.
ECN feedback.
A more accurate ECN feedback extension should aim to traverse A more accurate ECN feedback extension should aim to traverse most
most middleboxes, including firewalls and network address middleboxes, including firewalls and Network Address Translators
translators (NAT). Further, a feedback mechanism should (NATs). Further, a feedback mechanism should provide a method to
provide a method to fall back to classic ECN signaling if the fall back to classic ECN signalling if the new signal is
new signal is suppressed by certain middleboxes. suppressed by certain middleboxes.
In order to avoid a fork in the TCP protocol specifications, In order to avoid a fork in the TCP protocol specifications, if
if experiments with the new ECN feedback protocol are experiments with the new ECN feedback protocol are successful, the
successful, it is intended to eventually update RFC3168 for intention is to eventually update RFC 3168 for any TCP/ECN sender,
any TCP/ECN sender, not just for ConEx or DCTCP senders. not just for ConEx or DCTCP senders. Then, future senders will be
Then future senders will be able to unilaterally deploy new able to unilaterally deploy new behaviours that exploit the
behaviours that exploit the existence of more accurate ECN existence of more accurate ECN feedback in receivers (forward
feedback in receivers (forward compatibility). Conversely, compatibility). Conversely, even if another sender only needs one
even if another sender only needs one ECN feedback signal per ECN feedback signal per RTT, it should be able to use more
RTT, it should be able to use more accurate ECN feedback, and accurate ECN feedback and simply ignore the excess information.
simply ignore the excess information.
Furthermore, the receiver should not make assumptions about the Furthermore, the receiver should not make assumptions about the
mechanism that was used to set the markings nor about any mechanism that was used to set the markings nor about any
interpretation or reaction to the congestion signal. The receiver interpretation or reaction to the congestion signal. The receiver
only needs to faithfully reflect congestion information back to the only needs to faithfully reflect congestion information back to the
sender. sender.
5. Design Approaches 5. Design Approaches
This section introduces some possible TCP ECN feedback design This section introduces some possible design approaches for TCP ECN
approaches. The purpose of this section is to give examples of how feedback. The purpose of this section is to give examples of how
trade-offs might be needed between the requirements, as input to trade-offs might be needed between the requirements, as input to
future IETF work to specify a protocol. The order is not significant future IETF work to specify a protocol. The order is not
and there is no intention to endorse any particular approach. significant, and there is no intention to endorse any particular
approach.
All approaches presented below (and proposed so far) are able to All approaches presented below (and proposed so far) are able to
provide accurate ECN feedback information as long as no ACK loss provide accurate ECN feedback information as long as no ACK loss
occurs and the congestion rate is reasonable. In the case of a high occurs and the congestion rate is reasonable. In the case of a high
ACK loss rate or very high congestion (CE marking) rate, the proposed ACK loss rate or very high congestion (CE-marking) rate, the proposed
schemes have different resilience characteristics depending on the schemes have different resilience characteristics depending on the
number of bits used for the encoding. While classic ECN provides number of bits used for the encoding. While classic ECN provides
reliable (but inaccurate) feedback of a maximum of one congestion reliable (but inaccurate) feedback of a maximum of one congestion
signal per RTT, the proposed schemes do not implement an explicit signal per RTT, the proposed schemes do not implement an explicit
acknowledgement mechanism for the feedback (as e.g. the ECE / CWR acknowledgement mechanism for the feedback (as, e.g., the ECE/CWR
exchange of [RFC3168]). exchange of [RFC3168]).
5.1. Re-Definition of ECN/NS Header Bits 5.1. Redefinition of ECN/NS Header Bits
Schemes in this category can additionally use the NS bit for Schemes in this category can additionally use the NS bit for
capability negotiation during the TCP handshake exchange. Thus a capability negotiation during the TCP handshake exchange. Thus a
more accurate ECN could be negotiated without changing the classic more accurate ECN could be negotiated without changing the classic
ECN negotiation and thus being backwards compatible. ECN negotiation and thus being backwards compatible.
Schemes in this category can simply re-define the ECN header flags, Schemes in this category can simply redefine the ECN header flags,
ECE and CWR, to encode the occurrence of a CE marking at the ECE and CWR, to encode the occurrence of a CE marking at the
receiver. This approach provides very limited resilience against receiver. This approach provides very limited resilience against
loss of ACK, particularly pure ACKs (no payload and therefore loss of ACK, particularly pure ACKs (no payload and therefore
delivered unreliably). delivered unreliably).
A couple of schemes have been proposed so far: A couple of schemes have been proposed so far:
o A naive one-bit scheme that sends one ECE for each CE received o A naive 1-bit scheme that sends one ECE for each CE received could
could use CWR to increase robustness against ACK loss by use CWR to increase robustness against ACK loss by introducing
introducing redundant information on the next ACK, but this is redundant information on the next ACK, but this is still
still vulnerable to ACK loss. vulnerable to ACK loss.
o The scheme defined for DCTCP [I-D.bensley-tcpm-dctcp], which o The scheme defined for DCTCP [DCTCP], which toggles the ECE
toggles the ECE feedback on an immediate ACK whenever the CE feedback on an immediate ACK whenever the CE marking changes, and
marking changes, and otherwise feeds back delayed ACKs with the otherwise feeds back delayed ACKs with the ECE value unchanged.
ECE value unchanged. Appendix A demonstrates that this scheme is Appendix A demonstrates that this scheme is still ambiguous to the
still ambiguous to the sender if the ACKs are pure ACKs, and if sender if the ACKs are pure ACKs, and if some may have been lost.
some may have been lost.
Alternatively, the receiver uses the three ECN/NS header flags, ECE, Alternatively, the receiver uses the three ECN/NS header flags, ECE,
CWR and NS to represent a counter that signals the accumulated number CWR, and NS, to represent a counter that signals the accumulated
of CE markings it has received. Resilience against loss is better number of CE markings it has received. Resilience against loss is
than the flag-based schemes, but may not suffice in the presence of better than the flag-based schemes but may not suffice in the
extended ACK loss that otherwise would not affect the TCP sender's presence of extended ACK loss that otherwise would not affect the TCP
performance. sender's performance.
A number of coding schemes have been proposed so far in this A number of coding schemes have been proposed so far in this
category: category:
o A 3-bit counter scheme continuously feeds back the three least o A 3-bit counter scheme continuously feeds back the three least
significant bits of a CE counter; significant bits of a CE counter;
o A scheme that defines a standardised lookup table to map the 8 o A scheme that defines a standardized lookup table to map the eight
codepoints onto either a CE counter or an ECT(1) counter. codepoints onto either a CE counter or an ECT(1) counter.
These proposed schemes provide accumulated information on ECN-CE These proposed schemes provide accumulated information on CE marking
marking feedback, similar to the number of acknowledged bytes in the feedback, similar to the number of acknowledged bytes in the TCP
TCP header. Due to the limited number of bits the ECN feedback header. Due to the limited number of bits, the ECN feedback
information will wrap much more often than the acknowledgement field. information will wrap much more often than the acknowledgement field.
Thus feedback information could be lost due to a relatively small Thus, feedback information could be lost due to a relatively small
sequence of pure-ACK losses. Resilience could be increased by sequence of pure-ACK losses. Resilience could be increased by
introducing redundancy, e.g. send each counter increase two or more introducing redundancy, e.g., send each counter increase two or more
times. Of course any of these additional mechanisms will increase times. Of course, any of these additional mechanisms will increase
the complexity. If the congestion rate is greater than the ACK rate the complexity. If the congestion rate is greater than the ACK rate
(multiplied by the number of congestion marks that can be signaled (multiplied by the number of congestion marks that can be signaled
per ACK), the congestion information cannot correctly be fed back. per ACK), the congestion information cannot correctly be fed back.
Covering the worst case where every packet is CE marked can Covering the worst case (where every packet is CE marked) can
potentially be realized by dynamically adapting the ACK rate and potentially be realized by dynamically adapting the ACK rate and
redundancy. This again increases complexity and perhaps the redundancy. This again increases complexity and perhaps the
signaling overhead as well. Schemes that do not re-purpose the ECN signalling overhead as well. Schemes that do not repurpose the ECN
NS bit, could still support the ECN Nonce. NS bit could still support the ECN Nonce.
5.2. Using Other Header Bits 5.2. Using Other Header Bits
As seen in Figure 1, there are currently three unused flags in the As seen in Figure 1, there are currently three unused flags in the
TCP header. The proposed 3-bit counter or codepoint schemes could be TCP header. The proposed 3-bit counter or codepoint schemes could be
extended by one or more bits to add higher resilience against ACK extended by one or more bits to add higher resilience against ACK
loss. The relative gain would be exponentially higher resilience loss. The relative gain would be exponentially higher resilience
against ACK loss, while the respective drawbacks would remain against ACK loss, while the respective drawbacks would remain
identical. identical.
Alternatively, a new method could standardise the use of the bits in Alternatively, a new method could standardize the use of the bits in
the Urgent Pointer field (see [RFC6093]) to signal more bits of its the Urgent Pointer field (see [RFC6093]) to signal more bits of its
congestion signal counter, but only whenever it does not set the congestion signal counter, but only whenever the Urgent Flag is not
Urgent Flag. As this is often the case, resilience could be set. As this is often the case, resilience could be increased
increased without additional header overhead. without additional header overhead.
Any proposal to use such bits would need to check the likelihood that Any proposal to use such bits would need to check the likelihood that
some middleboxes might discard or 'normalize' the currently unused some middleboxes might discard or 'normalize' the currently unused
flag bits or a non-zero Urgent Pointer when the Urgent Flag is flag bits or a non-zero Urgent Pointer when the Urgent Flag is
cleared. If during experimentation certain bits have been proven to cleared. If during experimentation certain bits have been proven to
be usable, the assignment of any of these bits would then require an be usable, the assignment of any of these bits would then require an
IETF standards action. IETF standards action.
5.3. Using a TCP Option 5.3. Using a TCP Option
Alternatively, a new TCP option could be introduced, to help maintain Alternatively, a new TCP option could be introduced, to help maintain
the accuracy and integrity of ECN feedback between receiver and the accuracy and integrity of ECN feedback between receiver and
sender. Such an option could provide higher resilience and even more sender. Such an option could provide higher resilience and even more
information, perhaps as much as ECN for RTP/UDP [RFC6679], which information, e.g., as much as is provided by a proposal for SCTP that
explicitly provides the number of ECT(0), ECT(1), CE, non-ECT marked counts the number of CE marked packet [SCTP-ECN] since the last CWR
and lost packets, or as much as a proposal for SCTP that counts the was observed, or by ECN for RTP/UDP [RFC6679]. The latter explicitly
number of ECN marks [I-D.stewart-tsvwg-sctpecn] between CWR chunks. provides the total number of packets during a connection where the IP
However, deploying new TCP options has its own challenges. Moreover, ECN field is set to ECT(0), ECT(1), CE, or Not-ECT, as well as the
to actually achieve high resilience, this option would need to be number of lost packets. However, deploying new TCP options has its
carried by most or all ACKs as the receiver cannot know if and when own challenges. Moreover, to actually achieve high resilience, this
ACKs may be dropped. Thus this approach would introduce considerable option would need to be carried by most or all ACKs as the receiver
signaling overhead even though ECN feedback is not extremely critical cannot know if and when ACKs may be dropped. Thus, this approach
information (in the worst case, loss will still be available to would introduce considerable signalling overhead even though ECN
provide a strong congestion feedback signal). Whatever, such a TCP feedback is not extremely critical information (in the worst case,
option could be used in addition to a more accurate ECN feedback loss will still be available to provide a strong congestion feedback
scheme in the TCP header or in addition to classic ECN, only when signal). Nevertheless, such a TCP option could be used in addition
needed and when space is available. to a more accurate ECN feedback scheme in the TCP header or in
addition to classic ECN, only when needed and when space is
6. Acknowledgements available.
Thanks to Gorry Fairhurst for his review and for ideas on CE-based
integrity checking and to Mohammad Alizadeh for suggesting the need
to avoid bias.
Bob Briscoe was part-funded by the European Community under its
Seventh Framework Programme through the Reducing Internet Transport
Latency (RITE) project (ICT-317700) and through the Trilogy 2 project
(ICT-317756). he views expressed here are solely those of the
authors, in the context of the mentioned funding projects
7. IANA Considerations
This memo includes no request to IANA.
8. Security Considerations 6. Security Considerations
ECN feedback information must only be used if the other information ECN feedback information must only be used if the other information
contained in a received TCP segment indicates that the congestion was contained in a received TCP segment indicates that the congestion was
genuinely part of the flow and not spoofed - i.e. the normal TCP genuinely part of the flow and not spoofed. That is, the normal TCP
acceptance techniques have to be used to verify that the segment is acceptance techniques have to be used to verify that the segment is
part of the flow before returning any contained ECN information, and part of the flow before returning any contained ECN information, and,
similarly ECN feedback is only accepted on valid ACKs. similarly, ECN feedback is only accepted on valid ACKs.
Given ECN feedback is used as input for congestion control, the Given ECN feedback is used as input for congestion control, the
respective algorithm would not react appropriately if ECN feedback respective algorithm would not react appropriately if ECN feedback
were lost and the resilience mechanism to recover it was inadequate. were lost and the resilience mechanism to recover it was inadequate.
This resilience requirement is articulated in Section 4. However, it This resilience requirement is articulated in Section 4. However, it
should be noted that ECN feedback is not the last resort against should be noted that ECN feedback is not the last resort against
congestion collapse, because if there is insufficient response to congestion collapse, because if there is insufficient response to
ECN, loss will ensue, and TCP will still react appropriately to loss. ECN, loss will ensue, and TCP will still react appropriately to loss.
A receiver could suppress ECN feedback information leading to its A receiver could suppress ECN feedback information leading to its
connections consuming excess sender or network resources. This connections consuming excess sender or network resources. This
problem is similar to that seen with the classic ECN feedback scheme problem is similar to that seen with the classic ECN feedback scheme
and should be addressed by integrity checking as required in and should be addressed by integrity checking as required in
Section 4. Section 4.
9. References 7. References
9.1. Normative References 7.1. Normative References
[RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
of Explicit Congestion Notification (ECN) to IP", RFC of Explicit Congestion Notification (ECN) to IP",
3168, September 2001. RFC 3168, DOI 10.17487/RFC3168, September 2001,
<http://www.rfc-editor.org/info/rfc3168>.
[RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
Congestion Notification (ECN) Signaling with Nonces", RFC Congestion Notification (ECN) Signaling with Nonces",
3540, June 2003. RFC 3540, DOI 10.17487/RFC3540, June 2003,
<http://www.rfc-editor.org/info/rfc3540>.
9.2. Informative References
[I-D.bensley-tcpm-dctcp]
sbens@microsoft.com, s., Eggert, L., and D. Thaler,
"Microsoft's Datacenter TCP (DCTCP): TCP Congestion
Control for Datacenters", draft-bensley-tcpm-dctcp-02
(work in progress), January 2015.
[I-D.moncaster-tcpm-rcv-cheat] 7.2. Informative References
Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to
Allow Senders to Identify Receiver Non-Compliance", draft-
moncaster-tcpm-rcv-cheat-03 (work in progress), July 2014.
[I-D.stewart-tsvwg-sctpecn] [DCTCP] Bensley, S., Eggert, L., and D. Thaler, "Microsoft's
Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream Datacenter TCP (DCTCP): TCP Congestion Control for
Control Transmission Protocol (SCTP)", draft-stewart- Datacenters", Work in Progress,
tsvwg-sctpecn-05 (work in progress), January 2014. draft-bensley-tcpm-dctcp-05, July 2015.
[I-D.welzl-ecn-benefits] [ECN-BENEFITS]
Welzl, M. and G. Fairhurst, "The Benefits to Applications Fairhurst, G. and M. Welzl, "The Benefits of using
of using Explicit Congestion Notification (ECN)", draft- Explicit Congestion Notification (ECN)", Work in Progress
welzl-ecn-benefits-01 (work in progress), July 2014. draft-ietf-aqm-ecn-benefits-06, July 2015.
[RFC0896] Nagle, J., "Congestion control in IP/TCP internetworks", [RFC896] Nagle, J., "Congestion Control in IP/TCP Internetworks",
RFC 896, January 1984. RFC 896, DOI 10.17487/RFC0896, January 1984,
<http://www.rfc-editor.org/info/rfc896>.
[RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
Selective Acknowledgment Options", RFC 2018, October 1996. Selective Acknowledgment Options", RFC 2018,
DOI 10.17487/RFC2018, October 1996,
<http://www.rfc-editor.org/info/rfc2018>.
[RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M.
Sooriyabandara, "TCP Performance Implications of Network Sooriyabandara, "TCP Performance Implications of Network
Path Asymmetry", BCP 69, RFC 3449, December 2002. Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449,
December 2002, <http://www.rfc-editor.org/info/rfc3449>.
[RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009. Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
<http://www.rfc-editor.org/info/rfc5681>.
[RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding
Acknowledgement Congestion Control to TCP", RFC 5690, Acknowledgement Congestion Control to TCP", RFC 5690,
February 2010. DOI 10.17487/RFC5690, February 2010,
<http://www.rfc-editor.org/info/rfc5690>.
[RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the [RFC6093] Gont, F. and A. Yourtchenko, "On the Implementation of the
TCP Urgent Mechanism", RFC 6093, January 2011. TCP Urgent Mechanism", RFC 6093, DOI 10.17487/RFC6093,
January 2011, <http://www.rfc-editor.org/info/rfc6093>.
[RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
and K. Carlberg, "Explicit Congestion Notification (ECN) and K. Carlberg, "Explicit Congestion Notification (ECN)
for RTP over UDP", RFC 6679, August 2012. for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
2012, <http://www.rfc-editor.org/info/rfc6679>.
[RFC6789] Briscoe, B., Woundy, R., and A. Cooper, "Congestion [RFC6789] Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed.,
Exposure (ConEx) Concepts and Use Cases", RFC 6789, "Congestion Exposure (ConEx) Concepts and Use Cases",
December 2012. RFC 6789, DOI 10.17487/RFC6789, December 2012,
<http://www.rfc-editor.org/info/rfc6789>.
[SCTP-ECN] Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
Control Transmission Protocol (SCTP)", Work in Progress,
draft-stewart-tsvwg-sctpecn-05, January 2014.
[TEST-RCV] Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to
Allow Senders to Identify Receiver Non-Compliance", Work
in Progress, draft-moncaster-tcpm-rcv-cheat-03, July 2014.
Appendix A. Ambiguity of the More Accurate ECN Feedback in DCTCP Appendix A. Ambiguity of the More Accurate ECN Feedback in DCTCP
As defined in [I-D.bensley-tcpm-dctcp], a DCTCP receiver feeds back As defined in [DCTCP], a DCTCP receiver feeds back ECE=0 on delayed
ECE=0 on delayed ACKs as long as CE remains 0, and also immediately ACKs as long as CE remains 0, and also immediately sends an ACK with
sends an ACK with ECE=0 when CE transitions to 1. Similarly, it ECE=0 when CE transitions to 1. Similarly, it continually feeds back
continually feeds back ECE=1 on delayed ACKs while CE remains 1 and ECE=1 on delayed ACKs while CE remains 1 and immediately feeds back
immediately feeds back ECE=1 when CE transitions to 0. A sender can ECE=1 when CE transitions to 0. A sender can unambiguously decode
unambiguously decode this scheme if there is never any ACK loss, and this scheme if there is never any ACK loss, and the sender assumes
the sender assumes there will never be any ACK loss. there will never be any ACK loss.
The following two examples show that the feedback sequence becomes The following two examples show that the feedback sequence becomes
highly ambiguous to the sender, if either of these conditions is highly ambiguous to the sender if either of these conditions is
broken. Below, '0' will represent ECE=0, '1' will represent ECE=1 broken. Below, '0' represents ECE=0, '1' represents ECE=1, and '.'
and '.' will represent a gap of one segment between delayed ACKs. represents a gap of one segment between delayed ACKs. Now imagine
Now imagine that the sender receives the following sequence of that the sender receives the following sequence of feedback on three
feedback on 3 pure ACKs: pure ACKs:
0.0.0 0.0.0
When the receiver sent this sequence it could have been any of the When the receiver sent this sequence, it could have been any of the
following four sequences: following four sequences:
a. 0.0.0 (0 x CE) a. 0.0.0 (0 x CE)
b. 010.0 (1 x CE) b. 010.0 (1 x CE)
c. 0.010 (1 x CE) c. 0.010 (1 x CE)
d. 01010 (2 x CE) d. 01010 (2 x CE)
where any of the 1s represent a possible pure ACK carrying ECE where any of the 1s represent a possible pure ACK carrying ECE
feedback that could have been lost. If the sender guesses (a), it feedback that could have been lost. If the sender guesses (a), it
might be correct, or it might miss 1 or 2 congestion marks over 5 might be correct, or it might miss 1 or 2 congestion marks over 5
packets. Therefore, when confronted with this simple sequence (that packets. Therefore, when confronted with this simple sequence (that
is not contrived), a sender can guess that congestion might have been is not contrived), a sender can guess that congestion might have been
0%, 20% or 40%, but it doesn't know which. 0%, 20%, or 40%, but it doesn't know which.
Sequences with a longer gap (e.g. 0...0.0) become far more ambiguous. Sequences with a longer gap (e.g., 0...0.0) become far more
It helps a little if the sender knows the distance the receiver uses ambiguous. It helps a little if the sender knows the distance the
between delayed ACKs, and it helps a lot if the distance is 1, i.e. receiver uses between delayed ACKs, and it helps a lot if the
no delayed ACKs, but even then there will still be ambiguity whenever distance is 1, i.e., no delayed ACKs. However, even without delayed
there are pure ACK losses. ACKs there will still be ambiguity whenever there are pure ACK
losses.
Acknowledgements
Thanks to Gorry Fairhurst for his review and for ideas on CE-based
integrity checking and to Mohammad Alizadeh for suggesting the need
to avoid bias.
Bob Briscoe was partly funded by the European Community under its
Seventh Framework Programme through the Reducing Internet Transport
Latency (RITE) project (ICT-317700) and through the Trilogy 2 project
(ICT-317756). The views expressed here are solely those of the
authors, in the context of the mentioned funding projects.
Authors' Addresses Authors' Addresses
Mirja Kuehlewind (editor) Mirja Kuehlewind (editor)
ETH Zurich ETH Zurich
Gloriastrasse 35 Gloriastrasse 35
Zurich 8092 Zurich 8092
Switzerland Switzerland
Email: mirja.kuehlewind@tik.ee.ethz.ch Email: mirja.kuehlewind@tik.ee.ethz.ch
skipping to change at page 16, line 45 skipping to change at page 17, line 41
Austria Austria
Phone: +43 1 3676811 3146 Phone: +43 1 3676811 3146
Email: rs@netapp.com Email: rs@netapp.com
Bob Briscoe Bob Briscoe
BT BT
B54/77, Adastral Park B54/77, Adastral Park
Martlesham Heath Martlesham Heath
Ipswich IP5 3RE Ipswich IP5 3RE
UK United Kingdom
Phone: +44 1473 645196 Phone: +44 1473 645196
Email: bob.briscoe@bt.com Email: ietf@bobbriscoe.net
URI: http://bobbriscoe.net/ URI: http://bobbriscoe.net/
 End of changes. 96 change blocks. 
371 lines changed or deleted 367 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/