draft-ietf-tsvwg-ecn-01.txt   draft-ietf-tsvwg-ecn-02.txt 
Internet Engineering Task Force K. K. Ramakrishnan Internet Engineering Task Force K. K. Ramakrishnan
INTERNET DRAFT TeraOptic Networks INTERNET DRAFT TeraOptic Networks
draft-ietf-tsvwg-ecn-01.txt Sally Floyd draft-ietf-tsvwg-ecn-02.txt Sally Floyd
ACIRI ACIRI
D. Black D. Black
EMC EMC
January, 2001 February, 2001
Expires: July, 2001 Expires: August, 2001
The Addition of Explicit Congestion Notification (ECN) to IP The Addition of Explicit Congestion Notification (ECN) to IP
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 44 skipping to change at page 1, line 44
Abstract Abstract
This document specifies the incorporation of ECN (Explicit Congestion This document specifies the incorporation of ECN (Explicit Congestion
Notification) to TCP and IP, including ECN's use of two bits in the Notification) to TCP and IP, including ECN's use of two bits in the
IP header. We begin by describing TCP's use of packet drops as an IP header. We begin by describing TCP's use of packet drops as an
indication of congestion. Next we explain that with the addition of indication of congestion. Next we explain that with the addition of
active queue management (e.g., RED) to the Internet infrastructure, active queue management (e.g., RED) to the Internet infrastructure,
where routers detect congestion before the queue overflows, routers where routers detect congestion before the queue overflows, routers
are no longer limited to packet drops as an indication of congestion. are no longer limited to packet drops as an indication of congestion.
Routers can instead set the Congestion Experienced (CE) bit in the IP Routers can instead set the Congestion Experienced (CE) codepoint in
header of packets from ECN-capable transports. We describe when the the IP header of packets from ECN-capable transports. We describe
CE bit is to be set in routers, and describe modifications needed to when the CE codepoint is to be set in routers, and describe
TCP to make it ECN-capable. Modifications to other transport modifications needed to TCP to make it ECN-capable. Modifications to
protocols (e.g., unreliable unicast or multicast, reliable multicast, other transport protocols (e.g., unreliable unicast or multicast,
other reliable unicast transport protocols) could be considered as reliable multicast, other reliable unicast transport protocols) could
those protocols are developed and advance through the standards be considered as those protocols are developed and advance through
process. the standards process.
We also describe in this document the issues involving the use of ECN We also describe in this document the issues involving the use of ECN
within IP tunnels, and within IPsec tunnels in particular. within IP tunnels, and within IPsec tunnels in particular.
One of the guiding principles for this document is that all the One of the guiding principles for this document is that all the
mechanisms specified here are incrementally deployable. mechanisms specified here are incrementally deployable.
Table of Contents Table of Contents
1. Introduction 1. Introduction
2. Conventions and Acronyms 2. Conventions and Acronyms
3. Assumptions and General Principles 3. Assumptions and General Principles
4. Active Queue Management (AQM) 4. Active Queue Management (AQM)
5. Explicit Congestion Notification in IP 5. Explicit Congestion Notification in IP
5.1. ECN as an Indication of Persistent Congestion 5.1. ECN as an Indication of Persistent Congestion
5.2. Dropped or Corrupted Packets 5.2. Dropped or Corrupted Packets
5.3. Fragmentation
6. Support from the Transport Protocol 6. Support from the Transport Protocol
6.1. TCP 6.1. TCP
6.1.1. TCP Initialization 6.1.1 TCP Initialization
6.1.1.1. Robust TCP Initialization with an Echoed Reserve Field 6.1.1.1. Robust TCP Initialization with an Echoed Reserve Field
6.1.2. The TCP Sender 6.1.2. The TCP Sender
6.1.3. The TCP Receiver 6.1.3. The TCP Receiver
6.1.4. Congestion on the ACK-path 6.1.4. Congestion on the ACK-path
6.1.5. Retransmitted TCP packets 6.1.5. Retransmitted TCP packets
6.1.6. TCP Window Probes. 6.1.6. TCP Window Probes.
7. Non-compliance by the End Nodes 7. Non-compliance by the End Nodes
8. Non-compliance in the Network 8. Non-compliance in the Network
8.1. Complications Introduced by Split Paths 8.1. Complications Introduced by Split Paths
9. Encapsulated Packets 9. Encapsulated Packets
skipping to change at page 3, line 37 skipping to change at page 3, line 38
9.2. IPsec Tunnels 9.2. IPsec Tunnels
9.2.1. Negotiation between Tunnel Endpoints 9.2.1. Negotiation between Tunnel Endpoints
9.2.1.1. ECN Tunnel Security Association Database Field 9.2.1.1. ECN Tunnel Security Association Database Field
9.2.1.2. ECN Tunnel Security Association Attribute 9.2.1.2. ECN Tunnel Security Association Attribute
9.2.1.3. Changes to IPsec Tunnel Header Processing 9.2.1.3. Changes to IPsec Tunnel Header Processing
9.2.2. Changes to the ECN Field within an IPsec Tunnel. 9.2.2. Changes to the ECN Field within an IPsec Tunnel.
9.2.3. Comments for IPsec Support 9.2.3. Comments for IPsec Support
9.3. IP packets encapsulated in non-IP packet headers. 9.3. IP packets encapsulated in non-IP packet headers.
10. Issues Raised by Monitoring and Policing Devices 10. Issues Raised by Monitoring and Policing Devices
11. Evaluations of ECN 11. Evaluations of ECN
11.1. Related Work Evaluating ECN
11.2. A Discussion of the ECN nonce.
11.2.1. The Incremental Deployment of ECT(1) in Routers.
12. Summary of changes required in IP and TCP 12. Summary of changes required in IP and TCP
13. Conclusions 13. Conclusions
14. Acknowledgements 14. Acknowledgements
15. References 15. References
16. Security Considerations 16. Security Considerations
17. IPv4 Header Checksum Recalculation 17. IPv4 Header Checksum Recalculation
18. Possible Changes to the ECN Field in the Network 18. Possible Changes to the ECN Field in the Network
18.1. Possible Changes to the IP Header 18.1. Possible Changes to the IP Header
18.1.1. Erasing the Congestion Indication 18.1.1. Erasing the Congestion Indication
18.1.2. Falsely Reporting Congestion 18.1.2. Falsely Reporting Congestion
18.1.3. Disabling ECN-Capability 18.1.3. Disabling ECN-Capability
18.1.4. Falsely Indicating ECN-Capability 18.1.4. Falsely Indicating ECN-Capability
18.1.5. Changes with No Functional Effect
18.2. Information carried in the Transport Header 18.2. Information carried in the Transport Header
18.3. Split Paths 18.3. Split Paths
19. Implications of Subverting End-to-End Congestion Control 19. Implications of Subverting End-to-End Congestion Control
19.1. Implications for the Network and for Competing Flows 19.1. Implications for the Network and for Competing Flows
19.2. Implications for the Subverted Flow 19.2. Implications for the Subverted Flow
19.3. Non-ECN-Based Methods of Subverting End-to-end Congestion Control 19.3. Non-ECN-Based Methods of Subverting End-to-end Congestion Control
20. The Motivation for the ECT bit. 20. The Motivation for the ECT Codepoints.
20.1. The Motivation for an ECT Codepoint.
20.2. The Motivation for two ECT Codepoints.
21. Why use Two Bits in the IP Header? 21. Why use Two Bits in the IP Header?
22. Historical Definitions for the IPv4 TOS Octet 22. Historical Definitions for the IPv4 TOS Octet
23. IANA Considerations 23. IANA Considerations
RFC EDITOR - REMOVE THE FOLLOWING PARAGRAPH ON PUBLICATION - To compare RFC EDITOR - REMOVE THE FOLLOWING PARAGRAPH ON PUBLICATION - To compare
this with draft-ietf-tsvwg-ecn-00, compare the following: this with draft-ietf-tsvwg-ecn-01, compare the following:
"http://www.aciri.org/floyd/papers/draft-ietf-tsvwg-ecn-00.troff"
"http://www.aciri.org/floyd/papers/draft-ietf-tsvwg-ecn-01.troff" "http://www.aciri.org/floyd/papers/draft-ietf-tsvwg-ecn-01.troff"
Changes from draft-ietf-tsvwg-ecn-00: "http://www.aciri.org/floyd/papers/draft-ietf-tsvwg-ecn-02.troff"
* Deleted Section 6.1.1.2. on "Robust TCP Initialization with no Changes from draft-ietf-tsvwg-ecn-01:
response to the SYN", and modified the paragraph in the Conclusions Added the ECT(1) codepoint, and changed references about bits to
referring to this. references about codepoints in many places. Also added Section 11.2 on
* Added Section 23 on IANA Considerations. "A Discussion of the ECN nonce", and Section 20.2 on "The Motivation for
* Added two paragraphs to Section 18.2 on denial-of-service attacks. two ECT Codepoints".
* Added some text about the ECN nonce being a research issue. Added a paragraph saying that by default, the discussion of setting
* Moved two paragraphs about setting the CWR bit from Section 6.1.3 to the CE codepoint applies to all Differentiated Services Per-Hop
Section 6.1.2. Behaviors.
* Various small changes: Added Section 5.3 on fragmentation.
Adding several small clarifying sentences in Section 12, 22. Added "A host MUST NOT set ECT on SYN or SYN-ACK packets." to the end
Small clarification to text in Section 19.2. of Section 6.1.1, just to be explicit.
Deleted a few unnecessary sentences in Section 9. Corrected some references to "Section 19" to "Section 22".
Updated some references to Section X. Clarified that ECN is defined identically in IPv4 and in IPv6.
Added more references to RFC 2780.
Deleted references to internet-drafts.
Clarified terminology for "non-ECN-setup SYN packet", including the
following: "Receivers MUST correctly handle all forms of the non-ECN-
setup SYN and SYN-ACK packets."
1. Introduction 1. Introduction
TCP's congestion control and avoidance algorithms are based on the TCP's congestion control and avoidance algorithms are based on the
notion that the network is a black-box [Jacobson88, Jacobson90]. The notion that the network is a black-box [Jacobson88, Jacobson90]. The
network's state of congestion or otherwise is determined by end-sys- network's state of congestion or otherwise is determined by end-
tems probing for the network state, by gradually increasing the load systems probing for the network state, by gradually increasing the
on the network (by increasing the window of packets that are out- load on the network (by increasing the window of packets that are
standing in the network) until the network becomes congested and a outstanding in the network) until the network becomes congested and a
packet is lost. Treating the network as a "black-box" and treating packet is lost. Treating the network as a "black-box" and treating
loss as an indication of congestion in the network is appropriate for loss as an indication of congestion in the network is appropriate for
pure best-effort data carried by TCP, with little or no sensitivity pure best-effort data carried by TCP, with little or no sensitivity
to delay or loss of individual packets. In addition, TCP's conges- to delay or loss of individual packets. In addition, TCP's
tion management algorithms have techniques built-in (such as Fast congestion management algorithms have techniques built-in (such as
Retransmit and Fast Recovery) to minimize the impact of losses, from Fast Retransmit and Fast Recovery) to minimize the impact of losses,
a throughput perspective. However, these mechanisms are not intended from a throughput perspective. However, these mechanisms are not
to help applications that are in fact sensitive to the delay or loss intended to help applications that are in fact sensitive to the delay
of one or more individual packets. Interactive traffic such as tel- or loss of one or more individual packets. Interactive traffic such
net, web-browsing, and transfer of audio and video data can be sensi- as telnet, web-browsing, and transfer of audio and video data can be
tive to packet losses (especially when using an unreliable data sensitive to packet losses (especially when using an unreliable data
delivery transport such as UDP) or to the increased latency of the delivery transport such as UDP) or to the increased latency of the
packet caused by the need to retransmit the packet after a loss (with packet caused by the need to retransmit the packet after a loss (with
the reliable data delivery semantics provided by TCP). the reliable data delivery semantics provided by TCP).
Since TCP determines the appropriate congestion window to use by Since TCP determines the appropriate congestion window to use by
gradually increasing the window size until it experiences a dropped gradually increasing the window size until it experiences a dropped
packet, this causes the queues at the bottleneck router to build up. packet, this causes the queues at the bottleneck router to build up.
With most packet drop policies at the router that are not sensitive With most packet drop policies at the router that are not sensitive
to the load placed by each individual flow (e.g., tail-drop on queue to the load placed by each individual flow (e.g., tail-drop on queue
overflow), this means that some of the packets of latency-sensitive overflow), this means that some of the packets of latency-sensitive
flows may be dropped. In addition, such drop policies lead to syn- flows may be dropped. In addition, such drop policies lead to
chronization of loss across multiple flows. synchronization of loss across multiple flows.
Active queue management mechanisms detect congestion before the queue Active queue management mechanisms detect congestion before the queue
overflows, and provide an indication of this congestion to the end overflows, and provide an indication of this congestion to the end
nodes. Thus, active queue management can reduce unnecessary queueing nodes. Thus, active queue management can reduce unnecessary queueing
delay for all traffic sharing that queue. The advantages of active delay for all traffic sharing that queue. The advantages of active
queue management are discussed in RFC 2309 [RFC2309]. Active queue queue management are discussed in RFC 2309 [RFC2309]. Active queue
management avoids some of the bad properties of dropping on queue management avoids some of the bad properties of dropping on queue
overflow, including the undesirable synchronization of loss across overflow, including the undesirable synchronization of loss across
multiple flows. More importantly, active queue management means that multiple flows. More importantly, active queue management means that
transport protocols with mechanisms for congestion control (e.g., transport protocols with mechanisms for congestion control (e.g.,
TCP) do not have to rely on buffer overflow as the only indication of TCP) do not have to rely on buffer overflow as the only indication of
congestion. congestion.
Active queue management mechanisms may use one of several methods for Active queue management mechanisms may use one of several methods for
indicating congestion to end-nodes. One is to use packet drops, as is indicating congestion to end-nodes. One is to use packet drops, as is
currently done. However, active queue management allows the router to currently done. However, active queue management allows the router to
separate policies of queueing or dropping packets from the policies separate policies of queueing or dropping packets from the policies
for indicating congestion. Thus, active queue management allows for indicating congestion. Thus, active queue management allows
routers to use the Congestion Experienced (CE) bit in a packet header routers to use the Congestion Experienced (CE) codepoint in a packet
as an indication of congestion, instead of relying solely on packet header as an indication of congestion, instead of relying solely on
drops. This has the potential of reducing the impact of loss on packet drops. This has the potential of reducing the impact of loss
latency-sensitive flows. on latency-sensitive flows.
This document is intended to obsolete RFC 2481, "A Proposal to add This document is intended to obsolete RFC 2481, "A Proposal to add
Explicit Congestion Notification (ECN) to IP", which defined ECN as Explicit Congestion Notification (ECN) to IP", which defined ECN as
an Experimental Protocol for the Internet Community. an Experimental Protocol for the Internet Community.
RFC EDITOR - REMOVE THE FOLLOWING PARAGRAPH ON PUBLICATION - This RFC EDITOR - REMOVE THE FOLLOWING PARAGRAPH ON PUBLICATION - This
document obsoletes three subsequent internet-drafts on ECN, "IPsec document obsoletes three subsequent internet-drafts on ECN, "IPsec
Interactions with ECN", "ECN Interactions with IP Tunnels", and "TCP Interactions with ECN", "ECN Interactions with IP Tunnels", and "TCP
with ECN: The Treatment of Retransmitted Data Packets". This with ECN: The Treatment of Retransmitted Data Packets". This
document is intended largely to merge the earlier documents all into document is intended largely to merge the earlier documents all into
skipping to change at page 6, line 19 skipping to change at page 6, line 18
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
document, are to be interpreted as described in [B97]. document, are to be interpreted as described in [B97].
3. Assumptions and General Principles 3. Assumptions and General Principles
In this section, we describe some of the important design principles In this section, we describe some of the important design principles
and assumptions that guided the design choices in this proposal. and assumptions that guided the design choices in this proposal.
* Because ECN is likely to be adopted gradually, accommodating migra- * Because ECN is likely to be adopted gradually, accommodating
tion is essential. Some routers may still only drop packets to indi- migration is essential. Some routers may still only drop packets to
cate congestion, and some end-systems may not be ECN-capable. The indicate congestion, and some end-systems may not be ECN-capable. The
most viable strategy is one that accommodates incremental deployment most viable strategy is one that accommodates incremental deployment
without having to resort to "islands" of ECN-capable and non-ECN- without having to resort to "islands" of ECN-capable and non-ECN-
capable environments. capable environments.
* New mechanisms for congestion control and avoidance need to co- * New mechanisms for congestion control and avoidance need to co-
exist and cooperate with existing mechanisms for congestion control. exist and cooperate with existing mechanisms for congestion control.
In particular, new mechanisms have to co-exist with TCP's current In particular, new mechanisms have to co-exist with TCP's current
methods of adapting to congestion and with routers' current practice methods of adapting to congestion and with routers' current practice
of dropping packets in periods of congestion. of dropping packets in periods of congestion.
* Congestion may persist over different time-scales. The time scales * Congestion may persist over different time-scales. The time scales
that we are concerned with are congestion events that may last longer that we are concerned with are congestion events that may last longer
than a round-trip time. than a round-trip time.
* The number of packets in an individual flow (e.g., TCP connection * The number of packets in an individual flow (e.g., TCP connection
or an exchange using UDP) may range from a small number of packets to or an exchange using UDP) may range from a small number of packets to
quite a large number. We are interested in managing the congestion quite a large number. We are interested in managing the congestion
caused by flows that send enough packets so that they are still caused by flows that send enough packets so that they are still
active when network feedback reaches them. active when network feedback reaches them.
* Asymmetric routing is likely to be a normal occurrence in the * Asymmetric routing is likely to be a normal occurrence in the
Internet. The path (sequence of links and routers) followed by data Internet. The path (sequence of links and routers) followed by data
packets may be different from the path followed by the acknowledgment packets may be different from the path followed by the acknowledgment
packets in the reverse direction. packets in the reverse direction.
* Many routers process the "regular" headers in IP packets more effi- * Many routers process the "regular" headers in IP packets more
ciently than they process the header information in IP options. This efficiently than they process the header information in IP options.
suggests keeping congestion experienced information in the regular This suggests keeping congestion experienced information in the
headers of an IP packet. regular headers of an IP packet.
* It must be recognized that not all end-systems will cooperate in * It must be recognized that not all end-systems will cooperate in
mechanisms for congestion control. However, new mechanisms shouldn't mechanisms for congestion control. However, new mechanisms shouldn't
make it easier for TCP applications to disable TCP congestion con- make it easier for TCP applications to disable TCP congestion
trol. The benefit of lying about participating in new mechanisms control. The benefit of lying about participating in new mechanisms
such as ECN-capability should be small. such as ECN-capability should be small.
4. Active Queue Management (AQM) 4. Active Queue Management (AQM)
Random Early Detection (RED) is one mechanism for Active Queue Man- Random Early Detection (RED) is one mechanism for Active Queue
agement (AQM) that has been proposed to detect incipient congestion Management (AQM) that has been proposed to detect incipient
[FJ93], and is currently being deployed in the Internet [RFC2309]. congestion [FJ93], and is currently being deployed in the Internet
AQM is meant to be a general mechanism using one of several alterna- [RFC2309]. AQM is meant to be a general mechanism using one of
tives for congestion indication, but in the absence of ECN, AQM is several alternatives for congestion indication, but in the absence of
restricted to using packet drops as a mechanism for congestion indi- ECN, AQM is restricted to using packet drops as a mechanism for
cation. AQM drops packets based on the average queue length exceed- congestion indication. AQM drops packets based on the average queue
ing a threshold, rather than only when the queue overflows. However, length exceeding a threshold, rather than only when the queue
because AQM may drop packets before the queue actually overflows, AQM overflows. However, because AQM may drop packets before the queue
is not always forced by memory limitations to discard the packet. actually overflows, AQM is not always forced by memory limitations to
discard the packet.
AQM can set a Congestion Experienced (CE) bit in the packet header AQM can set a Congestion Experienced (CE) codepoint in the packet
instead of dropping the packet, when such a bit is provided in the IP header instead of dropping the packet, when such a field is provided
header and understood by the transport protocol. The use of the CE in the IP header and understood by the transport protocol. The use
bit with ECN allows the receiver(s) to receive the packet, avoiding of the CE codepoint with ECN allows the receiver(s) to receive the
the potential for excessive delays due to retransmissions after packet, avoiding the potential for excessive delays due to
packet losses. We use the term 'CE packet' to denote a packet that retransmissions after packet losses. We use the term 'CE packet' to
has the CE bit set. denote a packet that has the CE codepoint set.
5. Explicit Congestion Notification in IP 5. Explicit Congestion Notification in IP
This document specifies that the Internet provide a congestion indi- This document specifies that the Internet provide a congestion
cation for incipient congestion (as in RED and earlier work [RJ90]) indication for incipient congestion (as in RED and earlier work
where the notification can sometimes be through marking packets [RJ90]) where the notification can sometimes be through marking
rather than dropping them. This uses an ECN field in the IP header packets rather than dropping them. This uses an ECN field in the IP
with two bits. The ECN-Capable Transport (ECT) bit is set by the header with two bits, making four ECN codepoints, '00' to '11'. The
ECN-Capable Transport (ECT) codepoints '10' and '01' are set by the
data sender to indicate that the end-points of the transport protocol data sender to indicate that the end-points of the transport protocol
are ECN-capable. The CE bit is set by the router to indicate conges- are ECN-capable; we call them ECT(0) and ECT(1) respectively. The
tion to the end nodes. Routers that have a packet arriving at a full phrase "the ECT codepoint" in this documents refers to either of the
queue drop the packet, just as they do in the absence of ECN. two ECT codepoints. Routers treat the ECT(0) and ECT(1) codepoints
as equivalent. Senders are free to use either the ECT(0) or the
ECT(1) codepoint to indicate ECT, on a packet-by-packet basis.
Bits 6 and 7 in the IPv4 TOS octet are designated as the ECN field. The use of both the two codepoints for ECT, ECT(0) and ECT(1), is
Bit 6 is designated as the ECT bit, and bit 7 is designated as the CE motivated primarily by the desire to allow mechanisms for the data
bit. The IPv4 TOS octet corresponds to the Traffic Class octet in sender to verify that network elements are not erasing the CE
IPv6. The definitions for the IPv4 TOS octet [RFC791] and the IPv6 codepoint, and that data receivers are properly reporting to the
Traffic Class octet have been superseded by the six-bit DS (Differen- sender the receipt of packets with the CE codepoint set, as required
tiated Services) Field [RFC2474, RFC2780]. Bits 6 and 7 are listed by the transport protocol. Guidelines for the senders and receivers
in [RFC2474] as Currently Unused, and are specified in RFC 2780 as to differentiate between the ECT(0) and ECT(1) codepoints will be
approved for experimental use for ECN. Section 19 gives a brief his- addressed in separate documents, for each transport protocol. In
tory of the TOS octet. particular, this document does not address mechanisms for TCP end-
nodes to differentiate between the ECT(0) and ECT(1) codepoints.
Protocols and senders that only require a single ECT codepoint SHOULD
use ECT(0).
The not-ECT codepoint '00' indicates a packet that is not using ECN.
The CE codepoint '11' is set by a router to indicate congestion to
the end nodes. Routers that have a packet arriving at a full queue
drop the packet, just as they do in the absence of ECN.
+-----+-----+
| ECN FIELD |
+-----+-----+
ECT CE The ECT and CE bits defined in RFC 2481.
0 0 Not-ECT
0 1 ECT(1)
1 0 ECT(0)
1 1 CE
Figure 1: The ECN Field in IP.
The use of two ECT codepoints essentially gives a one-bit ECN nonce
in packet headers, and routers necessarily "erase" the nonce when
they set the CE codepoint [SCWA99]. For example, routers that erased
the CE codepoint would face additional difficulty in reconstructing
the original nonce, and thus repeated erasure of the CE codepoint
would be more likely to be detected by the end-nodes. The ECN nonce
also can address the problem of misbehaving transport receivers lying
to the transport sender about whether or not the CE codepoint was set
in a packet. The motivations for the use of two ECT codepoints is
discussed in more detail in Section 20, along with some discussion of
alternate possibilities for the fourth ECT codepoint. Backwards
compatibility with earlier ECN implementations that do not understand
the ECT(1) codepoint is discussed in Section 11.
In RFC 2481 [RFC2481], the ECN field was divided into the ECN-Capable
Transport (ECT) bit and the CE bit. The ECN field with only the ECN-
Capable Transport (ECT) bit set in RFC 2481 corresponds to the ECT(0)
codepoint in this document, and the ECN field with both the ECT and
CE bit in RFC 2481 corresponds to the CE codepoint in this document.
The '01' codepoint was left undefined in RFC 2481, and this is the
reason for recommending the use of ECT(0) when only a single ECT
codepoint is needed.
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
| DS FIELD | ECN FIELD | | DS FIELD, DSCP | ECN FIELD |
| | |
| DSCP | ECT | CE |
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
DSCP: differentiated services codepoint DSCP: differentiated services codepoint
ECN: Explicit Congestion Notification ECN: Explicit Congestion Notification
Figure 1: The Differentiated Services and ECN Fields in IP. Figure 2: The Differentiated Services and ECN Fields in IP.
Bits 6 and 7 in the IPv4 TOS octet are designated as the ECN field.
The IPv4 TOS octet corresponds to the Traffic Class octet in IPv6,
and the ECN field is defined identically in both cases. The
definitions for the IPv4 TOS octet [RFC791] and the IPv6 Traffic
Class octet have been superseded by the six-bit DS (Differentiated
Services) Field [RFC2474, RFC2780]. Bits 6 and 7 are listed in
[RFC2474] as Currently Unused, and are specified in RFC 2780 as
approved for experimental use for ECN. Section 22 gives a brief
history of the TOS octet.
Because of the unstable history of the TOS octet, the use of the ECN Because of the unstable history of the TOS octet, the use of the ECN
field as specified in this document cannot be guaranteed to be back- field as specified in this document cannot be guaranteed to be
wards compatible with all past uses of these two bits. The potential backwards compatible with those past uses of these two bits that pre-
dangers of this lack of backwards compatibility are discussed in Sec- date ECN. The potential dangers of this lack of backwards
tion 19. compatibility are discussed in Section 22.
Upon the receipt by an ECN-Capable transport of a single CE packet, Upon the receipt by an ECN-Capable transport of a single CE packet,
the congestion control algorithms followed at the end-systems MUST be the congestion control algorithms followed at the end-systems MUST be
essentially the same as the congestion control response to a *single* essentially the same as the congestion control response to a *single*
dropped packet. For example, for ECN-Capable TCP the source TCP is dropped packet. For example, for ECN-Capable TCP the source TCP is
required to halve its congestion window for any window of data con- required to halve its congestion window for any window of data
taining either a packet drop or an ECN indication. containing either a packet drop or an ECN indication.
One reason for requiring that the congestion-control response to the One reason for requiring that the congestion-control response to the
CE packet be essentially the same as the response to a dropped packet CE packet be essentially the same as the response to a dropped packet
is to accommodate the incremental deployment of ECN in both end-sys- is to accommodate the incremental deployment of ECN in both end-
tems and in routers. Some routers may drop ECN-Capable packets systems and in routers. Some routers may drop ECN-Capable packets
(e.g., using the same AQM policies for congestion detection) while (e.g., using the same AQM policies for congestion detection) while
other routers set the CE bit, for equivalent levels of congestion. other routers set the CE codepoint, for equivalent levels of
Similarly, a router might drop a non-ECN-Capable packet but set the congestion. Similarly, a router might drop a non-ECN-Capable packet
CE bit in an ECN-Capable packet, for equivalent levels of congestion. but set the CE codepoint in an ECN-Capable packet, for equivalent
If there were different congestion control responses to a CE bit levels of congestion. If there were different congestion control
indication than to a packet drop, this could result in unfair treat- responses to a CE codepoint than to a packet drop, this could result
ment for different flows. in unfair treatment for different flows.
An additional goal is that the end-systems should react to congestion An additional goal is that the end-systems should react to congestion
at most once per window of data (i.e., at most once per round-trip at most once per window of data (i.e., at most once per round-trip
time), to avoid reacting multiple times to multiple indications of time), to avoid reacting multiple times to multiple indications of
congestion within a round-trip time. congestion within a round-trip time.
For a router, the CE bit of an ECN-Capable packet should only be set For a router, the CE codepoint of an ECN-Capable packet SHOULD only
if the router would otherwise have dropped the packet as an indica- be set if the router would otherwise have dropped the packet as an
tion of congestion to the end nodes. When the router's buffer is not indication of congestion to the end nodes. When the router's buffer
yet full and the router is prepared to drop a packet to inform end is not yet full and the router is prepared to drop a packet to inform
nodes of incipient congestion, the router should first check to see end nodes of incipient congestion, the router should first check to
if the ECT bit is set in that packet's IP header. If so, then see if the ECT codepoint is set in that packet's IP header. If so,
instead of dropping the packet, the router MAY instead set the CE bit then instead of dropping the packet, the router MAY instead set the
in the IP header. CE codepoint in the IP header.
An environment where all end nodes were ECN-Capable could allow new An environment where all end nodes were ECN-Capable could allow new
criteria to be developed for setting the CE bit, and new congestion criteria to be developed for setting the CE codepoint, and new
control mechanisms for end-node reaction to CE packets. However, congestion control mechanisms for end-node reaction to CE packets.
this is a research issue, and as such is not addressed in this docu- However, this is a research issue, and as such is not addressed in
ment. this document.
When a CE packet (i.e., a packet that has the CE bit set) is received When a CE packet (i.e., a packet that has the CE codepoint set) is
by a router, the CE bit is left unchanged, and the packet is trans- received by a router, the CE codepoint is left unchanged, and the
mitted as usual. When severe congestion has occurred and the router's packet is transmitted as usual. When severe congestion has occurred
queue is full, then the router has no choice but to drop some packet and the router's queue is full, then the router has no choice but to
when a new packet arrives. We anticipate that such packet losses drop some packet when a new packet arrives. We anticipate that such
will become relatively infrequent when a majority of end-systems packet losses will become relatively infrequent when a majority of
become ECN-Capable and participate in TCP or other compatible conges- end-systems become ECN-Capable and participate in TCP or other
tion control mechanisms. In an ECN-Capable environment that is ade- compatible congestion control mechanisms. In an ECN-Capable
quately-provisioned network, packet losses should occur primarily environment that is adequately-provisioned, packet losses should
during transients or in the presence of non-cooperating sources. occur primarily during transients or in the presence of non-
cooperating sources.
We expect that routers will set the CE bit in response to incipient The above discussion of when CE may be set instead of dropping a
congestion as indicated by the average queue size, using the RED packet applies by default to all Differentiated Services Per-Hop
algorithms suggested in [FJ93, RFC2309]. To the best of our knowl- Behaviors (PHBs) [RFC 2475]. Specifications for PHBs MAY provide
edge, this is the only proposal currently under discussion in the more specifics on how a compliant implementation is to choose between
IETF for routers to drop packets proactively, before the buffer over- setting CE and dropping a packet, but this is NOT REQUIRED. A router
flows. However, this document does not attempt to specify a particu- MUST NOT set CE instead of dropping a packet when the drop that would
lar mechanism for active queue management, leaving that endeavor, if occur is caused by reasons other than congestion or the desire to
needed, to other areas of the IETF. While ECN is inextricably tied indicate incipient congestion to end nodes (e.g., a diffserv edge
up with the need to have a reasonable active queue management mecha- node may be configured to unconditionally drop certain classes of
nism at the router, the reverse does not hold; active queue manage- traffic to prevent them from entering its diffserv domain).
ment mechanisms have been developed and deployed independent of ECN,
using packet drops as indications of congestion in the absence of ECN We expect that routers will set the CE codepoint in response to
in the IP architecture. incipient congestion as indicated by the average queue size, using
the RED algorithms suggested in [FJ93, RFC2309]. To the best of our
knowledge, this is the only proposal currently under discussion in
the IETF for routers to drop packets proactively, before the buffer
overflows. However, this document does not attempt to specify a
particular mechanism for active queue management, leaving that
endeavor, if needed, to other areas of the IETF. While ECN is
inextricably tied up with the need to have a reasonable active queue
management mechanism at the router, the reverse does not hold; active
queue management mechanisms have been developed and deployed
independent of ECN, using packet drops as indications of congestion
in the absence of ECN in the IP architecture.
5.1. ECN as an Indication of Persistent Congestion 5.1. ECN as an Indication of Persistent Congestion
We emphasize that a *single* packet with the CE bit set in an IP We emphasize that a *single* packet with the CE codepoint set in an
packet causes the transport layer to respond, in terms of congestion IP packet causes the transport layer to respond, in terms of
control, as it would to a packet drop. The instantaneous queue size congestion control, as it would to a packet drop. The instantaneous
is likely to see considerable variations even when the router does queue size is likely to see considerable variations even when the
not experience persistent congestion. As such, it is important that router does not experience persistent congestion. As such, it is
transient congestion at a router, reflected by the instantaneous important that transient congestion at a router, reflected by the
queue size reaching a threshold much smaller than the capacity of the instantaneous queue size reaching a threshold much smaller than the
queue, not trigger a reaction at the transport layer. Therefore, the capacity of the queue, not trigger a reaction at the transport layer.
CE bit should not be set by a router based on the instantaneous queue Therefore, the CE codepoint should not be set by a router based on
size. the instantaneous queue size.
For example, since the ATM and Frame Relay mechanisms for congestion For example, since the ATM and Frame Relay mechanisms for congestion
indication have typically been defined without an associated notion indication have typically been defined without an associated notion
of average queue size as the basis for determining that an intermedi- of average queue size as the basis for determining that an
ate node is congested, we believe that they provide a very noisy sig- intermediate node is congested, we believe that they provide a very
nal. The TCP-sender reaction specified in this document for ECN is noisy signal. The TCP-sender reaction specified in this document for
NOT the appropriate reaction for such a noisy signal of congestion ECN is NOT the appropriate reaction for such a noisy signal of
notification. However, if the routers that interface to the ATM net- congestion notification. However, if the routers that interface to
work have a way of maintaining the average queue at the interface, the ATM network have a way of maintaining the average queue at the
and use it to come to a reliable determination that the ATM subnet is interface, and use it to come to a reliable determination that the
congested, they may use the ECN notification that is defined here. ATM subnet is congested, they may use the ECN notification that is
defined here.
We continue to encourage experiments in techniques at layer 2 (e.g., We continue to encourage experiments in techniques at layer 2 (e.g.,
in ATM switches or Frame Relay switches) to take advantage of ECN. in ATM switches or Frame Relay switches) to take advantage of ECN.
For example, using a scheme such as RED (where packet marking is For example, using a scheme such as RED (where packet marking is
based on the average queue length exceeding a threshold), layer 2 based on the average queue length exceeding a threshold), layer 2
devices could provide a reasonably reliable indication of congestion. devices could provide a reasonably reliable indication of congestion.
When all the layer 2 devices in a path set that layer's own Conges- When all the layer 2 devices in a path set that layer's own
tion Experienced bit (e.g., the EFCI bit for ATM, the FECN bit in Congestion Experienced codepoint (e.g., the EFCI bit for ATM, the
Frame Relay) in this reliable manner, then the interface router to FECN bit in Frame Relay) in this reliable manner, then the interface
the layer 2 network could copy the state of that layer 2 Congestion router to the layer 2 network could copy the state of that layer 2
Experienced bit into the CE bit in the IP header. We recognize that Congestion Experienced codepoint into the CE codepoint in the IP
this is not the current practice, nor is it in current standards. header. We recognize that this is not the current practice, nor is
However, encouraging experimentation in this manner may provide the it in current standards. However, encouraging experimentation in this
information needed to enable evolution of existing layer 2 mechanisms manner may provide the information needed to enable evolution of
to provide a more reliable means of congestion indication, when they existing layer 2 mechanisms to provide a more reliable means of
use a single bit for indicating congestion. congestion indication, when they use a single bit for indicating
congestion.
5.2. Dropped or Corrupted Packets 5.2. Dropped or Corrupted Packets
For the proposed use for ECN in this document (that is, for a trans- For the proposed use for ECN in this document (that is, for a
port protocol such as TCP for which a dropped data packet is an indi- transport protocol such as TCP for which a dropped data packet is an
cation of congestion), end nodes detect dropped data packets, and the indication of congestion), end nodes detect dropped data packets, and
congestion response of the end nodes to a dropped data packet is at the congestion response of the end nodes to a dropped data packet is
least as strong as the congestion response to a received CE packet. at least as strong as the congestion response to a received CE
To ensure the reliable delivery of the congestion indication of the packet. To ensure the reliable delivery of the congestion indication
CE bit, the ECT bit MUST NOT be set in a packet unless the loss of of the CE codepoint, an ECT codepoint MUST NOT be set in a packet
that packet in the network would be detected by the end nodes and unless the loss of that packet in the network would be detected by
interpreted as an indication of congestion. the end nodes and interpreted as an indication of congestion.
Transport protocols such as TCP do not necessarily detect all packet Transport protocols such as TCP do not necessarily detect all packet
drops, such as the drop of a "pure" ACK packet; for example, TCP does drops, such as the drop of a "pure" ACK packet; for example, TCP does
not reduce the arrival rate of subsequent ACK packets in response to not reduce the arrival rate of subsequent ACK packets in response to
an earlier dropped ACK packet. Any proposal for extending ECN-Capa- an earlier dropped ACK packet. Any proposal for extending ECN-
bility to such packets would have to address issues such as the case Capability to such packets would have to address issues such as the
of an ACK packet that was marked with the CE bit but was later case of an ACK packet that was marked with the CE codepoint but was
dropped in the network. We believe that this aspect is still the sub- later dropped in the network. We believe that this aspect is still
ject of research, so this document specifies that at this time, the subject of research, so this document specifies that at this
"pure" ACK packets MUST NOT indicate ECN-Capability. time, "pure" ACK packets MUST NOT indicate ECN-Capability.
Similarly, if a CE packet is dropped later in the network due to cor- Similarly, if a CE packet is dropped later in the network due to
ruption (bit errors), the end nodes should still invoke congestion corruption (bit errors), the end nodes should still invoke congestion
control, just as TCP would today in response to a dropped data control, just as TCP would today in response to a dropped data
packet. This issue of corrupted CE packets would have to be consid- packet. This issue of corrupted CE packets would have to be
ered in any proposal for the network to distinguish between packets considered in any proposal for the network to distinguish between
dropped due to corruption, and packets dropped due to congestion or packets dropped due to corruption, and packets dropped due to
buffer overflow. In particular, the ubiquitous deployment of ECN congestion or buffer overflow. In particular, the ubiquitous
would not, in and of itself, be a sufficient development to allow deployment of ECN would not, in and of itself, be a sufficient
end-nodes to interpret packet drops as indications of corruption development to allow end-nodes to interpret packet drops as
rather than congestion. indications of corruption rather than congestion.
5.3. Fragmentation
All ECN-capable packets SHOULD have the DF (Don't Fragment) bit set.
Reassembly of a fragmented packet MUST NOT lose indications of
congestion. In other words, if any fragment of an IP packet to be
reassembled has the CE codepoint set, then one of two actions MUST be
taken:
* The reassembled packet has the CE codepoint set. This MUST NOT
occur if any of the other fragments contributing to this
reassembly carries the Not-ECT codepoint.
* The packet is dropped instead of being reassmembled.
If both actions are applicable, either MAY be chosen. Reassembly of
a fragmented packet MUST NOT change the ECN codepoint when all of the
fragments carry the same codepoint.
Situations may arise in which the above specification is
insufficiently precise. For example, it does not place requirements
on reassembly of fragments that carry a mixture of ECT(0), ECT(1)
and/or Not-ECT. In situations where more precise reassembly behavior
would be required, protocol specifications SHOULD instead specify
that DF MUST be set in all packets sent by the protocol.
6. Support from the Transport Protocol 6. Support from the Transport Protocol
ECN requires support from the transport protocol, in addition to the ECN requires support from the transport protocol, in addition to the
functionality given by the ECN field in the IP packet header. The functionality given by the ECN field in the IP packet header. The
transport protocol might require negotiation between the endpoints transport protocol might require negotiation between the endpoints
during setup to determine that all of the endpoints are ECN-capable, during setup to determine that all of the endpoints are ECN-capable,
so that the sender can set the ECT bit in transmitted packets. Sec- so that the sender can set the ECT codepoint in transmitted packets.
ond, the transport protocol must be capable of reacting appropriately Second, the transport protocol must be capable of reacting
to the receipt of CE packets. This reaction could be in the form of appropriately to the receipt of CE packets. This reaction could be
the data receiver informing the data sender of the received CE packet in the form of the data receiver informing the data sender of the
(e.g., TCP), of the data receiver unsubscribing to a layered multi- received CE packet (e.g., TCP), of the data receiver unsubscribing to
cast group (e.g., RLM [MJV96]), or of some other action that ulti- a layered multicast group (e.g., RLM [MJV96]), or of some other
mately reduces the arrival rate of that flow on that congested link. action that ultimately reduces the arrival rate of that flow on that
congested link. CE packets indicate persistent rather than transient
congestion (see Section 5.1), and hence reactions to the receipt of
CE packets should be those appropriate for persistent congestion.
This document only addresses the addition of ECN Capability to TCP, This document only addresses the addition of ECN Capability to TCP,
leaving issues of ECN in other transport protocols to further leaving issues of ECN in other transport protocols to further
research. For TCP, ECN requires three new pieces of functionality: research. For TCP, ECN requires three new pieces of functionality:
negotiation between the endpoints during connection setup to deter- negotiation between the endpoints during connection setup to
mine if they are both ECN-capable; an ECN-Echo (ECE) flag in the TCP determine if they are both ECN-capable; an ECN-Echo (ECE) flag in the
header so that the data receiver can inform the data sender when a CE TCP header so that the data receiver can inform the data sender when
packet has been received; and a Congestion Window Reduced (CWR) flag a CE packet has been received; and a Congestion Window Reduced (CWR)
in the TCP header so that the data sender can inform the data flag in the TCP header so that the data sender can inform the data
receiver that the congestion window has been reduced. The support receiver that the congestion window has been reduced. The support
required from other transport protocols is likely to be different, required from other transport protocols is likely to be different,
particularly for unreliable or reliable multicast transport proto- particularly for unreliable or reliable multicast transport
cols, and will have to be determined as other transport protocols are protocols, and will have to be determined as other transport
brought to the IETF for standardization. protocols are brought to the IETF for standardization.
6.1. TCP 6.1. TCP
The following sections describe in detail the proposed use of ECN in The following sections describe in detail the proposed use of ECN in
TCP. This proposal is described in essentially the same form in TCP. This proposal is described in essentially the same form in
[Floyd94]. We assume that the source TCP uses the standard congestion [Floyd94]. We assume that the source TCP uses the standard congestion
control algorithms of Slow-start, Fast Retransmit and Fast Recovery control algorithms of Slow-start, Fast Retransmit and Fast Recovery
[RFC 2001]. [RFC 2001].
This proposal specifies two new flags in the Reserved field of the This proposal specifies two new flags in the Reserved field of the
TCP header. The TCP mechanism for negotiating ECN-Capability uses TCP header. The TCP mechanism for negotiating ECN-Capability uses
the ECN-Echo (ECE) flag in the TCP header. Bit 9 in the Reserved the ECN-Echo (ECE) flag in the TCP header. Bit 9 in the Reserved
field of the TCP header is designated as the ECN-Echo flag. The field of the TCP header is designated as the ECN-Echo flag. The
location of the 6-bit Reserved field in the TCP header is shown in location of the 6-bit Reserved field in the TCP header is shown in
Figure 3 of RFC 793 [RFC793] (and is reproduced below for complete- Figure 4 of RFC 793 [RFC793] (and is reproduced below for
ness). This specification of the ECN Field leaves the Reserved field completeness). This specification of the ECN Field leaves the
as a 4-bit field using bits 4-7. Reserved field as a 4-bit field using bits 4-7.
To enable the TCP receiver to determine when to stop setting the ECN- To enable the TCP receiver to determine when to stop setting the ECN-
Echo flag, we introduce a second new flag in the TCP header, the CWR Echo flag, we introduce a second new flag in the TCP header, the CWR
flag. The CWR flag is assigned to Bit 8 in the Reserved field of the flag. The CWR flag is assigned to Bit 8 in the Reserved field of the
TCP header. TCP header.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | | U | A | P | R | S | F | | | | U | A | P | R | S | F |
| Header Length | Reserved | R | C | S | S | Y | I | | Header Length | Reserved | R | C | S | S | Y | I |
| | | G | K | H | T | N | N | | | | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 2: The old definition of bytes 13 and 14 of the TCP Figure 3: The old definition of bytes 13 and 14 of the TCP
header. header.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| | | C | E | U | A | P | R | S | F | | | | C | E | U | A | P | R | S | F |
| Header Length | Reserved | W | C | R | C | S | S | Y | I | | Header Length | Reserved | W | C | R | C | S | S | Y | I |
| | | R | E | G | K | H | T | N | N | | | | R | E | G | K | H | T | N | N |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Figure 3: The new definition of bytes 13 and 14 of the TCP Figure 4: The new definition of bytes 13 and 14 of the TCP
Header. Header.
Thus, ECN uses the ECT and CE flags in the IP header (as shown in Thus, ECN uses the ECT and CE flags in the IP header (as shown in
Figure 1) for signaling between routers and connection endpoints, and Figure 1) for signaling between routers and connection endpoints, and
uses the ECN-Echo and CWR flags in the TCP header (as shown in Figure uses the ECN-Echo and CWR flags in the TCP header (as shown in Figure
3) for TCP-endpoint to TCP-endpoint signaling. For a TCP connection, 4) for TCP-endpoint to TCP-endpoint signaling. For a TCP connection,
a typical sequence of events in an ECN-based reaction to congestion a typical sequence of events in an ECN-based reaction to congestion
is as follows: is as follows:
* The ECT bit is set in packets transmitted by the sender to indi- * An ECT codepoint is set in packets transmitted by the sender to
cate that ECN is supported by the transport entities for these indicate that ECN is supported by the transport entities for these
packets. packets.
* An ECN-capable router detects impending congestion and detects * An ECN-capable router detects impending congestion and detects
that the ECT bit is set in the packet it is about to drop. that an ECT codepoint is set in the packet it is about to drop.
Instead of dropping the packet, the router chooses to set the CE Instead of dropping the packet, the router chooses to set the CE
bit in the IP header and forwards the packet. codepoint in the IP header and forwards the packet.
* The receiver receives the packet with the CE bit set, and sets * The receiver receives the packet with the CE codepoint set, and
the ECN-Echo flag in its next TCP ACK sent to the sender. sets the ECN-Echo flag in its next TCP ACK sent to the sender.
* The sender receives the TCP ACK with ECN-Echo set, and reacts to * The sender receives the TCP ACK with ECN-Echo set, and reacts to
the congestion as if a packet had been dropped. the congestion as if a packet had been dropped.
* The sender sets the CWR flag in the TCP header of the next * The sender sets the CWR flag in the TCP header of the next
packet sent to the receiver to acknowledge its receipt of and packet sent to the receiver to acknowledge its receipt of and
reaction to the ECN-Echo flag. reaction to the ECN-Echo flag.
The negotiation for using ECN by the TCP transport entities and the The negotiation for using ECN by the TCP transport entities and the
use of the ECN-Echo and CWR flags is described in more detail in the use of the ECN-Echo and CWR flags is described in more detail in the
sections below. sections below.
6.1.1 TCP Initialization 6.1.1 TCP Initialization
In the TCP connection setup phase, the source and destination TCPs In the TCP connection setup phase, the source and destination TCPs
exchange information about their willingness to use ECN. Subsequent exchange information about their willingness to use ECN. Subsequent
to the completion of this negotiation, the TCP sender sets the ECT to the completion of this negotiation, the TCP sender sets an ECT
bit in the IP header of data packets to indicate to the network that codepoint in the IP header of data packets to indicate to the network
the transport is capable and willing to participate in ECN for this that the transport is capable and willing to participate in ECN for
packet. This indicates to the routers that they may mark this packet this packet. This indicates to the routers that they may mark this
with the CE bit, if they would like to use that as a method of con- packet with the CE codepoint, if they would like to use that as a
gestion notification. If the TCP connection does not wish to use ECN method of congestion notification. If the TCP connection does not
notification for a particular packet, the sending TCP sets the ECT wish to use ECN notification for a particular packet, the sending TCP
bit equal to 0 (i.e., not set), and the TCP receiver ignores the CE sets the ECN codepoint to not-ECT, and the TCP receiver ignores the
bit in the received packet. CE codepoint in the received packet.
For this discussion, we designate the initiating host as Host A and For this discussion, we designate the initiating host as Host A and
the responding host as Host B. We call a SYN packet with the ECE and the responding host as Host B. We call a SYN packet with the ECE and
CWR flags set an "ECN-setup SYN packet", and we call a SYN packet CWR flags set an "ECN-setup SYN packet", and we call a SYN packet
with at least one of the ECE and CWR flags not set a "non-ECN-setup with at least one of the ECE and CWR flags not set a "non-ECN-setup
SYN packet". Similarly, we call a SYN-ACK packet with only the ECE SYN packet". Similarly, we call a SYN-ACK packet with only the ECE
flag set but the CWR flag not set an "ECN-setup SYN-ACK packet", and flag set but the CWR flag not set an "ECN-setup SYN-ACK packet", and
we call a SYN-ACK packet with any other configuration of the ECE and we call a SYN-ACK packet with any other configuration of the ECE and
CWR flags a "non-ECN-setup SYN-ACK packet". CWR flags a "non-ECN-setup SYN-ACK packet".
Before a TCP connection can use ECN, Host A sends an ECN-setup SYN Before a TCP connection can use ECN, Host A sends an ECN-setup SYN
packet, and Host B sends an ECN-setup SYN-ACK packet. For a SYN packet, and Host B sends an ECN-setup SYN-ACK packet. For a SYN
packet, the setting of both ECE and CWR in the ECN-setup SYN packet packet, the setting of both ECE and CWR in the ECN-setup SYN packet
is defined as an indication that the sending TCP is ECN-Capable, is defined as an indication that the sending TCP is ECN-Capable,
rather than as an indication of congestion or of response to conges- rather than as an indication of congestion or of response to
tion. More precisely, an ECN-setup SYN packet indicates that the TCP congestion. More precisely, an ECN-setup SYN packet indicates that
implementation transmitting the SYN packet will participate in ECN as the TCP implementation transmitting the SYN packet will participate
both a sender and receiver. Specifically, as a receiver, it will in ECN as both a sender and receiver. Specifically, as a receiver,
respond to incoming data packets that have the CE bit set in the IP it will respond to incoming data packets that have the CE codepoint
header by setting ECE in outgoing TCP Acknowledgement (ACK) packets. set in the IP header by setting ECE in outgoing TCP Acknowledgement
As a sender, it will respond to incoming packets that have ECE set by (ACK) packets. As a sender, it will respond to incoming packets that
reducing the congestion window and setting CWR when appropriate. An have ECE set by reducing the congestion window and setting CWR when
ECN-setup SYN packet does not commit the TCP sender to setting the appropriate. An ECN-setup SYN packet does not commit the TCP sender
ECT bit in any or all of the packets it may transmit. However, the to setting the ECT codepoint in any or all of the packets it may
commitment to respond appropriately to incoming packets with the CE transmit. However, the commitment to respond appropriately to
bit set remains even if the TCP sender in a later transmission, incoming packets with the CE codepoint set remains even if the TCP
within this TCP connection, sends a SYN packet without ECE and CWR sender in a later transmission, within this TCP connection, sends a
set. SYN packet without ECE and CWR set.
When Host B sends an ECN-setup SYN-ACK packet, it sets the ECE flag When Host B sends an ECN-setup SYN-ACK packet, it sets the ECE flag
but not the CWR flag. An ECN-setup SYN-ACK packet is defined as an but not the CWR flag. An ECN-setup SYN-ACK packet is defined as an
indication that the TCP transmitting the SYN-ACK packet is ECN-Capa- indication that the TCP transmitting the SYN-ACK packet is ECN-
ble. As with the SYN packet, an ECN-setup SYN-ACK packet does not Capable. As with the SYN packet, an ECN-setup SYN-ACK packet does
commit the TCP host to setting the ECT bit in transmitted packets. not commit the TCP host to setting the ECT codepoint in transmitted
packets.
The following rules apply to the sending of ECN-setup packets: The following rules apply to the sending of ECN-setup packets:
* If a host has received an ECN-setup SYN packet, then it MAY send an * If a host has received an ECN-setup SYN packet, then it MAY send an
ECN-setup SYN-ACK packet. Otherwise, it MUST NOT send an ECN-setup ECN-setup SYN-ACK packet. Otherwise, it MUST NOT send an ECN-setup
SYN-ACK packet. SYN-ACK packet.
* A host MUST NOT set ECT on data packets unless it has sent at least * A host MUST NOT set ECT on data packets unless it has sent at least
one ECN-setup SYN or ECN-setup SYN-ACK packet, and has received at one ECN-setup SYN or ECN-setup SYN-ACK packet, and has received at
least one ECN-setup SYN or ECN-setup SYN-ACK packet, and has sent no least one ECN-setup SYN or ECN-setup SYN-ACK packet, and has sent no
non-ECN-setup SYN or non-ECN-setup SYN-ACK packet. If a host has non-ECN-setup SYN or non-ECN-setup SYN-ACK packet. If a host has
received at least one non-ECN-setup SYN or non-ECN-setup SYN-ACK received at least one non-ECN-setup SYN or non-ECN-setup SYN-ACK
packet, then it SHOULD NOT set ECT on data packets. packet, then it SHOULD NOT set ECT on data packets.
* If a host ever sets the ECT bit on a data packet, then that host * If a host ever sets the ECT codepoint on a data packet, then that
MUST correctly set/clear the CWR TCP bit on all subsequent packets in host MUST correctly set/clear the CWR TCP bit on all subsequent
the connection. packets in the connection.
* If a host has sent at least one ECN-setup SYN or ECN-setup SYN-ACK * If a host has sent at least one ECN-setup SYN or ECN-setup SYN-ACK
packet, and has received no non-ECN-setup SYN or non-ECN-setup SYN- packet, and has received no non-ECN-setup SYN or non-ECN-setup SYN-
ACK packet, then if that host receives TCP data packets with ECT and ACK packet, then if that host receives TCP data packets with ECT and
CE bits set in the IP header, then that host MUST process these pack- CE codepoints set in the IP header, then that host MUST process these
ets as specified for an ECN-capable connection. * A host that is not packets as specified for an ECN-capable connection.
willing to use ECN on a TCP connection SHOULD clear both the ECE and * A host that is not willing to use ECN on a TCP connection SHOULD
CWR flags in all non-ECN-setup SYN and/or SYN-ACK packets that it clear both the ECE and CWR flags in all non-ECN-setup SYN and/or SYN-
sends to indicate this unwillingness. Receivers MUST correctly han- ACK packets that it sends to indicate this unwillingness. Receivers
dle all forms of the non-ECN-setup SYN and SYN-ACK packets. MUST correctly handle all forms of the non-ECN-setup SYN and SYN-ACK
packets.
* A host MUST NOT set ECT on SYN or SYN-ACK packets.
6.1.1.1. Robust TCP Initialization with an Echoed Reserve Field 6.1.1.1. Robust TCP Initialization with an Echoed Reserve Field
There is the question of why we chose to have the TCP sending the SYN There is the question of why we chose to have the TCP sending the SYN
set two ECN-related flags in the Reserved field of the TCP header for set two ECN-related flags in the Reserved field of the TCP header for
the SYN packet, while the responding TCP sending the SYN-ACK sets the SYN packet, while the responding TCP sending the SYN-ACK sets
only one ECN-related flag in the SYN-ACK packet. This asymmetry is only one ECN-related flag in the SYN-ACK packet. This asymmetry is
necessary for the robust negotiation of ECN-capability with some necessary for the robust negotiation of ECN-capability with some
deployed TCP implementations. There exists at least one faulty TCP deployed TCP implementations. There exists at least one faulty TCP
implementation in which TCP receivers set the Reserved field of the implementation in which TCP receivers set the Reserved field of the
TCP header in ACK packets (and hence the SYN-ACK) simply to reflect TCP header in ACK packets (and hence the SYN-ACK) simply to reflect
the Reserved field of the TCP header in the received data packet. the Reserved field of the TCP header in the received data packet.
Because the TCP SYN packet sets the ECN-Echo and CWR flags to indi- Because the TCP SYN packet sets the ECN-Echo and CWR flags to
cate ECN-capability, while the SYN-ACK packet sets only the ECN-Echo indicate ECN-capability, while the SYN-ACK packet sets only the ECN-
flag, the sending TCP correctly interprets a receiver's reflection of Echo flag, the sending TCP correctly interprets a receiver's
its own flags in the Reserved field as an indication that the reflection of its own flags in the Reserved field as an indication
receiver is not ECN-capable. The sending TCP is not mislead by a that the receiver is not ECN-capable. The sending TCP is not mislead
faulty TCP implementation sending a SYN-ACK packet that simply by a faulty TCP implementation sending a SYN-ACK packet that simply
reflects the Reserved field of the incoming SYN packet. reflects the Reserved field of the incoming SYN packet.
6.1.2. The TCP Sender 6.1.2. The TCP Sender
For a TCP connection using ECN, new data packets are transmitted with For a TCP connection using ECN, new data packets are transmitted with
the ECT bit set in the IP header (set to a "1"). If the sender an ECT codepoint set in the IP header. When only one ECT codepoint
receives an ECN-Echo (ECE) ACK packet (that is, an ACK packet with is needed by a sender for all packets sent on a TCP connection,
the ECN-Echo flag set in the TCP header), then the sender knows that ECT(0) SHOULD be used. If the sender receives an ECN-Echo (ECE) ACK
congestion was encountered in the network on the path from the sender packet (that is, an ACK packet with the ECN-Echo flag set in the TCP
to the receiver. The indication of congestion should be treated just header), then the sender knows that congestion was encountered in the
as a congestion loss in non-ECN-Capable TCP. That is, the TCP source network on the path from the sender to the receiver. The indication
halves the congestion window "cwnd" and reduces the slow start of congestion should be treated just as a congestion loss in non-ECN-
threshold "ssthresh". The sending TCP SHOULD NOT increase the con- Capable TCP. That is, the TCP source halves the congestion window
gestion window in response to the receipt of an ECN-Echo ACK packet. "cwnd" and reduces the slow start threshold "ssthresh". The sending
TCP SHOULD NOT increase the congestion window in response to the
receipt of an ECN-Echo ACK packet.
TCP should not react to congestion indications more than once every TCP should not react to congestion indications more than once every
window of data (or more loosely, more than once every round-trip window of data (or more loosely, more than once every round-trip
time). That is, the TCP sender's congestion window should be reduced time). That is, the TCP sender's congestion window should be reduced
only once in response to a series of dropped and/or CE packets from a only once in response to a series of dropped and/or CE packets from a
single window of data. In addition, the TCP source should not single window of data. In addition, the TCP source should not
decrease the slow-start threshold, ssthresh, if it has been decreased decrease the slow-start threshold, ssthresh, if it has been decreased
within the last round trip time. However, if any retransmitted pack- within the last round trip time. However, if any retransmitted
ets are dropped, then this is interpreted by the source TCP as a new packets are dropped, then this is interpreted by the source TCP as a
instance of congestion. new instance of congestion.
After the source TCP reduces its congestion window in response to a After the source TCP reduces its congestion window in response to a
CE packet, incoming acknowledgements that continue to arrive can CE packet, incoming acknowledgements that continue to arrive can
"clock out" outgoing packets as allowed by the reduced congestion "clock out" outgoing packets as allowed by the reduced congestion
window. If the congestion window consists of only one MSS (maximum window. If the congestion window consists of only one MSS (maximum
segment size), and the sending TCP receives an ECN-Echo ACK packet, segment size), and the sending TCP receives an ECN-Echo ACK packet,
then the sending TCP should in principle still reduce its congestion then the sending TCP should in principle still reduce its congestion
window in half. However, the value of the congestion window is window in half. However, the value of the congestion window is
bounded below by a value of one MSS. If the sending TCP were to con- bounded below by a value of one MSS. If the sending TCP were to
tinue to send, using a congestion window of 1 MSS, this results in continue to send, using a congestion window of 1 MSS, this results in
the transmission of one packet per round-trip time. It is necessary the transmission of one packet per round-trip time. It is necessary
to still reduce the sending rate of the TCP sender even further, on to still reduce the sending rate of the TCP sender even further, on
receipt of an ECN-Echo packet when the congestion window is one. We receipt of an ECN-Echo packet when the congestion window is one. We
use the retransmit timer as a means of reducing the rate further in use the retransmit timer as a means of reducing the rate further in
this circumstance. Therefore, the sending TCP MUST reset the this circumstance. Therefore, the sending TCP MUST reset the
retransmit timer on receiving the ECN-Echo packet when the congestion retransmit timer on receiving the ECN-Echo packet when the congestion
window is one. The sending TCP will then be able to send a new window is one. The sending TCP will then be able to send a new
packet only when the retransmit timer expires. packet only when the retransmit timer expires.
When an ECN-Capable TCP sender reduces its congestion window for any When an ECN-Capable TCP sender reduces its congestion window for any
reason (because of a retransmit timeout, a Fast Retransmit, or in reason (because of a retransmit timeout, a Fast Retransmit, or in
response to an ECN Notification), the TCP sender sets the CWR flag in response to an ECN Notification), the TCP sender sets the CWR flag in
the TCP header of the first new data packet sent after the window the TCP header of the first new data packet sent after the window
reduction. If that data packet is dropped in the network, then the reduction. If that data packet is dropped in the network, then the
sending TCP will have to reduce the congestion window again and sending TCP will have to reduce the congestion window again and
retransmit the dropped packet. retransmit the dropped packet.
We ensure that the "Congestion Window Reduced" information is reli- We ensure that the "Congestion Window Reduced" information is
ably delivered to the TCP receiver. This comes about from the fact reliably delivered to the TCP receiver. This comes about from the
that if the new data packet carrying the CWR flag is dropped, then fact that if the new data packet carrying the CWR flag is dropped,
the TCP sender will have to again reduce its congestion window, and then the TCP sender will have to again reduce its congestion window,
send another new data packet with the CWR flag set. Thus, the CWR and send another new data packet with the CWR flag set. Thus, the
bit in the TCP header SHOULD NOT be set on retransmitted packets. CWR bit in the TCP header SHOULD NOT be set on retransmitted packets.
When the TCP data sender is ready to set the CWR bit after reducing When the TCP data sender is ready to set the CWR bit after reducing
the congestion window, it SHOULD set the CWR bit only on the first the congestion window, it SHOULD set the CWR bit only on the first
new data packet that it transmits. new data packet that it transmits.
[Floyd94] discusses TCP's response to ECN in more detail. [Floyd98] [Floyd94] discusses TCP's response to ECN in more detail. [Floyd98]
discusses the validation test in the ns simulator, which illustrates discusses the validation test in the ns simulator, which illustrates
a wide range of ECN scenarios. These scenarios include the following: a wide range of ECN scenarios. These scenarios include the following:
an ECN followed by another ECN, a Fast Retransmit, or a Retransmit an ECN followed by another ECN, a Fast Retransmit, or a Retransmit
Timeout; a Retransmit Timeout or a Fast Retransmit followed by an Timeout; a Retransmit Timeout or a Fast Retransmit followed by an
ECN; and a congestion window of one packet followed by an ECN. ECN; and a congestion window of one packet followed by an ECN.
skipping to change at page 16, line 40 skipping to change at page 18, line 44
increasing the congestion window when it receives ACK packets without increasing the congestion window when it receives ACK packets without
the ECN-Echo bit set [RFC2581]. the ECN-Echo bit set [RFC2581].
6.1.3. The TCP Receiver 6.1.3. The TCP Receiver
When TCP receives a CE data packet at the destination end-system, the When TCP receives a CE data packet at the destination end-system, the
TCP data receiver sets the ECN-Echo flag in the TCP header of the TCP data receiver sets the ECN-Echo flag in the TCP header of the
subsequent ACK packet. If there is any ACK withholding implemented, subsequent ACK packet. If there is any ACK withholding implemented,
as in current "delayed-ACK" TCP implementations where the TCP as in current "delayed-ACK" TCP implementations where the TCP
receiver can send an ACK for two arriving data packets, then the ECN- receiver can send an ACK for two arriving data packets, then the ECN-
Echo flag in the ACK packet will be set to the OR of the CE bits of Echo flag in the ACK packet will be set to '1' if the CE codepoint is
all of the data packets being acknowledged. That is, if any of the set in any of the data packets being acknowledged. That is, if any
received data packets are CE packets, then the returning ACK has the of the received data packets are CE packets, then the returning ACK
ECN-Echo flag set. has the ECN-Echo flag set.
To provide robustness against the possibility of a dropped ACK packet To provide robustness against the possibility of a dropped ACK packet
carrying an ECN-Echo flag, the TCP receiver sets the ECN-Echo flag in carrying an ECN-Echo flag, the TCP receiver sets the ECN-Echo flag in
a series of ACK packets sent subsequently. The TCP receiver uses the a series of ACK packets sent subsequently. The TCP receiver uses the
CWR flag received from the TCP sender to determine when to stop set- CWR flag received from the TCP sender to determine when to stop
ting the ECN-Echo flag. setting the ECN-Echo flag.
After a TCP receiver sends an ACK packet with the ECN-Echo bit set, After a TCP receiver sends an ACK packet with the ECN-Echo bit set,
that TCP receiver continues to set the ECN-Echo flag in all the ACK that TCP receiver continues to set the ECN-Echo flag in all the ACK
packets it sends (whether they acknowledge CE data packets or non-CE packets it sends (whether they acknowledge CE data packets or non-CE
data packets) until it receives a CWR packet (a packet with the CWR data packets) until it receives a CWR packet (a packet with the CWR
flag set). After the receipt of the CWR packet, acknowledgements for flag set). After the receipt of the CWR packet, acknowledgements for
subsequent non-CE data packets do not have the ECN-Echo flag set. If subsequent non-CE data packets do not have the ECN-Echo flag set. If
another CE packet is received by the data receiver, the receiver another CE packet is received by the data receiver, the receiver
would once again send ACK packets with the ECN-Echo flag set. While would once again send ACK packets with the ECN-Echo flag set. While
the receipt of a CWR packet does not guarantee that the data sender the receipt of a CWR packet does not guarantee that the data sender
received the ECN-Echo message, this does suggest that the data sender received the ECN-Echo message, this does suggest that the data sender
reduced its congestion window at some point *after* it sent the data reduced its congestion window at some point *after* it sent the data
packet for which the CE bit was set. packet for which the CE codepoint was set.
We have already specified that a TCP sender is not required to reduce We have already specified that a TCP sender is not required to reduce
its congestion window more than once per window of data. Some care its congestion window more than once per window of data. Some care
is required if the TCP sender is to avoid unnecessary reductions of is required if the TCP sender is to avoid unnecessary reductions of
the congestion window when a window of data includes both dropped the congestion window when a window of data includes both dropped
packets and (marked) CE packets. This is illustrated in [Floyd98]. packets and (marked) CE packets. This is illustrated in [Floyd98].
6.1.4. Congestion on the ACK-path 6.1.4. Congestion on the ACK-path
For the current generation of TCP congestion control algorithms, pure For the current generation of TCP congestion control algorithms, pure
acknowledgement packets (e.g., packets that do not contain any accom- acknowledgement packets (e.g., packets that do not contain any
panying data) should be sent with the ECT bit off. Current TCP accompanying data) should be sent with the not-ECT codepoint.
receivers have no mechanisms for reducing traffic on the ACK-path in Current TCP receivers have no mechanisms for reducing traffic on the
response to congestion notification. Mechanisms for responding to ACK-path in response to congestion notification. Mechanisms for
congestion on the ACK-path are areas for current and future research. responding to congestion on the ACK-path are areas for current and
(One simple possibility would be for the sender to reduce its conges- future research. (One simple possibility would be for the sender to
tion window when it receives a pure ACK packet with the CE bit set). reduce its congestion window when it receives a pure ACK packet with
For current TCP implementations, a single dropped ACK generally has the CE codepoint set). For current TCP implementations, a single
only a very small effect on the TCP's sending rate. dropped ACK generally has only a very small effect on the TCP's
sending rate.
6.1.5. Retransmitted TCP packets 6.1.5. Retransmitted TCP packets
This document specifies that for ECN-capable TCP implementations, the This document specifies ECN-capable TCP implementations MUST NOT set
ECT bit (ECN-Capable Transport) in the IP header MUST NOT be set on either ECT codepoint (ECT(0) or ECT(1)) in the IP header for
retransmitted data packets, and that the TCP data receiver SHOULD retransmitted data packets, and that the TCP data receiver SHOULD
ignore the ECN field on arriving data packets that are outside of the ignore the ECN field on arriving data packets that are outside of the
receiver's current window. This is for greater security against receiver's current window. This is for greater security against
denial-of-service attacks, as well as for robustness of the ECN con- denial-of-service attacks, as well as for robustness of the ECN
gestion indication with packets that are dropped later in the net- congestion indication with packets that are dropped later in the
work. network.
First, we note that if the TCP sender were to set the ECT bit on a First, we note that if the TCP sender were to set an ECT codepoint on
retransmitted packet, then if an unnecessarily-retransmitted packet a retransmitted packet, then if an unnecessarily-retransmitted packet
was later dropped in the network, the end nodes would never receive was later dropped in the network, the end nodes would never receive
the indication of congestion from the router setting the CE bit. the indication of congestion from the router setting the CE
Thus, setting the ECT bit on retransmitted data packets is not con- codepoint. Thus, setting an ECT codepoint on retransmitted data
sistent with the robust delivery of the congestion indication even packets is not consistent with the robust delivery of the congestion
for packets that are later dropped in the network. indication even for packets that are later dropped in the network.
In addition, an attacker capable of spoofing the IP source address of In addition, an attacker capable of spoofing the IP source address of
the TCP sender could send data packets with arbitrary sequence num- the TCP sender could send data packets with arbitrary sequence
bers, with both the ECT and CE bits set in the IP header. On receiv- numbers, with the CE codepoint set in the IP header. On receiving
ing this spoofed data packet, the TCP data receiver would determine this spoofed data packet, the TCP data receiver would determine that
that the data does not lie in the current receive window, and return the data does not lie in the current receive window, and return a
a duplicate acknowledgement. We define an out-of-window packet at duplicate acknowledgement. We define an out-of-window packet at the
the TCP data receiver as a data packet that lies outside the TCP data receiver as a data packet that lies outside the receiver's
receiver's current window. On receiving an out-of-window packet, the current window. On receiving an out-of-window packet, the TCP data
TCP data receiver has to decide whether or not to treat the CE bit in receiver has to decide whether or not to treat the CE codepoint in
the packet header as a valid indication of congestion, and therefore the packet header as a valid indication of congestion, and therefore
whether to return ECN-Echo indications to the TCP data sender. If whether to return ECN-Echo indications to the TCP data sender. If
the TCP data receiver ignored the CE bit in an out-of-window packet, the TCP data receiver ignored the CE codepoint in an out-of-window
then the TCP data sender would not receive this possibly-legitimate packet, then the TCP data sender would not receive this possibly-
indication of congestion from the network, resulting in a violation legitimate indication of congestion from the network, resulting in a
of end-to-end congestion control. On the other hand, if the TCP data violation of end-to-end congestion control. On the other hand, if
receiver honors the CE indication in the out-of-window packet, and the TCP data receiver honors the CE indication in the out-of-window
reports the indication of congestion to the TCP data sender, then the packet, and reports the indication of congestion to the TCP data
malicious node that created the spoofed, out-of-window packet has sender, then the malicious node that created the spoofed, out-of-
successfully "attacked" the TCP connection by forcing the data sender window packet has successfully "attacked" the TCP connection by
to unnecessarily reduce (halve) its congestion window. To prevent forcing the data sender to unnecessarily reduce (halve) its
such a denial-of-service attack, we specify that a legitimate TCP congestion window. To prevent such a denial-of-service attack, we
data sender MUST NOT set the ECT bit on retransmitted data packets, specify that a legitimate TCP data sender MUST NOT set an ECT
and that the TCP data receiver SHOULD ignore the CE bit on out-of- codepoint on retransmitted data packets, and that the TCP data
window packets. receiver SHOULD ignore the CE codepoint on out-of-window packets.
One drawback of not setting ECT on retransmitted packets denies ECN One drawback of not setting ECT(0) or ECT(1) on retransmitted packets
protection for retransmitted packets. However, for an ECN-capable is that it denies ECN protection for retransmitted packets. However,
TCP connection in a fully-ECN-capable environment with mild conges- for an ECN-capable TCP connection in a fully-ECN-capable environment
tion, packets should rarely be dropped due to congestion in the first with mild congestion, packets should rarely be dropped due to
place, and so instances of retransmitted packets should rarely arise. congestion in the first place, and so instances of retransmitted
If packets are being retransmitted, then there are already packet packets should rarely arise. If packets are being retransmitted,
losses (from corruption or from congestion) that ECN has been unable then there are already packet losses (from corruption or from
to prevent. congestion) that ECN has been unable to prevent.
We note that if the router sets the CE bit for an ECN-capable data We note that if the router sets the CE codepoint for an ECN-capable
packet within a TCP connection, then the TCP connection is guaranteed data packet within a TCP connection, then the TCP connection is
to receive that indication of congestion, or to receive some other guaranteed to receive that indication of congestion, or to receive
indication of congestion within the same window of data, even if this some other indication of congestion within the same window of data,
packet is dropped or reordered in the network. We consider two even if this packet is dropped or reordered in the network. We
cases, when the packet is later retransmitted, and when the packet is consider two cases, when the packet is later retransmitted, and when
not later retransmitted. the packet is not later retransmitted.
In the first case, if the packet is either dropped or delayed, and at In the first case, if the packet is either dropped or delayed, and at
some point retransmitted by the data sender, then the retransmission some point retransmitted by the data sender, then the retransmission
is a result of a Fast Retransmit or a Retransmit Timeout for either is a result of a Fast Retransmit or a Retransmit Timeout for either
that packet or for some prior packet in the same window of data. In that packet or for some prior packet in the same window of data. In
this case, because the data sender already has retransmitted this this case, because the data sender already has retransmitted this
packet, we know that the data sender has already responded to an packet, we know that the data sender has already responded to an
indication of congestion for some packet within the same window of indication of congestion for some packet within the same window of
data as the original packet. Thus, even if the first transmission of data as the original packet. Thus, even if the first transmission of
the packet is dropped in the network, or is delayed, if it had the CE the packet is dropped in the network, or is delayed, if it had the CE
bit set, and is later ignored by the data receiver as an out-of-win- codepoint set, and is later ignored by the data receiver as an out-
dow packet, this is not a problem, because the sender has already of-window packet, this is not a problem, because the sender has
responded to an indication of congestion for that window of data. already responded to an indication of congestion for that window of
data.
In the second case, if the packet is never retransmitted by the data In the second case, if the packet is never retransmitted by the data
sender, then this data packet is the only copy of this data received sender, then this data packet is the only copy of this data received
by the data receiver, and therefore arrives at the data receiver as by the data receiver, and therefore arrives at the data receiver as
an in-window packet, regardless of how much the packet might be an in-window packet, regardless of how much the packet might be
delayed or reordered. In this case, if the CE bit is set on the delayed or reordered. In this case, if the CE codepoint is set on
packet within the network, this will be treated by the data receiver the packet within the network, this will be treated by the data
as a valid indication of congestion. receiver as a valid indication of congestion.
6.1.6. TCP Window Probes. 6.1.6. TCP Window Probes.
When the TCP data receiver advertises a zero window, the TCP data When the TCP data receiver advertises a zero window, the TCP data
sender sends window probes to determine if the receiver's window has sender sends window probes to determine if the receiver's window has
increased. Window probe packets do not contain any user data except increased. Window probe packets do not contain any user data except
for the sequence number, which is a byte. If a window probe packet for the sequence number, which is a byte. If a window probe packet
is dropped in the network, this loss is not detected by the receiver. is dropped in the network, this loss is not detected by the receiver.
Therefore, the TCP data sender MUST NOT set either the ECT or CWR Therefore, the TCP data sender MUST NOT set either an ECT codepoint
bits on window probe packets. or the CWR bit on window probe packets.
However, because window probes use exact sequence numbers, they can- However, because window probes use exact sequence numbers, they
not be easily spoofed in denial-of-service attacks. Therefore, if a cannot be easily spoofed in denial-of-service attacks. Therefore, if
window probe arrives with ECT and CE set, then the receiver SHOULD a window probe arrives with the CE codepoint set, then the receiver
respond to the ECN indications. SHOULD respond to the ECN indications.
7. Non-compliance by the End Nodes 7. Non-compliance by the End Nodes
This section discusses concerns about the vulnerability of ECN to This section discusses concerns about the vulnerability of ECN to
non-compliant end-nodes (i.e., end nodes that set the ECT bit in non-compliant end-nodes (i.e., end nodes that set the ECT codepoint
transmitted packets but do not respond to received CE packets). We in transmitted packets but do not respond to received CE packets).
argue that the addition of ECN to the IP architecture will not sig- We argue that the addition of ECN to the IP architecture will not
nificantly increase the current vulnerability of the architecture to significantly increase the current vulnerability of the architecture
unresponsive flows. to unresponsive flows.
Even for non-ECN environments, there are serious concerns about the Even for non-ECN environments, there are serious concerns about the
damage that can be done by non-compliant or unresponsive flows (that damage that can be done by non-compliant or unresponsive flows (that
is, flows that do not respond to congestion control indications by is, flows that do not respond to congestion control indications by
reducing their arrival rate at the congested link). For example, an reducing their arrival rate at the congested link). For example, an
end-node could "turn off congestion control" by not reducing its con- end-node could "turn off congestion control" by not reducing its
gestion window in response to packet drops. This is a concern for the congestion window in response to packet drops. This is a concern for
current Internet. It has been argued that routers will have to the current Internet. It has been argued that routers will have to
deploy mechanisms to detect and differentially treat packets from deploy mechanisms to detect and differentially treat packets from
non-compliant flows [RFC2309,FF99]. It has also been suggested that non-compliant flows [RFC2309,FF99]. It has also been suggested that
techniques such as end-to-end per-flow scheduling and isolation of techniques such as end-to-end per-flow scheduling and isolation of
one flow from another, differentiated services, or end-to-end reser- one flow from another, differentiated services, or end-to-end
vations could remove some of the more damaging effects of unrespon- reservations could remove some of the more damaging effects of
sive flows. unresponsive flows.
It might seem that dropping packets in itself is an adequate deter- It might seem that dropping packets in itself is an adequate
rent for non-compliance, and that the use of ECN removes this deter- deterrent for non-compliance, and that the use of ECN removes this
rent. We would argue in response that (1) ECN-capable routers pre- deterrent. We would argue in response that (1) ECN-capable routers
serve packet-dropping behavior in times of high congestion; and (2) preserve packet-dropping behavior in times of high congestion; and
even in times of high congestion, dropping packets in itself is not (2) even in times of high congestion, dropping packets in itself is
an adequate deterrent for non-compliance. not an adequate deterrent for non-compliance.
First, ECN-Capable routers will only mark packets (as opposed to First, ECN-Capable routers will only mark packets (as opposed to
dropping them) when the packet marking rate is reasonably low. During dropping them) when the packet marking rate is reasonably low. During
periods where the average queue size exceeds an upper threshold, and periods where the average queue size exceeds an upper threshold, and
therefore the potential packet marking rate would be high, our recom- therefore the potential packet marking rate would be high, our
mendation is that routers drop packets rather then set the CE bit in recommendation is that routers drop packets rather then set the CE
packet headers. codepoint in packet headers.
During the periods of low or moderate packet marking rates when ECN During the periods of low or moderate packet marking rates when ECN
would be deployed, there would be little deterrent effect on unre- would be deployed, there would be little deterrent effect on
sponsive flows of dropping rather than marking those packets. For unresponsive flows of dropping rather than marking those packets. For
example, delay-insensitive flows using reliable delivery might have example, delay-insensitive flows using reliable delivery might have
an incentive to increase rather than to decrease their sending rate an incentive to increase rather than to decrease their sending rate
in the presence of dropped packets. Similarly, delay-sensitive flows in the presence of dropped packets. Similarly, delay-sensitive flows
using unreliable delivery might increase their use of FEC in response using unreliable delivery might increase their use of FEC in response
to an increased packet drop rate, increasing rather than decreasing to an increased packet drop rate, increasing rather than decreasing
their sending rate. For the same reasons, we do not believe that their sending rate. For the same reasons, we do not believe that
packet dropping itself is an effective deterrent for non-compliance packet dropping itself is an effective deterrent for non-compliance
even in an environment of high packet drop rates, when all flows are even in an environment of high packet drop rates, when all flows are
sharing the same packet drop rate. sharing the same packet drop rate.
Several methods have been proposed to identify and restrict non-com- Several methods have been proposed to identify and restrict non-
pliant or unresponsive flows. The addition of ECN to the network compliant or unresponsive flows. The addition of ECN to the network
environment would not in any way increase the difficulty of designing environment would not in any way increase the difficulty of designing
and deploying such mechanisms. If anything, the addition of ECN to and deploying such mechanisms. If anything, the addition of ECN to
the architecture would make the job of identifying unresponsive flows the architecture would make the job of identifying unresponsive flows
slightly easier. For example, in an ECN-Capable environment routers slightly easier. For example, in an ECN-Capable environment routers
are not limited to information about packets that are dropped or have are not limited to information about packets that are dropped or have
the CE bit set at that router itself; in such an environment, routers the CE codepoint set at that router itself; in such an environment,
could also take note of arriving CE packets that indicate congestion routers could also take note of arriving CE packets that indicate
encountered by that packet earlier in the path. congestion encountered by that packet earlier in the path.
8. Non-compliance in the Network 8. Non-compliance in the Network
This section considers the issues when a router is operating, possi- This section considers the issues when a router is operating,
bly maliciously, to modify either of the bits in the ECN field. In possibly maliciously, to modify either of the bits in the ECN field.
this section we represent the ECN field in the IP header by the tuple
(ECT bit, CE bit).
By tampering with the bits in the ECN field, an adversary (or a bro- By tampering with the bits in the ECN field, an adversary (or a
ken router) could do one or more of the following: falsely report broken router) could do one or more of the following: falsely report
congestion, disable ECN-Capability for an individual packet, erase congestion, disable ECN-Capability for an individual packet, erase
the ECN congestion indication, or falsely indicate ECN-Capability. the ECN congestion indication, or falsely indicate ECN-Capability.
Section 18 systematically examines the various cases by which the ECN Section 18 systematically examines the various cases by which the ECN
field could be modified. The important criterion considered in field could be modified. The important criterion considered in
determining the consequences of such modifications is whether it is determining the consequences of such modifications is whether it is
likely to lead to poorer behavior in any dimension (throughput, likely to lead to poorer behavior in any dimension (throughput,
delay, fairness or functionality) than if a router were to drop a delay, fairness or functionality) than if a router were to drop a
packet. packet.
The first two possible changes, falsely reporting congestion or dis- The first two possible changes, falsely reporting congestion or
abling ECN-Capability for an individual packet, are no worse than if disabling ECN-Capability for an individual packet, are no worse than
the router were to simply drop the packet. From a congestion control if the router were to simply drop the packet. From a congestion
point of view, setting the CE bit in the absence of congestion by a control point of view, setting the CE codepoint in the absence of
non-compliant router would be no worse than a router dropping a congestion by a non-compliant router would be no worse than a router
packet unnecessarily. By "erasing" the ECT bit of a packet that is dropping a packet unnecessarily. By "erasing" an ECT codepoint of a
later dropped in the network, a router's actions could result in an packet that is later dropped in the network, a router's actions could
unnecessary packet drop for that packet later in the network. result in an unnecessary packet drop for that packet later in the
network.
However, as discussed in Section 18, a router that erases the ECN However, as discussed in Section 18, a router that erases the ECN
congestion indication or falsely indicates ECN-Capability could congestion indication or falsely indicates ECN-Capability could
potentially do more damage to the flow that if it has simply dropped potentially do more damage to the flow that if it has simply dropped
the packet. A rogue or broken router that "erased" the CE bit in the packet. A rogue or broken router that "erased" the CE codepoint
arriving CE packets would prevent that indication of congestion from in arriving CE packets would prevent that indication of congestion
reaching downstream receivers. This could result in the failure of from reaching downstream receivers. This could result in the failure
congestion control for that flow and a resulting increase in conges- of congestion control for that flow and a resulting increase in
tion in the network, ultimately resulting in subsequent packets congestion in the network, ultimately resulting in subsequent packets
dropped for this flow as the average queue size increased at the con- dropped for this flow as the average queue size increased at the
gested gateway. congested gateway.
Section 19 considers the potential repercussions of subverting end- Section 19 considers the potential repercussions of subverting end-
to-end congestion control by either falsely indicating ECN-Capabil- to-end congestion control by either falsely indicating ECN-
ity, or by erasing the congestion indication in ECN (the CE-bit). We Capability, or by erasing the congestion indication in ECN (the CE-
observe in Section 19 that the consequence of subverting ECN-based codepoint). We observe in Section 19 that the consequence of
congestion control may lead to potential unfairness, but this is subverting ECN-based congestion control may lead to potential
likely to be no worse than the subversion of either ECN-based or unfairness, but this is likely to be no worse than the subversion of
packet-based congestion control by the end nodes. either ECN-based or packet-based congestion control by the end nodes.
8.1. Complications Introduced by Split Paths 8.1. Complications Introduced by Split Paths
If a router or other network element has access to all of the packets If a router or other network element has access to all of the packets
of a flow, then that router could do no more damage to a flow by of a flow, then that router could do no more damage to a flow by
altering the ECN field than it could by simply dropping all of the altering the ECN field than it could by simply dropping all of the
packets from that flow. However, in some cases, a malicious or bro- packets from that flow. However, in some cases, a malicious or
ken router might have access to only a subset of the packets from a broken router might have access to only a subset of the packets from
flow. The question is as follows: can this router, by altering the a flow. The question is as follows: can this router, by altering
ECN field in this subset of the packets, do more damage to that flow the ECN field in this subset of the packets, do more damage to that
than if it has simply dropped that set of the packets? flow than if it has simply dropped that set of the packets?
This is also discussed in detail in Section 18, which conclude as This is also discussed in detail in Section 18, which conclude as
follows: It is true that the adversary that has access only to a follows: It is true that the adversary that has access only to a
subset of packets in an aggregate might, by subverting ECN-based con- subset of packets in an aggregate might, by subverting ECN-based
gestion control, be able to deny the benefits of ECN to the other congestion control, be able to deny the benefits of ECN to the other
packets in the aggregate. While this is undesirable, this is not a packets in the aggregate. While this is undesirable, this is not a
sufficient concern to result in disabling ECN. sufficient concern to result in disabling ECN.
9. Encapsulated Packets 9. Encapsulated Packets
9.1. IP packets encapsulated in IP 9.1. IP packets encapsulated in IP
The encapsulation of IP packet headers in tunnels is used in many The encapsulation of IP packet headers in tunnels is used in many
places, including IPsec and IP in IP [RFC2003]. This section consid- places, including IPsec and IP in IP [RFC2003]. This section
ers issues related to interactions between ECN and IP tunnels, and considers issues related to interactions between ECN and IP tunnels,
specifies two alternative solutions. This discussion is complemented and specifies two alternative solutions. This discussion is
by RFC 2983's discussion of interactions between Differentiated Ser- complemented by RFC 2983's discussion of interactions between
vices and IP tunnels of various forms [RFC 2983], as Differentiated Differentiated Services and IP tunnels of various forms [RFC 2983],
Services uses the remaining six bits of the IP header octet that is as Differentiated Services uses the remaining six bits of the IP
used by ECN (see Figure 1 in Section 5). header octet that is used by ECN (see Figure 2 in Section 5).
Some IP tunnel modes are based on adding a new "outer" IP header that Some IP tunnel modes are based on adding a new "outer" IP header that
encapsulates the original, or "inner" IP header and its associated encapsulates the original, or "inner" IP header and its associated
packet. In many cases, the new "outer" IP header may be added and packet. In many cases, the new "outer" IP header may be added and
removed at intermediate points along a connection, enabling the net- removed at intermediate points along a connection, enabling the
work to establish a tunnel without requiring endpoint participation. network to establish a tunnel without requiring endpoint
We denote tunnels that specify that the outer header be discarded at participation. We denote tunnels that specify that the outer header
tunnel egress as "simple tunnels". be discarded at tunnel egress as "simple tunnels".
ECN uses the ECT and CE flags in the IP header for signaling between ECN uses the ECN field in the IP header for signaling between routers
routers and connection endpoints. ECN interacts with IP tunnels and connection endpoints. ECN interacts with IP tunnels based on the
based on the treatment of these flags in the IP header. In simple IP treatment of the ECN field in the IP header. In simple IP tunnels
tunnels the octet containing these flags is copied or mapped from the the octet containing the ECN field is copied or mapped from the inner
inner IP header to the outer IP header at IP tunnel ingress, and the IP header to the outer IP header at IP tunnel ingress, and the outer
outer header's copy of this field is discarded at IP tunnel egress. header's copy of this field is discarded at IP tunnel egress. If the
If the outer header were to be simply discarded without taking care outer header were to be simply discarded without taking care to deal
to deal with the ECN related flags, and an ECN-capable router were to with the ECN field, and an ECN-capable router were to set the CE
set the CE (Congestion Experienced) bit within a packet in a simple (Congestion Experienced) codepoint within a packet in a simple IP
IP tunnel, this indication would be discarded at tunnel egress, los- tunnel, this indication would be discarded at tunnel egress, losing
ing the indication of congestion. the indication of congestion.
Thus, the use of ECN over simple IP tunnels would result in routers Thus, the use of ECN over simple IP tunnels would result in routers
attempting to use the outer IP header to signal congestion to end- attempting to use the outer IP header to signal congestion to
points, but those congestion warnings never arriving because the endpoints, but those congestion warnings never arriving because the
outer header is discarded at the tunnel egress point. This problem outer header is discarded at the tunnel egress point. This problem
was encountered with ECN and IPsec in tunnel mode, and RFC 2481 rec- was encountered with ECN and IPsec in tunnel mode, and RFC 2481
ommended that ECN not be used with the older simple IPsec tunnels in recommended that ECN not be used with the older simple IPsec tunnels
order to avoid this behavior and its consequences. When ECN becomes in order to avoid this behavior and its consequences. When ECN
widely deployed, then simple tunnels likely to carry ECN-capable becomes widely deployed, then simple tunnels likely to carry ECN-
traffic will have to be changed. capable traffic will have to be changed.
From a security point of view, the use of ECN in the outer header of From a security point of view, the use of ECN in the outer header of
an IP tunnel might raise security concerns because an adversary could an IP tunnel might raise security concerns because an adversary could
tamper with the ECN information that propagates beyond the tunnel tamper with the ECN information that propagates beyond the tunnel
endpoint. Based on an analysis in Sections 18 and 19 of these con- endpoint. Based on an analysis in Sections 18 and 19 of these
cerns and the resultant risks, our overall approach is to make sup- concerns and the resultant risks, our overall approach is to make
port for ECN an option for IP tunnels, so that an IP tunnel can be support for ECN an option for IP tunnels, so that an IP tunnel can be
specified or configured either to use ECN or not to use ECN in the specified or configured either to use ECN or not to use ECN in the
outer header of the tunnel. Thus, in environments or tunneling pro- outer header of the tunnel. Thus, in environments or tunneling
tocols where the risks of using ECN are judged to outweigh its bene- protocols where the risks of using ECN are judged to outweigh its
fits, the tunnel can simply not use ECN in the outer header. Then benefits, the tunnel can simply not use ECN in the outer header.
the only indication of congestion experienced at routers within the Then the only indication of congestion experienced at routers within
tunnel would be through packet loss. the tunnel would be through packet loss.
The result is that there are two viable options for the behavior of The result is that there are two viable options for the behavior of
ECN-capable connections over an IP tunnel, especially IPsec tunnels: ECN-capable connections over an IP tunnel, especially IPsec tunnels:
* A limited-functionality option in which ECN is preserved in the * A limited-functionality option in which ECN is preserved in the
inner header, but disabled in the outer header. The only mecha- inner header, but disabled in the outer header. The only
nism available for signaling congestion occurring within the tun- mechanism available for signaling congestion occurring within the
nel in this case is dropped packets. tunnel in this case is dropped packets.
* A full-functionality option that supports ECN in both the inner * A full-functionality option that supports ECN in both the inner
and outer headers, and propagates congestion warnings from nodes and outer headers, and propagates congestion warnings from nodes
within the tunnel to endpoints. within the tunnel to endpoints.
Support for these options requires varying amounts of changes to IP Support for these options requires varying amounts of changes to IP
header processing at tunnel ingress and egress. A small subset of header processing at tunnel ingress and egress. A small subset of
these changes sufficient to support only the limited-functionality these changes sufficient to support only the limited-functionality
option would be sufficient to eliminate any incompatibility between option would be sufficient to eliminate any incompatibility between
ECN and IP tunnels. ECN and IP tunnels.
One goal of this document is to give guidance about the tradeoffs One goal of this document is to give guidance about the tradeoffs
between the limited-functionality and full-functionality options. A between the limited-functionality and full-functionality options. A
full discussion of the potential effects of an adversary's modifica- full discussion of the potential effects of an adversary's
tions of the CE and ECT bits is given in Sections 18 and 19. modifications of the ECN field is given in Sections 18 and 19.
9.1.1. The Limited-functionality and Full-functionality Options 9.1.1. The Limited-functionality and Full-functionality Options
The limited-functionality option for ECN encapsulation in IP tunnels The limited-functionality option for ECN encapsulation in IP tunnels
is for the ECT bit in the outside (encapsulating) header to be off is for the non-ECT codepoint to be set in the outside (encapsulating)
(i.e., set to 0), regardless of the value of the ECT bit in the header regardless of the value of the ECN field in the inside
inside (encapsulated) header. With this option, the ECN field in the (encapsulated) header. With this option, the ECN field in the inner
inner header is not altered upon de-capsulation. The disadvantage of header is not altered upon de-capsulation. The disadvantage of this
this approach is that the flow does not have ECN support for that approach is that the flow does not have ECN support for that part of
part of the path that is using IP tunneling, even if the encapsulated the path that is using IP tunneling, even if the encapsulated packet
packet (from the original TCP sender) is ECN-Capable. That is, if (from the original TCP sender) is ECN-Capable. That is, if the
the encapsulated packet arrives at a congested router that is ECN- encapsulated packet arrives at a congested router that is ECN-
capable, and the router can decide to drop or mark the packet as an capable, and the router can decide to drop or mark the packet as an
indication of congestion to the end nodes, the router will not be indication of congestion to the end nodes, the router will not be
permitted to set the CE bit in the packet header, but instead will permitted to set the CE codepoint in the packet header, but instead
have to drop the packet. will have to drop the packet.
The full-functionality option for ECN encapsulation is to copy the The full-functionality option for ECN encapsulation is to copy the
ECT bit of the inside header to the outside header on encapsulation, ECN codepoint of the inside header to the outside header on
and to OR the CE bit from the outer header with the CE bit of the encapsulation if the inside header is not-ECT or ECT, and to set the
inside header on decapsulation. That is, for full ECN support the ECN codepoint of the outside header to ECT(0) if the ECN codepoint of
encapsulation and decapsulation processing involves the following: the inside header is CE. On decapsulation, if the CE codepoint is
At tunnel ingress, the full-functionality option copies the value of set on the outside header, then the CE codepoint is also set in the
ECT (bit 6) in the inner header to the outer header. CE (bit 7) is inner header. Otherwise, the ECN codepoint on the inner header is
set to 0 in the outer header. Upon decapsulation at the tunnel left unchanged. That is, for full ECN support the encapsulation and
egress, the full-functionality option sets CE to 1 in the inner decapsulation processing involves the following: At tunnel ingress,
header if the value of ECT (bit 6) in the inner header is 1, and the the full-functionality option sets the ECN codepoint in the outer
value of CE (bit 7) in the outer header is 1. Otherwise, no change header. If the ECN codepoint in the inner header is not-ECT or ECT,
is made to this field of the inner header. then it is copied to the ECN codepoint in the outer header. If the
ECN codepoint in the inner header is CE, then the ECN codepoint in
the outer header is set to ECT(0). Upon decapsulation at the tunnel
egress, the full-functionality option sets the CE codepoint in the
inner header if the CE codepoint is set in the outer header.
Otherwise, no change is made to this field of the inner header.
With the full-functionality option, a flow can take advantage of ECN With the full-functionality option, a flow can take advantage of ECN
in those parts of the path that might use IP tunneling. The disad- in those parts of the path that might use IP tunneling. The
vantage of the full-functionality option from a security perspective disadvantage of the full-functionality option from a security
is that the IP tunnel cannot protect the flow from certain modifica- perspective is that the IP tunnel cannot protect the flow from
tions to the ECN bits in the IP header within the tunnel. The poten- certain modifications to the ECN bits in the IP header within the
tial dangers from modifications to the ECN bits in the IP header are tunnel. The potential dangers from modifications to the ECN bits in
described in detail in Sections 18 and 19. the IP header are described in detail in Sections 18 and 19.
(1) An IP tunnel MUST modify the handling of the DS field octet at (1) An IP tunnel MUST modify the handling of the DS field octet at
IP tunnel endpoints by implementing either the limited-functional- IP tunnel endpoints by implementing either the limited-
ity or the full-functionality option. functionality or the full-functionality option.
(2) Optionally, an IP tunnel MAY enable the endpoints of an IP (2) Optionally, an IP tunnel MAY enable the endpoints of an IP
tunnel to negotiate the choice between the limited-functionality tunnel to negotiate the choice between the limited-functionality
and the full-functionality option for ECN in the tunnel. and the full-functionality option for ECN in the tunnel.
The minimum required to make ECN usable with IP tunnels is the lim- The minimum required to make ECN usable with IP tunnels is the
ited-functionality option, which prevents ECN from being enabled in limited-functionality option, which prevents ECN from being enabled
the outer header of an IPsec tunnel. Full support for ECN requires in the outer header of an IPsec tunnel. Full support for ECN
the use of the full-functionality option. If there are no optional requires the use of the full-functionality option. If there are no
mechanisms for the tunnel endpoints to negotiate a choice between the optional mechanisms for the tunnel endpoints to negotiate a choice
limited-functionality or full-functionality option, there can be a between the limited-functionality or full-functionality option, there
pre-existing agreement between the tunnel endpoints about whether to can be a pre-existing agreement between the tunnel endpoints about
support the limited-functionality or the full-functionality ECN whether to support the limited-functionality or the full-
option. functionality ECN option.
In addition, it is RECOMMENDED that packets with ECT and CE both set In addition, it is RECOMMENDED that packets with the CE codepoint in
to 1 in the outer header be dropped if they arrive at the tunnel the outer header be dropped if they arrive at the tunnel egress point
egress point for a tunnel that uses the limited-functionality option, for a tunnel that uses the limited-functionality option, or for a
or for a tunnel that uses the full-functionality option but for which tunnel that uses the full-functionality option but for which the not-
the ECT bit in the inner header is set to zero. This is motivated by ECT codepoint is set in the inner header. This is motivated by
backwards compatibility and to ensure that no unauthorized modifica- backwards compatibility and to ensure that no unauthorized
tions of the ECN field take place, and is discussed further in the modifications of the ECN field take place, and is discussed further
next Section (9.1.2). in the next Section (9.1.2).
9.1.2. Changes to the ECN Field within an IP Tunnel. 9.1.2. Changes to the ECN Field within an IP Tunnel.
The presence of a copy of the ECN field in the inner header of an IP The presence of a copy of the ECN field in the inner header of an IP
tunnel mode packet provides an opportunity for detection of unautho- tunnel mode packet provides an opportunity for detection of
rized modifications to the ECT bit in the outer header. Comparison unauthorized modifications to the ECN field in the outer header.
of the ECT bits in the inner and outer headers falls into two cate- Comparison of the ECT fields in the inner and outer headers falls
gories for implementations that conform to this document: into two categories for implementations that conform to this
document:
* If the IP tunnel uses the full-functionality option, then the * If the IP tunnel uses the full-functionality option, then the
values of the ECT bits in the inner and outer headers should be not-ECT codepoint should be set in the outer header if and only if
identical. it is also set in the inner header.
* If the tunnel uses the limited-functionality option, then the * If the tunnel uses the limited-functionality option, then the
ECT bit in the outer header should be 0. not-ECT codepoint should be set in the outer header.
Receipt of a packet not satisfying the appropriate condition could be Receipt of a packet not satisfying the appropriate condition could be
a cause of concern. a cause of concern.
Consider the case of an IP tunnel where the tunnel ingress point has Consider the case of an IP tunnel where the tunnel ingress point has
not been updated to this document's requirements, while the tunnel not been updated to this document's requirements, while the tunnel
egress point has been updated to support ECN. In this case, the IP egress point has been updated to support ECN. In this case, the IP
tunnel is not explicitly configured to support the full-functionality tunnel is not explicitly configured to support the full-functionality
ECN option. However, the tunnel ingress point is behaving identically ECN option. However, the tunnel ingress point is behaving identically
to a tunnel ingress point that supports the full-functionality to a tunnel ingress point that supports the full-functionality
option. If packets from an ECN-capable connection use this tunnel, option. If packets from an ECN-capable connection use this tunnel,
ECT will be set to 1 in the outer header at the tunnel ingress point. the ECT codepoint will be set in the outer header at the tunnel
Congestion within the tunnel may then result in ECN-capable routers ingress point. Congestion within the tunnel may then result in ECN-
setting CE in the outer header. Because the tunnel has not been capable routers setting CE in the outer header. Because the tunnel
explicitly configured to support the full-functionality option, the has not been explicitly configured to support the full-functionality
tunnel egress point expects the ECT bit in the outer header to be 0. option, the tunnel egress point expects the not-ECT codepoint to be
When an ECN-capable tunnel egress point receives a packet with the set in the outer header. When an ECN-capable tunnel egress point
ECT bit in the outer header set to 1, in a tunnel that has not been receives a packet with the ECT or CE codepoint in the outer header,
configured to support the full-functionality option, that packet in a tunnel that has not been configured to support the full-
should be processed, according to whether CE bit was set, as follows. functionality option, that packet should be processed, according to
It is RECOMMENDED that such packets, with the ECT bit in the outer whether the CE codepoint was set, as follows. It is RECOMMENDED that
header set to 1 on a tunnel that has not been configured to support on a tunnel that has not been configured to support the full-
the full-functionality option, be dropped at the egress point if CE functionality option, packets should be dropped at the egress point
is set to 1 in the outer header but 0 in the inner header, and for- if the CE codepoint is set in the outer header but not in the inner
warded otherwise. header, and should be forwarded otherwise.
An IP tunnel cannot provide protection against erasure of congestion An IP tunnel cannot provide protection against erasure of congestion
indications based on resetting the value of the CE bit in packets for indications based on changing the ECN codepoint from CE to ECT. The
which ECT is set in the outer header. The erasure of congestion erasure of congestion indications may impact the network and other
indications may impact the network and other flows in ways that would flows in ways that would not be possible in the absence of ECN. It
not be possible in the absence of ECN. It is important to note that is important to note that erasure of congestion indications can only
erasure of congestion indications can only be performed to congestion be performed to congestion indications placed by nodes within the
indications placed by nodes within the tunnel; the copy of the CE bit tunnel; the copy of the ECN field in the inner header preserves
in the inner header preserves congestion notifications from nodes congestion notifications from nodes upstream of the tunnel ingress
upstream of the tunnel ingress. If erasure of congestion notifica- (unless the inner header is also erased). If erasure of congestion
tions is judged to be a security risk that exceeds the congestion notifications is judged to be a security risk that exceeds the
management benefits of ECN, then tunnels could be specified or con- congestion management benefits of ECN, then tunnels could be
figured to use the limited-functionality option. specified or configured to use the limited-functionality option.
9.2. IPsec Tunnels 9.2. IPsec Tunnels
IPsec supports secure communication over potentially insecure network IPsec supports secure communication over potentially insecure network
components such as intermediate routers. IPsec protocols support two components such as intermediate routers. IPsec protocols support two
operating modes, transport mode and tunnel mode, that span a wide operating modes, transport mode and tunnel mode, that span a wide
range of security requirements and operating environments. Transport range of security requirements and operating environments. Transport
mode security protocol header(s) are inserted between the IP (IPv4 or mode security protocol header(s) are inserted between the IP (IPv4 or
IPv6) header and higher layer protocol headers (e.g., TCP), and hence IPv6) header and higher layer protocol headers (e.g., TCP), and hence
transport mode can only be used for end-to-end security on a connec- transport mode can only be used for end-to-end security on a
tion. IPsec tunnel mode is based on adding a new "outer" IP header connection. IPsec tunnel mode is based on adding a new "outer" IP
that encapsulates the original, or "inner" IP header and its associ- header that encapsulates the original, or "inner" IP header and its
ated packet. Tunnel mode security headers are inserted between these associated packet. Tunnel mode security headers are inserted between
two IP headers. In contrast to transport mode, the new "outer" IP these two IP headers. In contrast to transport mode, the new "outer"
header and tunnel mode security headers can be added and removed at IP header and tunnel mode security headers can be added and removed
intermediate points along a connection, enabling security gateways to at intermediate points along a connection, enabling security gateways
secure vulnerable portions of a connection without requiring endpoint to secure vulnerable portions of a connection without requiring
participation in the security protocols. An important aspect of tun- endpoint participation in the security protocols. An important
nel mode security is that in the original specification, the outer aspect of tunnel mode security is that in the original specification,
header is discarded at tunnel egress, ensuring that security threats the outer header is discarded at tunnel egress, ensuring that
based on modifying the IP header do not propagate beyond that tunnel security threats based on modifying the IP header do not propagate
endpoint. Further discussion of IPsec can be found in [RFC2401]. beyond that tunnel endpoint. Further discussion of IPsec can be
found in [RFC2401].
The IPsec protocol as originally defined in [ESP, AH] required that The IPsec protocol as originally defined in [ESP, AH] required that
the inner header's ECN field not be changed by IPsec decapsulation the inner header's ECN field not be changed by IPsec decapsulation
processing at a tunnel egress node; this would have ruled out the processing at a tunnel egress node; this would have ruled out the
possibility of full-functionality mode for ECN. At the same time, possibility of full-functionality mode for ECN. At the same time,
this would ensure that an adversary's modifications to the ECN field this would ensure that an adversary's modifications to the ECN field
cannot be used to launch theft- or denial-of-service attacks across cannot be used to launch theft- or denial-of-service attacks across
an IPsec tunnel endpoint, as any such modifications will be discarded an IPsec tunnel endpoint, as any such modifications will be discarded
at the tunnel endpoint. at the tunnel endpoint.
In principle, permitting the use of ECN functionality in the outer In principle, permitting the use of ECN functionality in the outer
header of an IPsec tunnel raises security concerns because an adver- header of an IPsec tunnel raises security concerns because an
sary could tamper with the information that propagates beyond the adversary could tamper with the information that propagates beyond
tunnel endpoint. Based on an analysis (included in Sections 18 and the tunnel endpoint. Based on an analysis (included in Sections 18
19) of these concerns and the associated risks, our overall approach and 19) of these concerns and the associated risks, our overall
has been to provide configuration support for IPsec changes to remove approach has been to provide configuration support for IPsec changes
the conflict with ECN. to remove the conflict with ECN.
In particular, in tunnel mode the IPsec tunnel MUST support either In particular, in tunnel mode the IPsec tunnel MUST support either
the limited-functionality or the full-functionality mode outlined in the limited-functionality or the full-functionality mode outlined in
Section 9.1.1. Section 9.1.1.
This makes permission to use ECN functionality in the outer header of This makes permission to use ECN functionality in the outer header of
an IPsec tunnel a configurable part of the corresponding IPsec Secu- an IPsec tunnel a configurable part of the corresponding IPsec
rity Association (SA), so that it can be disabled in situations where Security Association (SA), so that it can be disabled in situations
the risks are judged to outweigh the benefits. The result is that an where the risks are judged to outweigh the benefits. The result is
IPsec security administrator is presented with two alternatives for that an IPsec security administrator is presented with two
the behavior of ECN-capable connections within an IPsec tunnel, the alternatives for the behavior of ECN-capable connections within an
limited-functionality alternative and full-functionality alternative IPsec tunnel, the limited-functionality alternative and full-
described earlier. All IPsec implementations MUST implement either functionality alternative described earlier. All IPsec
the limited-functionality or the full-functionality alternative in implementations MUST implement either the limited-functionality or
order to eliminate incompatibility between ECN and IPsec tunnels, but the full-functionality alternative in order to eliminate
implementers MAY choose to implement either alternative. incompatibility between ECN and IPsec tunnels, but implementers MAY
choose to implement either alternative.
In addition, this document specifies how the endpoints of an IPsec In addition, this document specifies how the endpoints of an IPsec
tunnel could negotiate enabling ECN functionality in the outer head- tunnel could negotiate enabling ECN functionality in the outer
ers of that tunnel based on security policy. The ability to negoti- headers of that tunnel based on security policy. The ability to
ate ECN usage between tunnel endpoints would enable a security admin- negotiate ECN usage between tunnel endpoints would enable a security
istrator to disable ECN in situations where she believes the risks administrator to disable ECN in situations where she believes the
(e.g., of lost congestion notifications) outweigh the benefits of risks (e.g., of lost congestion notifications) outweigh the benefits
ECN. of ECN.
The IPsec protocol, as defined in [ESP, AH], does not include the IP The IPsec protocol, as defined in [ESP, AH], does not include the IP
header's ECN field in any of its cryptographic calculations (in the header's ECN field in any of its cryptographic calculations (in the
case of tunnel mode, the outer IP header's ECN field is not case of tunnel mode, the outer IP header's ECN field is not
included). Hence modification of the ECN field by a network node has included). Hence modification of the ECN field by a network node has
no effect on IPsec's end-to-end security, because it cannot cause any no effect on IPsec's end-to-end security, because it cannot cause any
IPsec integrity check to fail. As a consequence, IPsec does not pro- IPsec integrity check to fail. As a consequence, IPsec does not
vide any defense against an adversary's modification of the ECN field provide any defense against an adversary's modification of the ECN
(i.e., a man-in-the-middle attack), as the adversary's modification field (i.e., a man-in-the-middle attack), as the adversary's
will also have no effect on IPsec's end-to-end security. In some modification will also have no effect on IPsec's end-to-end security.
environments, the ability to modify the ECN field without affecting In some environments, the ability to modify the ECN field without
IPsec integrity checks may constitute a covert channel; if it is nec- affecting IPsec integrity checks may constitute a covert channel; if
essary to eliminate such a channel or reduce its bandwidth, then the it is necessary to eliminate such a channel or reduce its bandwidth,
IPsec tunnel should be run in limited-functionality mode. then the IPsec tunnel should be run in limited-functionality mode.
9.2.1. Negotiation between Tunnel Endpoints 9.2.1. Negotiation between Tunnel Endpoints
This section describes the detailed changes to enable usage of ECN This section describes the detailed changes to enable usage of ECN
over IPsec tunnels, including the negotiation of ECN support between over IPsec tunnels, including the negotiation of ECN support between
tunnel endpoints. This is supported by three changes to IPsec: tunnel endpoints. This is supported by three changes to IPsec:
* An optional Security Association Database (SAD) field indicating * An optional Security Association Database (SAD) field indicating
whether tunnel encapsulation and decapsulation processing allows whether tunnel encapsulation and decapsulation processing allows
or forbids ECN usage in the outer IP header. or forbids ECN usage in the outer IP header.
* An optional Security Association Attribute that enables negotia- * An optional Security Association Attribute that enables
tion of this SAD field between the two endpoints of an SA that negotiation of this SAD field between the two endpoints of an SA
supports tunnel mode. that supports tunnel mode.
* Changes to tunnel mode encapsulation and decapsulation process- * Changes to tunnel mode encapsulation and decapsulation
ing to allow or forbid ECN usage in the outer IP header based on processing to allow or forbid ECN usage in the outer IP header
the value of the SAD field. When ECN usage is allowed in the based on the value of the SAD field. When ECN usage is allowed in
outer IP header, ECT is set in the outer header for ECN-capable the outer IP header, the ECT codepoint is set in the outer header
connections and congestion notifications (indicated by the CE bit) for ECN-capable connections and congestion notifications
from such connections are propagated to the inner header at tunnel (indicated by the CE codepoint) from such connections are
egress. propagated to the inner header at tunnel egress.
If negotiation of ECN usage is implemented, then the SAD field SHOULD If negotiation of ECN usage is implemented, then the SAD field SHOULD
also be implemented. On the other hand, negotiation of ECN usage is also be implemented. On the other hand, negotiation of ECN usage is
OPTIONAL in all cases, even for implementations that support the SAD OPTIONAL in all cases, even for implementations that support the SAD
field. The encapsulation and decapsulation processing changes are field. The encapsulation and decapsulation processing changes are
REQUIRED, but MAY be implemented without the other two changes by REQUIRED, but MAY be implemented without the other two changes by
assuming that ECN usage is always forbidden. The full-functionality assuming that ECN usage is always forbidden. The full-functionality
alternative for ECN usage over IPsec tunnels consists of the SAD alternative for ECN usage over IPsec tunnels consists of the SAD
field and the full version of encapsulation and decapsulation pro- field and the full version of encapsulation and decapsulation
cessing changes, with or without the OPTIONAL negotiation support. processing changes, with or without the OPTIONAL negotiation support.
The limited-functionality alternative consists of a subset of the The limited-functionality alternative consists of a subset of the
encapsulation and decapsulation changes that always forbids ECN encapsulation and decapsulation changes that always forbids ECN
usage. usage.
These changes are covered further in the following three subsections. These changes are covered further in the following three subsections.
9.2.1.1. ECN Tunnel Security Association Database Field 9.2.1.1. ECN Tunnel Security Association Database Field
Full ECN functionality adds a new field to the SAD (see [RFC2401]): Full ECN functionality adds a new field to the SAD (see [RFC2401]):
skipping to change at page 29, line 8 skipping to change at page 31, line 14
congestion occurring within the tunnel. The allowed value enables congestion occurring within the tunnel. The allowed value enables
ECN congestion notifications. The forbidden value disables such ECN congestion notifications. The forbidden value disables such
notifications, causing all congestion to be indicated via dropped notifications, causing all congestion to be indicated via dropped
packets. packets.
[OPTIONAL. The value of this field SHOULD be assumed to be [OPTIONAL. The value of this field SHOULD be assumed to be
"forbidden" in implementations that do not support it.] "forbidden" in implementations that do not support it.]
If this attribute is implemented, then the SA specification in a If this attribute is implemented, then the SA specification in a
Security Policy Database (SPD) entry MUST support a corresponding Security Policy Database (SPD) entry MUST support a corresponding
attribute, and this SPD attribute MUST be covered by the SPD adminis- attribute, and this SPD attribute MUST be covered by the SPD
trative interface (currently described in Section 4.4.1 of administrative interface (currently described in Section 4.4.1 of
[RFC2401]). [RFC2401]).
9.2.1.2. ECN Tunnel Security Association Attribute 9.2.1.2. ECN Tunnel Security Association Attribute
A new IPsec Security Association Attribute is defined to enable the A new IPsec Security Association Attribute is defined to enable the
support for ECN congestion notifications based on the outer IP header support for ECN congestion notifications based on the outer IP header
to be negotiated for IPsec tunnels (see [RFC2407]). This attribute to be negotiated for IPsec tunnels (see [RFC2407]). This attribute
is OPTIONAL, although implementations that support it SHOULD also is OPTIONAL, although implementations that support it SHOULD also
support the SAD field defined in Section 9.2.1.1. support the SAD field defined in Section 9.2.1.1.
Attribute Type Attribute Type
class value type class value type
------------------------------------------------- -------------------------------------------------
ECN Tunnel 10 Basic ECN Tunnel 10 Basic
The IPsec SA Attribute value 10 has been allocated by IANA to indi- The IPsec SA Attribute value 10 has been allocated by IANA to
cate that the ECN Tunnel SA Attribute is being negotiated; the type indicate that the ECN Tunnel SA Attribute is being negotiated; the
of this attribute is Basic (see Section 4.5 of [RFC2407]). The Class type of this attribute is Basic (see Section 4.5 of [RFC2407]). The
Values are used to conduct the negotiation. See [RFC2407, RFC2408, Class Values are used to conduct the negotiation. See [RFC2407,
RFC2409] for further information including encoding formats and RFC2408, RFC2409] for further information including encoding formats
requirements for negotiating this SA attribute. and requirements for negotiating this SA attribute.
Class Values Class Values
ECN Tunnel ECN Tunnel
Specifies whether ECN functionality is allowed to Specifies whether ECN functionality is allowed to
be used with Tunnel Encapsulation Mode. be used with Tunnel Encapsulation Mode.
This affects tunnel encapsulation and decapsulation processing - This affects tunnel encapsulation and decapsulation processing -
see Section 9.2.1.3. see Section 9.2.1.3.
skipping to change at page 29, line 45 skipping to change at page 32, line 4
ECN Tunnel ECN Tunnel
Specifies whether ECN functionality is allowed to Specifies whether ECN functionality is allowed to
be used with Tunnel Encapsulation Mode. be used with Tunnel Encapsulation Mode.
This affects tunnel encapsulation and decapsulation processing - This affects tunnel encapsulation and decapsulation processing -
see Section 9.2.1.3. see Section 9.2.1.3.
RESERVED 0 RESERVED 0
Allowed 1 Allowed 1
Forbidden 2 Forbidden 2
Values 3-61439 are reserved to IANA. Values 61440-65535 are for Values 3-61439 are reserved to IANA. Values 61440-65535 are for
private use. private use.
If unspecified, the default shall be assumed to be Forbidden. If unspecified, the default shall be assumed to be Forbidden.
ECN Tunnel is a new SA attribute, and hence initiators that use it ECN Tunnel is a new SA attribute, and hence initiators that use it
can expect to encounter responders that do not understand it, and can expect to encounter responders that do not understand it, and
therefore reject proposals containing it. For backwards compatibil- therefore reject proposals containing it. For backwards
ity with such implementations initiators SHOULD always also include a compatibility with such implementations initiators SHOULD always also
proposal without the ECN Tunnel attribute to enable such a responder include a proposal without the ECN Tunnel attribute to enable such a
to select a transform or proposal that does not contain the ECN Tun- responder to select a transform or proposal that does not contain the
nel attribute. RFC 2407 currently requires responders to reject all ECN Tunnel attribute. RFC 2407 currently requires responders to
proposals if any proposal contains an unknown attribute; this reject all proposals if any proposal contains an unknown attribute;
requirement is expected to be changed to require a responder not to this requirement is expected to be changed to require a responder not
select proposals or transforms containing unknown attributes. to select proposals or transforms containing unknown attributes.
9.2.1.3. Changes to IPsec Tunnel Header Processing 9.2.1.3. Changes to IPsec Tunnel Header Processing
For full ECN support, the encapsulation and decapsulation processing For full ECN support, the encapsulation and decapsulation processing
for the IPv4 TOS field and the IPv6 Traffic Class field are changed for the IPv4 TOS field and the IPv6 Traffic Class field are changed
from that specified in [RFC2401] to the following: from that specified in [RFC2401] to the following:
<-- How Outer Hdr Relates to Inner Hdr --> <-- How Outer Hdr Relates to Inner Hdr -->
Outer Hdr at Inner Hdr at Outer Hdr at Inner Hdr at
IPv4 Encapsulator Decapsulator IPv4 Encapsulator Decapsulator
skipping to change at page 30, line 38 skipping to change at page 32, line 44
Header fields: Header fields:
DS Field copied from inner hdr (6) no change DS Field copied from inner hdr (6) no change
ECN Field constructed (7) constructed (8) ECN Field constructed (7) constructed (8)
(5)(6) If the packet will immediately enter a domain for which the (5)(6) If the packet will immediately enter a domain for which the
DSCP value in the outer header is not appropriate, that value MUST DSCP value in the outer header is not appropriate, that value MUST
be mapped to an appropriate value for the domain [RFC 2474]. Also be mapped to an appropriate value for the domain [RFC 2474]. Also
see [RFC 2475] for further information. see [RFC 2475] for further information.
(7) If the value of the ECN Tunnel field in the SAD entry for this (7) If the value of the ECN Tunnel field in the SAD entry for this
SA is "allowed" and the value of ECT (bit 0) is 1 in the inner SA is "allowed" and the ECN field in the inner header is set to
header, set ECT to 1 in the outer header, else set ECT to 0 in the any value other than CE, copy this ECN field to the outer header.
outer header. Set CE (bit 1) to 0 in the outer header. If the ECN field in the inner header is set to CE, then set the
ECN field in the outer header to ECT(0).
(8) If the value of the ECN tunnel field in the SAD entry for this (8) If the value of the ECN tunnel field in the SAD entry for this
SA is "allowed" and the value of ECT (bit 0) in the inner header SA is "allowed" and the ECN field in the inner header is set to
is 1, then set the CE bit (bit 1) in the inner header to the logi- ECT(0) or ECT(1) and the ECN field in the outer header is set to
cal OR of the CE bit in the inner header with the CE bit in the CE, then copy the ECN field from the outer header to the inner
outer header, else make no change to the ECN field. header. Otherwise, make no change to the ECN field in the inner
header.
(5) and (6) are identical to match usage in [RFC2401], although (5) and (6) are identical to match usage in [RFC2401], although
they are different in [RFC2401]. they are different in [RFC2401].
The above description applies to implementations that support the ECN The above description applies to implementations that support the ECN
Tunnel field in the SAD; such implementations MUST implement this Tunnel field in the SAD; such implementations MUST implement this
processing instead of the processing of the IPv4 TOS octet and IPv6 processing instead of the processing of the IPv4 TOS octet and IPv6
Traffic Class octet defined in [RFC2401]. This constitutes the full- Traffic Class octet defined in [RFC2401]. This constitutes the full-
functionality alternative for ECN usage with IPsec tunnels. functionality alternative for ECN usage with IPsec tunnels.
An implementation that does not support the ECN Tunnel field in the An implementation that does not support the ECN Tunnel field in the
SAD MUST implement this processing by assuming that the value of the SAD MUST implement this processing by assuming that the value of the
ECN Tunnel field of the SAD is "forbidden" for every SA. In this ECN Tunnel field of the SAD is "forbidden" for every SA. In this
case, the processing of the ECN field reduces to: case, the processing of the ECN field reduces to:
(7) Set the ECN field (ECT and CE bits) to zero in the outer (7) Set the ECN field to not-ECT in the outer header.
header.
(8) Make no change to the ECN field in the inner header. (8) Make no change to the ECN field in the inner header.
This constitutes the limited functionality alternative for ECN usage This constitutes the limited functionality alternative for ECN usage
with IPsec tunnels. with IPsec tunnels.
For backwards compatibility, packets with ECT and CE both set to 1 in For backwards compatibility, packets with the CE codepoint set in the
the outer header SHOULD be dropped if they arrive on an SA that is outer header SHOULD be dropped if they arrive on an SA that is using
using the limited-functionality option, or that is using the full- the limited-functionality option, or that is using the full-
functionality option (i.e., and has set the ECT flag in the outer functionality option with the not-ECN codepoint set in the inner
header to 1) for a packet with the ECT flag set to 0 in the inner
header. header.
9.2.2. Changes to the ECN Field within an IPsec Tunnel. 9.2.2. Changes to the ECN Field within an IPsec Tunnel.
If the ECN Field is changed inappropriately within an IPsec tunnel, If the ECN Field is changed inappropriately within an IPsec tunnel,
and this change is detected at the tunnel egress, then the receipt of and this change is detected at the tunnel egress, then the receipt of
a packet not satisfying the appropriate condition for its SA is an a packet not satisfying the appropriate condition for its SA is an
auditable event. An implementation MAY create audit records with auditable event. An implementation MAY create audit records with
per-SA counts of incorrect packets over some time period rather than per-SA counts of incorrect packets over some time period rather than
creating an audit record for each erroneous packet. Any such audit creating an audit record for each erroneous packet. Any such audit
record SHOULD contain the headers from at least one erroneous packet, record SHOULD contain the headers from at least one erroneous packet,
but need not contain the headers from every packet represented by the but need not contain the headers from every packet represented by the
entry. entry.
9.2.3. Comments for IPsec Support 9.2.3. Comments for IPsec Support
Substantial comments were received on two areas of this document dur- Substantial comments were received on two areas of this document
ing review by the IPsec working group. This section describes these during review by the IPsec working group. This section describes
comments and explains why the proposed changes were not incorporated. these comments and explains why the proposed changes were not
incorporated.
The first comment indicated that per-node configuration is easier to The first comment indicated that per-node configuration is easier to
implement than per-SA configuration. After serious thought and implement than per-SA configuration. After serious thought and
despite some initial encouragement of per-node configuration, it no despite some initial encouragement of per-node configuration, it no
longer seems to be a good idea. The concern is that as ECN-awareness longer seems to be a good idea. The concern is that as ECN-awareness
is progressively deployed in IPsec, many ECN-aware IPsec implementa- is progressively deployed in IPsec, many ECN-aware IPsec
tions will find themselves communicating with a mixture of ECN-aware implementations will find themselves communicating with a mixture of
and ECN-unaware IPsec tunnel endpoints. In such an environment with ECN-aware and ECN-unaware IPsec tunnel endpoints. In such an
per-node configuration, the only reasonable thing to do is forbid ECN environment with per-node configuration, the only reasonable thing to
usage for all IPsec tunnels, which is not the desired outcome. do is forbid ECN usage for all IPsec tunnels, which is not the
desired outcome.
In the second area, several reviewers noted that SA negotiation is In the second area, several reviewers noted that SA negotiation is
complex, and adding to it is non-trivial. One reviewer suggested complex, and adding to it is non-trivial. One reviewer suggested
using ICMP after tunnel setup as a possible alternative. The addi- using ICMP after tunnel setup as a possible alternative. The
tion to SA negotiation in this document is OPTIONAL and will remain addition to SA negotiation in this document is OPTIONAL and will
so; implementers are free to ignore it. The authors believe that the remain so; implementers are free to ignore it. The authors believe
assurance it provides can be useful in a number of situations. In that the assurance it provides can be useful in a number of
practice, if this is not implemented, it can be deleted at a subse- situations. In practice, if this is not implemented, it can be
quent stage in the standards process. Extending ICMP to negotiate deleted at a subsequent stage in the standards process. Extending
ECN after tunnel setup is more complex than extending SA attribute ICMP to negotiate ECN after tunnel setup is more complex than
negotiation. Some tunnels do not permit traffic to be addressed to extending SA attribute negotiation. Some tunnels do not permit
the tunnel egress endpoint, hence the ICMP packet would have to be traffic to be addressed to the tunnel egress endpoint, hence the ICMP
addressed to somewhere else, scanned for by the egress endpoint, and packet would have to be addressed to somewhere else, scanned for by
discarded there or at its actual destination. In addition, ICMP the egress endpoint, and discarded there or at its actual
delivery is unreliable, and hence there is a possibility of an ICMP destination. In addition, ICMP delivery is unreliable, and hence
packet being dropped, entailing the invention of yet another there is a possibility of an ICMP packet being dropped, entailing the
ack/retransmit mechanism. It seems better simply to specify an invention of yet another ack/retransmit mechanism. It seems better
OPTIONAL extension to the existing SA negotiation mechanism. simply to specify an OPTIONAL extension to the existing SA
negotiation mechanism.
9.3. IP packets encapsulated in non-IP packet headers. 9.3. IP packets encapsulated in non-IP packet headers.
A different set of issues are raised, relative to ECN, when IP pack- A different set of issues are raised, relative to ECN, when IP
ets are encapsulated in tunnels with non-IP packet headers. This packets are encapsulated in tunnels with non-IP packet headers. This
occurs with MPLS [MPLS], GRE [GRE], L2TP [L2TP], and PPTP [PPTP]. occurs with MPLS [MPLS], GRE [GRE], L2TP [L2TP], and PPTP [PPTP].
For these protocols, there is no conflict with ECN; it is just that For these protocols, there is no conflict with ECN; it is just that
ECN cannot be used within the tunnel unless an ECN codepoint can be ECN cannot be used within the tunnel unless an ECN codepoint can be
specified for the header of the encapsulating protocol. Earlier work specified for the header of the encapsulating protocol. Earlier work
considered a preliminary proposal for incorporating ECN into MPLS, considered a preliminary proposal for incorporating ECN into MPLS,
and proposals for incorporating ECN into GRE, L2TP, or PPTP will be and proposals for incorporating ECN into GRE, L2TP, or PPTP will be
considered as the need arises. considered as the need arises.
10. Issues Raised by Monitoring and Policing Devices 10. Issues Raised by Monitoring and Policing Devices
One possibility is that monitoring and policing devices (or more One possibility is that monitoring and policing devices (or more
informally, "penalty boxes") will be installed in the network to mon- informally, "penalty boxes") will be installed in the network to
itor whether best-effort flows are appropriately responding to con- monitor whether best-effort flows are appropriately responding to
gestion, and to preferentially drop packets from flows determined not congestion, and to preferentially drop packets from flows determined
to be using adequate end-to-end congestion control procedures. not to be using adequate end-to-end congestion control procedures.
We recommend that any "penalty box" that detects a flow or an aggre- We recommend that any "penalty box" that detects a flow or an
gate of flows that is not responding to end-to-end congestion control aggregate of flows that is not responding to end-to-end congestion
first change from marking to dropping packets from that flow, before control first change from marking to dropping packets from that flow,
taking any additional action to restrict the bandwidth available to before taking any additional action to restrict the bandwidth
that flow. Thus, initially, the router may drop packets in which the available to that flow. Thus, initially, the router may drop packets
router would otherwise would have set the CE bit. This could include in which the router would otherwise would have set the CE codepoint.
dropping those arriving packets for that flow that are ECN-Capable This could include dropping those arriving packets for that flow that
and that already have the CE bit set. In this way, any congestion are ECN-Capable and that already have the CE codepoint set. In this
indications seen by that router for that flow will be guaranteed to way, any congestion indications seen by that router for that flow
also be seen by the end nodes, even in the presence of malicious or will be guaranteed to also be seen by the end nodes, even in the
broken routers elsewhere in the path. If we assume that the first presence of malicious or broken routers elsewhere in the path. If we
action taken at any "penalty box" for an ECN-capable flow will be to assume that the first action taken at any "penalty box" for an ECN-
drop packets instead of marking them, then there is no way that an capable flow will be to drop packets instead of marking them, then
adversary that subverts ECN-based end-to-end congestion control can there is no way that an adversary that subverts ECN-based end-to-end
cause a flow to be characterized as being non-cooperative and placed congestion control can cause a flow to be characterized as being non-
into a more severe action within the "penalty box". cooperative and placed into a more severe action within the "penalty
box".
The monitoring and policing devices that are actually deployed could The monitoring and policing devices that are actually deployed could
fall short of the `ideal' monitoring device described above, in that fall short of the `ideal' monitoring device described above, in that
the monitoring is applied not to a single flow, but to an aggregate the monitoring is applied not to a single flow, but to an aggregate
of flows (e.g., those sharing a single IPsec tunnel). In this case, of flows (e.g., those sharing a single IPsec tunnel). In this case,
the switch from marking to dropping would apply to all of the flows the switch from marking to dropping would apply to all of the flows
in that aggregate, denying the benefits of ECN to the other flows in in that aggregate, denying the benefits of ECN to the other flows in
the aggregate also. At the highest level of aggregation, another the aggregate also. At the highest level of aggregation, another
form of the disabling of ECN happens even in the absence of monitor- form of the disabling of ECN happens even in the absence of
ing and policing devices, when ECN-Capable RED queues switch from monitoring and policing devices, when ECN-Capable RED queues switch
marking to dropping packets as an indication of congestion when the from marking to dropping packets as an indication of congestion when
average queue size has exceeded some threshold. the average queue size has exceeded some threshold.
If there were serious operational problems with routers inappropri-
ately erasing the CE bit in packet headers, this could be addressed
to some extent by including a one-bit ECN nonce in packet headers.
Routers would erase the nonce when they set the CE bit [SCWA99].
Routers that erased the CE bit would face additional difficulty in
reconstructing the original nonce, and thus repeated erasure of the
CE bit would be more likely to be detected by the end-nodes. (This
could in fact be done without adding any extra bits for ECN in the IP
header, by using the ECN codepoints (ECT=1, CE=0) and (ECT=0, CE=1)
as the two values for the nonce, and by defining the codepoint
(ECT=0, CE=1) to mean exactly the same as the codepoint (ECT=1,
CE=0).) However, at this point the potential danger of misbehaving
routers does not seem of sufficient concern to warrant this addi-
tional complication of adding an ECN nonce to protect against the
erasure of the CE bit. Additional research is also needed to better
understand the value of such a nonce and appropriate means of gener-
ating sequences of nonce values that an adversary will find suffi-
ciently difficult to reconstruct.
An ECN nonce would also address the problem of misbehaving transport
receivers lying to the transport sender about whether or not the CE
bit was set in a packet. However, another possibility is for the
data sender to test for a misbehaving receiver directly, by occasion-
ally sending a data packet with ECT and CE set, to see if the
receiver reports receiving the CE bit. Of course, if these packets
encountered congestion in the network, the router would make no
change in the packets, because the CE bit would already be set.
Thus, for packets sent with the ECT and CE bits set, the TCP end-
nodes could not determine if some router intended to set the CE bit
in these packets. For this reason, sending packets with the ECT and
CE bits would have to be done very sparingly. In addition, the TCP
sender would have to remember which packets were sent with the ECT
and CE bits set, so that it doesn't react to them as if there was
congestion in the network. We believe that further research is
needed on possible transport-based mechanisms for verifying that the
transport receiver does not lie to the transport sender about the
receipt of congestion indications.
11. Evaluations of ECN 11. Evaluations of ECN
11.1. Related Work Evaluating ECN
This section discusses some of the related work evaluating the use of This section discusses some of the related work evaluating the use of
ECN. The ECN Web Page [ECN] has pointers to other papers, as well as ECN. The ECN Web Page [ECN] has pointers to other papers, as well as
to implementations of ECN. to implementations of ECN.
[Floyd94] considers the advantages and drawbacks of adding ECN to the [Floyd94] considers the advantages and drawbacks of adding ECN to the
TCP/IP architecture. As shown in the simulation-based comparisons, TCP/IP architecture. As shown in the simulation-based comparisons,
one advantage of ECN is to avoid unnecessary packet drops for short one advantage of ECN is to avoid unnecessary packet drops for short
or delay-sensitive TCP connections. A second advantage of ECN is in or delay-sensitive TCP connections. A second advantage of ECN is in
avoiding some unnecessary retransmit timeouts in TCP. This paper avoiding some unnecessary retransmit timeouts in TCP. This paper
discusses in detail the integration of ECN into TCP's congestion con- discusses in detail the integration of ECN into TCP's congestion
trol mechanisms. The possible disadvantages of ECN discussed in the control mechanisms. The possible disadvantages of ECN discussed in
paper are that a non-compliant TCP connection could falsely advertise the paper are that a non-compliant TCP connection could falsely
itself as ECN-capable, and that a TCP ACK packet carrying an ECN-Echo advertise itself as ECN-capable, and that a TCP ACK packet carrying
message could itself be dropped in the network. The first of these an ECN-Echo message could itself be dropped in the network. The
two issues is discussed in the appendix of this document, and the first of these two issues is discussed in the appendix of this
second is addressed by the addition of the CWR flag in the TCP document, and the second is addressed by the addition of the CWR flag
header. in the TCP header.
Experimental evaluations of ECN include [RFC2884,K98]. The conclu- Experimental evaluations of ECN include [RFC2884,K98]. The
sions of [K98] and [RFC2884] are that ECN TCP gets moderately better conclusions of [K98] and [RFC2884] are that ECN TCP gets moderately
throughput than non-ECN TCP; that ECN TCP flows are fair towards non- better throughput than non-ECN TCP; that ECN TCP flows are fair
ECN TCP flows; and that ECN TCP is robust with two-way traffic (with towards non-ECN TCP flows; and that ECN TCP is robust with two-way
congestion in both directions) and with multiple congested gateways. traffic (with congestion in both directions) and with multiple
Experiments with many short web transfers show that, while most of congested gateways. Experiments with many short web transfers show
the short connections have similar transfer times with or without that, while most of the short connections have similar transfer times
ECN, a small percentage of the short connections have very long with or without ECN, a small percentage of the short connections have
transfer times for the non-ECN experiments as compared to the ECN very long transfer times for the non-ECN experiments as compared to
experiments. the ECN experiments.
12. Summary of changes required in IP and TCP 11.2. A Discussion of the ECN nonce.
This document specified two bits in the IP header, the ECN-Capable The use of two ECT codepoints, ECT(0) and ECT(1), can provide a one-
Transport (ECT) bit and the Congestion Experienced (CE) bit, to be bit ECN nonce in packet headers [SCWA99]. The primary motivation for
used for ECN. The ECT bit set to "0" indicates that the transport this is the desire to allow mechanisms for the data sender to verify
protocol will ignore the CE bit. This is the default value for the that network elements are not erasing the CE codepoint, and that data
ECT bit. The ECT bit set to "1" indicates that the transport proto- receivers are properly reporting to the sender the receipt of packets
col is willing and able to participate in ECN. with the CE codepoint set, as required by the transport protocol.
This section discusses issues of backwards compatibility with IP ECN
implementations in routers conformant with RFC 2481, in which only
one ECT codepoint was defined. We do not believe that the
incremental deployment of ECN implementations that understand the
ECT(1) codepoint will cause significant operational problems. This
is particularly likely to be the case when the deployment of the
ECT(1) codepoint begins with routers, before the ECT(1) codepoint
starts to be used by end-nodes.
The default value for the CE bit is "0". The router sets the CE bit 11.2.1. The Incremental Deployment of ECT(1) in Routers.
to "1" to indicate congestion to the end nodes. The CE bit in a
packet header MUST NOT be reset by a router from "1" to "0".
When viewed in terms of code points, this document has defined three ECN has been an Experimental standard since January 1999, and there
code points for the ECN field, for "not ECT" (ECT=0, CE=0), "ECT but are already implementations of ECN in routers that do not understand
not CE" (ECT=1, CE=0), and "ECT and CE" (ECT=1, CE=1). The code the ECT(1) codepoint. When the use of the ECT(1) codepoint is
point of (ECT=0, CE=1) is not defined in this document. One possi- standardized for TCP or for other transport protocols, this could
bility would be for this code point to be used, some time in the mean that a data sender is using the ECT(1) codepoint, but that this
future, for some other function for non-ECN-capable packets. A sec- codepoint is not understood by a congested router on the path.
ond possibility would be for this code point to be used as an ECN
nonce, as described earlier in the document. A third possibility If allowed by the transport protocol, a data sender would be free not
would be for the code point (ECT=0, CE=1) to be used to indicate that to make use of ECT(1) at all, and to send all ECN-capable packets
the packet is ECN-capable for an alternate semantics for the Conges- with the codepoint ECT(0). However, if an ECN-capable sender is
tion Experienced indication. However, at this time the code point using ECT(1), and the congested router on the path did not understand
(ECT=0, CE=1) remains undefined. the ECT(1) codepoint, then the router would end up marking some of
the ECT(0) packets, and dropping some of the ECT(1) packets, as
indications of congestion. Since TCP is required to react to both
marked and dropped packets, this behavior of dropping packets that
could have been marked poses no significant threat to the network,
and is consistent with the overall approach to ECN that allows
routers to determine when and whether to mark packets as they see fit
(see Section 5).
12. Summary of changes required in IP and TCP
This document specified two bits in the IP header to be used for ECN.
The not-ECT codepoint indicates that the transport protocol will
ignore the CE codepoint. This is the default value for the ECN
codepoint. The ECT codepoints indicate that the transport protocol
is willing and able to participate in ECN.
The router sets the CE codepoint to indicate congestion to the end
nodes. The CE codepoint in a packet header MUST NOT be reset by a
router.
TCP requires three changes for ECN, a setup phase and two new flags TCP requires three changes for ECN, a setup phase and two new flags
in the TCP header. The ECN-Echo flag is used by the data receiver to in the TCP header. The ECN-Echo flag is used by the data receiver to
inform the data sender of a received CE packet. The Congestion Win- inform the data sender of a received CE packet. The Congestion
dow Reduced (CWR) flag is used by the data sender to inform the data Window Reduced (CWR) flag is used by the data sender to inform the
receiver that the congestion window has been reduced. data receiver that the congestion window has been reduced.
When ECN (Explicit Congestion Notification [RFC2481]) is used, it is When ECN (Explicit Congestion Notification [RFC2481]) is used, it is
required that congestion indications generated within an IP tunnel required that congestion indications generated within an IP tunnel
not be lost at the tunnel egress. We specified a minor modification not be lost at the tunnel egress. We specified a minor modification
to the IP protocol's handling of the ECN field during encapsulation to the IP protocol's handling of the ECN field during encapsulation
and de-capsulation to allow flows that will undergo IP tunneling to and de-capsulation to allow flows that will undergo IP tunneling to
use ECN. use ECN.
Two options for ECN in tunnels were specified: Two options for ECN in tunnels were specified:
1) A limited-functionality option that does not use ECN inside the IP 1) A limited-functionality option that does not use ECN inside the IP
tunnel, by turning the ECT bit in the outer header off, and not tunnel, by setting the ECN field in the outer header to not-ECT, and
altering the inner header at the time of decapsulation. not altering the inner header at the time of decapsulation.
2) The full-functionality option, which copies the ECT bit of the 2) The full-functionality option, which sets the ECN field in the
inner header to the encapsulating header. At decapsulation, if the outer header to either not-ECT or to one of the ECT codepoints,
ECT bit is set in the inner header, the CE bit on the outer header is depending on the ECN field in the inner header. At decapsulation, if
ORed with the CE bit of the inner header to update the CE bit of the the CE codepoint is set in the outer header, and the inner header is
packet. set to one of the ECT codepoints, then the CE codepoint is copied to
the inner header.
All IP tunnels MUST implement one of the two alternative approaches All IP tunnels MUST implement one of the two alternative approaches
described above. For IPsec tunnels, this document also defines an described above. For IPsec tunnels, this document also defines an
optional IPsec Security Association (SA) attribute that enables optional IPsec Security Association (SA) attribute that enables
negotiation of ECN usage within IPsec tunnels and an optional field negotiation of ECN usage within IPsec tunnels and an optional field
in the Security Association Database to indicate whether ECN is per- in the Security Association Database to indicate whether ECN is
mitted in tunnel mode on a SA. The required changes to IPsec tunnels permitted in tunnel mode on a SA. The required changes to IPsec
for ECN usage modify RFC 2401 [RFC2401], which defines the IPsec tunnels for ECN usage modify RFC 2401 [RFC2401], which defines the
architecture and specifies some aspects of its implementation. The IPsec architecture and specifies some aspects of its implementation.
new IPsec SA attribute is in addition to those already defined in The new IPsec SA attribute is in addition to those already defined in
Section 4.5 of [RFC2407]. Section 4.5 of [RFC2407].
This document is intended to obsolete RFC 2481, "A Proposal to add This document is intended to obsolete RFC 2481, "A Proposal to add
Explicit Congestion Notification (ECN) to IP", which defined ECN as Explicit Congestion Notification (ECN) to IP", which defined ECN as
an Experimental Protocol for the Internet Community. The rest of an Experimental Protocol for the Internet Community. The rest of
this section describes the relationship between this document and its this section describes the relationship between this document and its
predecessor. predecessor.
RFC 2481 included a brief discussion of the use of ECN with encapsu- RFC 2481 included a brief discussion of the use of ECN with
lated packets, and noted that for the IPsec specifications at the encapsulated packets, and noted that for the IPsec specifications at
time (January 1999), flows could not safely use ECN if they were to the time (January 1999), flows could not safely use ECN if they were
traverse IPsec tunnels. RFC 2481 also described the changes that to traverse IPsec tunnels. RFC 2481 also described the changes that
could be made to IPsec tunnel specifications to made them compatible could be made to IPsec tunnel specifications to made them compatible
with ECN. with ECN.
This document also incorporates work that was done after RFC 2481, This document also incorporates work that was done after RFC 2481,
First was to describe the changes to IPsec tunnels in detail, and First was to describe the changes to IPsec tunnels in detail, and
extensively discuss the security implications of ECN (now included as extensively discuss the security implications of ECN (now included as
Sections 18 and 19 of this document). Second was to extend the dis- Sections 18 and 19 of this document). Second was to extend the
cussion of IPsec tunnels to include all IP tunnels. Because older IP discussion of IPsec tunnels to include all IP tunnels. Because older
tunnels are not compatible with a flow's use of ECN, the deployment IP tunnels are not compatible with a flow's use of ECN, the
of ECN in the Internet will create strong pressure for older IP tun- deployment of ECN in the Internet will create strong pressure for
nels to be updated to an ECN-compatible version, using either the older IP tunnels to be updated to an ECN-compatible version, using
limited-functionality or the full-functionality option. either the limited-functionality or the full-functionality option.
This document does not address the issue of including ECN in non-IP This document does not address the issue of including ECN in non-IP
tunnels such as MPLS, GRE, L2TP, or PPTP. An earlier preliminary tunnels such as MPLS, GRE, L2TP, or PPTP. An earlier preliminary
document about adding ECN support to MPLS was not advanced. document about adding ECN support to MPLS was not advanced.
A third new piece of work after RFC2481 was to describe the ECN pro- A third new piece of work after RFC2481 was to describe the ECN
cedure with retransmitted data packets, that the ECT bit should not procedure with retransmitted data packets, that an ECT codepoint
be set on retransmitted data packets. The motivation for this addi- should not be set on retransmitted data packets. The motivation for
tional specification is to eliminate a possible avenue for denial-of- this additional specification is to eliminate a possible avenue for
service attacks on an existing TCP connection. Some prior deploy- denial-of-service attacks on an existing TCP connection. Some prior
ments of ECN-capable TCP might not conform to the (new) requirement deployments of ECN-capable TCP might not conform to the (new)
not to set the ECT bit on retransmitted packets; we do not believe requirement not to set an ECT codepoint on retransmitted packets; we
this will cause significant problems in practice. do not believe this will cause significant problems in practice.
This document also expands slightly on the specification of the use This document also expands slightly on the specification of the use
of SYN packets for the negotiation of ECN. While some prior deploy- of SYN packets for the negotiation of ECN. While some prior
ments of ECN-capable TCP might not conform to the requirements speci- deployments of ECN-capable TCP might not conform to the requirements
fied in this document, we do not believe that this will lead to any specified in this document, we do not believe that this will lead to
performance or compatibility problems for TCP connections with a com- any performance or compatibility problems for TCP connections with a
bination of TCP implementations at the endpoints. combination of TCP implementations at the endpoints.
This document also includes the specification of the ECT(1)
codepoint, which may be used by TCP as part of the implementation of
an ECN nonce.
13. Conclusions 13. Conclusions
Given the current effort to implement AQM, we believe this is the Given the current effort to implement AQM, we believe this is the
right time to deploy congestion avoidance mechanisms that do not right time to deploy congestion avoidance mechanisms that do not
depend on packet drops alone. With the increased deployment of depend on packet drops alone. With the increased deployment of
applications and transports sensitive to the delay and loss of a sin- applications and transports sensitive to the delay and loss of a
gle packet (e.g., realtime traffic, short web transfers), depending single packet (e.g., realtime traffic, short web transfers),
on packet loss as a normal congestion notification mechanism appears depending on packet loss as a normal congestion notification
to be insufficient (or at the very least, non-optimal). mechanism appears to be insufficient (or at the very least, non-
optimal).
We examined the consequence of modifications of the ECN field within We examined the consequence of modifications of the ECN field within
the network, analyzing all the opportunities for an adversary to the network, analyzing all the opportunities for an adversary to
change the ECN field. In many cases, the change to the ECN field is change the ECN field. In many cases, the change to the ECN field is
no worse than dropping a packet. However, we noted that some changes no worse than dropping a packet. However, we noted that some changes
have the more serious consequence of subverting end-to-end congestion have the more serious consequence of subverting end-to-end congestion
control. However, we point out that even then the potential damage control. However, we point out that even then the potential damage
is limited, and is similar to the threat posed by end-systems inten- is limited, and is similar to the threat posed by end-systems
tionally failing to cooperate with end-to-end congestion control. intentionally failing to cooperate with end-to-end congestion
control.
14. Acknowledgements 14. Acknowledgements
Many people have made contributions to this work and this document, Many people have made contributions to this work and this document,
including many that we have not managed to directly acknowledge in including many that we have not managed to directly acknowledge in
this document. In addition, we would like to thank Kenjiro Cho for this document. In addition, we would like to thank Kenjiro Cho for
the proposal for the TCP mechanism for negotiating ECN-Capability, the proposal for the TCP mechanism for negotiating ECN-Capability,
Kevin Fall for the proposal of the CWR bit, Steve Blake for material Kevin Fall for the proposal of the CWR bit, Steve Blake for material
on IPv4 Header Checksum Recalculation, Jamal Hadi-Salim for discus- on IPv4 Header Checksum Recalculation, Jamal Hadi-Salim for
sions of ECN issues, and Steve Bellovin, Jim Bound, Brian Carpenter, discussions of ECN issues, and Steve Bellovin, Jim Bound, Brian
Paul Ferguson, Stephen Kent, Greg Minshall, and Vern Paxson for dis- Carpenter, Paul Ferguson, Stephen Kent, Greg Minshall, and Vern
cussions of security issues. We also thank the Internet End-to-End Paxson for discussions of security issues. We also thank the
Research Group for ongoing discussions of these issues. Internet End-to-End Research Group for ongoing discussions of these
issues.
Email discussions with a number of people, including Alexey Email discussions with a number of people, including Alexey
Kuznetsov, Jamal Hadi-Salim, and Venkat Venkatsubra, have addressed Kuznetsov, Jamal Hadi-Salim, and Venkat Venkatsubra, have addressed
the issues raised by non-conformant equipment in the Internet that the issues raised by non-conformant equipment in the Internet that
does not respond to TCP SYN packets with the ECE and CWR flags set. does not respond to TCP SYN packets with the ECE and CWR flags set.
We thank Mark Handley, Jitentra Padhye, and others for discussions on We thank Mark Handley, Jitentra Padhye, and others for discussions on
the TCP initialization procedures. the TCP initialization procedures.
The discussion of ECN and IP tunnel considerations draws heavily on The discussion of ECN and IP tunnel considerations draws heavily on
related discussions and documents from the Differentiated Services related discussions and documents from the Differentiated Services
Working Group. We thank Tabassum Bint Haque from Dhaka, Bangladesh, Working Group. We thank Tabassum Bint Haque from Dhaka, Bangladesh,
for feedback on IP tunnels. We thank Derrell Piper and Kero Tivinen for feedback on IP tunnels. We thank Derrell Piper and Kero Tivinen
for proposing modifications to RFC 2407 that improve the usability of for proposing modifications to RFC 2407 that improve the usability of
negotiating the ECN Tunnel SA attribute. negotiating the ECN Tunnel SA attribute.
We thank David Wetherall, David Ely, and Neil Spring for the proposal
for the ECN nonce. We also thank Stefan Savage for discussions on
this issue. We thank Bob Briscoe and Jon Crowcroft for raising the
issue of fragmentation in IP, on alternate semantics for the fourth
ECN codepoint, and several other topics. We thank Richard Wendland
for feedback on several issues in the draft.
15. References 15. References
[AH] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402, [AH] Kent, S. and R. Atkinson, "IP Authentication Header", RFC 2402,
November 1998. November 1998.
[B97] Bradner, S., "Key words for use in RFCs to Indicate Requirement [B97] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997. Levels", BCP 14, RFC 2119, March 1997.
[ECN] "The ECN Web Page", URL "http://www.aciri.org/floyd/ecn.html". [ECN] "The ECN Web Page", URL "http://www.aciri.org/floyd/ecn.html".
Reference for informational purposes only. Reference for informational purposes only.
skipping to change at page 38, line 30 skipping to change at page 40, line 44
for Congestion Avoidance", IEEE/ACM Transactions on Networking, V.1 for Congestion Avoidance", IEEE/ACM Transactions on Networking, V.1
N.4, August 1993, p. 397-413. N.4, August 1993, p. 397-413.
[Floyd94] Floyd, S., "TCP and Explicit Congestion Notification", ACM [Floyd94] Floyd, S., "TCP and Explicit Congestion Notification", ACM
Computer Communication Review, V. 24 N. 5, October 1994, p. 10-23. Computer Communication Review, V. 24 N. 5, October 1994, p. 10-23.
[Floyd98] Floyd, S., "The ECN Validation Test in the NS Simulator", [Floyd98] Floyd, S., "The ECN Validation Test in the NS Simulator",
URL "http://www-mash.cs.berkeley.edu/ns/", test tcl/test/test-all- URL "http://www-mash.cs.berkeley.edu/ns/", test tcl/test/test-all-
ecn. Reference for informational purposes only. ecn. Reference for informational purposes only.
[FF99] Floyd, S., and Fall, K., "Promoting the Use of End-to-End Con- [FF99] Floyd, S., and Fall, K., "Promoting the Use of End-to-End
gestion Control in the Internet", IEEE/ACM Transactions on Network- Congestion Control in the Internet", IEEE/ACM Transactions on
ing, August 1999. Networking, August 1999.
[FRED] Lin, D., and Morris, R., "Dynamics of Random Early Detection", [FRED] Lin, D., and Morris, R., "Dynamics of Random Early Detection",
SIGCOMM '97, September 1997. SIGCOMM '97, September 1997.
[GRE] S. Hanks, T. Li, D. Farinacci, and P. Traina, Generic Routing [GRE] S. Hanks, T. Li, D. Farinacci, and P. Traina, Generic Routing
Encapsulation (GRE), RFC 1701, October 1994. Encapsulation (GRE), RFC 1701, October 1994.
[Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc. [Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc.
ACM SIGCOMM '88, pp. 314-329. ACM SIGCOMM '88, pp. 314-329.
[Jacobson90] V. Jacobson, "Modified TCP Congestion Avoidance Algo- [Jacobson90] V. Jacobson, "Modified TCP Congestion Avoidance
rithm", Message to end2end-interest mailing list, April 1990. URL Algorithm", Message to end2end-interest mailing list, April 1990. URL
"ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt". "ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt".
[K98] Krishnan, H., "Analyzing Explicit Congestion Notification (ECN) [K98] Krishnan, H., "Analyzing Explicit Congestion Notification (ECN)
benefits for TCP", Master's thesis, UCLA, 1998, URL benefits for TCP", Master's thesis, UCLA, 1998, URL
"http://www.cs.ucla.edu/~hari/software/ecn/ ecn_report.ps.gz". "http://www.cs.ucla.edu/~hari/software/ecn/ ecn_report.ps.gz".
[L2TP] W. Townsley, A. Valencia, A. Rubens, G. Pall, G. Zorn, and B. [L2TP] W. Townsley, A. Valencia, A. Rubens, G. Pall, G. Zorn, and B.
Palter Layer Two Tunneling Protocol "L2TP", RFC 2661, August 1999. Palter Layer Two Tunneling Protocol "L2TP", RFC 2661, August 1999.
[MJV96] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver- driven [MJV96] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver- driven
skipping to change at page 39, line 40 skipping to change at page 42, line 5
[RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, Generic [RFC1701] Hanks, S., Li, T., Farinacci, D., and P. Traina, Generic
Routing Encapsulation (GRE), RFC 1701, October 1994. Routing Encapsulation (GRE), RFC 1701, October 1994.
[RFC1702] Hanks, S., Li, T., Farinacci, D., and P. Traina, Generic [RFC1702] Hanks, S., Li, T., Farinacci, D., and P. Traina, Generic
Routing Encapsulation over IPv4 networks, RFC 1702, October 1994. Routing Encapsulation over IPv4 networks, RFC 1702, October 1994.
[RFC2003] Perkins, C., IP Encapsulation within IP, RFC 2003, October [RFC2003] Perkins, C., IP Encapsulation within IP, RFC 2003, October
1996. 1996.
[RFC 2119] S. Bradner, Key words for use in RFCs to Indicate Require- [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate
ment Levels, RFC 2119, March 1997. Requirement Levels, RFC 2119, March 1997.
[RFC2309] Braden, B., et al., "Recommendations on Queue Management [RFC2309] Braden, B., et al., "Recommendations on Queue Management
and Congestion Avoidance in the Internet", RFC 2309, April 1998. and Congestion Avoidance in the Internet", RFC 2309, April 1998.
[RFC2401] S. Kent and R. Atkinson, Security Architecture for the [RFC2401] S. Kent and R. Atkinson, Security Architecture for the
Internet Protocol, RFC 2401, November 1998. Internet Protocol, RFC 2401, November 1998.
[RFC2407] D. Piper, The Internet IP Security Domain of Interpretation [RFC2407] D. Piper, The Internet IP Security Domain of Interpretation
for ISAKMP, RFC 2407, November 1998. for ISAKMP, RFC 2407, November 1998.
skipping to change at page 40, line 15 skipping to change at page 42, line 29
RFC 2409, November 1998. RFC 2409, November 1998.
[RFC2409] D. Harkins and D. Carrel, The Internet Key Exchange (IKE), [RFC2409] D. Harkins and D. Carrel, The Internet Key Exchange (IKE),
RFC 2409, November 1998. RFC 2409, November 1998.
[RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition [RFC2474] Nichols, K., Blake, S., Baker, F. and D. Black, "Definition
of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 of the Differentiated Services Field (DS Field) in the IPv4 and IPv6
Headers", RFC 2474, December 1998. Headers", RFC 2474, December 1998.
[RFC2475] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. [RFC2475] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W.
Weiss, An Architecture for Differentiated Services, RFC 2475, Decem- Weiss, An Architecture for Differentiated Services, RFC 2475,
ber 1998. December 1998.
[RFC2481] K. Ramakrishnan and S. Floyd, A Proposal to add Explicit [RFC2481] K. Ramakrishnan and S. Floyd, A Proposal to add Explicit
Congestion Notification (ECN) to IP, RFC 2481, January 1999. Congestion Notification (ECN) to IP, RFC 2481, January 1999.
[RFC2581] M. Allman, V. Paxson, W. Stevens, "TCP Congestion Control", [RFC2581] M. Allman, V. Paxson, W. Stevens, "TCP Congestion Control",
RFC 2581, April 1999. RFC 2581, April 1999.
[RFC2884] Jamal Hadi Salim and Uvaiz Ahmed, "Performance Evaluation [RFC2884] Jamal Hadi Salim and Uvaiz Ahmed, "Performance Evaluation
of Explicit Congestion Notification (ECN) in IP Networks", RFC 2884, of Explicit Congestion Notification (ECN) in IP Networks", RFC 2884,
July 2000. July 2000.
[RFC2983] D. Black, "Differentiated Services and Tunnels", RFC2983, [RFC2983] D. Black, "Differentiated Services and Tunnels", RFC2983,
October 2000. October 2000.
[RFC2780] S. Bradner and V. Paxson, "IANA Allocation Guidelines For [RFC2780] S. Bradner and V. Paxson, "IANA Allocation Guidelines For
Values In the Internet Protocol and Related Headers", RFC 2780, March Values In the Internet Protocol and Related Headers", RFC 2780, March
2000. 2000.
[RJ90] K. K. Ramakrishnan and Raj Jain, "A Binary Feedback Scheme for [RJ90] K. K. Ramakrishnan and Raj Jain, "A Binary Feedback Scheme for
Congestion Avoidance in Computer Networks", ACM Transactions on Com- Congestion Avoidance in Computer Networks", ACM Transactions on
puter Systems, Vol.8, No.2, pp. 158-181, May 1990. Computer Systems, Vol.8, No.2, pp. 158-181, May 1990.
[SCWA99] Stefan Savage, Neal Cardwell, David Wetherall, and Tom [SCWA99] Stefan Savage, Neal Cardwell, David Wetherall, and Tom
Anderson, TCP Congestion Control with a Misbehaving Receiver, ACM Anderson, TCP Congestion Control with a Misbehaving Receiver, ACM
Computer Communications Review, October 1999. Computer Communications Review, October 1999.
16. Security Considerations 16. Security Considerations
Security considerations have been discussed in Sections 7, 8, 18, and Security considerations have been discussed in Sections 7, 8, 18, and
19. 19.
skipping to change at page 41, line 11 skipping to change at page 43, line 25
IPv4 header checksum recalculation is an issue with some high-end IPv4 header checksum recalculation is an issue with some high-end
router architectures using an output-buffered switch, since most if router architectures using an output-buffered switch, since most if
not all of the header manipulation is performed on the input side of not all of the header manipulation is performed on the input side of
the switch, while the ECN decision would need to be made local to the the switch, while the ECN decision would need to be made local to the
output buffer. This is not an issue for IPv6, since there is no IPv6 output buffer. This is not an issue for IPv6, since there is no IPv6
header checksum. The IPv4 TOS octet is the last byte of a 16-bit header checksum. The IPv4 TOS octet is the last byte of a 16-bit
half-word. half-word.
RFC 1141 [RFC1141] discusses the incremental updating of the IPv4 RFC 1141 [RFC1141] discusses the incremental updating of the IPv4
checksum after the TTL field is decremented. The incremental updat- checksum after the TTL field is decremented. The incremental
ing of the IPv4 checksum after the CE bit was set would work as fol- updating of the IPv4 checksum after the CE codepoint was set would
lows: Let HC be the original header checksum, and let HC' be the new work as follows: Let HC be the original header checksum for an ECT(0)
header checksum after the CE bit has been set. Then for header packet, and let HC' be the new header checksum after the CE checksum
checksums calculated with one's complement subtraction, HC' would be has been set. That is, the ECN field has changed from '10' to '11'.
recalculated as follows: Then for header checksums calculated with one's complement
subtraction, HC' would be recalculated as follows:
HC' = { HC - 1 HC > 1 HC' = { HC - 1 HC > 1
{ 0x0000 HC = 1 { 0x0000 HC = 1
For header checksums calculated on two's complement machines, HC' would For header checksums calculated on two's complement machines, HC' would
be recalculated as follows after the CE bit was set: be recalculated as follows after the CE bit was set:
HC' = { HC - 1 HC > 0 HC' = { HC - 1 HC > 0
{ 0xFFFE HC = 0 { 0xFFFE HC = 0
A similar incremental updating of the IPv4 checksum can be carried out
when the ECN field is changed from ECT(1) to CE, that is, from '01' to
'11'.
18. Possible Changes to the ECN Field in the Network 18. Possible Changes to the ECN Field in the Network
This section discusses in detail possible changes to the ECN field in This section discusses in detail possible changes to the ECN field in
the network, such as falsely reporting congestion, disabling ECN- the network, such as falsely reporting congestion, disabling ECN-
Capability for an individual packet, erasing the ECN congestion indi- Capability for an individual packet, erasing the ECN congestion
cation, or falsely indicating ECN-Capability. We represent the ECN indication, or falsely indicating ECN-Capability.
bits in the IP header by the tuple (ECT bit, CE bit).
18.1. Possible Changes to the IP Header 18.1. Possible Changes to the IP Header
18.1.1. Erasing the Congestion Indication 18.1.1. Erasing the Congestion Indication
First, we consider the changes that a router could make that would First, we consider the changes that a router could make that would
result in effectively erasing the congestion indication after it had result in effectively erasing the congestion indication after it had
been set by a router upstream. The convention followed is: been set by a router upstream. The convention followed is:
(ECT, CE) of received packet -> (ECT, CE) of packet transmitted. ECN codepoint of received packet -> ECN codepoint of packet
transmitted.
(1, 1) -> (1, 0): erase only the CE bit that was set. Replacing the CE codepoint with the ECT(0) or ECT(1) codepoint
(1, 1) -> (0, 0): erase both the ECT bit and the CE bit. effectively erases the congestion indication. However, with the use
(1, 1) -> (0, 1): erase the ECT bit of two ECT codepoints, a router erasing the CE codepoint has no way
to know whether the original ECT codepoint was ECT(0) or ECT(1).
Thus, it is possible for the transport protocol to deploy mechanisms
to detect such erasures of the CE codepoint.
The first change turns off the CE bit after it has been set by some The consequence of the erasure of the CE codepoint for the upstream
upstream router along the path. The consequence for the upstream
router is that there is a potential for congestion to build for a router is that there is a potential for congestion to build for a
time, because the congestion indication does not reach the source. time, because the congestion indication does not reach the source.
However, the packet would be received and acknowledged. However, the packet would be received and acknowledged.
The potential effect of erasing the congestion indication is complex, The potential effect of erasing the congestion indication is complex,
and is discussed in depth in Section 19 below. Note that the effect and is discussed in depth in Section 19 below. Note that the effect
of erasing the congestion indication is different from dropping a of erasing the congestion indication is different from dropping a
packet in the network. When a data packet is dropped, the drop is packet in the network. When a data packet is dropped, the drop is
detected by the TCP sender, and interpreted as an indication of con- detected by the TCP sender, and interpreted as an indication of
gestion. Similarly, if a sufficient number of consecutive acknowl- congestion. Similarly, if a sufficient number of consecutive
edgement packets are dropped, causing the cumulative acknowledgement acknowledgement packets are dropped, causing the cumulative
field not to be advanced at the sender, the sender is limited by the acknowledgement field not to be advanced at the sender, the sender is
congestion window from sending additional packets, and ultimately the limited by the congestion window from sending additional packets, and
retransmit timer expires. ultimately the retransmit timer expires.
In contrast, a systematic erasure of the CE bit by a downstream In contrast, a systematic erasure of the CE bit by a downstream
router can have the effect of causing a queue buildup at an upstream router can have the effect of causing a queue buildup at an upstream
router, including the possible loss of packets due to buffer over- router, including the possible loss of packets due to buffer
flow. There is a potential of unfairness in that another flow that overflow. There is a potential of unfairness in that another flow
goes through the congested router could react to the CE bit set while that goes through the congested router could react to the CE bit set
the flow that has the CE bit erased could see better performance. while the flow that has the CE bit erased could see better
The limitations on this potential unfairness are discussed in more performance. The limitations on this potential unfairness are
detail in Section 19 below. discussed in more detail in Section 19 below.
The second change is to turn off both the ECT and the CE bits, thus
erasing the congestion indication and disabling ECN-Capability at the
same time. The third change turns off only the ECT bit, disabling
ECN-Capability.
Within an IP tunnel using the full-functionality option, the third The last of the three changes is to replace the CE codepoint with the
change would not erase the congestion indication, but would only dis- not-ECT codepoint. thus erasing the congestion indication and
able ECN-Capability for that packet within the rest of the tunnel. disabling ECN-Capability at the same time.
However, when performed outside of an IP tunnel, the third change
would also effectively erase the congestion indication, because an
ECN field of (0, 1) is undefined.
The `erasure' of the congestion indication is only effective if the The `erasure' of the congestion indication is only effective if the
packet does not end up being marked or dropped again by a downstream packet does not end up being marked or dropped again by a downstream
router. With the first change, the packet remains ECN-Capable, and router. If the CE codepoint is replaced by an ECT codepoint, the
could be either marked or dropped by a downstream router as an indi- packet remains ECN-Capable, and could be either marked or dropped by
cation of congestion. With the second and third changes, the packet a downstream router as an indication of congestion. If the CE
is no longer ECN-capable, and can therefore be dropped but not marked codepoint is replaced by the not-ECT codepoint, the packet is no
by a downstream router as an indication of congestion. longer ECN-capable, and can therefore be dropped but not marked by a
downstream router as an indication of congestion.
18.1.2. Falsely Reporting Congestion 18.1.2. Falsely Reporting Congestion
(1, 0) -> (1, 1) This change is to set the CE codepoint when an ECT codepoint was
already set, even though there was no congestion. This change does
This change is to set the CE bit when the ECT bit was already set, not affect the treatment of that packet along the rest of the path.
even though there was no congestion. This change does not affect the In particular, a router does not examine the CE codepoint in deciding
treatment of that packet along the rest of the path. In particular, whether to drop or mark an arriving packet.
a router does not examine the CE bit in deciding whether to drop or
mark an arriving packet.
However, this could result in the application unnecessarily invoking However, this could result in the application unnecessarily invoking
end-to-end congestion control, and reducing its arrival rate. By end-to-end congestion control, and reducing its arrival rate. By
itself, this is no worse (for the application or for the network) itself, this is no worse (for the application or for the network)
than if the tampering router had actually dropped the packet. than if the tampering router had actually dropped the packet.
18.1.3. Disabling ECN-Capability 18.1.3. Disabling ECN-Capability
(1, 0) -> (0, *) This change is to turn off the ECT codepoint of a packet. This means
This change is to turn off the ECT bit of a packet that does not have
the CE bit set. (Section 18.1.1 discussed the case of turning off
the ECT bit of a packet that does have the CE bit set.) This means
that if the packet later encounters congestion (e.g., by arriving to that if the packet later encounters congestion (e.g., by arriving to
a RED queue with a moderate average queue size), it will be dropped a RED queue with a moderate average queue size), it will be dropped
instead of being marked. By itself, this is no worse (for the appli- instead of being marked. By itself, this is no worse (for the
cation) than if the tampering router had actually dropped the packet. application) than if the tampering router had actually dropped the
The saving grace in this particular case is that there is no con- packet. The saving grace in this particular case is that there is no
gested router upstream expecting a reaction from setting the CE bit. congested router upstream expecting a reaction from setting the CE
bit.
18.1.4. Falsely Indicating ECN-Capability 18.1.4. Falsely Indicating ECN-Capability
This change would incorrectly label a packet as ECN-Capable. The This change would incorrectly label a packet as ECN-Capable. The
packet may have been sent either by an ECN-Capable transport or a packet may have been sent either by an ECN-Capable transport or a
transport that is not ECN-Capable. transport that is not ECN-Capable.
(0, *) -> (1, 0);
(0, *) -> (1, 1);
If the packet later encounters moderate congestion at an ECN-Capable If the packet later encounters moderate congestion at an ECN-Capable
router, the router could set the CE bit instead of dropping the router, the router could set the CE codepoint instead of dropping the
packet. If the transport protocol in fact is not ECN-Capable, then packet. If the transport protocol in fact is not ECN-Capable, then
the transport will never receive this indication of congestion, and the transport will never receive this indication of congestion, and
will not reduce its sending rate in response. The potential conse- will not reduce its sending rate in response. The potential
quences of falsely indicating ECN-capability are discussed further in consequences of falsely indicating ECN-capability are discussed
Section 19 below. further in Section 19 below.
If the packet never later encounters congestion at an ECN-Capable If the packet never later encounters congestion at an ECN-Capable
router, then the first of these two changes would have no effect. router, then the first of these two changes would have no effect,
The second change, however, would have the effect of giving false other than possibly interfering with the use of the ECN nonce by the
reports of congestion to a monitoring device along the path. If the transport protocol. The last change, however, would have the effect
transport protocol is ECN-Capable, then the second of these two of giving false reports of congestion to a monitoring device along
changes (when, for example, (0,0) was changed to (1,1)) could also the path. If the transport protocol is ECN-Capable, then this change
have an effect at the transport level, by combining falsely indicat- could also have an effect at the transport level, by combining
ing ECN-Capability with falsely reporting congestion. For an ECN- falsely indicating ECN-Capability with falsely reporting congestion.
capable transport, this would cause the transport to unnecessarily For an ECN-capable transport, this would cause the transport to
react to congestion. In this particular case, the router that is unnecessarily react to congestion. In this particular case, the
incorrectly changing the ECN field could have dropped the packet. router that is incorrectly changing the ECN field could have dropped
Thus for this case of an ECN-capable transport, the consequence of the packet. Thus for this case of an ECN-capable transport, the
this change to the ECN field is no worse than dropping the packet. consequence of this change to the ECN field is no worse than dropping
the packet.
18.1.5. Changes with No Functional Effect
(0, *) -> (0, *)
The CE bit is ignored in a packet that does not have the ECT bit set.
Thus, this change would have no effect, in terms of ECN.
18.2. Information carried in the Transport Header 18.2. Information carried in the Transport Header
For TCP, an ECN-capable TCP receiver informs its TCP peer that it is For TCP, an ECN-capable TCP receiver informs its TCP peer that it is
ECN-capable at the TCP level, conveying this information in the TCP ECN-capable at the TCP level, conveying this information in the TCP
header at the time the connection is setup. This document does not header at the time the connection is setup. This document does not
consider potential dangers introduced by changes in the transport consider potential dangers introduced by changes in the transport
header within the network. In the case of IPsec tunnels, the IPsec header within the network. In the case of IPsec tunnels, the IPsec
tunnel protects the transport header. tunnel protects the transport header.
Another issue concerns TCP packets with a spoofed IP source address Another issue concerns TCP packets with a spoofed IP source address
carrying invalid ECN information in the transport header. For com- carrying invalid ECN information in the transport header. For
pleteness, we examine here some possible ways that a node spoofing completeness, we examine here some possible ways that a node spoofing
the IP source address of another node could use the two ECN flags in the IP source address of another node could use the two ECN flags in
the TCP header to launch a denial-of-service attack. However, these the TCP header to launch a denial-of-service attack. However, these
attacks would require an ability for the attacker to use valid TCP attacks would require an ability for the attacker to use valid TCP
sequence numbers, and any attacker with this ability and with the sequence numbers, and any attacker with this ability and with the
ability to spoof IP source addresses could damage the TCP connection ability to spoof IP source addresses could damage the TCP connection
without using the ECN flags. Therefore, ECN does not add any new without using the ECN flags. Therefore, ECN does not add any new
vulnerabilities in this respect. vulnerabilities in this respect.
An acknowledgement packet with a spoofed IP source address of the TCP An acknowledgement packet with a spoofed IP source address of the TCP
data receiver could include the ECE bit set. If accepted by the TCP data receiver could include the ECE bit set. If accepted by the TCP
data sender as a valid packet, this spoofed acknowledgement packet data sender as a valid packet, this spoofed acknowledgement packet
could result in the TCP data sender unnecessarily halving its conges- could result in the TCP data sender unnecessarily halving its
tion window. However, to be accepted by the data sender, such a congestion window. However, to be accepted by the data sender, such
spoofed acknowledgement packet would have to have the correct 32-bit a spoofed acknowledgement packet would have to have the correct
sequence number as well as a valid acknowledgement number. An 32-bit sequence number as well as a valid acknowledgement number. An
attacker that could successfully send such a spoofed acknowledgement attacker that could successfully send such a spoofed acknowledgement
packet could also send a spoofed RST packet, or do other equally dam- packet could also send a spoofed RST packet, or do other equally
aging operations to the TCP connection. damaging operations to the TCP connection.
Packets with a spoofed IP source address of the TCP data sender could Packets with a spoofed IP source address of the TCP data sender could
include the CWR bit set. Again, to be accepted, such a packet would include the CWR bit set. Again, to be accepted, such a packet would
have to have a valid sequence number. In addition, such a spoofed have to have a valid sequence number. In addition, such a spoofed
packet would have a limited performance impact. Spoofing a data packet would have a limited performance impact. Spoofing a data
packet with the CWR bit set could result in the TCP data receiver packet with the CWR bit set could result in the TCP data receiver
sending fewer ECE packets than it would otherwise, if the data sending fewer ECE packets than it would otherwise, if the data
receiver was sending ECE packets when it received the spoofed CWR receiver was sending ECE packets when it received the spoofed CWR
packet. packet.
skipping to change at page 45, line 30 skipping to change at page 47, line 30
that set of packets? that set of packets?
We will classify the packets in the flow as A packets and B packets, We will classify the packets in the flow as A packets and B packets,
and assume that the adversary only has access to A packets. Assume and assume that the adversary only has access to A packets. Assume
that the adversary is subverting end-to-end congestion control along that the adversary is subverting end-to-end congestion control along
the path traveled by A packets only, by either falsely indicating the path traveled by A packets only, by either falsely indicating
ECN-Capability upstream of the point where congestion occurs, or ECN-Capability upstream of the point where congestion occurs, or
erasing the congestion indication downstream. Consider also that erasing the congestion indication downstream. Consider also that
there exists a monitoring device that sees both the A and B packets, there exists a monitoring device that sees both the A and B packets,
and will "punish" both the A and B packets if the total flow is and will "punish" both the A and B packets if the total flow is
determined not to be properly responding to indications of conges- determined not to be properly responding to indications of
tion. Another key characteristic that we believe is likely to be congestion. Another key characteristic that we believe is likely to
true is that the monitoring device, before `punishing' the A&B flow, be true is that the monitoring device, before `punishing' the A&B
will first drop packets instead of setting the CE bit, and will drop flow, will first drop packets instead of setting the CE codepoint,
arriving packets of that flow that already have the ECT and CE bits and will drop arriving packets of that flow that already have the CE
set. If the end nodes are in fact using end-to-end congestion con- codepoint set. If the end nodes are in fact using end-to-end
trol, they will see all of the indications of congestion seen by the congestion control, they will see all of the indications of
monitoring device, and will begin to respond to these indications of congestion seen by the monitoring device, and will begin to respond
congestion. Thus, the monitoring device is successful in providing to these indications of congestion. Thus, the monitoring device is
the indications to the flow at an early stage. successful in providing the indications to the flow at an early
stage.
It is true that the adversary that has access only to the A packets It is true that the adversary that has access only to the A packets
might, by subverting ECN-based congestion control, be able to deny might, by subverting ECN-based congestion control, be able to deny
the benefits of ECN to the other packets in the A&B aggregate. While the benefits of ECN to the other packets in the A&B aggregate. While
this is unfortunate, this is not a reason to disable ECN within an this is unfortunate, this is not a reason to disable ECN within an
IPsec tunnel. IPsec tunnel.
A variant of falsely reporting congestion occurs when there are two A variant of falsely reporting congestion occurs when there are two
adversaries along a path, where the first adversary falsely reports adversaries along a path, where the first adversary falsely reports
congestion, and the second adversary `erases' those reports. (Unlike congestion, and the second adversary `erases' those reports. (Unlike
packet drops, ECN congestion reports can be `reversed' later in the packet drops, ECN congestion reports can be `reversed' later in the
network by a malicious or broken router.) While this would be trans- network by a malicious or broken router. However, the use of the ECN
parent to the end node, it is possible that a monitoring device nonce could help the transport to detect this behavior.) While this
between the first and second adversaries would see the false indica- would be transparent to the end node, it is possible that a
tions of congestion. Keep in mind our recommendation in this docu- monitoring device between the first and second adversaries would see
ment, that before `punishing' a flow for not responding appropriately the false indications of congestion. Keep in mind our recommendation
to congestion, the router will first switch to dropping rather than in this document, that before `punishing' a flow for not responding
marking as an indication of congestion, for that flow. When this appropriately to congestion, the router will first switch to dropping
includes dropping arriving packets from that flow that have the CE rather than marking as an indication of congestion, for that flow.
bit set, this ensures that these indications of congestion are being When this includes dropping arriving packets from that flow that have
seen by the end nodes. Thus, there is no additional harm that we are the CE codepoint set, this ensures that these indications of
able to postulate as a result of multiple conflicting adversaries. congestion are being seen by the end nodes. Thus, there is no
additional harm that we are able to postulate as a result of multiple
conflicting adversaries.
19. Implications of Subverting End-to-End Congestion Control 19. Implications of Subverting End-to-End Congestion Control
This section focuses on the potential repercussions of subverting This section focuses on the potential repercussions of subverting
end-to-end congestion control by either falsely indicating ECN-Capa- end-to-end congestion control by either falsely indicating ECN-
bility, or by erasing the congestion indication in ECN (the CE-bit). Capability, or by erasing the congestion indication in ECN (the CE
Subverting end-to-end congestion control by either of these two meth- codepoint). Subverting end-to-end congestion control by either of
ods can have consequences both for the application and for the net- these two methods can have consequences both for the application and
work. We discuss these separately below. for the network. We discuss these separately below.
The first method to subvert end-to-end congestion control, that of The first method to subvert end-to-end congestion control, that of
falsely indicating ECN-Capability, effectively subverts end-to-end falsely indicating ECN-Capability, effectively subverts end-to-end
congestion control only if the packet later encounters congestion congestion control only if the packet later encounters congestion
that results in the setting of the CE bit. In this case, the trans- that results in the setting of the CE codepoint. In this case, the
port protocol (which may not be ECN-capable) does not receive the transport protocol (which may not be ECN-capable) does not receive
indication of congestion from these downstream congested routers. the indication of congestion from these downstream congested routers.
The second method to subvert end-to-end congestion control, `erasing' The second method to subvert end-to-end congestion control, `erasing'
the (set) CE bit in a packet, effectively subverts end-to-end conges- the CE codepoint in a packet, effectively subverts end-to-end
tion control only when the CE bit in the packet was set earlier by a congestion control only when the CE codepoint in the packet was set
congested router. In this case, the transport protocol does not earlier by a congested router. In this case, the transport protocol
receive the indication of congestion from the upstream congested does not receive the indication of congestion from the upstream
routers. congested routers.
Either of these two methods of subverting end-to-end congestion con- Either of these two methods of subverting end-to-end congestion
trol can potentially introduce more damage to the network (and possi- control can potentially introduce more damage to the network (and
bly to the flow itself) than if the adversary had simply dropped possibly to the flow itself) than if the adversary had simply dropped
packets from that flow. However, as we discuss later in this section packets from that flow. However, as we discuss later in this section
and in Section 7, this potential damage is limited. and in Section 7, this potential damage is limited.
19.1. Implications for the Network and for Competing Flows 19.1. Implications for the Network and for Competing Flows
The CE bit of the ECN field is only used by routers as an indication The CE codepoint of the ECN field is only used by routers as an
of congestion during periods of *moderate* congestion. ECN-capable indication of congestion during periods of *moderate* congestion.
routers should drop rather than mark packets during heavy congestion ECN-capable routers should drop rather than mark packets during heavy
even if the router's queue is not yet full. For example, for routers congestion even if the router's queue is not yet full. For example,
using active queue management based on RED, the router should drop for routers using active queue management based on RED, the router
rather than mark packets that arrive while the average queue sizes should drop rather than mark packets that arrive while the average
exceed the RED queue's maximum threshold. queue sizes exceed the RED queue's maximum threshold.
One consequence for the network of subverting end-to-end congestion One consequence for the network of subverting end-to-end congestion
control is that flows that do not receive the congestion indications control is that flows that do not receive the congestion indications
from the network might increase their sending rate until they drive from the network might increase their sending rate until they drive
the network into heavier congestion. Then, the congested router the network into heavier congestion. Then, the congested router
could begin to drop rather than mark arriving packets. For flows could begin to drop rather than mark arriving packets. For flows
that are not isolated by some form of per-flow scheduling or other that are not isolated by some form of per-flow scheduling or other
per-flow mechanisms, but are instead aggregated with other flows in a per-flow mechanisms, but are instead aggregated with other flows in a
single queue in an undifferentiated fashion, this packet-dropping at single queue in an undifferentiated fashion, this packet-dropping at
the congested router would apply to all flows that share that queue. the congested router would apply to all flows that share that queue.
Thus, the consequences would be to increase the level of congestion Thus, the consequences would be to increase the level of congestion
in the network. in the network.
In some cases, the increase in the level of congestion will lead to a In some cases, the increase in the level of congestion will lead to a
substantial buffer buildup at the congested queue that will be suffi- substantial buffer buildup at the congested queue that will be
cient to drive the congested queue from the packet-marking to the sufficient to drive the congested queue from the packet-marking to
packet-dropping regime. This transition could occur either because the packet-dropping regime. This transition could occur either
of buffer overflow, or because of the active queue management policy because of buffer overflow, or because of the active queue management
described above that drops packets when the average queue is above policy described above that drops packets when the average queue is
RED's maximum threshold. At this point, all flows, including the above RED's maximum threshold. At this point, all flows, including
subverted flow, will begin to see packet drops instead of packet the subverted flow, will begin to see packet drops instead of packet
marks, and a malicious or broken router will no longer be able to marks, and a malicious or broken router will no longer be able to
`erase' these indications of congestion in the network. If the end `erase' these indications of congestion in the network. If the end
nodes are deploying appropriate end-to-end congestion control, then nodes are deploying appropriate end-to-end congestion control, then
the subverted flow will reduce its arrival rate in response to con- the subverted flow will reduce its arrival rate in response to
gestion. When the level of congestion is sufficiently reduced, the congestion. When the level of congestion is sufficiently reduced,
congested queue can return from the packet-dropping regime to the the congested queue can return from the packet-dropping regime to the
packet-marking regime. The steady-state pattern could be one of the packet-marking regime. The steady-state pattern could be one of the
congested queue oscillating between these two regimes. congested queue oscillating between these two regimes.
In other cases, the consequences of subverting end-to-end congestion In other cases, the consequences of subverting end-to-end congestion
control will not be severe enough to drive the congested link into control will not be severe enough to drive the congested link into
sufficiently-heavy congestion that packets are dropped instead of sufficiently-heavy congestion that packets are dropped instead of
being marked. In this case, the implications for competing flows in being marked. In this case, the implications for competing flows in
the network will be a slightly-increased rate of packet marking or the network will be a slightly-increased rate of packet marking or
dropping, and a corresponding decrease in the bandwidth available to dropping, and a corresponding decrease in the bandwidth available to
those flows. This can be a stable state if the arrival rate of the those flows. This can be a stable state if the arrival rate of the
subverted flow is sufficiently small, relative to the link bandwidth, subverted flow is sufficiently small, relative to the link bandwidth,
that the average queue size at the congested router remains under that the average queue size at the congested router remains under
control. In particular, the subverted flow could have a limited control. In particular, the subverted flow could have a limited
bandwidth demand on the link at this router, while still getting more bandwidth demand on the link at this router, while still getting more
than its "fair" share of the link. This limited demand could be due than its "fair" share of the link. This limited demand could be due
to a limited demand from the data source; a limitation from the TCP to a limited demand from the data source; a limitation from the TCP
advertised window; a lower-bandwidth access pipe; or other factors. advertised window; a lower-bandwidth access pipe; or other factors.
Thus the subversion of ECN-based congestion control can still lead to Thus the subversion of ECN-based congestion control can still lead to
unfairness, which we believe is appropriate to note here. unfairness, which we believe is appropriate to note here.
The threat to the network posed by the subversion of ECN-based con- The threat to the network posed by the subversion of ECN-based
gestion control in the network is essentially the same as the threat congestion control in the network is essentially the same as the
posed by an end-system that intentionally fails to cooperate with threat posed by an end-system that intentionally fails to cooperate
end-to-end congestion control. The deployment of mechanisms in with end-to-end congestion control. The deployment of mechanisms in
routers to address this threat is an open research question, and is routers to address this threat is an open research question, and is
discussed further in Section 10. discussed further in Section 10.
Let us take the example described in Section 18.1.1, where the CE bit Let us take the example described in Section 18.1.1, where the CE
that was set in a packet is erased: {(1, 1) -> (1, 0)}. The conse- codepoint that was set in a packet is erased: {'11' -> '10' or '11'
quence for the congested upstream router that set the CE bit is that -> '01'}. The consequence for the congested upstream router that set
this congestion indication does not reach the end nodes for that the CE codepoint is that this congestion indication does not reach
flow. The source (even one which is completely cooperative and not the end nodes for that flow. The source (even one which is completely
malicious) is thus allowed to continue to increase its sending rate cooperative and not malicious) is thus allowed to continue to
(if it is a TCP flow, by increasing its congestion window). The flow increase its sending rate (if it is a TCP flow, by increasing its
potentially achieves better throughput than the other flows that also congestion window). The flow potentially achieves better throughput
share the congested router, especially if there are no policing mech- than the other flows that also share the congested router, especially
anisms or per-flow queueing mechanisms at that router. Consider the if there are no policing mechanisms or per-flow queueing mechanisms
behavior of the other flows, especially if they are cooperative: that at that router. Consider the behavior of the other flows, especially
is, the flows that do not experience subverted end-to-end congestion if they are cooperative: that is, the flows that do not experience
control. They are likely to reduce their load (e.g., by reducing subverted end-to-end congestion control. They are likely to reduce
their window size) on the congested router, thus benefiting our sub- their load (e.g., by reducing their window size) on the congested
verted flow. This results in unfairness. As we discussed above, this router, thus benefiting our subverted flow. This results in
unfairness could either be transient (because the congested queue is unfairness. As we discussed above, this unfairness could either be
driven into the packet-marking regime), oscillatory (because the con- transient (because the congested queue is driven into the packet-
gested queue oscillates between the packet marking and the packet marking regime), oscillatory (because the congested queue oscillates
dropping regime), or more moderate but a persistent stable state between the packet marking and the packet dropping regime), or more
(because the congested queue is never driven to the packet dropping moderate but a persistent stable state (because the congested queue
regime). is never driven to the packet dropping regime).
The results would be similar if the subverted flow was intentionally The results would be similar if the subverted flow was intentionally
avoiding end-to-end congestion control. One difference is that a avoiding end-to-end congestion control. One difference is that a
flow that is intentionally avoiding end-to-end congestion control at flow that is intentionally avoiding end-to-end congestion control at
the end nodes can avoid end-to-end congestion control even when the the end nodes can avoid end-to-end congestion control even when the
congested queue is in packet-dropping mode, by refusing to reduce its congested queue is in packet-dropping mode, by refusing to reduce its
sending rate in response to packet drops in the network. Thus the sending rate in response to packet drops in the network. Thus the
problems for the network from the subversion of ECN-based congestion problems for the network from the subversion of ECN-based congestion
control are less severe than the problems caused by the intentional control are less severe than the problems caused by the intentional
avoidance of end-to-end congestion control in the end nodes. It is avoidance of end-to-end congestion control in the end nodes. It is
also the case that it is considerably more difficult to control the also the case that it is considerably more difficult to control the
behavior of the end nodes than it is to control the behavior of the behavior of the end nodes than it is to control the behavior of the
infrastructure itself. This is not to say that the problems for the infrastructure itself. This is not to say that the problems for the
network posed by the network's subversion of ECN-based congestion network posed by the network's subversion of ECN-based congestion
control are small; just that they are dwarfed by the problems for the control are small; just that they are dwarfed by the problems for the
network posed by the subversion of either ECN-based or other cur- network posed by the subversion of either ECN-based or other
rently known packet-based congestion control mechanisms by the end currently known packet-based congestion control mechanisms by the end
nodes. nodes.
19.2. Implications for the Subverted Flow 19.2. Implications for the Subverted Flow
When a source indicates that it is ECN-capable, there is an expecta- When a source indicates that it is ECN-capable, there is an
tion that the routers in the network that are capable of participat- expectation that the routers in the network that are capable of
ing in ECN will use the CE bit for indication of congestion. There is participating in ECN will use the CE codepoint for indication of
the potential benefit of using ECN in reducing the amount of packet congestion. There is the potential benefit of using ECN in reducing
loss (in addition to the reduced queueing delays because of active the amount of packet loss (in addition to the reduced queueing delays
queue management policies). When the packet flows through a tunnel because of active queue management policies). When the packet flows
where the nodes that the tunneled packets traverse are untrusted in through a tunnel where the nodes that the tunneled packets traverse
some way, the expectation is that IPsec will protect the flow from are untrusted in some way, the expectation is that IPsec will protect
subversion that results in undesirable consequences. the flow from subversion that results in undesirable consequences.
In many cases, a subverted flow will benefit from the subversion of In many cases, a subverted flow will benefit from the subversion of
end-to-end congestion control for that flow in the network, by end-to-end congestion control for that flow in the network, by
receiving more bandwidth than it would have otherwise, relative to receiving more bandwidth than it would have otherwise, relative to
competing non-subverted flows. If the congested queue reaches the competing non-subverted flows. If the congested queue reaches the
packet-dropping stage, then the subversion of end-to-end congestion packet-dropping stage, then the subversion of end-to-end congestion
control might or might not be of overall benefit to the subverted control might or might not be of overall benefit to the subverted
flow, depending on that flow's relative tradeoffs between throughput, flow, depending on that flow's relative tradeoffs between throughput,
loss, and delay. loss, and delay.
One form of subverting end-to-end congestion control is to falsely One form of subverting end-to-end congestion control is to falsely
indicate ECN-capability by setting the ECT bit. This has the conse- indicate ECN-capability by setting the ECT codepoint. This has the
quence of downstream congested routers setting the CE bit in vain. consequence of downstream congested routers setting the CE codepoint
However, as described in Section 9.1.2, if the ECT bit is changed in in vain. However, as described in Section 9.1.2, if an ECT codepoint
an IP tunnel, this can be detected at the egress point of the tunnel, is changed in an IP tunnel, this can be detected at the egress point
as long as the inner header was not changed within the tunnel. of the tunnel, as long as the inner header was not changed within the
tunnel.
The second form of subverting end-to-end congestion control is to The second form of subverting end-to-end congestion control is to
erase the congestion indication, either by erasing the CE bit erase the congestion indication by erasing the CE codepoint. In this
directly, or by erasing the ECT bit when the CE bit is already set. case, it is the upstream congested routers that set the CE codepoint
In this case, it is the upstream congested routers that set the CE in vain.
bit in vain.
If the ECT bit is erased within an IP tunnel, then this can be If an ECT codepoint is erased within an IP tunnel, then this can be
detected at the egress point of the tunnel, as long as the inner detected at the egress point of the tunnel, as long as the inner
header was not changed within the tunnel. If the CE bit is set header was not changed within the tunnel. If the CE codepoint is set
upstream of the IP tunnel, then any erasure of the outer header's CE upstream of the IP tunnel, then any erasure of the outer header's CE
bit within the tunnel will have no effect because the inner header codepoint within the tunnel will have no effect because the inner
preserves the set value of the CE bit. However, if the CE bit is set header preserves the set value of the CE codepoint. However, if the
within the tunnel, and erased either within or downstream of the tun- CE codepoint is set within the tunnel, and erased either within or
nel, this is not necessarily detected at the egress point of the tun- downstream of the tunnel, this is not necessarily detected at the
nel. egress point of the tunnel.
With this subversion of end-to-end congestion control, an end-system With this subversion of end-to-end congestion control, an end-system
transport does not respond to the congestion indication. Along with transport does not respond to the congestion indication. Along with
the increased unfairness for the non-subverted flows described in the the increased unfairness for the non-subverted flows described in the
previous section, the congested router's queue could continue to previous section, the congested router's queue could continue to
build, resulting in packet loss at the congested router - which is a build, resulting in packet loss at the congested router - which is a
means for indicating congestion to the transport in any case. In the means for indicating congestion to the transport in any case. In the
interim, the flow might experience higher queueing delays, possibly interim, the flow might experience higher queueing delays, possibly
along with an increased bandwidth relative to other non-subverted along with an increased bandwidth relative to other non-subverted
flows. But transports do not inherently make assumptions of consis- flows. But transports do not inherently make assumptions of
tently experiencing carefully managed queueing in the path. We consistently experiencing carefully managed queueing in the path. We
believe that these forms of subverting end-to-end congestion control believe that these forms of subverting end-to-end congestion control
are no worse for the subverted flow than if the adversary had simply are no worse for the subverted flow than if the adversary had simply
dropped the packets of that flow itself. dropped the packets of that flow itself.
19.3. Non-ECN-Based Methods of Subverting End-to-end Congestion Control 19.3. Non-ECN-Based Methods of Subverting End-to-end Congestion Control
We have shown that, in many cases, a malicious or broken router that We have shown that, in many cases, a malicious or broken router that
is able to change the bits in the ECN field can do no more damage is able to change the bits in the ECN field can do no more damage
than if it had simply dropped the packet in question. However, this than if it had simply dropped the packet in question. However, this
is not true in all cases, in particular in the cases where the broken is not true in all cases, in particular in the cases where the broken
router subverted end-to-end congestion control by either falsely router subverted end-to-end congestion control by either falsely
indicating ECN-Capability or by erasing the ECN congestion indication indicating ECN-Capability or by erasing the ECN congestion indication
(in the CE-bit). While there are many ways that a router can harm a (in the CE codepoint). While there are many ways that a router can
flow by dropping packets, a router cannot subvert end-to-end conges- harm a flow by dropping packets, a router cannot subvert end-to-end
tion control by dropping packets. As an example, a router cannot congestion control by dropping packets. As an example, a router
subvert TCP congestion control by dropping data packets, acknowledge- cannot subvert TCP congestion control by dropping data packets,
ment packets, or control packets. acknowledgement packets, or control packets.
Even though packet-dropping cannot be used to subvert end-to-end con- Even though packet-dropping cannot be used to subvert end-to-end
gestion control, there *are* non-ECN-based methods for subverting congestion control, there *are* non-ECN-based methods for subverting
end-to-end congestion control that a broken or malicious router could end-to-end congestion control that a broken or malicious router could
use. For example, a broken router could duplicate data packets, thus use. For example, a broken router could duplicate data packets, thus
effectively negating the effects of end-to-end congestion control effectively negating the effects of end-to-end congestion control
along some portion of the path. (For a router that duplicated pack- along some portion of the path. (For a router that duplicated
ets within an IPsec tunnel, the security administrator can cause the packets within an IPsec tunnel, the security administrator can cause
duplicate packets to be discarded by configuring anti-replay protec- the duplicate packets to be discarded by configuring anti-replay
tion for the tunnel.) This duplication of packets within the network protection for the tunnel.) This duplication of packets within the
would have similar implications for the network and for the subverted network would have similar implications for the network and for the
flow as those described in Sections 18.1.1 and 18.1.4 above. subverted flow as those described in Sections 18.1.1 and 18.1.4
above.
20. The Motivation for the ECT bit. 20. The Motivation for the ECT Codepoints.
The need for the ECT bit is motivated by the fact that ECN will be 20.1. The Motivation for an ECT Codepoint.
deployed incrementally in an Internet where some transport protocols
and routers understand ECN and some do not. With the ECT bit, the
router can drop packets from flows that are not ECN-capable, but can
*instead* set the CE bit in packets that *are* ECN-capable. Because
the ECT bit allows an end node to have the CE bit set in a packet
*instead* of having the packet dropped, an end node might have some
incentive to deploy ECN.
If there was no ECT indication, then the router would have to set the The need for an ECT codepoint is motivated by the fact that ECN will
CE bit for packets from both ECN-capable and non-ECN-capable flows. be deployed incrementally in an Internet where some transport
In this case, there would be no incentive for end-nodes to deploy protocols and routers understand ECN and some do not. With an ECT
ECN, and no viable path of incremental deployment from a non-ECN codepoint, the router can drop packets from flows that are not ECN-
world to an ECN-capable world. Consider the first stages of such an capable, but can *instead* set the CE codepoint in packets that *are*
incremental deployment, where a subset of the flows are ECN-capable. ECN-capable. Because an ECT codepoint allows an end node to have the
At the onset of congestion, when the packet dropping/marking rate CE codepoint set in a packet *instead* of having the packet dropped,
would be low, routers would only set CE bits, rather than dropping an end node might have some incentive to deploy ECN.
packets. However, only those flows that are ECN-capable would under-
stand and respond to CE packets. The result is that the ECN-capable If there was no ECT codepoint, then the router would have to set the
flows would back off, and the non-ECN-capable flows would be unaware CE codepoint for packets from both ECN-capable and non-ECN-capable
of the ECN signals and would continue to open their congestion win- flows. In this case, there would be no incentive for end-nodes to
dows. deploy ECN, and no viable path of incremental deployment from a non-
ECN world to an ECN-capable world. Consider the first stages of such
an incremental deployment, where a subset of the flows are ECN-
capable. At the onset of congestion, when the packet
dropping/marking rate would be low, routers would only set CE
codepoints, rather than dropping packets. However, only those flows
that are ECN-capable would understand and respond to CE packets. The
result is that the ECN-capable flows would back off, and the non-ECN-
capable flows would be unaware of the ECN signals and would continue
to open their congestion windows.
In this case, there are two possible outcomes: (1) the ECN-capable In this case, there are two possible outcomes: (1) the ECN-capable
flows back off, the non-ECN-capable flows get all of the bandwidth, flows back off, the non-ECN-capable flows get all of the bandwidth,
and congestion remains mild, or (2) the ECN-capable flows back off, and congestion remains mild, or (2) the ECN-capable flows back off,
the non-ECN-capable flows don't, and congestion increases until the the non-ECN-capable flows don't, and congestion increases until the
router transitions from setting the CE bit to dropping packets. router transitions from setting the CE codepoint to dropping packets.
While this second outcome evens out the fairness, the ECN-capable While this second outcome evens out the fairness, the ECN-capable
flows would still receive little benefit from being ECN-capable, flows would still receive little benefit from being ECN-capable,
because the increased congestion would drive the router to packet- because the increased congestion would drive the router to packet-
dropping behavior. dropping behavior.
A flow that advertised itself as ECN-Capable but does not respond to A flow that advertised itself as ECN-Capable but does not respond to
CE bits is functionally equivalent to a flow that turns off conges- CE codepoints is functionally equivalent to a flow that turns off
tion control, as discussed earlier in this document. congestion control, as discussed earlier in this document.
Thus, in a world when a subset of the flows are ECN-capable, but Thus, in a world when a subset of the flows are ECN-capable, but
where ECN-capable flows have no mechanism for indicating that fact to where ECN-capable flows have no mechanism for indicating that fact to
the routers, there would be less effective and less fair congestion the routers, there would be less effective and less fair congestion
control in the Internet, resulting in a strong incentive for end control in the Internet, resulting in a strong incentive for end
nodes not to deploy ECN. nodes not to deploy ECN.
20.2. The Motivation for two ECT Codepoints.
The primary motivation for the two ECT codepoints is to provide a
one-bit ECN nonce. The ECN nonce allows the development of
mechanisms for the sender to probabilistically verify that network
elements are not erasing the CE codepoint, and that data receivers
are properly reporting to the sender the receipt of packets with the
CE codepoint set.
Another possibility for senders to detect misbehaving network
elements or receivers would be for the data sender to occasionally
send a data packet with the CE codepoint set, to see if the receiver
reports receiving the CE codepoint. Of course, if these packets
encountered congestion in the network, the router might make no
change in the packets, because the CE codepoint would already be set.
Thus, for packets sent with the CE codepoint set, the TCP end-nodes
could not determine if some router intended to set the CE codepoint
in these packets. For this reason, sending packets with the CE
codepoint would have to be done sparingly, and would be a less
effective check against misbehaving network elements and receivers
than would be the ECN nonce.
The assignment of the fourth ECN codepoint to ECT(1) precludes the
use of this codepoint for other purposes. For clarity, we briefly
list those possible purposes here.
One possibility might have been for the data sender to use the fourth
ECN codepoint to indicate an alternate semantics for ECN. However,
this seems to us more appropriate to be signalled using a
differentiated services codepoint in the DS field.
A second possible use for the fourth ECN codepoint would have been to
give the router two separate codepoints for the indication of
congestion, CE(0) and CE(1), for mild and severe congestion
respectively. While this could be useful in some cases, this
certainly does not seem a compelling requirement at this point. If
there was judged to be a compelling need for this, the complications
of incremental deployment would most likely necessitate more that
just one codepoint for this function.
A third use that has been informally proposed for the ECN codepoint
is for use in some forms of multicast congestion control, based on
randomized procedures for duplicating marked packets at routers.
Some proposed multicast packet duplication procedures are based on a
new ECN codepoint that (1) conveys the fact that congestion occurred
upstream of the duplication point that marked the packet with this
codepoint and (2) can detect congestion downstream of that
duplication point. ECT(1) can serve this purpose because it is both
distinct from ECT(0) and is replaced by CE when ECN marking occurs in
response to congestion or incipient congestion. Explanation of how
this enhanced version of ECN would be used by multicast congestion
control is beyond the scope of this document, as are ECN-aware
multicast packet duplication procedures and the processing of the ECN
field at multicast receivers in all cases (i.e., irrespective of the
multicast packet duplication procedure(s) used).
The specification of IP tunnel modifications for ECN in this document
assumes that the only change made to the outer IP header's ECN field
between tunnel endpoints is to set the CE codepoint to indicate
congestion. This is not consistent with some of the proposed uses of
ECT(1) by the multicast duplication procedures in the previous
paragraph, and such procedures SHOULD NOT be deployed within tunnels
configured for full ECN functionality. Limited ECN functionality may
be used instead, although in practice many tunnel protocols
(including IPsec) will not work correctly if multicast traffic
duplication occurs within the tunnel
21. Why use Two Bits in the IP Header? 21. Why use Two Bits in the IP Header?
Given the need for an ECT indication in the IP header, there still Given the need for an ECT indication in the IP header, there still
remains the question of whether the ECT (ECN-Capable Transport) and remains the question of whether the ECT (ECN-Capable Transport) and
CE (Congestion Experienced) indications should have been overloaded CE (Congestion Experienced) codepoints should have been overloaded on
on a single bit. This overloaded-one-bit alternative, explored in a single bit. This overloaded-one-bit alternative, explored in
[Floyd94], would have involved a single bit with two values. One [Floyd94], would have involved a single bit with two values. One
value, "ECT and not CE", would represent an ECN-Capable Transport, value, "ECT and not CE", would represent an ECN-Capable Transport,
and the other value, "CE or not ECT", would represent either Conges- and the other value, "CE or not ECT", would represent either
tion Experienced or a non-ECN-Capable transport. Congestion Experienced or a non-ECN-Capable transport.
One difference between the one-bit and two-bit implementations con- One difference between the one-bit and two-bit implementations
cerns packets that traverse multiple congested routers. Consider a concerns packets that traverse multiple congested routers. Consider
CE packet that arrives at a second congested router, and is selected a CE packet that arrives at a second congested router, and is
by the active queue management at that router for either marking or selected by the active queue management at that router for either
dropping. In the one-bit implementation, the second congested router marking or dropping. In the one-bit implementation, the second
has no choice but to drop the CE packet, because it cannot distin- congested router has no choice but to drop the CE packet, because it
guish between a CE packet and a non-ECT packet. In the two-bit cannot distinguish between a CE packet and a non-ECT packet. In the
implementation, the second congested router has the choice of either two-bit implementation, the second congested router has the choice of
dropping the CE packet, or of leaving it alone with the CE bit set. either dropping the CE packet, or of leaving it alone with the CE
codepoint set.
Another difference between the one-bit and two-bit implementations Another difference between the one-bit and two-bit implementations
comes from the fact that with the one-bit implementation, receivers comes from the fact that with the one-bit implementation, receivers
in a single flow cannot distinguish between CE and non-ECT packets. in a single flow cannot distinguish between CE and non-ECT packets.
Thus, in the one-bit implementation an ECN-capable data sender would Thus, in the one-bit implementation an ECN-capable data sender would
have to unambiguously indicate to the receiver or receivers whether have to unambiguously indicate to the receiver or receivers whether
each packet had been sent as ECN-Capable or as non-ECN-Capable. One each packet had been sent as ECN-Capable or as non-ECN-Capable. One
possibility would be for the sender to indicate in the transport possibility would be for the sender to indicate in the transport
header whether the packet was sent as ECN-Capable. A second possi- header whether the packet was sent as ECN-Capable. A second
bility that would involve a functional limitation for the one- bit possibility that would involve a functional limitation for the one-
implementation would be for the sender to unambiguously indicate that bit implementation would be for the sender to unambiguously indicate
it was going to send *all* of its packets as ECN-Capable or as non- that it was going to send *all* of its packets as ECN-Capable or as
ECN-Capable. For a multicast transport protocol, this unambiguous non-ECN-Capable. For a multicast transport protocol, this
indication would have to be apparent to receivers joining an on-going unambiguous indication would have to be apparent to receivers joining
multicast session. an on-going multicast session.
Another concern that was described earlier (and recommended in this Another concern that was described earlier (and recommended in this
document) is that transports (particularly TCP) should not mark pure document) is that transports (particularly TCP) should not mark pure
ACK packets or retransmitted packets as being ECN-Capable. A pure ACK packets or retransmitted packets as being ECN-Capable. A pure
ACK packet from a non-ECN-capable transport could be dropped, without ACK packet from a non-ECN-capable transport could be dropped, without
necessarily having an impact on the transport from a congestion con- necessarily having an impact on the transport from a congestion
trol perspective (because subsequent ACKs are cumulative). An ECN- control perspective (because subsequent ACKs are cumulative). An
capable transport reacting to the CE bit set in a pure ACK packet by ECN-capable transport reacting to the CE codepoint in a pure ACK
reducing the window would be at a disadvantage in comparison to a packet by reducing the window would be at a disadvantage in
non-ECN-capable transport. For this reason (and for reasons described comparison to a non-ECN-capable transport. For this reason (and for
earlier in relation to retransmitted packets), it is desirable to reasons described earlier in relation to retransmitted packets), it
have the ECN-Capable bit indication on a per-packet basis. is desirable to have the ECT codepoint set on a per-packet basis.
Another advantage of the two-bit approach is that it is somewhat more Another advantage of the two-bit approach is that it is somewhat more
robust. The most critical issue, discussed in Section 8, is that the robust. The most critical issue, discussed in Section 8, is that the
default indication should be that of a non-ECN-Capable transport. In default indication should be that of a non-ECN-Capable transport. In
a two-bit implementation, this requirement for the default value sim- a two-bit implementation, this requirement for the default value
ply means that the ECT bit should be `OFF' by default. In the one- simply means that the non-ECT codepoint should be the default. In
bit implementation, this means that the single overloaded bit should the one-bit implementation, this means that the single overloaded bit
by default be in the "CE or not ECT" position. This is less clear should by default be in the "CE or not ECT" position. This is less
and straightforward, and possibly more open to incorrect implementa- clear and straightforward, and possibly more open to incorrect
tions either in the end nodes or in the routers. implementations either in the end nodes or in the routers.
In summary, while the one-bit implementation could be a possible In summary, while the one-bit implementation could be a possible
implementation, it has the following significant limitations relative implementation, it has the following significant limitations relative
to the two-bit implementation. First, the one-bit implementation has to the two-bit implementation. First, the one-bit implementation has
more limited functionality for the treatment of CE packets at a sec- more limited functionality for the treatment of CE packets at a
ond congested router. Second, the one-bit implementation requires second congested router. Second, the one-bit implementation requires
either that extra information be carried in the transport header of either that extra information be carried in the transport header of
packets from ECN-Capable flows (to convey the functionality of the packets from ECN-Capable flows (to convey the functionality of the
second bit elsewhere, namely in the transport header), or that second bit elsewhere, namely in the transport header), or that
senders in ECN-Capable flows accept the limitation that receivers senders in ECN-Capable flows accept the limitation that receivers
must be able to determine a priori which packets are ECN-Capable and must be able to determine a priori which packets are ECN-Capable and
which are not ECN-Capable. Third, the one-bit implementation is pos- which are not ECN-Capable. Third, the one-bit implementation is
sibly more open to errors from faulty implementations that choose the possibly more open to errors from faulty implementations that choose
wrong default value for the ECN bit. We believe that the use of the the wrong default value for the ECN bit. We believe that the use of
extra bit in the IP header for the ECT-bit is extremely valuable to the extra bit in the IP header for the ECT-bit is extremely valuable
overcome these limitations. to overcome these limitations.
22. Historical Definitions for the IPv4 TOS Octet 22. Historical Definitions for the IPv4 TOS Octet
RFC 791 [RFC791] defined the ToS (Type of Service) octet in the IP RFC 791 [RFC791] defined the ToS (Type of Service) octet in the IP
header. In RFC 791, bits 6 and 7 of the ToS octet are listed as header. In RFC 791, bits 6 and 7 of the ToS octet are listed as
"Reserved for Future Use", and are shown set to zero. The first two "Reserved for Future Use", and are shown set to zero. The first two
fields of the ToS octet were defined as the Precedence and Type of fields of the ToS octet were defined as the Precedence and Type of
Service (TOS) fields. Service (TOS) fields.
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
skipping to change at page 53, line 45 skipping to change at page 57, line 20
The IPv4 TOS octet was redefined in RFC 1349 [RFC1349] as follows: The IPv4 TOS octet was redefined in RFC 1349 [RFC1349] as follows:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
| PRECEDENCE | TOS | MBZ | RFC 1349 | PRECEDENCE | TOS | MBZ | RFC 1349
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
Bit 6 in the TOS field was defined in RFC 1349 for "Minimize Monetary Bit 6 in the TOS field was defined in RFC 1349 for "Minimize Monetary
Cost". In addition to the Precedence and Type of Service (TOS) Cost". In addition to the Precedence and Type of Service (TOS)
fields, the last field, MBZ (for "must be zero") was defined as cur- fields, the last field, MBZ (for "must be zero") was defined as
rently unused. RFC 1349 stated that "The originator of a datagram currently unused. RFC 1349 stated that "The originator of a datagram
sets [the MBZ] field to zero (unless participating in an Internet sets [the MBZ] field to zero (unless participating in an Internet
protocol experiment which makes use of that bit)." protocol experiment which makes use of that bit)."
RFC 1455 [RFC 1455] defined an experimental standard that used all RFC 1455 [RFC 1455] defined an experimental standard that used all
four bits in the TOS field to request a guaranteed level of link four bits in the TOS field to request a guaranteed level of link
security. security.
RFC 1349 and RFC 1455 have been obsoleted by "Definition of the Dif- RFC 1349 and RFC 1455 have been obsoleted by "Definition of the
ferentiated Services Field (DS Field) in the IPv4 and IPv6 Headers" Differentiated Services Field (DS Field) in the IPv4 and IPv6
[RFC2474] in which bits 6 and 7 of the DS field are listed as Cur- Headers" [RFC2474] in which bits 6 and 7 of the DS field are listed
rently Unused (CU). RFC 2780 [RFC2780] specified ECN as an experi- as Currently Unused (CU). RFC 2780 [RFC2780] specified ECN as an
mental use of the two-bit CU field. RFC 2780 updated the definition experimental use of the two-bit CU field. RFC 2780 updated the
of the DS Field to only encompass the first six bits of this octet definition of the DS Field to only encompass the first six bits of
rather than all eight bits; these first six bits are defined as the this octet rather than all eight bits; these first six bits are
Differentiated Services CodePoint (DSCP): defined as the Differentiated Services CodePoint (DSCP):
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
| DSCP | CU | RFCs 2474, | DSCP | CU | RFCs 2474,
2780 2780
+-----+-----+-----+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+
Because of this unstable history, the definition of the ECN field in Because of this unstable history, the definition of the ECN field in
this document cannot be guaranteed to be backwards compatible with this document cannot be guaranteed to be backwards compatible with
all past uses of these two bits. all past uses of these two bits.
Prior to RFC 2474, routers were not permitted to modify bits in Prior to RFC 2474, routers were not permitted to modify bits in
either the DSCP or ECN field of packets forwarded through them, and either the DSCP or ECN field of packets forwarded through them, and
hence routers that comply only with RFCs prior to 2474 should have no hence routers that comply only with RFCs prior to 2474 should have no
effect on ECN. For end nodes, bit 7 (the ECN CE bit) must be trans- effect on ECN. For end nodes, bit 7 (the second ECN bit) must be
mitted as zero for any implementation compliant only with RFCs prior transmitted as zero for any implementation compliant only with RFCs
to 2474. Such nodes may transmit bit 6 (the ECN ECT bit) as one for prior to 2474. Such nodes may transmit bit 6 (the first ECN bit) as
the "Minimize Monetary Cost" provision of RFC 1349 or the experiment one for the "Minimize Monetary Cost" provision of RFC 1349 or the
authorized by RFC 1455; neither this aspect of RFC 1349 nor the experiment authorized by RFC 1455; neither this aspect of RFC 1349
experiment in RFC 1455 were widely implemented or used. The damage nor the experiment in RFC 1455 were widely implemented or used. The
that could be done by a broken, non-conformant router would be to damage that could be done by a broken, non-conformant router would
"erase" the CE bit for an ECN- capable packet that arrived at the include "erasing" the CE codepoint for an ECN-capable packet that
router with the CE bit set, or set the CE bit even in the absence of arrived at the router with the CE codepoint set, or setting the CE
congestion. This has been discussed in the section on "Non-compli- codepoint even in the absence of congestion. This has been discussed
ance in the Network". in the section on "Non-compliance in the Network".
The damage that could be done in an ECN-capable environment by a non- The damage that could be done in an ECN-capable environment by a non-
ECN-capable end-node transmitting packets with the ECT bit set has ECN-capable end-node transmitting packets with the ECT codepoint set
been discussed in the section on "Non-compliance by the End Nodes". has been discussed in the section on "Non-compliance by the End
Nodes".
23. IANA Considerations 23. IANA Considerations
The bits for ECT and CE in the ECN Field of the IP header and the The codepoints for the ECN Field of the IP header and the bits for
bits for CWR and ECE in the TCP header are specified by the Standards CWR and ECE in the TCP header are specified by the Standards Action
Action of this RFC, as is required by RFC 2780. We would note that of this RFC, as is required by RFC 2780.
this RFC does not define the codepoint of (ECT=0, CE=1) for the ECT
and CE bits.
IANA allocated the IPSEC Security Association Attribute value 10 for IANA allocated the IPSEC Security Association Attribute value 10 for
the ECN Tunnel use described in Section 9.2.1.2 above at the request the ECN Tunnel use described in Section 9.2.1.2 above at the request
of David Black in November 1999. If this draft is approved for pub- of David Black in November 1999. If this draft is approved for
lication as an RFC, IANA should change the Reference for this alloca- publication as an RFC, IANA should change the Reference for this
tion from David Black's request to this RFC based on its RFC number. allocation from David Black's request to this RFC based on its RFC
number.
AUTHORS' ADDRESSES AUTHORS' ADDRESSES
K. K. Ramakrishnan K. K. Ramakrishnan
TeraOptic Networks, Inc. TeraOptic Networks, Inc.
Phone: +1 (408) 666-8650 Phone: +1 (408) 666-8650
Email: kk@teraoptic.com Email: kk@teraoptic.com
Sally Floyd Sally Floyd
Phone: +1 (510) 666-2989 Phone: +1 (510) 666-2989
skipping to change at page 55, line 31 skipping to change at page 59, line 7
Email: floyd@aciri.org Email: floyd@aciri.org
URL: http://www.aciri.org/floyd/ URL: http://www.aciri.org/floyd/
David L. Black David L. Black
EMC Corporation EMC Corporation
42 South St. 42 South St.
Hopkinton, MA 01748 Hopkinton, MA 01748
Phone: +1 (508) 435-1000 x75140 Phone: +1 (508) 435-1000 x75140
Email: black_david@emc.com Email: black_david@emc.com
This draft was created in January 2001. This draft was created in February 2001.
It expires July 2001. It expires August 2001.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/