--- 1/draft-ietf-tcpm-ecnsyn-03.txt 2008-01-09 05:12:11.000000000 +0100 +++ 2/draft-ietf-tcpm-ecnsyn-04.txt 2008-01-09 05:12:11.000000000 +0100 @@ -1,22 +1,22 @@ Internet Engineering Task Force A. Kuzmanovic INTERNET-DRAFT A. Mondal Intended status: Proposed Standard Northwestern University -Expires: 18 May 2008 S. Floyd +Expires: 8 July 2008 S. Floyd ICIR K.K. Ramakrishnan AT&T - 18 November 2007 + 8 January 2008 Adding Explicit Congestion Notification (ECN) Capability to TCP's SYN/ACK Packets - draft-ietf-tcpm-ecnsyn-03.txt + draft-ietf-tcpm-ecnsyn-04.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that @@ -50,60 +50,76 @@ timeout, this document specifies the use of ECN for the SYN/ACK packet itself, when sent in response to a SYN packet with the two ECN flags set in the TCP header, indicating a willingness to use ECN. Setting TCP SYN/ACK packets as ECN-Capable can be of great benefit to the TCP connection, avoiding the severe penalty of a retransmit timeout for a connection that has not yet started placing a load on the network. The sender of the SYN/ACK packet must respond to a report of an ECN-marked SYN/ACK packet by reducing its initial congestion window from two, three, or four segments to one segment, thereby reducing the subsequent load from that connection on the - network. + network. This document is intended to update RFC 3168. Table of Contents 1. Introduction ....................................................4 - 2. Conventions .....................................................5 + 2. Conventions and Terminology .....................................6 3. Proposal ........................................................6 4. Discussion ......................................................9 5. Related Work ...................................................12 6. Performance Evaluation .........................................13 6.1. The Costs and Benefit of Adding ECN-Capability ............13 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets ........................................................14 7. Security Considerations ........................................15 8. Conclusions ....................................................16 9. Acknowledgements ...............................................17 A. Report on Simulations ..........................................17 - A.1. Simulations with RED in Packet Mode .......................18 + A.1. Simulations with RED in Packet Mode .......................17 A.2. Simulations with RED in Byte Mode .........................19 - Normative References ..............................................20 - Informative References ............................................20 - IANA Considerations ...............................................22 - Full Copyright Statement ..........................................22 - Intellectual Property .............................................23 + B. Issues of Incremental Deployment ...............................20 + Normative References ..............................................23 + Informative References ............................................23 + IANA Considerations ...............................................24 + Full Copyright Statement ..........................................25 + Intellectual Property .............................................25 NOTE TO RFC EDITOR: PLEASE DELETE THIS NOTE UPON PUBLICATION. + Changes from draft-ietf-tcpm-ecnsyn-03: + + * General editing. This includes using the terms "initiator" + and "responder" for the two ends of the TCP connection. + Feedback from Alfred Hoenes. + + * Added some text to the backwards compatibility discussion, + now in Appendix B, about the pros and cons of using a TCP + flag for the TCP initiator to signal that it understands + ECN-Capable SYN/ACK packets. The consensus at this time is + not to use such a flag. Also added a recommendation that + TCP implementations include a management interface to turn + off the use of ECN for SYN/ACK packets. From email from + Bob Briscoe. + Changes from draft-ietf-tcpm-ecnsyn-02: * Added to the discussion in the Security section of whether ECN-Capable TCP SYN packets have problems with firewalls, over and above the known problems of TCP data packets (e.g., as in the Microsoft report). From a question raised at the TCPM meeting at the July 2007 IETF. * Added a sentence to the discussion of routers or middleboxes that *might* drop TCP SYN packets on the basis of IP header fields. Feedback from Remi Denis-Courmont. - * General editing. Feedback from Alfred Henes. + * General editing. Feedback from Alfred Hoenes. Changes from draft-ietf-tcpm-ecnsyn-01: * Changes in response to feedback from Anil Agarwal. * Added a look at the costs of adding ECN-Capability to SYN/ACKs in a highly-congested scenario. From feedback from Mark Allman and Janardhan Iyengar. * Added a comparative evaluation of two possible responses @@ -180,25 +196,25 @@ congestion while avoiding unnecessary retransmissions and, in some cases, unnecessary retransmit timeouts. Thus, using ECN has several benefits: 1) For short transfers, a TCP connection's congestion window may be small. For example, if the current window contains only one packet, and that packet is dropped, TCP will have to wait for a retransmit timeout to recover, reducing its overall throughput. Similarly, if the current window contains only a few packets and one of those packets is dropped, there might not be enough duplicate - acknowledgements for a fast retransmission, and the sender might have - to wait for a delay of several round-trip times using Limited - Transmit [RFC3042]. With the use of ECN, short flows are less likely - to have packets dropped, sometimes avoiding unnecessary delays or - costly retransit timeouts. + acknowledgements for a fast retransmission, and the sender of the + data packet might have to wait for a delay of several round-trip + times using Limited Transmit [RFC3042]. With the use of ECN, short + flows are less likely to have packets dropped, sometimes avoiding + unnecessary delays or costly retransmit timeouts. 2) While longer flows may not see substantially improved throughput with the use of ECN, they experience lower loss. This may benefit TCP applications that are latency- and loss-sensitive, because of the avoidance of retransmissions. RFC 3168 only specifies marking the Congestion Experienced codepoint on TCP's data packets, and not on SYN and SYN/ACK packets. RFC 3168 specifies the negotiation of the use of ECN between the two TCP end- points in the TCP SYN and SYN-ACK exchange, using flags in the TCP @@ -219,44 +235,49 @@ benefits: 1) Avoidance of a retransmit timeout; 2) Improvement in the throughput of short connections. This draft specifies ECN+, a modification to RFC 3168 to allow TCP SYN/ACK packets to be ECN-Capable. Section 3 contains the specification of the change, while Section 4 discusses some of the issues, and Section 5 discusses related work. Section 6 contains an evaluation of the proposed change. -2. Conventions +2. Conventions and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119]. -3. Proposal - - This section specifies the modification to RFC 3168 to allow TCP - SYN/ACK packets to be ECN-Capable. We use the following terminology - from RFC 3168: + We use the following terminology from RFC 3168: The ECN field in the IP header: o CE: the Congestion Experienced codepoint; and o ECT: either one of the two ECN-Capable Transport codepoints. The ECN flags in the TCP header: o CWR: the Congestion Window Reduced flag; and o ECE: the ECN-Echo flag. ECN-setup packets: o ECN-setup SYN packet: a SYN packet with the ECE and CWR flags; o ECN-setup SYN-ACK packet: a SYN-ACK packet with ECE but not CWR. + In this document we use the terms "initiator" and "responder" to + refer to the sender of the SYN packet and of the SYN-ACK packet, + respectively. + +3. Proposal + + This section specifies the modification to RFC 3168 to allow TCP + SYN/ACK packets to be ECN-Capable. + RFC 3168 in Section 6.1.1. states that "A host MUST NOT set ECT on SYN or SYN-ACK packets." In this section, we specify that a TCP node MAY respond to an ECN-setup SYN packet by setting ECT in the responding ECN-setup SYN/ACK packet, indicating to routers that the SYN/ACK packet is ECN-Capable. This allows a congested router along the path to mark the packet instead of dropping the packet as an indication of congestion. Assume that TCP node A transmits to TCP node B an ECN-setup SYN packet, indicating willingness to use ECN for this connection. As @@ -282,34 +303,34 @@ 3-second timer expires <--- ECN-setup SYN/ACK, not ECT <--- ECN-setup SYN/ACK Data/ACK ---> Data/ACK ---> <--- Data (one to four segments) --------------------------------------------------------------- Figure 1: SYN exchange with the SYN/ACK packet dropped. - If the SYN/ACK packet is dropped in the network, the TCP host (node + If the SYN/ACK packet is dropped in the network, the responder (node B) responds by waiting three seconds for the retransmit timer to expire [RFC2988]. If a SYN/ACK packet with the ECT codepoint is - dropped, the TCP node SHOULD resend the SYN/ACK packet without the + dropped, the responder SHOULD resend the SYN/ACK packet without the ECN-Capable codepoint. (Although we are not aware of any middleboxes that drop SYN/ACK packets that contain an ECN-Capable codepoint in the IP header, we have learned to design our protocols defensively in this regard [RFC3360].) - We note that if syn-cookies were used by Node B in the exchange in - Figure 1, TCP Node B wouldn't set a timer upon transmission of the - SYN/ACK packet [SYN-COOK]. In this case, if the SYN/ACK packet was - lost, the initiator (Node A) would have to timeout and retransmit the - SYN packet in order to trigger another SYN-ACK. + We note that if syn-cookies were used by the responder (node B) in + the exchange in Figure 1, the responder wouldn't set a timer upon + transmission of the SYN/ACK packet [SYN-COOK]. In this case, if the + SYN/ACK packet was lost, the initiator (Node A) would have to timeout + and retransmit the SYN packet in order to trigger another SYN-ACK. Figure 2 shows an interchange with the SYN/ACK packet sent as ECN- Capable, and ECN-marked instead of dropped at the congested router. --------------------------------------------------------------- TCP Node A Router TCP Node B ---------- ------ ---------- ECN-setup SYN packet ---> ECN-setup SYN packet ---> @@ -319,48 +340,53 @@ <--- ECN-setup SYN/ACK, CE Data/ACK, ECN-Echo ---> Data/ACK, ECN-Echo ---> Window reduced to one segment. <--- Data, CWR (one segment only) --------------------------------------------------------------- Figure 2: SYN exchange with the SYN/ACK packet marked. - If the receiving node (node A) receives a SYN/ACK packet that has - been marked by the congested router, with the CE codepoint set, the - receiving node MUST respond by setting the ECN-Echo flag in the TCP - header of the responding ACK packet. As specified in RFC 3168, the - receiving node continues to set the ECN-Echo flag in packets until it + If the initiator (node A) receives a SYN/ACK packet that has been + marked by the congested router, with the CE codepoint set, the + initiator MUST respond by setting the ECN-Echo flag in the TCP header + of the responding ACK packet. As specified in RFC 3168, the + initiator continues to set the ECN-Echo flag in packets until it receives a packet with the CWR flag set. - When the sending node (node B) receives the ECN-Echo packet reporting - the Congestion Experienced indication in the SYN/ACK packet, the node - MUST set the initial congestion window to one segment, instead of two - segments as allowed by [RFC2581], or three or four segments allowed - by [RFC3390]. If the sending node (node B) was going to use an - initial window of one segment, and receives an ECN-Echo packet - informing it of a Congestion Experienced indication on its SYN/ACK - packet, the sending node MAY continue to send with an initial window - of one segment, without waiting for a retransmit timeout. We note - that this updates RFC 3168, which specifies that "the sending TCP - MUST reset the retransmit timer on receiving the ECN-Echo packet when - the congestion window is one." As specified by RFC 3168, the sending - node (node B) also sets the CWR flag in the TCP header of the next - data packet sent, to acknowledge its receipt of and reaction to the - ECN-Echo flag. + When the responder (node B) receives the ECN-Echo packet reporting + the Congestion Experienced indication in the SYN/ACK packet, the + responder MUST set the initial congestion window to one segment, + instead of two segments as allowed by [RFC2581], or three or four + segments allowed by [RFC3390]. If the responder (node B) was going + to use an initial window of one segment, and receives an ECN-Echo + packet informing it of a Congestion Experienced indication on its + SYN/ACK packet, the responder MAY continue to send with an initial + window of one segment, without waiting for a retransmit timeout. We + note that this updates RFC 3168, which specifies that "the sending + TCP MUST reset the retransmit timer on receiving the ECN-Echo packet + when the congestion window is one." As specified by RFC 3168, the + responder (node B) also sets the CWR flag in the TCP header of the + next data packet sent, to acknowledge its receipt of and reaction to + the ECN-Echo flag. If the data transfer in Figure 2 is entirely from Node A to Node B, then data packets from Node A continue to set the ECN-Echo flag in data packets, waiting for the CWR flag from Node B acknowledging a response to the ECN-Echo flag. + The TCP implementation using ECN-Capable SYN/ACK packets SHOULD + include a management interface to allow the use of ECN to be turned + off for SYN/ACK packets. This is to deal with possible backwards + compatibility problems such as those discussed in Appendix B. + 4. Discussion Motivation: The rationale for the proposed change is the following. When node B receives a TCP SYN packet with ECN-Echo bit set in the TCP header, this indicates that node A is ECN-capable. If node B is also ECN- capable, there are no obstacles to immediately setting one of the ECN-Capable codepoints in the IP header in the responding TCP SYN/ACK packet. @@ -400,49 +426,20 @@ Second, the ECN-Capable codepoint in TCP SYN packets could be misused by malicious clients to `improve' the well-known TCP SYN attack. By setting an ECN-Capable codepoint in TCP SYN packets, a malicious host might be able to inject a large number of TCP SYN packets through a potentially congested ECN-enabled router, congesting it even further. For both these reasons, we continue the restriction that the TCP SYN packet MUST NOT have the ECN-Capable codepoint in the IP header set. - Backwards compatibility: - In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node - B must have received an ECN-setup SYN packet from node A. However, - it is possible that node A supports ECN, but either ignores the CE - codepoint on received SYN/ACK packets, or ignores SYN/ACK packets - with the ECT or CE codepoint set. If the TCP sender ignores the CE - codepoint on received SYN/ACK packets, this would mean that the TCP - connection would not respond to this congestion indication. However, - this seems to us an acceptable cost to pay in the incremental - deployment of ECN-Capability for TCP's SYN/ACK packets. It would - mean that the sender of the SYN/ACK packet would not reduce the - initial congestion window from two, three, or four segments down to - one segment, as it should. However, the TCP sender would still - respond correctly to any subsequent CE indications on data packets - later on in the connection. Thus, to be explicit, when a TCP - connection includes a sender that supports ECN but *does not* support - ECN-Capability for SYN/ACK packets, in combination with a receiver - that *does* support ECN-Capabililty for SYN/ACK packets, it is quite - possible that the ECN-Capable SYN/ACK packets will be marked rather - than dropped in the network, and that the sender will not respond to - the ECN mark on the SYN/ACK packet. - - It is also possible that in some older TCP implementation, the TCP - sender would ignore arriving SYN/ACK packets that had the ECT or CE - codepoint set. This would result in a delay in connection set-up for - that TCP connection, with the TCP sender re-sending the SYN packet - after a retransmit timeout. We are not aware of any TCP - implementations with this behavior. - SYN/ACK packets and packet size: There are a number of router buffer architectures that have smaller dropping rates for small (SYN) packets than for large (data) packets. For example, for a Drop Tail queue in units of packets, where each packet takes a single slot in the buffer regardless of packet size, small and large packets are equally likely to be dropped. However, for a Drop Tail queue in units of bytes, small packets are less likely to be dropped than are large ones. Similarly, for RED in packet mode, small and large packets are equally likely to be dropped or marked, while for RED in byte mode, a packet's chance of being @@ -461,38 +458,38 @@ We believe that there are a wide range of behaviors in the real world in terms of the drop or mark behavior at routers as a function of packet size [Tools] (Section 10). We note that all of these alternatives listed above are available in the NS simulator (Drop Tail queues are by default in units of packets, while the default for RED queue management has been changed from packet mode to byte mode). Response to ECN-marking of SYN/ACK packets: One question is why TCP SYN/ACK packets should be treated differently - from other packets in terms of the packet sender's response to an - ECN-marked packet. Section 5 of RFC 3168 specifies the following: + from other packets in terms of the end node's response to an ECN- + marked packet. Section 5 of RFC 3168 specifies the following: "Upon the receipt by an ECN-Capable transport of a single CE packet, the congestion control algorithms followed at the end-systems MUST be essentially the same as the congestion control response to a *single* dropped packet. For example, for ECN-Capable TCP the source TCP is required to halve its congestion window for any window of data containing either a packet drop or an ECN indication." In particular, Section 6.1.2 of RFC 3168 specifies that when the TCP congestion window consists of a single packet and that packet is ECN- - marked in the network, then the sender must reduce the sending rate - below one packet per round-trip time, by waiting for one RTO before - sending another packet. If the RTO was set to the average round-trip - time, this would result in halving the sending rate; because the RTO - is in fact larger than the average round-trip time, the sending rate - is reduced to less than half of its previous value. + marked in the network, then the data sender must reduce the sending + rate below one packet per round-trip time, by waiting for one RTO + before sending another packet. If the RTO was set to the average + round-trip time, this would result in halving the sending rate; + because the RTO is in fact larger than the average round-trip time, + the sending rate is reduced to less than half of its previous value. TCP's congestion control response to the *dropping* of a SYN/ACK packet is to wait a default time before sending another packet. This document argues that ECN gives end-systems a wider range of possible responses to the *marking* of a SYN/ACK packet, and that waiting a default time before sending a data packet is not the desired response. On the conservative end, one could assume an effective congestion window of one packet for the SYN/ACK packet, and respond to an ECN- @@ -504,28 +501,28 @@ seconds before sending a data packet. However, we note that for an ECN-marked SYN/ACK packet, halving the *congestion window* is not the same as halving the *sending rate*; there is no `sending rate' associated with an ECN-Capable SYN/ACK packet, as such packets are only sent as the first packet in a connection from that host. Further, a router's marking of a SYN/ACK packet is not affected by any past history of that connection. Adding ECN-Capability to SYN/ACK packets allows the simple response - of setting the initial congestion window to one packet, instead of - its allowed default value of two, three, or four packets, with the - host proceeding with a cautious sending rate of one packet per round- - trip time. If that packet is ECN-marked or dropped, then the sender - will wait an RTO before sending another packet. This document argues - that this approach is useful to users, with no dangers of congestion - collapse or of starvation of competing traffic. This is discussed in - more detail below in Section 6.2. + of the responder setting the initial congestion window to one packet, + instead of its allowed default value of two, three, or four packets, + with the responder proceeding with a cautious sending rate of one + packet per round-trip time. If that data packet is ECN-marked or + dropped, then the responder will wait an RTO before sending another + packet. This document argues that this approach is useful to users, + with no dangers of congestion collapse or of starvation of competing + traffic. This is discussed in more detail below in Section 6.2. We note that if the data transfer is entirely from Node A to Node B, then there is no effective difference between the two possible responses to an ECN-marked SYN/ACK packet outlined above. In either case, Node B sends no data packets, only sending acknowledgement packets in response to received data packets. 5. Related Work The addition of ECN-capability to TCP's SYN/ACK packets was proposed @@ -632,21 +630,21 @@ Thus, the degree of benefit of adding ECN-Capability to SYN/ACK packets depends not only on the overall packet drop rate in the network, but also on the queue management architecture at the congested link. 6.2. An Evaluation of Different Responses to ECN-Marked SYN/ACK Packets This document specifies that the end-node responds to the report of an ECN-marked SYN/ACK packet by setting the initial congestion window to one segment, instead of its possible default value of two to four - segments. We call this ECN+ with NoWaiting. However, in Section 4 + segments. We call this ECN+ with NoWaiting. However, Section 4 discussed another possible response to an ECN-marked SYN/ACK packet, of the end-node waiting an RTT before sending a data packet. We call this approach ECN+ with Waiting. Simulations comparing the performance with Standard ECN (without ECN- marked SYN/ACK packets), ECN+ with NoWaiting, and ECN+ with Waiting show little difference, in terms of aggregate congestion, between ECN+ with NoWaiting and ECN+ with Waiting. The details are given in Appendix A below. Our conclusions are that ECN+ with NoWaiting is perfectly safe, and there are no congestion-related reasons for @@ -685,27 +683,27 @@ Capable or CE codepoint in the IP header (over and above the routers already known to crash when a data packet arrives with either ECT(0) or ECT(1)), but we have not conducted any measurement studies of this [F07]. Congestion collapse: Because TCP SYN/ACK packets carrying an ECT codepoint could be ECN- marked instead of dropped at an ECN-capable router, the concern is whether this can either invoke congestion, or worsen performance in highly congested scenarios. However, after learning that a SYN/ACK - packet was ECN-marked, the sender of that packet will only send one - data packet; if this data packet is ECN-marked, the sender will then - wait for a retransmission timeout. In addition, routers are free to - drop rather than mark arriving packets in times of high congestion, - regardless of whether the packets are ECN-capable. When congestion - is very high and a router's buffer is full, the router has no choice - but to drop rather than to mark an arriving packet. + packet was ECN-marked, the responder will only send one data packet; + if this data packet is ECN-marked, the responder will then wait for a + retransmission timeout. In addition, routers are free to drop rather + than mark arriving packets in times of high congestion, regardless of + whether the packets are ECN-capable. When congestion is very high + and a router's buffer is full, the router has no choice but to drop + rather than to mark an arriving packet. The simulations reported in Appendix A show that even with demanding traffic mixes dominated by short flows and high levels of congestion, the aggregate packet dropping rates are not significantly different with Standard ECN, ECN+ with NoWaiting, or ECN+ with Waiting. In particular, the simulations show that in periods of very high congestion the packet-marking rate is low with or without ECN+, and the use of ECN+ does not significantly increase the number of dropped or marked packets. @@ -744,22 +742,23 @@ the server to more appropriately adjust the initial load it places on the network. Future work will address the more general question of adding ECN- Capability to relevant handshake packets in other protocols that use retransmission-based reliability in their setup phase (e.g., SCTP, DCCP, HIP, and the like). 9. Acknowledgements - We thank Anil Agarwal, Mark Allman, Wesley Eddy, Janardhan Iyengar, - and Pasi Sarolahti for feedback on earlier versions of this draft. + We thank Anil Agarwal, Mark Allman, Remi Denis-Courmont, Wesley Eddy, + Alfred Hoenes, Janardhan Iyengar, and Pasi Sarolahti for feedback on + earlier versions of this draft. A. Report on Simulations This section reports on simulations showing the costs of adding ECN+ in highly-congested scenarios. This section also reports on simulations for a comparative evaluation between ECN+ with NoWaiting and ECN+ with Waiting. The simulations are run with a range of file-size distributions. As a baseline, they use the empirical heavy-tailed distribution reported @@ -768,28 +767,28 @@ lower and higher values to get distributions with mean file sizes of 3 KBytes, 5 KBytes, 14 KBytes and 17 KBytes. The congested link is 100 Mbps. RED is run in gentle mode, and arriving ECN-Capable packets are only dropped instead of marked if the buffer is full (and the router has no choice). We explore two alternatives for a TCP node's response to a report of an ECN-marked SYN/ACK packet. With ECN+ with NoWaiting, the TCP node sends a data packet immediately (with an initial congestion window of one segment). With the alternative ECN+ with Waiting, the TCP node - waits a round-trip time before sending a data packet; the sender + waits a round-trip time before sending a data packet; the responder already has one measurement of the round-trip time when the acknowledgement for the SYN/ACK packet is received. In the tables below, ECN+ refers to ECN+ with NoWaiting, where the - sender starts transmitting immediately, and ECN+/wait refers to ECN+ - with Waiting, where the sender waits a round-trip time before sending - a data packet into the network. + responder starts transmitting immediately, and ECN+/wait refers to + ECN+ with Waiting, where the responder waits a round-trip time before + sending a data packet into the network. The simulation scripts are available on [ECN-SYN], along with graphs showing the distribution of response times for the TCP connections. A.1. Simulations with RED in Packet Mode The simulations with RED in packet mode and with the queue in packets show that ECN+ is useful in times of moderate congestion, though it adds little benefit in times of high congestion. The simulations show a minimal increase in levels of congestion with either ECN+ with @@ -852,21 +852,21 @@ Traffic Load = 200%: ECN ECN+ ECN+/wait ------- ------- ------- Loss rate 29.99% 30.22% 30.23% Table 1: Simulations with an average flow size of 3 Kbytes, RED in packet mode, queue in packets. A.2. Simulations with RED in Byte Mode - Table 3 below shows simulations with RED in byte mode and the queue + Table 2 below shows simulations with RED in byte mode and the queue in bytes. Like the simulations with RED in packet mode, there is no significant increase in aggregate congestion with the use of ECN+ or ECN+/wait, and no congestion-related reason to prefer ECN+/wait over ECN+. However, unlike the simulations with RED in packet mode, the simulations with RED in byte mode show little benefit from the use of ECN+ or ECN+/wait, in that the packet marking rate with ECN+ or ECN+/wait is not much different than the packet marking rate with Standard ECN. This is because with RED in byte mode, small packets @@ -889,43 +889,153 @@ Marked 4,086 4,644 4,826 Loss rate 5.90% 5.78% 5.81% Traffic Load = 125%: ECN ECN+ ECN+/wait ------- ------- ------- Dropped 157,305 157,435 158,368 Marked 2,183 2,363 2,663 Loss rate 9.89% 9.87% 9.93% - Table 3: Simulations with an average flow size of 3 Kbytes, RED in + Table 2: Simulations with an average flow size of 3 Kbytes, RED in byte mode, queue in bytes. +B. Issues of Incremental Deployment + + In order for TCP node B to send a SYN/ACK packet as ECN-Capable, node + B must have received an ECN-setup SYN packet from node A. However, + it is possible that node A supports ECN, but either ignores the CE + codepoint on received SYN/ACK packets, or ignores SYN/ACK packets + with the ECT or CE codepoint set. If the TCP initiator ignores the + CE codepoint on received SYN/ACK packets, this would mean that the + TCP responder would not respond to this congestion indication. + However, this seems to us an acceptable cost to pay in the + incremental deployment of ECN-Capability for TCP's SYN/ACK packets. + It would mean that the responder would not reduce the initial + congestion window from two, three, or four segments down to one + segment, as it should. However, the TCP end nodes would still + respond correctly to any subsequent CE indications on data packets + later on in the connection. + + Figure 3 shows an interchange with the SYN/ACK packet ECN-marked, but + with the ECN mark ignored by the TCP originator. + + --------------------------------------------------------------- + TCP Node A Router TCP Node B + ---------- ------ ---------- + + ECN-setup SYN packet ---> + ECN-setup SYN packet ---> + + <--- ECN-setup SYN/ACK, ECT + <--- Sets CE on SYN/ACK + <--- ECN-setup SYN/ACK, CE + + Data/ACK, No ECN-Echo ---> + Data/ACK ---> + <--- Data (up to four packets) + --------------------------------------------------------------- + + Figure 3: SYN exchange with the SYN/ACK packet marked, + but with the ECN mark ignored by the TCP initiator. + + Thus, to be explicit, when a TCP connection includes an initiator + that supports ECN but *does not* support ECN-Capability for SYN/ACK + packets, in combination with a responder that *does* support ECN- + Capabililty for SYN/ACK packets, it is possible that the ECN-Capable + SYN/ACK packets will be marked rather than dropped in the network, + and that the responder will not learn about the ECN mark on the + SYN/ACK packet. This would not be a problem if most packets from the + responder supporting ECN for SYN/ACK packets were in long-lived TCP + connections, but it would be more problematic if most of the packets + were from TCP connections consisting of four data packets, and the + TCP responder for these connections was ready to send its data + packets immediately after the SYN/ACK exchange. Of course, with + *severe* congestion, the SYN/ACK packets would likely be dropped + rather than ECN-marked at the congested router, preventing the TCP + responder from adding to the congestion by sending its initial window + of four data packets. + + It is also possible that in some older TCP implementation, the + initiator would ignore arriving SYN/ACK packets that had the ECT or + CE codepoint set. This would result in a delay in connection set-up + for that TCP connection, with the initiator re-sending the SYN packet + after a retransmit timeout. We are not aware of any TCP + implementations with this behavior. + + One possibility for coping with problems of backwards compatibility + would be for TCP initiators to use a TCP flag that means "I + understand ECN-Capable SYN/ACK packets". If this document were to + standardize the use of such an "ECN-SYN" flag, then the TCP responder + would only send a SYN/ACK packet as ECN-capable if the incoming SYN + packet had the "ECN-SYN" flag set. An ECN-SYN flag would prevent the + backwards compatibility problems described in the paragraphs above. + + One drawback to the use of an ECN-SYN flag is that it would use one + of the four remaining reserved bits in the TCP header, for a + transient backwards compatibility problem. This drawback is limited + by the fact that the "ECN-SYN" flag would be defined only for use + with ECN-setup SYN packets; that bit in the TCP header could be + defined to have other uses for other kinds of TCP packets. + + Factors in deciding not to use an ECN-SYN flag include the following: + + (1) The limited installed base: At the time that this document was + written, the TCP implementations in Microsoft Vista and Mac OS X + included ECN, but ECN was not enabled by default [SBT07]. Thus, + there was not a large deployed base of ECN-Capable TCP + implementations. This limits the scope of any backwards + compatibility problems. + + (2) Limits to the scope of the problem: The backwards compatibility + problem would not be serious enough to cause congestion collapse; + with severe congestion, the buffer at the congested router will + overflow, and the congested router will drop rather than ECN-mark + arriving SYN packets. Some active queue management mechanisms might + switch from packet-marking to packet-dropping in times of high + congestion before buffer overflow, as recommended in Section 19.1 of + RFC 3168. This helps to prevent congestion collapse problems with + the use of ECN. + + (3) Detection of and response to backwards-compatibility problems: A + TCP responder such as a web server can't differentiate between a + SYN/ACK packet that is not ECN-marked in the network, and a SYN/ACK + packet that is ECN-marked, but where the ECN mark is ignored by the + TCP initiator. However, a TCP responder *can* detect if a SYN/ACK + packet is sent as ECN-capable and not reported as ECN-marked, but + data packets are dropped or marked from the initial window of data. + We will call this scenario "initial-window-congestion". If a web + server frequently experienced initial-window congestion (without + SYN/ACK congestion), then the web server *might* be experiencing + backwards compatibility problems with ECN-Capable SYN/ACK packets, + and could respond by not sending SYN/ACK packets as ECN-Capable. + Normative References [RFC 2119] S. Bradner, Key words for use in RFCs to Indicate Requirement Levels, RFC 2119, March 1997. [RFC3168] K.K. Ramakrishnan, S. Floyd, and D. Black, The Addition of Explicit Congestion Notification (ECN) to IP, RFC 3168, Proposed Standard, September 2001. Informative References [ECN+] A. Kuzmanovic, The Power of Explicit Congestion Notification, SIGCOMM 2005. [ECN-SYN] ECN-SYN web page with simulation scripts, URL to be added. [F07] S. Floyd, "[BEHAVE] Response of firewalls and middleboxes to TCP SYN packets that are ECN-Capable?", August 2, 2007, email sent to the BEHAVE mailing list, URL "http://www1.ietf.org/mail- - archive/web/behave/current/msg02644.html".` + archive/web/behave/current/msg02644.html". [Kelson00] Dax Kelson, note sent to the Linux kernel mailing list, September 10, 2000. [MAF05] A. Medina, M. Allman, and S. Floyd. Measuring the Evolution of Transport Protocols in the Internet, ACM CCR, April 2005. [PI] C. Hollot, V. Misra, W. Gong, and D. Towsley, On Designing Improved Controllers for AQM Routers Supporting TCP Flows, April 1998.