draft-ietf-aqm-recommendation-01.txt   draft-ietf-aqm-recommendation-02.txt 
Network Working Group F. Baker, Ed. Network Working Group F. Baker, Ed.
Internet-Draft Cisco Systems Internet-Draft Cisco Systems
Obsoletes: 2309 (if approved) G. Fairhurst, Ed. Obsoletes: 2309 (if approved) G. Fairhurst, Ed.
Intended status: Best Current Practice University of Aberdeen Intended status: Best Current Practice University of Aberdeen
Expires: August 3, 2014 January 30, 2014 Expires: August 18, 2014 February 14, 2014
IETF Recommendations Regarding Active Queue Management IETF Recommendations Regarding Active Queue Management
draft-ietf-aqm-recommendation-01 draft-ietf-aqm-recommendation-02
Abstract Abstract
This memo presents recommendations to the Internet community This memo presents recommendations to the Internet community
concerning measures to improve and preserve Internet performance. It concerning measures to improve and preserve Internet performance. It
presents a strong recommendation for testing, standardization, and presents a strong recommendation for testing, standardization, and
widespread deployment of active queue management (AQM) in network widespread deployment of active queue management (AQM) in network
devices, to improve the performance of today's Internet. It also devices, to improve the performance of today's Internet. It also
urges a concerted effort of research, measurement, and ultimate urges a concerted effort of research, measurement, and ultimate
deployment of AQM mechanisms to protect the Internet from flows that deployment of AQM mechanisms to protect the Internet from flows that
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 3, 2014. This Internet-Draft will expire on August 18, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 28 skipping to change at page 2, line 28
4. Conclusions and Recommendations . . . . . . . . . . . . . . . 10 4. Conclusions and Recommendations . . . . . . . . . . . . . . . 10
4.1. Operational deployments SHOULD use AQM procedures . . . 11 4.1. Operational deployments SHOULD use AQM procedures . . . 11
4.2. Signaling to the transport endpoints . . . . . . . . . . 11 4.2. Signaling to the transport endpoints . . . . . . . . . . 11
4.2.1. AQM and ECN . . . . . . . . . . . . . . . . . . . . . 12 4.2.1. AQM and ECN . . . . . . . . . . . . . . . . . . . . . 12
4.3. AQM algorithms deployed SHOULD NOT require operational 4.3. AQM algorithms deployed SHOULD NOT require operational
tuning . . . . . . . . . . . . . . . . . . . . . . . . . 13 tuning . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4. AQM algorithms SHOULD respond to measured congestion, not 4.4. AQM algorithms SHOULD respond to measured congestion, not
application profiles. . . . . . . . . . . . . . . . . . . 14 application profiles. . . . . . . . . . . . . . . . . . . 14
4.5. AQM algorithms SHOULD NOT be dependent on specific 4.5. AQM algorithms SHOULD NOT be dependent on specific
transport protocol behaviours . . . . . . . . . . . . . . 15 transport protocol behaviours . . . . . . . . . . . . . . 15
4.6. Interactions with congestion control algorithms . . . . . 15 4.6. Interactions with congestion control algorithms . . . . . 16
4.7. The need for further research . . . . . . . . . . . . . . 16 4.7. The need for further research . . . . . . . . . . . . . . 17
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
6. Security Considerations . . . . . . . . . . . . . . . . . . . 17 6. Security Considerations . . . . . . . . . . . . . . . . . . . 17
7. Privacy Considerations . . . . . . . . . . . . . . . . . . . 18 7. Privacy Considerations . . . . . . . . . . . . . . . . . . . 18
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 18
9.1. Normative References . . . . . . . . . . . . . . . . . . 18 9.1. Normative References . . . . . . . . . . . . . . . . . . 18
9.2. Informative References . . . . . . . . . . . . . . . . . 19 9.2. Informative References . . . . . . . . . . . . . . . . . 19
Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 21 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 22
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
The Internet protocol architecture is based on a connectionless end- The Internet protocol architecture is based on a connectionless end-
to-end packet service using the Internet Protocol, whether IPv4 to-end packet service using the Internet Protocol, whether IPv4
[RFC0791] or IPv6 [RFC2460]. The advantages of its connectionless [RFC0791] or IPv6 [RFC2460]. The advantages of its connectionless
design: flexibility and robustness, have been amply demonstrated. design: flexibility and robustness, have been amply demonstrated.
However, these advantages are not without cost: careful design is However, these advantages are not without cost: careful design is
required to provide good service under heavy load. In fact, lack of required to provide good service under heavy load. In fact, lack of
attention to the dynamics of packet forwarding can result in severe attention to the dynamics of packet forwarding can result in severe
service degradation or "Internet meltdown". This phenomenon was service degradation or "Internet meltdown". This phenomenon was
first observed during the early growth phase of the Internet of the first observed during the early growth phase of the Internet in the
mid 1980s [RFC0896][RFC0970], and is technically called "congestive mid 1980s [RFC0896][RFC0970], and is technically called "congestive
collapse". collapse".
The original fix for Internet meltdown was provided by Van Jacobsen. The original fix for Internet meltdown was provided by Van Jacobsen.
Beginning in 1986, Jacobsen developed the congestion avoidance Beginning in 1986, Jacobsen developed the congestion avoidance
mechanisms that are now required in TCP implementations [Jacobson88] mechanisms that are now required in TCP implementations [Jacobson88]
[RFC1122]. These mechanisms operate in Internet hosts to cause TCP [RFC1122]. These mechanisms operate in Internet hosts to cause TCP
connections to "back off" during congestion. We say that TCP flows connections to "back off" during congestion. We say that TCP flows
are "responsive" to congestion signals (i.e., marked or dropped are "responsive" to congestion signals (i.e., marked or dropped
packets) from the network. It is primarily these TCP congestion packets) from the network. It is primarily these TCP congestion
avoidance algorithms that prevent the congestive collapse of today's avoidance algorithms that prevent the congestive collapse of today's
Internet. Internet. Similar algorithms are specified for other non-TCP
transports.
However, that is not the end of the story. Considerable research has However, that is not the end of the story. Considerable research has
been done on Internet dynamics since 1988, and the Internet has been done on Internet dynamics since 1988, and the Internet has
grown. It has become clear that the TCP congestion avoidance grown. It has become clear that the TCP congestion avoidance
mechanisms [RFC5681], while necessary and powerful, are not mechanisms [RFC5681], while necessary and powerful, are not
sufficient to provide good service in all circumstances. Basically, sufficient to provide good service in all circumstances. Basically,
there is a limit to how much control can be accomplished from the there is a limit to how much control can be accomplished from the
edges of the network. Some mechanisms are needed in the network edges of the network. Some mechanisms are needed in the network
devices to complement the endpoint congestion avoidance mechanisms. devices to complement the endpoint congestion avoidance mechanisms.
These mechanisms may be implemented in network devices that include These mechanisms may be implemented in network devices that include
skipping to change at page 3, line 40 skipping to change at page 3, line 41
algorithms. To a rough approximation, queue management algorithms algorithms. To a rough approximation, queue management algorithms
manage the length of packet queues by marking or dropping packets manage the length of packet queues by marking or dropping packets
when necessary or appropriate, while scheduling algorithms determine when necessary or appropriate, while scheduling algorithms determine
which packet to send next and are used primarily to manage the which packet to send next and are used primarily to manage the
allocation of bandwidth among flows. While these two AQM mechanisms allocation of bandwidth among flows. While these two AQM mechanisms
are closely related, they address different performance issues. are closely related, they address different performance issues.
This memo highlights two performance issues: This memo highlights two performance issues:
The first issue is the need for an advanced form of queue management The first issue is the need for an advanced form of queue management
that we call "active queue management." Section 2 summarizes the that we call "Active Queue Management", AQM. Section 2 summarizes
benefits that active queue management can bring. A number of Active the benefits that active queue management can bring. A number of AQM
Queue Management (AQM) procedures are described in the literature, procedures are described in the literature, with different
with different characteristics. This document does not recommend any characteristics. This document does not recommend any of them in
of them in particular, but does make recommendations that ideally particular, but does make recommendations that ideally would affect
would affect the choice of procedure used in a given implementation. the choice of procedure used in a given implementation.
The second issue, discussed in Section 3 of this memo, is the The second issue, discussed in Section 3 of this memo, is the
potential for future congestive collapse of the Internet due to flows potential for future congestive collapse of the Internet due to flows
that are unresponsive, or not sufficiently responsive, to congestion that are unresponsive, or not sufficiently responsive, to congestion
indications. Unfortunately, there is no consensus solution to indications. Unfortunately, there is currently no consensus solution
controlling congestion caused by such aggressive flows; significant to controlling congestion caused by such aggressive flows;
research and engineering will be required before any solution will be significant research and engineering will be required before any
available. It is imperative that this work be energetically pursued, solution will be available. It is imperative that this work be
to ensure the future stability of the Internet. energetically pursued, to ensure the future stability of the
Internet.
Section 4 concludes the memo with a set of recommendations to the Section 4 concludes the memo with a set of recommendations to the
Internet community concerning these topics. Internet community concerning these topics.
The discussion in this memo applies to "best-effort" traffic, which The discussion in this memo applies to "best-effort" traffic, which
is to say, traffic generated by applications that accept the is to say, traffic generated by applications that accept the
occasional loss, duplication, or reordering of traffic in flight. It occasional loss, duplication, or reordering of traffic in flight. It
also applies to other traffic, such as real-time traffic that can also applies to other traffic, such as real-time traffic that can
adapt its sending rate to reduce loss and/or delay. It is most adapt its sending rate to reduce loss and/or delay. It is most
effective, when the adaption occurs on time scales of a single RTT or effective, when the adaption occurs on time scales of a single Round
a small number of RTTs, for elastic traffic [RFC1633]. Trip Time (RTT) or a small number of RTTs, for elastic traffic
[RFC1633].
[RFC2309] resulted from past discussions of end-to-end performance, [RFC2309] resulted from past discussions of end-to-end performance,
Internet congestion, and Random Early Discard (RED) in the End-to-End Internet congestion, and Random Early Discard (RED) in the End-to-End
Research Group of the Internet Research Task Force (IRTF). This Research Group of the Internet Research Task Force (IRTF). This
update results from experience with this and other algorithms, and update results from experience with this and other algorithms, and
the AQM discussion within the IETF[AQM-WG]. the AQM discussion within the IETF[AQM-WG].
1.1. Requirements Language 1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
skipping to change at page 5, line 17 skipping to change at page 5, line 20
become full. It is important to reduce the steady-state queue become full. It is important to reduce the steady-state queue
size, and this is perhaps the most important goal for queue size, and this is perhaps the most important goal for queue
management. management.
The naive assumption might be that there is a simple tradeoff The naive assumption might be that there is a simple tradeoff
between delay and throughput, and that the recommendation that between delay and throughput, and that the recommendation that
queues be maintained in a "non-full" state essentially translates queues be maintained in a "non-full" state essentially translates
to a recommendation that low end-to-end delay is more important to a recommendation that low end-to-end delay is more important
than high throughput. However, this does not take into account than high throughput. However, this does not take into account
the critical role that packet bursts play in Internet the critical role that packet bursts play in Internet
performance. Even though TCP constrains the congestion window of performance. For example, even though TCP constrains the
a flow, packets often arrive at network devices in bursts congestion window of a flow, packets often arrive at network
[Leland94]. If the queue is full or almost full, an arriving devices in bursts [Leland94]. If the queue is full or almost
burst will cause multiple packets to be dropped. This can result full, an arriving burst will cause multiple packets to be
in a global synchronization of flows throttling back, followed by dropped. This can result in a global synchronization of flows
a sustained period of lowered link utilization, reducing overall throttling back, followed by a sustained period of lowered link
throughput. utilization, reducing overall throughput.
The point of buffering in the network is to absorb data bursts The point of buffering in the network is to absorb data bursts
and to transmit them during the (hopefully) ensuing bursts of and to transmit them during the (hopefully) ensuing bursts of
silence. This is essential to permit the transmission of bursty silence. This is essential to permit the transmission of bursty
data. Normally small queues are prefered in network devices, data. Normally small queues are preferred in network devices,
with sufficient queue capacity to absorb the bursts. The with sufficient queue capacity to absorb the bursts. The
counter-intuitive result is that maintaining normally-small counter-intuitive result is that maintaining normally-small
queues can result in higher throughput as well as lower end-to- queues can result in higher throughput as well as lower end-to-
end delay. In summary, queue limits should not reflect the end delay. In summary, queue limits should not reflect the
steady state queues we want to be maintained in the network; steady state queues we want to be maintained in the network;
instead, they should reflect the size of bursts that a network instead, they should reflect the size of bursts that a network
device needs to absorb. device needs to absorb.
Besides tail drop, two alternative queue disciplines that can be Besides tail drop, two alternative queue disciplines that can be
applied when a queue becomes full are "random drop on full" or "drop applied when a queue becomes full are "random drop on full" or "drop
skipping to change at page 7, line 38 skipping to change at page 7, line 41
overall average queue sizes, so that arriving bursts can be overall average queue sizes, so that arriving bursts can be
accommodated without dropping packets. In addition, AQM should accommodated without dropping packets. In addition, AQM should
be used to control the queue size for each individual flow or be used to control the queue size for each individual flow or
class, so that they do not experience unnecessarily high delay. class, so that they do not experience unnecessarily high delay.
Therefore, AQM should be applied across the classes or flows as Therefore, AQM should be applied across the classes or flows as
well as within each class or flow. well as within each class or flow.
In short, scheduling algorithms and queue management should be seen In short, scheduling algorithms and queue management should be seen
as complementary, not as replacements for each other. as complementary, not as replacements for each other.
An AQM method may use Explicit Congestion Notification (ECN)
[RFC3168] instead of dropping to mark packets under mild or moderate
congestion (see Section 4.2.1).
It is also important to differentiate the choice of buffer size for a It is also important to differentiate the choice of buffer size for a
queue in a switch/router or other network device, and the queue in a switch/router or other network device, and the
threshold(s) and other parameters that determine how and when an AQM threshold(s) and other parameters that determine how and when an AQM
algorithm operates. One the one hand, the optimum buffer size is a algorithm operates. One the one hand, the optimum buffer size is a
function of operational requirements and should generally be sized to function of operational requirements and should generally be sized to
be sufficient to buffer the largest normal traffic burst that is be sufficient to buffer the largest normal traffic burst that is
expected. This size depends on the number and burstiness of traffic expected. This size depends on the number and burstiness of traffic
arriving at the queue and the rate at which traffic leaves the queue. arriving at the queue and the rate at which traffic leaves the queue.
Different types of traffic and deployment scenarios will lead to Different types of traffic and deployment scenarios will lead to
different requirements. On the other hand, the choice of AQM different requirements. On the other hand, the choice of AQM
skipping to change at page 8, line 37 skipping to change at page 8, line 42
congestion occurs, and (3) flows that are responsive but are not TCP- congestion occurs, and (3) flows that are responsive but are not TCP-
friendly. The last two classes contain more aggressive flows that friendly. The last two classes contain more aggressive flows that
pose significant threats to Internet performance, which we will now pose significant threats to Internet performance, which we will now
discuss. discuss.
1. TCP-Friendly flows 1. TCP-Friendly flows
A TCP-friendly flow responds to congestion notification within a A TCP-friendly flow responds to congestion notification within a
small number of path Round Trip Times (RTT), and in steady-state small number of path Round Trip Times (RTT), and in steady-state
it uses no more capacity than a conformant TCP running under it uses no more capacity than a conformant TCP running under
comparable conditions (drop rate, RTT, MTU, etc.). This is comparable conditions (drop rate, RTT, packet size, etc.). This
described in the remainder of the document. is described in the remainder of the document.
2. Non-Responsive Flows 2. Non-Responsive Flows
The User Datagram Protocol (UDP) [RFC0768] provides a minimal, The User Datagram Protocol (UDP) [RFC0768] provides a minimal,
best-effort transport to applications and upper-layer protocols best-effort transport to applications and upper-layer protocols
(both simply called "applications" in the remainder of this (both simply called "applications" in the remainder of this
document) and does not itself provide mechanisms to prevent document) and does not itself provide mechanisms to prevent
congestion collapse and establish a degree of fairness [RFC5405]. congestion collapse and establish a degree of fairness [RFC5405].
There is a growing set of UDP-based applications whose congestion There is a growing set of UDP-based applications whose congestion
avoidance algorithms are inadequate or nonexistent (i.e, a flow avoidance algorithms are inadequate or nonexistent (i.e, a flow
that does not throttle its sending rate when it experiences that does not throttle its sending rate when it experiences
congestion). Examples include some UDP streaming applications congestion). Examples include some UDP streaming applications
for packet voice and video, and some multicast bulk data for packet voice and video, and some multicast bulk data
transport. If no action is taken, such unresponsive flows could transport. If no action is taken, such unresponsive flows could
lead to a new congestive collapse [RFC2309]. lead to a new congestive collapse [RFC2309].
In general, UDP-based applications need to incorporate effective In general, UDP-based applications need to incorporate effective
congestion avoidance mechanisms [RFC5405]. Further research and congestion avoidance mechanisms [RFC5405]. Further research and
development of ways to accomplish congestion avoidance for development of ways to accomplish congestion avoidance for
presently unresponsive applications continue to be presently unresponsive applications continue to be important.
important.Network devices need to be able to protect themselves Network devices need to be able to protect themselves against
against unresponsive flows, and mechanisms to accomplish this unresponsive flows, and mechanisms to accomplish this must be
must be developed and deployed. Deployment of such mechanisms developed and deployed. Deployment of such mechanisms would
would provide an incentive for all applications to become provide an incentive for all applications to become responsive by
responsive by either using a congestion-controlled transport either using a congestion-controlled transport (e.g. TCP, SCTP,
(e.g. TCP, SCTP, DCCP) or by incorporating their own congestion DCCP) or by incorporating their own congestion control in the
control in the application. [RFC5405]. application [RFC5405].
3. Non-TCP-friendly Transport Protocols 3. Non-TCP-friendly Transport Protocols
A second threat is posed by transport protocol implementations A second threat is posed by transport protocol implementations
that are responsive to congestion, but, either deliberately or that are responsive to congestion, but, either deliberately or
through faulty implementation, are not TCP-friendly. Such through faulty implementation, are not TCP-friendly. Such
applications may gain an unfair share of the available network applications may gain an unfair share of the available network
capacity. capacity.
For example, the popularity of the Internet has caused a For example, the popularity of the Internet has caused a
skipping to change at page 9, line 42 skipping to change at page 9, line 47
deliberately be implemented with congestion avoidance algorithms deliberately be implemented with congestion avoidance algorithms
that are more aggressive in their use of capacity than other TCP that are more aggressive in their use of capacity than other TCP
implementations; this would allow a vendor to claim to have a implementations; this would allow a vendor to claim to have a
"faster TCP". The logical consequence of such implementations "faster TCP". The logical consequence of such implementations
would be a spiral of increasingly aggressive TCP implementations, would be a spiral of increasingly aggressive TCP implementations,
leading back to the point where there is effectively no leading back to the point where there is effectively no
congestion avoidance and the Internet is chronically congested. congestion avoidance and the Internet is chronically congested.
Another example could be an RTP/UDP video flow that uses an Another example could be an RTP/UDP video flow that uses an
adaptive codec, but responds incompletely to indications of adaptive codec, but responds incompletely to indications of
congestion or over responds over an excessively long time period. congestion or responds over an excessively long time period.
Such flows are unlikely to be responsive to congestion signals in Such flows are unlikely to be responsive to congestion signals in
a time frame comparable to a small number of end-to-end a timeframe comparable to a small number of end-to-end
transmission delays. However, over a longer timescale, perhaps transmission delays. However, over a longer timescale, perhaps
seconds in duration, they could moderate their speed, or increase seconds in duration, they could moderate their speed, or increase
their speed if they determine capacity to be available. their speed if they determine capacity to be available.
Tunneled traffic aggregates carrying multiple (short) TCP flows Tunneled traffic aggregates carrying multiple (short) TCP flows
can be more aggressive than standard bulk TCP. Applications can be more aggressive than standard bulk TCP. Applications
(e.g. web browsers and peer-to-peer file-sharing) have exploited (e.g. web browsers and peer-to-peer file-sharing) have exploited
this by opening multiple connections to the same endpoint. this by opening multiple connections to the same endpoint.
The projected increase in the fraction of total Internet traffic for The projected increase in the fraction of total Internet traffic for
skipping to change at page 10, line 21 skipping to change at page 10, line 24
managing such flows. This raises many difficult issues in managing such flows. This raises many difficult issues in
identifying and isolating unresponsive or non-TCP-friendly flows at identifying and isolating unresponsive or non-TCP-friendly flows at
an acceptable overhead cost. Finally, there is as yet little an acceptable overhead cost. Finally, there is as yet little
measurement or simulation evidence available about the rate at which measurement or simulation evidence available about the rate at which
these threats are likely to be realized, or about the expected these threats are likely to be realized, or about the expected
benefit of algorithms for managing such flows. benefit of algorithms for managing such flows.
Another topic requiring consideration is the appropriate granularity Another topic requiring consideration is the appropriate granularity
of a "flow" when considering a queue management method. There are a of a "flow" when considering a queue management method. There are a
few "natural" answers: 1) a transport (e.g. TCP or UDP) flow (source few "natural" answers: 1) a transport (e.g. TCP or UDP) flow (source
address/port, destination address/port, DSCP); 2) a source/ address/port, destination address/port, Differentiated Services Code
destination host pair (IP addresses, DSCP); 3) a given source host or Point - DSCP); 2) a source/destination host pair (IP addresses,
a given destination host. We suggest that the source/destination DSCP); 3) a given source host or a given destination host. We
host pair gives the most appropriate granularity in many suggest that the source/destination host pair gives the most
circumstances. However, it is possible that different vendors/ appropriate granularity in many circumstances. However, it is
providers could set different granularities for defining a flow (as a possible that different vendors/providers could set different
way of "distinguishing" themselves from one another), or that granularities for defining a flow (as a way of "distinguishing"
different granularities could be chosen for different places in the themselves from one another), or that different granularities could
network. It may be the case that the granularity is less important be chosen for different places in the network. It may be the case
than the fact that a network device needs to be able to deal with that the granularity is less important than the fact that a network
more unresponsive flows at *some* granularity. The granularity of device needs to be able to deal with more unresponsive flows at
flows for congestion management is, at least in part, a question of *some* granularity. The granularity of flows for congestion
policy that needs to be addressed in the wider IETF community. management is, at least in part, a question of policy that needs to
be addressed in the wider IETF community.
4. Conclusions and Recommendations 4. Conclusions and Recommendations
The IRTF, in publishing [RFC2309], and the IETF in subsequent The IRTF, in publishing [RFC2309], and the IETF in subsequent
discussion, has developed a set of specific recommendations regarding discussion, has developed a set of specific recommendations regarding
the implementation and operational use of AQM procedures. This the implementation and operational use of AQM procedures. This
document updates these to include: document updates these to include:
1. Network devices SHOULD implement some AQM mechanism to manage 1. Network devices SHOULD implement some AQM mechanism to manage
queue lengths, reduce end-to-end latency, and avoid lock-out queue lengths, reduce end-to-end latency, and avoid lock-out
skipping to change at page 11, line 32 skipping to change at page 11, line 35
envisaged in this document in which the recommendation does not envisaged in this document in which the recommendation does not
apply. However, care should be taken in concluding that one's use apply. However, care should be taken in concluding that one's use
case falls in that category; during the life of the Internet, such case falls in that category; during the life of the Internet, such
use cases have been rarely if ever observed and reported on. To the use cases have been rarely if ever observed and reported on. To the
contrary, available research [Papagiannaki] says that even high speed contrary, available research [Papagiannaki] says that even high speed
links in network cores that are normally very stable in depth and links in network cores that are normally very stable in depth and
behavior experience occasional issues that need moderation. behavior experience occasional issues that need moderation.
4.1. Operational deployments SHOULD use AQM procedures 4.1. Operational deployments SHOULD use AQM procedures
AQM procedures are designed to minimize delay induced in the network AQM procedures are designed to minimize the delay induced in the
by queues that have filled as a result of host behavior. Marking and network by queues that have filled as a result of host behavior.
loss behaviors provide a signal that buffers within network devices Marking and loss behaviors provide a signal that buffers within
are becoming unnecessarily full, and that the sender would do well to network devices are becoming unnecessarily full, and that the sender
moderate its behavior. would do well to moderate its behavior.
4.2. Signaling to the transport endpoints 4.2. Signaling to the transport endpoints
There are a number of ways a network device may signal to the end There are a number of ways a network device may signal to the end
point that the network is becoming congested and trigger a reduction point that the network is becoming congested and trigger a reduction
in rate. The signalling methods include: in rate. The signalling methods include:
o Delaying transport segments (packets) in flight, such as in a o Delaying transport segments (packets) in flight, such as in a
queue. queue.
skipping to change at page 12, line 20 skipping to change at page 12, line 24
Increased network latency can be used as an implicit signal of Increased network latency can be used as an implicit signal of
congestion. E.g., in TCP additional delay can affect ACK Clocking congestion. E.g., in TCP additional delay can affect ACK Clocking
and has the result of reducing the rate of transmission of new data. and has the result of reducing the rate of transmission of new data.
In RTP, network latency impacts the RTCP-reported RTT and increased In RTP, network latency impacts the RTCP-reported RTT and increased
latency can trigger a sender to adjust its rate. Methods such as latency can trigger a sender to adjust its rate. Methods such as
LEDBAT [RFC6817] assume increased latency as a primary signal of LEDBAT [RFC6817] assume increased latency as a primary signal of
congestion. congestion.
It is essential that all Internet hosts respond to loss [RFC5681], It is essential that all Internet hosts respond to loss [RFC5681],
[RFC5405][RFC2960][RFC4340]. Packet dropping by network devices that [RFC5405][RFC4960][RFC4340]. Packet dropping by network devices that
are under load has two effects: It protects the network, which is the are under load has two effects: It protects the network, which is the
primary reason that network devices drop packets. The detection of primary reason that network devices drop packets. The detection of
loss also provides a signal to a reliable transport (e.g. TCP, SCTP) loss also provides a signal to a reliable transport (e.g. TCP, SCTP)
that there is potential congestion using a pragmatic heuristic; "when that there is potential congestion using a pragmatic heuristic; "when
the network discards a message in flight, it may imply the presence the network discards a message in flight, it may imply the presence
of faulty equipment or media in a path, and it may imply the presence of faulty equipment or media in a path, and it may imply the presence
of congestion. To be conservative transport must the latter." of congestion. To be conservative transport must the latter."
Unreliable transports (e.g. using UDP) need to similarly react to Unreliable transports (e.g. using UDP) need to similarly react to
loss [RFC5405] loss [RFC5405]
Network devices SHOULD use use an AQM algorithm to determine the Network devices SHOULD use an AQM algorithm to determine the packets
packets that are marked or discarded due to congestion. that are marked or discarded due to congestion.
Loss also has an effect on the efficiency of a flow and can Loss also has an effect on the efficiency of a flow and can
significantly impact some classes of application. In reliable significantly impact some classes of application. In reliable
transports the dropped data must be subsequently retransmitted. transports the dropped data must be subsequently retransmitted.
While other applications/transports may adapt to the absence of lost While other applications/transports may adapt to the absence of lost
data, this still implies inefficient use of available capacity and data, this still implies inefficient use of available capacity and
the dropped traffic can affect other flows. Hence, loss is not the dropped traffic can affect other flows. Hence, loss is not
entirely positive; it is a necessary evil. entirely positive; it is a necessary evil.
4.2.1. AQM and ECN 4.2.1. AQM and ECN
Explicit Congestion Notification (ECN) [RFC4301] [RFC4774] [RFC6040] Explicit Congestion Notification (ECN) [RFC4301] [RFC4774] [RFC6040]
[RFC6679]. is a network-layer function that allows a transport to [RFC6679] is a network-layer function that allows a transport to
receive network congestion information from a network device without receive network congestion information from a network device without
incurring the unintended consequences of loss. ECN includes both incurring the unintended consequences of loss. ECN includes both
transport mechanisms and functions implemented in network devices, transport mechanisms and functions implemented in network devices,
the latter rely upon using AQM to decider whether to ECN-mark. the latter rely upon using AQM to decider whether to ECN-mark.
Congestion for ECN-capable transports is signalled by a network Congestion for ECN-capable transports is signalled by a network
device setting the "Congestion Experienced (CE)" codepoint in the IP device setting the "Congestion Experienced (CE)" codepoint in the IP
header. This codepoint is noted by the remote receiving end point header. This codepoint is noted by the remote receiving end point
and signalled back to the sender using a transport protocol and signalled back to the sender using a transport protocol
mechanism, allowing the sender to trigger timely congestion control. mechanism, allowing the sender to trigger timely congestion control.
skipping to change at page 13, line 18 skipping to change at page 13, line 23
configured with a threshold. Non-ECN capable flows (the default) are configured with a threshold. Non-ECN capable flows (the default) are
dropped under congestion. dropped under congestion.
Network devices SHOULD use an AQM algorithm that marks ECN-capable Network devices SHOULD use an AQM algorithm that marks ECN-capable
traffic when making decisions about the response to congestion. traffic when making decisions about the response to congestion.
Network devices need to implement this method by marking ECN-capable Network devices need to implement this method by marking ECN-capable
traffic or by dropping non-ECN-capable traffic. traffic or by dropping non-ECN-capable traffic.
Safe deployment of ECN requires that network devices drop excessive Safe deployment of ECN requires that network devices drop excessive
traffic, even when marked as originating from an ECN-capable traffic, even when marked as originating from an ECN-capable
transport. This is necessary because (1) A non-conformant, broken or transport. This is a necessary safety precaution because (1) A non-
malicious receiver could conceal an ECN mark, and not report this to conformant, broken or malicious receiver could conceal an ECN mark,
the sender (2) A non-conformant, broken or malicious sender could and not report this to the sender (2) A non-conformant, broken or
ignore a reported ECN mark, as it could ignore a loss without using malicious sender could ignore a reported ECN mark, as it could ignore
ECN (3) A malfunctioning or non-conforming network device may a loss without using ECN (3) A malfunctioning or non-conforming
similarly "hide" an ECN mark. In normal operation such cases should network device may similarly "hide" an ECN mark. In normal operation
be very uncommon. such cases should be very uncommon.
Network devices SHOULD use an algorithm to drop excessive traffic, Network devices SHOULD use an algorithm to drop excessive traffic,
even when marked as originating from an ECN-capable transport. even when marked as originating from an ECN-capable transport.
4.3. AQM algorithms deployed SHOULD NOT require operational tuning 4.3. AQM algorithms deployed SHOULD NOT require operational tuning
A number of AQM algorithms have been proposed. Many require some A number of AQM algorithms have been proposed. Many require some
form of tuning or setting of parameters for initial network form of tuning or setting of parameters for initial network
conditions. This can make these algorithms difficult to use in conditions. This can make these algorithms difficult to use in
operational networks. operational networks.
AQM algorithms need to consider both "initial conditions" and AQM algorithms need to consider both "initial conditions" and
"operational conditions". The former includes values that exist "operational conditions". The former includes values that exist
before any experience is gathered about the use of the algorithm, before any experience is gathered about the use of the algorithm,
such as the configured speed of interface, support for full duplex such as the configured speed of interface, support for full duplex
communication, interface MTU and other properties of the link. The communication, interface MTU and other properties of the link. The
latter includes information observed from monitoring the size of the latter includes information observed from monitoring the size of the
queue, experienced queueing delay, rate of packet discards, etc. queue, experienced queueing delay, rate of packet discard, etc.
This document therefore recommends that AQM algorithm proposed for This document therefore specifies that AQM algorithms that are
deployment in the Internet: proposed for deployment in the Internet have the following
properties:
o SHOULD NOT require tuning of initial or configuration parameters. o SHOULD NOT require tuning of initial or configuration parameters.
An algorithm needs to provide a default behaviour that auto-tunes An algorithm needs to provide a default behaviour that auto-tunes
to a reasonable performance for typical network conditions. This to a reasonable performance for typical network operational
is expected to ease deployment and operation. conditions. This is expected to ease deployment and operation.
Initial conditions, such as the interface rate and MTU size or
other values derived from these, MAY be required by an AQM
algorithm.
o MAY support further manual tuning that could improve performance o MAY support further manual tuning that could improve performance
in a specific deployed network. Algorithms that lack such in a specific deployed network. Algorithms that lack such
variables are acceptable, but if such variables exist, they SHOULD variables are acceptable, but if such variables exist, they SHOULD
be externalized (made visible to the operator). Guidance needs to be externalized (made visible to the operator). Guidance needs to
be provided on the cases where autotuning is unlikely to achieve be provided on the cases where autotuning is unlikely to achieve
satisfactory performance and to identify the set of parameters satisfactory performance and to identify the set of parameters
that can be tuned. This is expected to enable the algorithm to be that can be tuned. This is expected to enable the algorithm to be
deployed in networks that have specific characteristics (variable/ deployed in networks that have specific characteristics (variable/
larger delay; networks were capacity is impacted by interactions larger delay; networks were capacity is impacted by interactions
with lower layer mechanisms, etc) with lower layer mechanisms, etc).
o MAY provide logging and alarm signals to assist in identifying if o MAY provide logging and alarm signals to assist in identifying if
an algorithm using manual or auto-tuning is functioning as an algorithm using manual or auto-tuning is functioning as
expected. (e.g., this could be based on an internal consistency expected. (e.g., this could be based on an internal consistency
check between input, output, and mark/drop rates over time). This check between input, output, and mark/drop rates over time). This
is expected to encourage deployment by default and allow operators is expected to encourage deployment by default and allow operators
to identify potential interactions with other network functions. to identify potential interactions with other network functions.
Hence, self-tuning algorithms are to be preferred. Algorithms Hence, self-tuning algorithms are to be preferred. Algorithms
recommended for general Internet deployment by the IETF need to be recommended for general Internet deployment by the IETF need to be
skipping to change at page 14, line 41 skipping to change at page 14, line 49
Not all applications transmit packets of the same size. Although Not all applications transmit packets of the same size. Although
applications may be characterised by particular profiles of packet applications may be characterised by particular profiles of packet
size this should not be used as the basis for AQM (see next section). size this should not be used as the basis for AQM (see next section).
Other methods exist, e.g. Differentiated Services queueing, Pre- Other methods exist, e.g. Differentiated Services queueing, Pre-
Congestion Notification (PCN) [RFC5559], that can be used to Congestion Notification (PCN) [RFC5559], that can be used to
differentiate and police classes of application. Network devices may differentiate and police classes of application. Network devices may
combine AQM with these traffic classification mechanisms and perform combine AQM with these traffic classification mechanisms and perform
AQM only on specific queues within a network device. AQM only on specific queues within a network device.
An AQM algorithm should not deliberately try to prejudice the size of An AQM algorithm should not deliberately try to prejudice the size of
packet that performs best (i.e. preferentially drop/mark based only packet that performs best (i.e. Preferentially drop/mark based only
on packet size). Procedures for selecting packets to mark/drop on packet size). Procedures for selecting packets to mark/drop
SHOULD observe actual or projected time a packet is in a queue (bytes SHOULD observe the actual or projected time that a packet is in a
at a rate being an analog to time). When an AQM algorithm decides queue (bytes at a rate being an analog to time). When an AQM
whether to drop (or mark) a packet, it is RECOMMENDED that the size algorithm decides whether to drop (or mark) a packet, it is
of the particular packet should not be taken into account [Byte-pkt]. RECOMMENDED that the size of the particular packet should not be
taken into account [Byte-pkt].
Applications (or transports) generally know the packet size that they Applications (or transports) generally know the packet size that they
are using and can hence make their judgments about whether to use are using and can hence make their judgments about whether to use
small or large packets based on the data they wish to send and the small or large packets based on the data they wish to send and the
expected impact on the delay or throughput, or other performance expected impact on the delay or throughput, or other performance
parameter. When a transport or application responds to a dropped or parameter. When a transport or application responds to a dropped or
marked packet, the size of the rate reduction should be proportionate marked packet, the size of the rate reduction should be proportionate
to the size of the packet that was sent [Byte-pkt]. to the size of the packet that was sent [Byte-pkt].
AQM-enabled system MAY instantiate different instances of an AQM AQM-enabled system MAY instantiate different instances of an AQM
algorithm to be applied within the same traffic class. Traffic algorithm to be applied within the same traffic class. Traffic
classes may be differentiated based on an Access Control List (ACL), classes may be differentiated based on an Access Control List (ACL),
the packet DiffServ Code Point (DSCP) [RFC5559], setting of the ECN the packet DiffServ Code Point (DSCP) [RFC5559], setting of the ECN
field[RFC3168] [RFC4774] or an equivalent codepoint at a lower layer. field[RFC3168] [RFC4774] or an equivalent codepoint at a lower layer.
This recommendation goes beyond what is defined in RFC 3168, by This recommendation goes beyond what is defined in RFC 3168, by
allowing more than one instance of an AQM to handle both ECN-capable allowing that an implementation MAY use more than one instance of an
and non-ECN-capable packets. AQM algorithm to handle both ECN-capable and non-ECN-capable packets.
4.5. AQM algorithms SHOULD NOT be dependent on specific transport 4.5. AQM algorithms SHOULD NOT be dependent on specific transport
protocol behaviours protocol behaviours
In deploying AQM, network devices need to support a range of Internet In deploying AQM, network devices need to support a range of Internet
traffic and SHOULD NOT make implicit assumptions about the traffic and SHOULD NOT make implicit assumptions about the
characteristics desired by the set transports/applications the characteristics desired by the set transports/applications the
network supports. That is, AQM methods should be opaque to the network supports. That is, AQM methods should be opaque to the
choice of transport and application. choice of transport and application.
skipping to change at page 16, line 4 skipping to change at page 16, line 15
4.6. Interactions with congestion control algorithms 4.6. Interactions with congestion control algorithms
Applications and transports need to react to received implicit or Applications and transports need to react to received implicit or
explicit signals that indicate the presence of congestion. This explicit signals that indicate the presence of congestion. This
section identifies issues that can impact the design of transport section identifies issues that can impact the design of transport
protocols when using paths that use AQM. protocols when using paths that use AQM.
Transport protocols and applications need timely signals of Transport protocols and applications need timely signals of
congestion. The time taken to detect and respond to congestion is congestion. The time taken to detect and respond to congestion is
increased when network devices queue packets in buffers. It can be increased when network devices queue packets in buffers. It can be
difficult to detect tail losses at a higher layer and may sometimes difficult to detect tail losses at a higher layer and this may
require transport timers or probe packets to detect and respond to sometimes require transport timers or probe packets to detect and
such loss. Loss patterns may also impact timely detection, e.g. the respond to such loss. Loss patterns may also impact timely
time may be reduced when network devices do not drop long runs of detection, e.g. the time may be reduced when network devices do not
packets from the same flow. drop long runs of packets from the same flow.
A common objective is to deliver data from its source end point to A common objective is to deliver data from its source end point to
its destination in the least possible time. When speaking of TCP its destination in the least possible time. When speaking of TCP
performance, the terms "knee" and "cliff" area defined by [Jain94]. performance, the terms "knee" and "cliff" area defined by [Jain94].
They respectively refer to the minimum congestion window that They respectively refer to the minimum congestion window that
maximises throughput and the maximum congestion window that avoids maximises throughput and the maximum congestion window that avoids
loss. An application that transmits at the rate determined by this loss. An application that transmits at the rate determined by this
window has the effect of maximizing the rate or throughput. For the window has the effect of maximizing the rate or throughput. For the
sender, exceeding the cliff is ineffective, as it (by definition) sender, exceeding the cliff is ineffective, as it (by definition)
induces loss; operating at a point close to the cliff has a negative induces loss; operating at a point close to the cliff has a negative
skipping to change at page 17, line 9 skipping to change at page 17, line 22
We have learned that the problems of congestion, latency and buffer- We have learned that the problems of congestion, latency and buffer-
sizing have not gone away, and are becoming more important to many sizing have not gone away, and are becoming more important to many
users. A number of self-tuning AQM algorithms have been found that users. A number of self-tuning AQM algorithms have been found that
offer significant advantages for deployed networks. There is also offer significant advantages for deployed networks. There is also
renewed interest in deploying AQM and the potential of ECN. renewed interest in deploying AQM and the potential of ECN.
In 2013, an obvious example of further research is the need to In 2013, an obvious example of further research is the need to
consider the use of Map/Reduce applications in data centers; do we consider the use of Map/Reduce applications in data centers; do we
need to extend our taxonomy of TCP/SCTP sessions to include not only need to extend our taxonomy of TCP/SCTP sessions to include not only
"mice" and "elephants", but "lemmings". Where "Lemmings" are flash "mice" and "elephants", but "lemmings". Where "Lemmings" are flash
crowds of "mice" that the network inadvertently tries to signal to as crowds of "mice" that the network inadvertently try to signal to as
if they were elephant flows, resulting in head of line blocking in if they were elephant flows, resulting in head of line blocking in
data center applications. data center applications.
Examples of other required research include: Examples of other required research include:
o Research into new AQM and scheduling algorithms. o Research into new AQM and scheduling algorithms.
o Research into the use of and deployment of ECN alongside AQM. o Research into the use of and deployment of ECN alongside AQM.
o Tools for enabling AQM (and ECN) deployment and measuring the o Tools for enabling AQM (and ECN) deployment and measuring the
skipping to change at page 18, line 21 skipping to change at page 18, line 33
The original recommendation in [RFC2309] was written by the End-to- The original recommendation in [RFC2309] was written by the End-to-
End Research Group, which is to say Bob Braden, Dave Clark, Jon End Research Group, which is to say Bob Braden, Dave Clark, Jon
Crowcroft, Bruce Davie, Steve Deering, Deborah Estrin, Sally Floyd, Crowcroft, Bruce Davie, Steve Deering, Deborah Estrin, Sally Floyd,
Van Jacobson, Greg Minshall, Craig Partridge, Larry Peterson, KK Van Jacobson, Greg Minshall, Craig Partridge, Larry Peterson, KK
Ramakrishnan, Scott Shenker, John Wroclawski, and Lixia Zhang. This Ramakrishnan, Scott Shenker, John Wroclawski, and Lixia Zhang. This
is an edited version of that document, with much of its text and is an edited version of that document, with much of its text and
arguments unchanged. arguments unchanged.
The need for an updated document was agreed to in the tsvarea meeting The need for an updated document was agreed to in the tsvarea meeting
at IETF 86. This document was reviewed on the aqm@ietf.org list. at IETF 86. This document was reviewed on the aqm@ietf.org list.
Comments came from Colin Perkins, Richard Scheffenegger, and Dave Comments came from Colin Perkins, Richard Scheffenegger, Dave Taht,
Taht. and many others.
Gorry Fairhurst was in part supported by the European Community under Gorry Fairhurst was in part supported by the European Community under
its Seventh Framework Programme through the Reducing Internet its Seventh Framework Programme through the Reducing Internet
Transport Latency (RITE) project (ICT-317700). Transport Latency (RITE) project (ICT-317700).
9. References 9. References
9.1. Normative References 9.1. Normative References
[Byte-pkt] [Byte-pkt]
skipping to change at page 21, line 9 skipping to change at page 21, line 24
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS "Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474, December Field) in the IPv4 and IPv6 Headers", RFC 2474, December
1998. 1998.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
and W. Weiss, "An Architecture for Differentiated and W. Weiss, "An Architecture for Differentiated
Services", RFC 2475, December 1998. Services", RFC 2475, December 1998.
[RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C.,
Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M.,
Zhang, L., and V. Paxson, "Stream Control Transmission
Protocol", RFC 2960, October 2000.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, March 2006. Congestion Control Protocol (DCCP)", RFC 4340, March 2006.
[RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC
4960, September 2007. 4960, September 2007.
[RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
Friendly Rate Control (TFRC): Protocol Specification", RFC Friendly Rate Control (TFRC): Protocol Specification", RFC
5348, September 2008. 5348, September 2008.
skipping to change at page 21, line 51 skipping to change at page 22, line 14
Appendix A. Change Log Appendix A. Change Log
Initial Version: March 2013 Initial Version: March 2013
Minor update of the algorithms that the IETF recommends SHOULD NOT Minor update of the algorithms that the IETF recommends SHOULD NOT
require operational (especially manual) configuration or tuningdate: require operational (especially manual) configuration or tuningdate:
April 2013 April 2013
Major surgery. This draft is for discussion at IETF-87 and expected Major surgery. This draft is for discussion at IETF-87 and expected
to be further updated. to be further updated.
July 2013 July 2013
-00 WG Draft - Updated transport recommendations; revised deployment -00 WG Draft - Updated transport recommendations; revised deployment
configuration section; numerous minor edits. configuration section; numerous minor edits.
Oct 2013 Oct 2013
-01 WG Draft - Updated transport recommendations; revised deployment -01 WG Draft - Updated transport recommendations; revised deployment
configuration section; numerous minor edits. configuration section; numerous minor edits.
Jan 2014 - Feedback from WG. Jan 2014 - Feedback from WG.
-02 WG Draft - Minor edits Feb 2014 - Mainly language fixes.
Authors' Addresses Authors' Addresses
Fred Baker (editor) Fred Baker (editor)
Cisco Systems Cisco Systems
Santa Barbara, California 93117 Santa Barbara, California 93117
USA USA
Email: fred@cisco.com Email: fred@cisco.com
Godred Fairhurst (editor) Godred Fairhurst (editor)
 End of changes. 36 change blocks. 
99 lines changed or deleted 108 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/