draft-ietf-aqm-fq-codel-01.txt   draft-ietf-aqm-fq-codel-02.txt 
AQM working group T. Hoeiland-Joergensen AQM working group T. Hoeiland-Joergensen
Internet-Draft Karlstad University Internet-Draft Karlstad University
Intended status: Informational P. McKenney Intended status: Informational P. McKenney
Expires: January 5, 2016 IBM Linux Technology Center Expires: April 21, 2016 IBM Linux Technology Center
D. Taht D. Taht
Teklibre Teklibre
J. Gettys J. Gettys
E. Dumazet E. Dumazet
Google, Inc. Google, Inc.
July 04, 2015 October 19, 2015
FlowQueue-Codel FlowQueue-Codel
draft-ietf-aqm-fq-codel-01 draft-ietf-aqm-fq-codel-02
Abstract Abstract
This memo presents the FQ-CoDel hybrid packet scheduler/AQM This memo presents the FQ-CoDel hybrid packet scheduler/AQM
algorithm, a powerful tool for fighting bufferbloat and reducing algorithm, a powerful tool for fighting bufferbloat and reducing
latency. latency.
FQ-CoDel mixes packets from multiple flows and reduces the impact of FQ-CoDel mixes packets from multiple flows and reduces the impact of
head of line blocking from bursty traffic. It provides isolation for head of line blocking from bursty traffic. It provides isolation for
low-rate traffic such as DNS, web, and videoconferencing traffic. It low-rate traffic such as DNS, web, and videoconferencing traffic. It
skipping to change at page 1, line 47 skipping to change at page 1, line 47
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 5, 2016. This Internet-Draft will expire on April 21, 2016.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 35 skipping to change at page 2, line 35
2. CoDel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. CoDel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Flow Queueing . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Flow Queueing . . . . . . . . . . . . . . . . . . . . . . . . 5
4. FQ-CoDel Parameters and Data Structures . . . . . . . . . . . 6 4. FQ-CoDel Parameters and Data Structures . . . . . . . . . . . 6
4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 6 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 6
4.1.1. Interval . . . . . . . . . . . . . . . . . . . . . . 6 4.1.1. Interval . . . . . . . . . . . . . . . . . . . . . . 6
4.1.2. Target . . . . . . . . . . . . . . . . . . . . . . . 7 4.1.2. Target . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.3. Packet limit . . . . . . . . . . . . . . . . . . . . 7 4.1.3. Packet limit . . . . . . . . . . . . . . . . . . . . 7
4.1.4. Quantum . . . . . . . . . . . . . . . . . . . . . . . 7 4.1.4. Quantum . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.5. Flows . . . . . . . . . . . . . . . . . . . . . . . . 8 4.1.5. Flows . . . . . . . . . . . . . . . . . . . . . . . . 8
4.1.6. ECN . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.1.6. ECN . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.1.7. CE_THRESHOLD . . . . . . . . . . . . . . . . . . . . 8 4.1.7. CE threshold . . . . . . . . . . . . . . . . . . . . 8
4.2. Data structures . . . . . . . . . . . . . . . . . . . . . 8 4.2. Data structures . . . . . . . . . . . . . . . . . . . . . 8
4.2.1. Internal queues . . . . . . . . . . . . . . . . . . . 8 4.2.1. Internal queues . . . . . . . . . . . . . . . . . . . 8
4.2.2. New and old queues lists . . . . . . . . . . . . . . 9 4.2.2. New and old queues lists . . . . . . . . . . . . . . 9
5. The FQ-CoDel scheduler . . . . . . . . . . . . . . . . . . . 9 5. The FQ-CoDel scheduler . . . . . . . . . . . . . . . . . . . 9
5.1. Enqueue . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.1. Enqueue . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.1.1. Alternative classification schemes . . . . . . . . . 10 5.1.1. Alternative classification schemes . . . . . . . . . 10
5.2. Dequeue . . . . . . . . . . . . . . . . . . . . . . . . . 10 5.2. Dequeue . . . . . . . . . . . . . . . . . . . . . . . . . 10
6. Implementation considerations . . . . . . . . . . . . . . . . 11 6. Implementation considerations . . . . . . . . . . . . . . . . 11
6.1. Probability of hash collisions . . . . . . . . . . . . . 11 6.1. Probability of hash collisions . . . . . . . . . . . . . 11
6.2. Memory Overhead . . . . . . . . . . . . . . . . . . . . . 12 6.2. Memory Overhead . . . . . . . . . . . . . . . . . . . . . 12
6.3. Per-Packet Timestamping . . . . . . . . . . . . . . . . . 13 6.3. Per-Packet Timestamping . . . . . . . . . . . . . . . . . 13
6.4. Other forms of "Fair Queueing" . . . . . . . . . . . . . 13 6.4. Other forms of "Fair Queueing" . . . . . . . . . . . . . 13
6.5. Differences between CoDel and FQ-CoDel behaviour . . . . 13 6.5. Differences between CoDel and FQ-CoDel behaviour . . . . 13
7. Limitations of flow queueing . . . . . . . . . . . . . . . . 14 7. Limitations of flow queueing . . . . . . . . . . . . . . . . 14
7.1. Fairness between things other than flows . . . . . . . . 14 7.1. Fairness between things other than flows . . . . . . . . 14
7.2. Flow bunching by opaque encapsulation . . . . . . . . . . 15 7.2. Flow bunching by opaque encapsulation . . . . . . . . . . 15
7.3. Low-priority congestion control algorithms . . . . . . . 15 7.3. Low-priority congestion control algorithms . . . . . . . 15
8. Deployment status and future work . . . . . . . . . . . . . . 16 8. Deployment status and future work . . . . . . . . . . . . . . 16
9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17
12. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 17 12. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 17
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 17
13.1. Normative References . . . . . . . . . . . . . . . . . . 17 13.1. Normative References . . . . . . . . . . . . . . . . . . 17
13.2. Informative References . . . . . . . . . . . . . . . . . 17 13.2. Informative References . . . . . . . . . . . . . . . . . 18
13.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 18 13.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18
1. Introduction 1. Introduction
The FQ-CoDel algorithm is a combined packet scheduler and AQM The FQ-CoDel algorithm is a combined packet scheduler and AQM
developed as part of the bufferbloat-fighting community effort. It developed as part of the bufferbloat-fighting community effort. It
is based on a modified Deficit Round Robin (DRR) queue scheduler, is based on a modified Deficit Round Robin (DRR) queue scheduler,
with the CoDel AQM algorithm operating on each queue. This document with the CoDel AQM algorithm operating on each queue. This document
describes the combined algorithm; reference implementations are describes the combined algorithm; reference implementations are
skipping to change at page 8, line 28 skipping to change at page 8, line 28
4.1.6. ECN 4.1.6. ECN
ECN is _enabled_ by default. Rather than do anything special with ECN is _enabled_ by default. Rather than do anything special with
misbehaved ECN flows, FQ-CoDel relies on the packet scheduling system misbehaved ECN flows, FQ-CoDel relies on the packet scheduling system
to minimise their impact, thus unresponsive packets in a flow being to minimise their impact, thus unresponsive packets in a flow being
marked with ECN can grow to the overall packet limit, but will not marked with ECN can grow to the overall packet limit, but will not
otherwise affect the performance of the system. otherwise affect the performance of the system.
It can be disabled by specifying the _noecn_ parameter. It can be disabled by specifying the _noecn_ parameter.
4.1.7. CE_THRESHOLD 4.1.7. CE threshold
This parameter enables DCTCP-like processing to enable CE marking ECT This parameter enables DCTCP-like processing to enable CE marking ECT
packets at a lower setpoint than the default codel target. packets at a lower setpoint than the default codel target.
ce_threshold is disabled by default and can be set to a number of The parameter, _ce_threshold_, is disabled by default and can be set
usec to enable. to a number of usec to enable.
4.2. Data structures 4.2. Data structures
4.2.1. Internal queues 4.2.1. Internal queues
The main data structure of FQ-CoDel is the array of queues, which is The main data structure of FQ-CoDel is the array of queues, which is
instantiated to the number of queues specified by the _flows_ instantiated to the number of queues specified by the _flows_
parameter at instantiation time. Each queue consists simply of an parameter at instantiation time. Each queue consists simply of an
ordered list of packets with FIFO semantics, two state variables ordered list of packets with FIFO semantics, two state variables
tracking the queue deficit and total number of bytes enqueued, and tracking the queue deficit and total number of bytes enqueued, and
skipping to change at page 12, line 5 skipping to change at page 12, line 5
6. Implementation considerations 6. Implementation considerations
6.1. Probability of hash collisions 6.1. Probability of hash collisions
Since the Linux FQ-CoDel implementation by default uses 1024 hash Since the Linux FQ-CoDel implementation by default uses 1024 hash
buckets, the probability that (say) 100 VoIP sessions will all hash buckets, the probability that (say) 100 VoIP sessions will all hash
to the same bucket is something like ten to the power of minus 300. to the same bucket is something like ten to the power of minus 300.
Thus, the probability that at least one of the VoIP sessions will Thus, the probability that at least one of the VoIP sessions will
hash to some other queue is very high indeed. hash to some other queue is very high indeed.
Conversely, the probability that each of the 100 VoIP sessions will Expanding on this, based on analytical equations for hash collision
get its own queue is given by (1023!/(924!*1024^99)) or about 0.007; probabilities, for 100 VoIP sessions, the probability of no collision
so not all that probable. The probability rises sharply, however, if is 90.78%; the probability that no more than two of the 100 VoIP
we are willing to accept a few collisions. For example, there is sessions will be involved in any given collision = 99.57%; and the
about an 86% probability that no more than two of the 100 VoIP probability that no more than three of the 100 VoIP sessions will be
sessions will be involved in any given collision, and about a 99% involved in any given collision = 99.99%.
probability that no more than three of the VoIP sessions will be
involved in any given collision. These last two results were These probabilities can be improved upon by using set-associative
computed using Monte Carlo simulations: Oddly enough, the mathematics hashing, a technique used in the Cake algorithm currently being
for VoIP-session collision exactly matches that of hardware cache developed as a further development upon the FQ-CoDel principles. For
overflow. a 4-way associative hash with the same number of total queues, the
probability of no collisions for 100 VoIP sessions is 99.93%, while
for an 8-way associative hash it is ~100%.
6.2. Memory Overhead 6.2. Memory Overhead
FQ-CoDel can be implemented with a very low memory footprint (less FQ-CoDel can be implemented with a very low memory footprint (less
than 64 bytes per queue on 64 bit systems). These are the data than 64 bytes per queue on 64 bit systems). These are the data
structures used in the Linux implementation: structures used in the Linux implementation:
struct codel_vars { struct codel_vars {
u32 count; u32 count;
u32 lastcount; u32 lastcount;
skipping to change at page 16, line 9 skipping to change at page 16, line 9
afforded to them by the FQ-CoDel scheduler. However, low-priority afforded to them by the FQ-CoDel scheduler. However, low-priority
congestion control mechanisms may be able to take steps to continue congestion control mechanisms may be able to take steps to continue
to be low priority, for instance by taking into account the vastly to be low priority, for instance by taking into account the vastly
reduced level of delay afforded by an AQM, or by using a coupled reduced level of delay afforded by an AQM, or by using a coupled
approach to observing the behaviour of multiple flows. approach to observing the behaviour of multiple flows.
8. Deployment status and future work 8. Deployment status and future work
The FQ-CoDel algorithm as described in this document has been shipped The FQ-CoDel algorithm as described in this document has been shipped
as part of the Linux kernel since version 3.5, released on the 21st as part of the Linux kernel since version 3.5, released on the 21st
of July, 2012. The CE_THRESHOLD was added in version 4.0. It has of July, 2012, with the ce_threshold being added in version 4.2. The
seen widespread testing in a variety of contexts and is configured as algorithm has seen widespread testing in a variety of contexts and is
the default queueing discipline in a number of mainline Linux configured as the default queueing discipline in a number of mainline
distributions (as of this writing at least OpenWRT, Arch Linux and Linux distributions (as of this writing at least OpenWRT, Arch Linux
Fedora). We believe it to be a safe default and encourage people and Fedora). We believe it to be a safe default and encourage people
running Linux to turn it on: It is a massive improvement over the running Linux to turn it on: It is a massive improvement over the
previous default FIFO queue. previous default FIFO queue.
Of course there is always room for improvement, and this document has Of course there is always room for improvement, and this document has
listed some of the know limitations of the algorithm. As such, we listed some of the know limitations of the algorithm. As such, we
encourage further research into algorithm refinements and addressing encourage further research into algorithm refinements and addressing
of limitations. One such effort is undertaken by the bufferbloat of limitations. One such effort is undertaken by the bufferbloat
community in the form of the Cake [1] queue management scheme. In community in the form of the Cake [1] queue management scheme. In
addition to this we believe the following (non-exhaustive) list of addition to this we believe the following (non-exhaustive) list of
issues to be worthy of further enquiry: issues to be worthy of further enquiry:
skipping to change at page 16, line 42 skipping to change at page 16, line 42
control algorithms and scavenger protocols. control algorithms and scavenger protocols.
o Other scheduling mechanisms to replace the DRR portion of the o Other scheduling mechanisms to replace the DRR portion of the
algorithm, e.g. QFQ or WFQ. algorithm, e.g. QFQ or WFQ.
o Sensitivity of parameters, most notably the number of queues and o Sensitivity of parameters, most notably the number of queues and
the CoDel parameters. the CoDel parameters.
9. Security Considerations 9. Security Considerations
There are no specific security exposures associated with FQ-CoDel. There are no specific security exposures associated with FQ-CoDel
Some exposures present in current FIFO systems are in fact reduced that are not also present in current FIFO systems. And some are in
(e.g. simple minded packet floods). fact reduced (e.g. simple minded packet floods). However, some care
is needed in the implementation to ensure this is the case. These
are included in the description above, however we reiterate them
here:
o To prevent packets in the new queues from starving old queues, it
is important that when a queue on the list of new queues empties,
it is moved to _the end of_ the list of old queues. This is
described at the end of Section 5.2.
o To prevent an attacker targeting a specific flow for a denial of
service attack, the hash that maps packets to queues should not be
predictable. To achieve this, FQ-CoDel salts the hash, as
described in the beginning of Section 5.1. The size of the salt
and the strength of the hash function is obviously a tradeoff
between performance and security. The Linux implementation uses a
32 bit random value as the salt and a Jenkins hash function. This
makes it possible to achieve very high throughput, and we consider
it sufficient to ward off the most obvious attacks.
10. IANA Considerations 10. IANA Considerations
This document has no actions for IANA. This document has no actions for IANA.
11. Acknowledgements 11. Acknowledgements
Our deepest thanks to Kathie Nichols, Van Jacobson, and all the Our deepest thanks to Kathie Nichols, Van Jacobson, and all the
members of the bufferbloat.net effort. members of the bufferbloat.net effort for all the help on developing
and testing the algorithm. In addition, our thanks to Anil Agarwal
for his help with getting the hash collision probabilities in this
document right.
12. Conclusions 12. Conclusions
FQ-CoDel is a very general, efficient, nearly parameterless queue FQ-CoDel is a very general, efficient, nearly parameterless queue
management approach combining flow queueing with CoDel. It is a very management approach combining flow queueing with CoDel. It is a very
powerful tool for solving bufferbloat, and we believe it to be safe powerful tool for solving bufferbloat, and we believe it to be safe
to turn on by default, as has already happened in a number of Linux to turn on by default, as has already happened in a number of Linux
distributions. In this document we have documented the Linux distributions. In this document we have documented the Linux
implementation in sufficient detail for an independent implementation in sufficient detail for an independent
implementation, and we encourage such implementations be widely implementation, and we encourage such implementations be widely
skipping to change at page 17, line 32 skipping to change at page 17, line 49
13.1. Normative References 13.1. Normative References
[CODELDRAFT] [CODELDRAFT]
Nichols, K., Jacobson, V., McGregor, A., and J. Iyengar, Nichols, K., Jacobson, V., McGregor, A., and J. Iyengar,
"Controlled Delay Active Queue Management", October 2014, "Controlled Delay Active Queue Management", October 2014,
<https://datatracker.ietf.org/doc/draft-ietf-aqm-codel/>. <https://datatracker.ietf.org/doc/draft-ietf-aqm-codel/>.
[RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
"Low Extra Delay Background Transport (LEDBAT)", RFC 6817, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
December 2012. DOI 10.17487/RFC6817, December 2012,
<http://www.rfc-editor.org/info/rfc6817>.
[RTCWEB-QOS] [RTCWEB-QOS]
Dhesikan, S., Jennings, C., Druta, D., Jones, P., and J. Dhesikan, S., Jennings, C., Druta, D., Jones, P., and J.
Polk, "DSCP and other packet markings for RTCWeb QoS", Polk, "DSCP and other packet markings for RTCWeb QoS",
December 2014, <https://datatracker.ietf.org/doc/draft- December 2014, <https://datatracker.ietf.org/doc/draft-
dhesikan-tsvwg-rtcweb-qos/>. dhesikan-tsvwg-rtcweb-qos/>.
13.2. Informative References 13.2. Informative References
[CODEL2012] [CODEL2012]
Nichols, K. and V. Jacobson, "Controlling Queue Delay", Nichols, K. and V. Jacobson, "Controlling Queue Delay",
July 2012, <http://queue.acm.org/detail.cfm?id=2209336>. July 2012, <http://queue.acm.org/detail.cfm?id=2209336>.
[DRR] Shreedhar, M. and G. Varghese, "Efficient Fair Queueing [DRR] Shreedhar, M. and G. Varghese, "Efficient Fair Queueing
Using Deficit Round Robin", June 1996, Using Deficit Round Robin", June 1996,
<http://users.ece.gatech.edu/~siva/ECE4607/presentations/ <http://users.ece.gatech.edu/~siva/ECE4607/presentations/
DRR.pdf>. DRR.pdf>.
[DRRPP] MacGregor, M. and W. Shi, "Deficits for Bursty Latency- [DRRPP] MacGregor, M. and W. Shi, "Deficits for Bursty Latency-
critical Flows: DRR++", 2000, <http://ieeexplore.ieee.org/ critical Flows: DRR++", 2000,
xpls/abs_all.jsp?arnumber=875803>. <http://ieeexplore.ieee.org/xpls/
abs_all.jsp?arnumber=875803>.
[GONG2014] [GONG2014]
Gong, Y., Rossi, D., Testa, C., Valenti, S., and D. Taht, Gong, Y., Rossi, D., Testa, C., Valenti, S., and D. Taht,
"Fighting the bufferbloat: on the coexistence of AQM and "Fighting the bufferbloat: on the coexistence of AQM and
low priority congestion control", July 2014, low priority congestion control", July 2014,
<http://perso.telecom-paristech.fr/~drossi/paper/ <http://perso.telecom-paristech.fr/~drossi/paper/
rossi14comnet-b.pdf>. rossi14comnet-b.pdf>.
[SQF2012] Bonald, T., Muscariello, L., and N. Ostallo, "On the [SQF2012] Bonald, T., Muscariello, L., and N. Ostallo, "On the
impact of TCP and per-flow scheduling on Internet impact of TCP and per-flow scheduling on Internet
Performance - IEEE/ACM transactions on Networking", April Performance - IEEE/ACM transactions on Networking", April
2012, <http://perso.telecom-paristech.fr/~bonald/ 2012, <http://perso.telecom-
Publications_files/BMO2011.pdf>. paristech.fr/~bonald/Publications_files/BMO2011.pdf>.
13.3. URIs 13.3. URIs
[1] http://www.bufferbloat.net/projects/codel/wiki/Cake [1] http://www.bufferbloat.net/projects/codel/wiki/Cake
Authors' Addresses Authors' Addresses
Toke Hoeiland-Joergensen Toke Hoeiland-Joergensen
Karlstad University Karlstad University
Dept. of Computer Science Dept. of Computer Science
 End of changes. 16 change blocks. 
35 lines changed or deleted 60 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/