--- 1/draft-ietf-tsvwg-l4sops-01.txt 2021-10-25 17:14:21.040428164 -0700 +++ 2/draft-ietf-tsvwg-l4sops-02.txt 2021-10-25 17:14:21.092429469 -0700 @@ -1,18 +1,18 @@ Transport Area Working Group G. White, Ed. Internet-Draft CableLabs -Intended status: Informational 12 July 2021 -Expires: 13 January 2022 +Intended status: Informational 25 October 2021 +Expires: 28 April 2022 Operational Guidance for Deployment of L4S in the Internet - draft-ietf-tsvwg-l4sops-01 + draft-ietf-tsvwg-l4sops-02 Abstract This document is intended to provide guidance in order to ensure successful deployment of Low Latency Low Loss Scalable throughput (L4S) in the Internet. Other L4S documents provide guidance for running an L4S experiment, but this document is focused solely on potential interactions between L4S flows and flows using the original ('Classic') ECN over a Classic ECN bottleneck link. The document discusses the potential outcomes of these interactions, describes @@ -29,21 +29,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 13 January 2022. + This Internet-Draft will expire on 28 April 2022. Copyright Notice Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. @@ -58,50 +58,50 @@ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Per-Flow Fairness . . . . . . . . . . . . . . . . . . . . . . 5 3. Flow Queuing Systems . . . . . . . . . . . . . . . . . . . . 7 4. Detection of Classic ECN Bottlenecks . . . . . . . . . . . . 7 4.1. Recent Studies . . . . . . . . . . . . . . . . . . . . . 7 4.2. Future Experiments . . . . . . . . . . . . . . . . . . . 9 5. Operator of an L4S host . . . . . . . . . . . . . . . . . . . 9 5.1. Server Type . . . . . . . . . . . . . . . . . . . . . . . 10 5.1.1. General purpose servers (e.g. web servers) . . . . . 10 5.1.2. Specialized servers handling long-running sessions - (e.g. cloud gaming) . . . . . . . . . . . . . . . . . 10 + (e.g. cloud gaming) . . . . . . . . . . . . . . . . . 11 5.2. Server deployment environment . . . . . . . . . . . . . . 11 5.2.1. Edge Servers . . . . . . . . . . . . . . . . . . . . 11 5.2.2. Other hosts . . . . . . . . . . . . . . . . . . . . . 12 6. Operator of a Network Employing RFC3168 FIFO Bottlenecks . . 13 6.1. Preferred Options . . . . . . . . . . . . . . . . . . . . 13 - 6.1.1. Upgrade AQMs to an L4S-aware AQM . . . . . . . . . . 13 + 6.1.1. Upgrade AQMs to an L4S-aware AQM . . . . . . . . . . 14 6.1.2. Configure Non-Coupled Dual Queue with Shallow - Target . . . . . . . . . . . . . . . . . . . . . . . 13 - 6.1.3. Approximate Fair Dropping . . . . . . . . . . . . . . 14 - 6.1.4. Replace RFC3168 FIFO with RFC3168 FQ . . . . . . . . 14 - 6.1.5. Do Nothing . . . . . . . . . . . . . . . . . . . . . 14 - 6.2. Less Preferred Options . . . . . . . . . . . . . . . . . 14 + Target . . . . . . . . . . . . . . . . . . . . . . . 14 + 6.1.3. Approximate Fair Dropping . . . . . . . . . . . . . . 15 + 6.1.4. Replace RFC3168 FIFO with RFC3168 FQ . . . . . . . . 15 + 6.1.5. Do Nothing . . . . . . . . . . . . . . . . . . . . . 15 + 6.2. Non-Preferred Options . . . . . . . . . . . . . . . . . . 15 6.2.1. Configure Non-Coupled Dual Queue Treating ECT(1) as - NotECT . . . . . . . . . . . . . . . . . . . . . . . 14 - 6.2.2. WRED with ECT(1) Differentation . . . . . . . . . . . 15 - 6.2.3. Configure AQM to treat ECT(1) as NotECT . . . . . . . 15 - 6.2.4. ECT(1) Tunnel Bypass . . . . . . . . . . . . . . . . 15 - 6.3. Last Resort Options . . . . . . . . . . . . . . . . . . . 15 - 6.3.1. Disable RFC3168 Support . . . . . . . . . . . . . . . 16 - 6.3.2. Re-mark ECT(1) to NotECT Prior to AQM . . . . . . . . 16 - 7. Operator of a Network Employing RFC3168 FQ Bottlenecks . . . 16 - 8. Conclusion of the L4S experiment . . . . . . . . . . . . . . 17 - 8.1. Termination of a successful L4S experiment . . . . . . . 17 - 8.2. Termination of an unsuccessful L4S experiment . . . . . . 18 - 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 18 - 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 - 11. Security Considerations . . . . . . . . . . . . . . . . . . . 18 - 12. Informative References . . . . . . . . . . . . . . . . . . . 18 - Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 21 + NotECT . . . . . . . . . . . . . . . . . . . . . . . 16 + 6.2.2. WRED with ECT(1) Differentation . . . . . . . . . . . 16 + 6.2.3. Configure AQM to treat ECT(1) as NotECT . . . . . . . 16 + 6.2.4. ECT(1) Tunnel Bypass . . . . . . . . . . . . . . . . 16 + 6.3. Last Resort Options . . . . . . . . . . . . . . . . . . . 17 + 6.3.1. Disable RFC3168 Support . . . . . . . . . . . . . . . 17 + 6.3.2. Re-mark ECT(1) to NotECT Prior to AQM . . . . . . . . 17 + 7. Operator of a Network Employing RFC3168 FQ Bottlenecks . . . 17 + 8. Conclusion of the L4S experiment . . . . . . . . . . . . . . 18 + 8.1. Termination of a successful L4S experiment . . . . . . . 19 + 8.2. Termination of an unsuccessful L4S experiment . . . . . . 19 + 9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 + 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 + 11. Security Considerations . . . . . . . . . . . . . . . . . . . 19 + 12. Informative References . . . . . . . . . . . . . . . . . . . 19 + Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 22 1. Introduction Low-latency, low-loss, scalable throughput (L4S) [I-D.ietf-tsvwg-l4s-arch] traffic is designed to provide lower queuing delay than conventional traffic via a new network service based on a modified Explicit Congestion Notification (ECN) response from the network. L4S traffic is identified by the ECT(1) codepoint, and network bottlenecks that support L4S should congestion-mark ECT(1) packets to enable L4S congestion feedback. However, L4S @@ -412,44 +412,59 @@ expected for deployers of L4S technology. As part of that validation, it is recommended that deployers consider the issue of RFC3168 FIFO bottlenecks and conduct experiments as described in the previous section, or otherwise assess the impact that the L4S technology will have in the networks in which it is to be deployed, and take action as is described further in this section. This sort of progressive (incremental) deployment helps to ensure that any issues are discovered when the scale of those issues is relatively small. - TODO: discussion of risk of incorrectly classifying a path + Some of the recommendations in this section involve the sender + determining (through various means) the likelihood of a particular + path having a bottleneck that implements single queue RFC3168 AQM. + Since this determination can be imprecise, there exists some risk + that a path is incorrectly classified. In the case of false- + positives (where a path is erroneously believed to contain RFC3168), + discontinuing the use of L4S on that path would result in a lost + opportunity for low-latency low-loss service, and thus likely an + unnecessary degradation in the quality of experience for the user. + In the case of false-negatives, the use of L4S has the potential to + result in a reduction in the throughput of non-L4S flows while the + L4S flow is active. In environments where the risk of false- + negatives is significant, it is recommended that hosts limit the use + of L4S congestion control to application-limited flows that are + especially sensitive to latency, latency variation and loss. 5.1. Server Type If pre-deployment testing raises concerns about issues with RFC3168 bottlenecks, the actions taken may depend on the server type. 5.1.1. General purpose servers (e.g. web servers) * Out-of-band active testing could be performed by the server. For example, a javascript application could run simultaneous downloads (i.e. with and without L4S) during page reading time in order to survey for presence of RFC3168 FIFO bottlenecks on paths to users (e.g. as described in Section 4 of [Briscoe]). * In-band testing could be built in to the transport protocol implementation at the sender in order to perform detection (see Section 5 of [Briscoe], though note that this mechanism does not differentiate between FIFO and FQ). - * Discontinuing use of L4S based on the detection of RFC3168 FIFO - bottlenecks is likely not needed for short transactional transfers - (e.g. sub 10 seconds) since these are unlikely to achieve the - steady-state conditions where unfairness has been observed. + * Depending on the details of the L4S congestion control + implementation, taking action based on the detection of RFC3168 + FIFO bottlenecks may not be needed for short transactional + transfers that are unlikely to achieve the steady-state conditions + where unfairness is likely to occur. * For longer file transfers, it may be possible to fall-back to Classic behavior in real-time (i.e. when doing in-band testing), or to cache those destinations where RFC3168 has been detected, and disable L4S for subsequent long file transfers to those destinations. 5.1.2. Specialized servers handling long-running sessions (e.g. cloud gaming) @@ -476,20 +491,23 @@ Some hosts (such as CDN leaf nodes and servers internal to an ISP) are deployed in environments in which they serve content to a constrained set of networks or clients. The operator of such hosts may be able to determine whether there is the possibility of [RFC3168] FIFO bottlenecks being present, and utilize this information to make decisions on selectively deploying L4S and/or disabling it (e.g. bleaching ECN). Furthermore, such an operator may be able to determine the likelihood of an L4S bottleneck being present, and use this information as well. + It is recommended that L4S experimental deployments begin with such + servers. + For example, if a particular network is known to have deployed legacy [RFC3168] FIFO bottlenecks, usage of L4S for long capacity-seeking file transfers on that network could be delayed until those bottlenecks can be upgraded to mitigate any potential issues as discussed in the next section. Prior to deploying L4S on edge servers a server operator should: * Consult with network operators on presence of legacy [RFC3168] FIFO bottlenecks @@ -514,25 +532,33 @@ potential risk for any unfairness to be experienced by end users. 5.2.2. Other hosts Hosts that are deployed in locations that serve a wide variety of networks face a more difficult prospect in terms of handling the potential presence of RFC3168 FIFO bottlenecks. Nonetheless, the steps listed in the ealier section (based on server type) can be taken to minimize the risk of unfairness. + It is recommended that operators of such hosts consider carefully + whether these hosts are appropriate for early experimentation with + L4S. + The interpretation of studies on ECN usage and their deployment context (see Section 4.1) has so far concluded that RFC3168 FIFO bottlenecks are likely to be rare, and so detections using these - techniques may also prove to be rare. Therefore, it may be possible - for a host to cache a list of end host ip addresses where a RFC3168 + techniques may also prove to be rare. Additionally, the most recent + large scale study [Holland] indicated that there were a small number + of networks in which RFC3168 bottlenecks are more prevalent than the + global average. Therefore, it may be possible for a host to maintain + a list of networks where L4S should not be enabled, and, for other + networks, to cache a list of end host ip addresses where a RFC3168 bottleneck has been detected. Entries in such a cache would need to age-out after a period of time to account for IP address changes, path changes, equipment upgrades, etc. [TODO: more info on ways to cache/maintain such a list] It has been suggested that a public block-list of domains that implement RFC3168 FIFO bottlenecks could be maintained. There are a number of significant issues that would seem to make this idea infeasible, not the least of which is the fact that presence of RFC3168 FIFO bottlenecks or L4S bottlenecks is not a property of a @@ -554,28 +580,40 @@ be fully saturated) that is configured with a legacy [RFC3168] FIFO AQM can take certain steps in order to improve rate fairness between classic traffic and L4S traffic, and thus enable L4S to be deployed in a greater number of paths. Some of the options listed in this section may not be feasible in all networking equipment. 6.1. Preferred Options + The options in this section preserve the ability of the bottleneck to + CE-mark ECT(1) packets as well as ECT(0) packets. The result of + these options is that hosts utilizing classic (RFC3168) ECN and hosts + utilizing L4S ECN receive the benefit of ECN. Further with these + options, the hosts that choose to use L4S ECN see the benefit of + reduced latency and latency-variation compared to hosts that choose + instead to use classic ECN. + 6.1.1. Upgrade AQMs to an L4S-aware AQM If the RFC3168 AQM implementation can be upgraded to enable support for L4S, either via [I-D.ietf-tsvwg-aqm-dualq-coupled] or via an L4S- aware FQ implementation, this is the preferred approach to addressing potential unfairness, because it additionally enables all of the benefits of L4S. + Section 4.2 of [I-D.ietf-tsvwg-l4s-arch] contains a description of + the options available, including a discussion about L4S-aware FQ + implementations. + 6.1.2. Configure Non-Coupled Dual Queue with Shallow Target Equipment supporting [RFC3168] may be configurable to enable two parallel queues for the same traffic class, with classification done based on the ECN field. * Configure 2 queues, both with ECN; 50:50 WRR scheduler - Queue #1: ECT(1) & CE packets - Shallow immediate AQM target @@ -591,20 +629,33 @@ This option would allow L4S flows to achieve low latency, low loss and scalable throughput, but would sacrifice the more precise flow balance offered by [I-D.ietf-tsvwg-aqm-dualq-coupled]. This option would be expected to result in some reordering of previously CE marked packets sent by Classic ECN senders, which is a trait shared with [I-D.ietf-tsvwg-aqm-dualq-coupled]. As is discussed in [I-D.ietf-tsvwg-ecn-l4s-id], this reordering would be either zero risk or very low risk. + If classification based on the ECN field isn't possible in the + bottleneck, this option may still be useful if an external system can + be configured to reflect the ECN codepoint to another field that + could then be used as an alternative identifier to classify traffic + into Queue #1. For example, if at network ingress an edge router can + apply a local-use DSCP to ECT(1) & CE packets, the bottleneck can + then utilize a DSCP classifier. Similarly, in MPLS networks, ECT(1) + & CE packets could use a different EXP value [RFC5129] than classic + packets. More generally, any tunnelling protocol can be used to + proxy the ECN value of the encapsulated packet to its outer header, + enabling bottlenecks to classify packets based on their input virtual + interface. + 6.1.3. Approximate Fair Dropping The Approximate Fair Dropping ([AFD]) algorithm tracks individual flow rates and introduces either packet drops or CE-marks to each flow in proportion to the amount by which the flow rate exceeds a computed per-flow fair-share rate. Where an implementation of AFD or an equivalent algorithm is available, it could be enabled on an interface with a single-queue RFC3168 AQM as a fairly lightweight way to inject additional ECN marks into any significantly higher rate flows. See also [Cisco-N9000]. @@ -618,26 +669,32 @@ 6.1.5. Do Nothing If it is infeasible to implement any of the above options, it may be preferable for an operator of RFC3168 FIFO bottlenecks to leave them unchanged. In many deployment situations the risk of fairness issues may be very low, and the impact if they occur may not be particularly troublesome. This could, for instance, be true in bottlenecks where there is a high degree of flow aggregation or in high-speed bottlenecks (e.g. greater than 100 Mbps). -6.2. Less Preferred Options +6.2. Non-Preferred Options - In the case that there is a concern about per-flow fairness between - L4S flows and Classic flows in an RFC3168 FIFO bottleneck, and none - of the remedies in the previous section can be implemented, the - options listed in this section could be considered. + The options in this section come with a downside that they treat + ECT(1) packets as NotECT, and thus don't provide the latency/loss + benefit to flows marked ECT(1) (i.e. L4S flows). In the case that + there is a strong concern about per-flow fairness between L4S flows + and Classic flows in an RFC3168 FIFO bottleneck, and none of the + remedies in the previous section can be implemented, the options + listed in this section could be considered. These options are non- + preferred because bottlenecks that implement them create a dilemma + for operators of hosts, in that the application could see better + performance if it uses classic (RFC3168) ECN rather than L4S ECN. 6.2.1. Configure Non-Coupled Dual Queue Treating ECT(1) as NotECT * Configure 2 queues, both with AQM; 50:50 WRR scheduler - Queue #1: ECT(1) & NotECT packets - ECN disabled - Queue #2: ECT(0) & CE packets - ECN enabled * Outcome @@ -634,39 +691,42 @@ 6.2.1. Configure Non-Coupled Dual Queue Treating ECT(1) as NotECT * Configure 2 queues, both with AQM; 50:50 WRR scheduler - Queue #1: ECT(1) & NotECT packets - ECN disabled - Queue #2: ECT(0) & CE packets - ECN enabled * Outcome + - ECT(1) treated as NotECT - Flow balance for the 2 queues is the same as in Section 6.1.2 - This option would not allow L4S flows to achieve low latency, low - loss and scalable throughput in this bottleneck link. As a result it - is the less preferred option. + This option could potentially be implemented using an identifier + other than the ECN field, as discussed in Section 6.1.2. 6.2.2. WRED with ECT(1) Differentation This configuration is similar to the option described in Section 6.2.1, but uses a single queue with WRED functionality. * Configure the queue with two WRED classes - Class #1: ECT(1) & NotECT packets - ECN disabled - Class #2: ECT(0) & CE packets - ECN enabled + This option could potentially be implemented using an identifier + other than the ECN field, as discussed in Section 6.1.2. + 6.2.3. Configure AQM to treat ECT(1) as NotECT If equipment is configurable in such a way as to only supply CE marks to ECT(0) packets, and treat ECT(1) packets identically to NotECT, or is upgradable to support this capability, doing so will eliminate the risk of unfairness. 6.2.4. ECT(1) Tunnel Bypass Tunnel ECT(1) traffic through the RFC3168 bottleneck with the outer @@ -828,22 +887,22 @@ Internet", 98th IETF MAPRG Presentation , 2017, . [Briscoe] Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on Detection of a Classic ECN AQM", ArXiv , February 2021, . [Cisco-N9000] - Cisco, "Intelligent Buffer Management on Cisco Nexus 9000 - Series Switches White Paper", Cisco Product + "Intelligent Buffer Management on Cisco Nexus 9000 Series + Switches White Paper", Cisco Product Document 1486580292771926, 6 June 2017, . [Ha] Ha, S., Rhee, I., and L. Xu, "CUBIC: A New TCP-Friendly High-Speed TCP Variant", ACM SIGOPS Operating Systems Review , 2008, . @@ -895,23 +954,25 @@ Assignments", 2018, . [Mandalari] Mandalari, AM., Lutu, A., Briscoe, B., Bagnulo, M., and O. Alay, "Measuring ECN++: Good News for ++, Bad News for ECN over Mobile", DOI 10.1109/MCOM.2018.1700739, IEEE Communications Magazine vol. 56, no. 3, March 2018, . - [Palmei] Palmei, J. and X. et al., "Design and Evaluation of COBALT - Queue Discipline", IEEE International Symposium on Local - and Metropolitan Area Networks 2019, 2019, + [Palmei] Palmei, J., Gupta, S., Imputato, P., Morton, J., + Tahiliani, M., Avallone, S., and D. Taht, "Design and + Evaluation of COBALT Queue Discipline", IEEE International + Symposium on Local and Metropolitan Area Networks 2019, + 2019, . [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP Friendly Rate Control (TFRC): Protocol Specification", RFC 5348, DOI 10.17487/RFC5348, September 2008,