--- 1/draft-ietf-ippm-model-based-metrics-11.txt 2017-09-15 12:13:17.268655294 -0700 +++ 2/draft-ietf-ippm-model-based-metrics-12.txt 2017-09-15 12:13:17.376657880 -0700 @@ -1,77 +1,64 @@ IP Performance Working Group M. Mathis Internet-Draft Google, Inc Intended status: Experimental A. Morton -Expires: January 1, 2018 AT&T Labs - June 30, 2017 +Expires: March 19, 2018 AT&T Labs + September 15, 2017 Model Based Metrics for Bulk Transport Capacity - draft-ietf-ippm-model-based-metrics-11.txt + draft-ietf-ippm-model-based-metrics-12.txt Abstract We introduce a new class of Model Based Metrics designed to assess if a complete Internet path can be expected to meet a predefined Target Transport Performance by applying a suite of IP diagnostic tests to successive subpaths. The subpath-at-a-time tests can be robustly applied to critical infrastructure, such as network interconnections or even individual devices, to accurately detect if any part of the infrastructure will prevent paths traversing it from meeting the Target Transport Performance. - Model Based Metrics rely on peer-reviewed mathematical models to - specify a Targeted Suite of IP Diagnostic tests, designed to assess - whether common transport protocols can be expected to meet a - predetermined Target Transport Performance over an Internet path. + Model Based Metrics rely on mathematical models to specify a Targeted + Suite of IP Diagnostic tests, designed to assess whether common + transport protocols can be expected to meet a predetermined Target + Transport Performance over an Internet path. - For Bulk Transport Capacity IP diagnostics are built using test + For Bulk Transport Capacity the IP diagnostics are built using test streams and statistical criteria for evaluating the packet transfer that mimic TCP over the complete path. The temporal structure of the test stream (bursts, etc) mimic TCP or other transport protocol carrying bulk data over a long path. However they are constructed to be independent of the details of the subpath under test, end systems or applications. Likewise the success criteria evaluates the packet transfer statistics of the subpath against criteria determined by protocol performance models applied to the Target Transport Performance of the complete path. The success criteria also does not depend on the details of the subpath, end systems or application. - Model Based Metrics exhibit several important new properties not - present in other Bulk Transport Capacity Metrics, including the - ability to reason about concatenated or overlapping subpaths. The - results are vantage independent which is critical for supporting - independent validation of tests by comparing results from multiple - measurement points. - - This document provides a framework for designing suites of IP - diagnostic tests that are tailored to confirming that infrastructure - can meet the predetermined Target Transport Performance. It does not - fully specify the IP diagnostics tests needed to assure any specific - target performance. - Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on January 1, 2018. + This Internet-Draft will expire on March 19, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -101,46 +88,46 @@ 6.2. Constant window pseudo CBR . . . . . . . . . . . . . . . 27 6.3. Scanned window pseudo CBR . . . . . . . . . . . . . . . . 28 6.4. Concurrent or channelized testing . . . . . . . . . . . . 29 7. Interpreting the Results . . . . . . . . . . . . . . . . . . 30 7.1. Test outcomes . . . . . . . . . . . . . . . . . . . . . . 30 7.2. Statistical criteria for estimating run_length . . . . . 31 7.3. Reordering Tolerance . . . . . . . . . . . . . . . . . . 34 8. IP Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . 34 8.1. Basic Data Rate and Packet Transfer Tests . . . . . . . . 35 8.1.1. Delivery Statistics at Paced Full Data Rate . . . . . 35 - 8.1.2. Delivery Statistics at Full Data Windowed Rate . . . 36 - 8.1.3. Background Packet Transfer Statistics Tests . . . . . 36 + 8.1.2. Delivery Statistics at Full Data Windowed Rate . . . 35 + 8.1.3. Background Packet Transfer Statistics Tests . . . . . 35 8.2. Standing Queue Tests . . . . . . . . . . . . . . . . . . 36 - 8.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . 38 - 8.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 38 + 8.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . 37 + 8.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 37 8.2.3. Non excessive loss . . . . . . . . . . . . . . . . . 38 - 8.2.4. Duplex Self Interference . . . . . . . . . . . . . . 39 + 8.2.4. Duplex Self Interference . . . . . . . . . . . . . . 38 8.3. Slowstart tests . . . . . . . . . . . . . . . . . . . . . 39 8.3.1. Full Window slowstart test . . . . . . . . . . . . . 39 - 8.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . 40 + 8.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . 39 8.4. Sender Rate Burst tests . . . . . . . . . . . . . . . . . 40 8.5. Combined and Implicit Tests . . . . . . . . . . . . . . . 41 8.5.1. Sustained Bursts Test . . . . . . . . . . . . . . . . 41 8.5.2. Passive Measurements . . . . . . . . . . . . . . . . 42 9. An Example . . . . . . . . . . . . . . . . . . . . . . . . . 43 9.1. Observations about applicability . . . . . . . . . . . . 44 - 10. Validation . . . . . . . . . . . . . . . . . . . . . . . . . 45 + 10. Validation . . . . . . . . . . . . . . . . . . . . . . . . . 44 11. Security Considerations . . . . . . . . . . . . . . . . . . . 46 - 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 47 + 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 46 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 47 - 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 + 14. Informative References . . . . . . . . . . . . . . . . . . . 47 Appendix A. Model Derivations . . . . . . . . . . . . . . . . . 51 - A.1. Queueless Reno . . . . . . . . . . . . . . . . . . . . . 52 - Appendix B. The effects of ACK scheduling . . . . . . . . . . . 53 - Appendix C. Version Control . . . . . . . . . . . . . . . . . . 54 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 54 + A.1. Queueless Reno . . . . . . . . . . . . . . . . . . . . . 51 + Appendix B. The effects of ACK scheduling . . . . . . . . . . . 52 + Appendix C. Version Control . . . . . . . . . . . . . . . . . . 53 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 53 1. Introduction Model Based Metrics (MBM) rely on peer-reviewed mathematical models to specify a Targeted Suite of IP Diagnostic tests, designed to assess whether common transport protocols can be expected to meet a predetermined Target Transport Performance over an Internet path. This note describes the modeling framework to derive the test parameters for assessing an Internet path's ability to support a predetermined Bulk Transport Capacity. @@ -158,28 +145,38 @@ protocols to meet the specific performance objective over some network path. In most cases, the IP diagnostic tests can be implemented by combining existing IPPM metrics with additional controls for generating test streams having a specified temporal structure (bursts or standing queues caused by constant bit rate streams, etc.) and statistical criteria for evaluating packet transfer. The temporal structure of the test streams mimic transport protocol behavior over the complete path; the statistical criteria models the transport - protocol's response to less than ideal IP packet transfer. + protocol's response to less than ideal IP packet transfer. In + control theory terms, the tests are "open loop". Note that running a + test requires the coordinated activity of sending and receiving + measurement points. This note addresses Bulk Transport Capacity. It describes an alternative to the approach presented in "A Framework for Defining Empirical Bulk Transfer Capacity Metrics" [RFC3148]. Other Model Based Metrics may cover other applications and transports, such as VoIP over UDP and RTP, and new transport protocols. + This note assumes a traditional Reno TCP style self clocked, window + controlled transport protocol that uses packet loss and ECN CE marks + for congestion feedback. There are currently some experimental + protocols and congestion control algorithms that are rate based or + otherwise fall outside of these assumptions. In the future these new + protocols and algorithms may call for revised models. + The MBM approach, mapping Target Transport Performance to a Targeted IP Diagnostic Suite (TIDS) of IP tests, solves some intrinsic problems with using TCP or other throughput maximizing protocols for measurement. In particular all throughput maximizing protocols (and TCP congestion control in particular) cause some level of congestion in order to detect when they have reached the available capacity limitation of the network. This self inflicted congestion obscures the network properties of interest and introduces non-linear dynamic equilibrium behaviors that make any resulting measurements useless as metrics because they have no predictive value for conditions or paths @@ -202,21 +199,41 @@ publication. REF Editor: The reference to draft-ietf-tcpm-rack is to attribute an idea. This document should not block waiting for the completion of that one. Please send comments about this draft to ippm@ietf.org. See http://goo.gl/02tkD for more information including: interim drafts, an up to date todo list and information on contributing. - Formatted: Thu Jun 29 19:08:08 PDT 2017 + Formatted: Fri Sep 15 11:14:13 PDT 2017 + + Changes since -11 draft: + + o (From IESG review comments.) + o Ben Campbell: Shorten the Abstract. + o Mirja Kuhlewind: Reduced redundancy. (See message) + o MK: Mention open loop in the introduction. + o MK: Spelled out ECN and reference RFC3168. + o MK: Added a paragraph to the introduction about assuming a + traditional self clocked, window controlled transport protocol. + o MK: Added language about initial window to the list at about + bursts at the end of section 4.1. + o MK: Network power is defined in the terminology section. + o MK: The introduction mention coordinated activity of both + endpoints. + o MK: The security section restates that some of the tests are not + intended for frequent monitoring tests as the high load can impact + other traffic negatively. + o MK: Restored "Informative References" section name. + o And a few minor nits. Changes since -10 draft: o A few more nits from various sources. o (From IETF LC review comments.) o David Mandelberg: design metrics to prevent DDOS. o From Robert Sparks: * Remove all legacy 2119 language. * Fixed Xr notation inconsistency. @@ -214,28 +231,29 @@ Changes since -10 draft: o A few more nits from various sources. o (From IETF LC review comments.) o David Mandelberg: design metrics to prevent DDOS. o From Robert Sparks: * Remove all legacy 2119 language. * Fixed Xr notation inconsistency. * Adjusted abstract: tests are only partially specified. + * Avoid rather than suppress the effects of congestion control * Removed the unnecessary, excessively abstract and unclear thought about IP vs TCP measurements. * Changed "thwarted" to "not fulfilled". * Qualified language about burst models. * Replaced "infinitesimal" with other language. * Added citations for the reordering strawman. - * Pointed out that psuedo CBR tests depend on self clock. + * Pointed out that pseudo CBR tests depend on self clock. * Fixed some run on sentences. o Update language to reflect RFC7567, AQM recommendations. o Suggestion from Merry Mou (MIT) Changes since -09 draft: o Five last minute editing nits. Changes since -08 draft: @@ -515,20 +534,21 @@ bottlenecks elsewhere, such as in the application itself. IP diagnostic tests: Measurements or diagnostics to determine if packet transfer statistics meet some precomputed target. traffic patterns: The temporal patterns or burstiness of traffic generated by applications over transport protocols such as TCP. There are several mechanisms that cause bursts at various time scales as described in Section 4.1. Our goal here is to mimic the range of common patterns (burst sizes and rates, etc), without tying our applicability to specific applications, implementations or technologies, which are sure to become stale. + Explicit Congestion Notification (ECN): See [RFC3168]. packet transfer statistics: Raw, detailed or summary statistics about packet transfer properties of the IP layer including packet losses, ECN Congestion Experienced (CE) marks, reordering, or any other properties that may be germane to transport performance. packet loss ratio: As defined in [RFC7680]. apportioned: To divide and allocate, for example budgeting packet loss across multiple subpaths such that the losses will accumulate to less than a specified end-to-end loss ratio. Apportioning metrics is essentially the inverse of the process described in [RFC5835]. @@ -615,31 +635,33 @@ Without loss of generality this is assumed to be the size for returning acknowledgments (ACKs). For TCP, the Maximum Segment Size (MSS) is the Target MTU minus the header_overhead. Basic parameters common to models and subpath tests are defined here are described in more detail in Section 5.2. Note that these are mixed between application transport performance (excludes headers) and IP performance (which include TCP headers and retransmissions as part of the IP payload). + Network power: The observed data rate divided by the observed RTT. + Network power indicates how effectively a transport protocol is + filling a network. Window [size]: The total quantity of data carried by packets in- flight plus the data represented by ACKs circulating in the network is referred to as the window. See Section 4.1. Sometimes used with other qualifiers (congestion window, cwnd or receiver window) to indicate which mechanism is controlling the window. pipe size: A general term for number of packets needed in flight (the window size) to exactly fill some network path or subpath. - It corresponds to the window size which maximizes network power, - the observed data rate divided by the observed RTT. Often used - with additional qualifiers to specify which path, or under what - conditions, etc. + It corresponds to the window size which maximizes network power. + Often used with additional qualifiers to specify which path, or + under what conditions, etc. target_window_size: The average number of packets in flight (the window size) needed to meet the Target Data Rate, for the specified Target RTT, and MTU. It implies the scale of the bursts that the network might experience. run length: A general term for the observed, measured, or specified number of packets that are (expected to be) delivered between losses or ECN Congestion Experienced (CE) marks. Nominally one over the sum of the loss and ECN CE marking probabilities, if there are independently and identically distributed. target_run_length: The target_run_length is an estimate of the @@ -907,23 +930,24 @@ can tolerate bursts at the scales that can be caused by the above mechanisms. Three cases are believed to be sufficient: o Two level slowstart bursts sufficient to get connections started properly. o Ubiquitous sender interface rate bursts caused by efficiency algorithms. We assume 4 packet bursts to be the most common case, since it matches the effects of delayed ACK during slowstart. These bursts should be assumed not to significantly affect packet transfer statistics. - o Infrequent sender interface rate bursts that are full - target_window_size. Target_run_length may be derated for these - large fast bursts. + o Infrequent sender interface rate bursts that are the maximum of + the full target_window_size and the initial window size (10 + segments in [RFC6928]). The Target_run_length may be derated for + these large fast bursts. If a subpath can meet the required packet loss ratio for bursts at all of these scales then it has sufficient buffering at all potential bottlenecks to tolerate any of the bursts that are likely introduced by TCP or other transport protocols. 4.2. Diagnostic Approach A complete path of a given RTT and MTU, which are equal to or smaller than the Target RTT and equal to or larger than the Target MTU @@ -1586,66 +1608,40 @@ reordering should be instrumented and the maximum reordering that can be properly characterized by the test (because of the bound on history buffers) should be recorded with the measurement results. Reordering tolerance and diagnostic limitations, such as the size of the history buffer used to diagnose packets that are way out-of- order, must be specified in a FSTIDS. 8. IP Diagnostic Tests - The IP diagnostic tests below are organized by traffic pattern: basic - data rate and packet transfer statistics, standing queues, slowstart - bursts, and sender rate bursts. We also introduce some combined - tests which are more efficient when networks are expected to pass, - but conflate diagnostic signatures when they fail. + The IP diagnostic tests below are organized the technique used to + generate the test stream as described in Section 6. All of the + results are evaluated in accordance with Section 7, possibly with + additional test specific critera. - There are a number of test details which are not fully defined here. - They must be fully specified in a FS-TIDS. From a standardization - perspective, this lack of specificity will weaken this version of - Model Based Metrics, however it is anticipated that this weakness is - than offset by the extent to which MBM suppresses the problems caused - by using transport protocols for measurement. e.g. non-specific MBM - metrics are likely to have better repeatability than many existing - BTC like metrics. Once we have good field experience, the missing - details can be fully specified. + We also introduce some combined tests which are more efficient when + networks are expected to pass, but conflate diagnostic signatures + when they fail. 8.1. Basic Data Rate and Packet Transfer Tests We propose several versions of the basic data rate and packet - transfer statistics test. All measure the number of packets - delivered between losses or ECN Congestion Experienced (CE) marks, - using a data stream that is rate controlled at approximately the - target_data_rate. - - The tests below differ in how the data rate is controlled. The data - can be paced on a timer, or window controlled (and self clocked). - The first two tests implicitly confirm that sub_path has sufficient - raw capacity to carry the target_data_rate. They are recommended for - relatively infrequent testing, such as an installation or periodic - auditing process. The third, background packet transfer statistics, - is a low rate test designed for ongoing monitoring for changes in - subpath quality. - - All rely on the data receiver accumulating packet transfer statistics - as described in Section 7.2 to score the outcome: - - Pass: it is statistically significant that the observed interval - between losses or ECN CE marks is larger than the target_run_length. - - Fail: it is statistically significant that the observed interval - between losses or ECN CE marks is smaller than the target_run_length. - - A test is considered to be inconclusive if it failed to generate the - data rate as specified below, meet the qualifications defined in - Section 5.4 or neither run length statistical hypothesis was - confirmed in the allotted test duration. + transfer statistics test that differ in how the data rate is + controlled. The data can be paced on a timer, or window controlled + (and self clocked). The first two tests implicitly confirm that + sub_path has sufficient raw capacity to carry the target_data_rate. + They are recommended for relatively infrequent testing, such as an + installation or periodic auditing process. The third, background + packet transfer statistics, is a low rate test designed for ongoing + monitoring for changes in subpath quality. 8.1.1. Delivery Statistics at Paced Full Data Rate Confirm that the observed run length is at least the target_run_length while relying on timer to send data at the target_rate using the procedure described in in Section 6.1 with a burst size of 1 (single packets) or 2 (packet pairs). The test is considered to be inconclusive if the packet transmission can not be accurately controlled for any reason. @@ -2172,24 +2168,28 @@ Bulk Transport Capacity are sensitive to RTT and as a consequence often yield very different results when run local to an ISP or interconnect and when run over a customer's complete path. Neither the ISP nor customer can repeat the others measurements, leading to high levels of distrust and acrimony. Model Based Metrics are expected to greatly improve this situation. Note that in situ measurements sometimes requires sending synthetic measurement traffic between arbitrary locations in the network, and as such are potentially attractive platforms for launching DDOS - attacks. All active measurement tools and protocols must be deigned + attacks. All active measurement tools and protocols must be designed to minimize the opportunities for these misuses. See the discussion in section 7 of [RFC7594]. + Some of the tests described in the note are not intended for frequent + network monitoring since they have the potential to cause high + network loads and might adversely affect other traffic. + This document only describes a framework for designing Fully Specified Targeted IP Diagnostic Suite. Each FS-TIDS must include its own security section. 12. Acknowledgments Ganga Maguluri suggested the statistical test for measuring loss probability in the target run length. Alex Gilgur and Merry Mou for helping with the statistics. @@ -2198,38 +2198,43 @@ Ruediger Geib provided feedback which greatly improved the document. This work was inspired by Measurement Lab: open tools running on an open platform, using open tools to collect open data. See http://www.measurementlab.net/ 13. IANA Considerations This document has no actions for IANA. -14. References +14. Informative References [RFC0863] Postel, J., "Discard Protocol", STD 21, RFC 863, May 1983. [RFC0864] Postel, J., "Character Generator Protocol", STD 22, RFC 864, May 1983. [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, "Framework for IP Performance Metrics", RFC 2330, May 1998. [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion Window Validation", RFC 2861, June 2000. [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining Empirical Bulk Transfer Capacity Metrics", RFC 3148, July 2001. + [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition + of Explicit Congestion Notification (ECN) to IP", + RFC 3168, DOI 10.17487/RFC3168, September 2001, + . + [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte Counting (ABC)", RFC 3465, February 2003. [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for TCP", RFC 4015, February 2005. [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, S., and J. Perser, "Packet Reordering Metrics", RFC 4737, November 2006.