draft-ietf-ippm-metrictest-00.txt | draft-ietf-ippm-metrictest-01.txt | |||
---|---|---|---|---|
Internet Engineering Task Force R. Geib, Ed. | Internet Engineering Task Force R. Geib, Ed. | |||
Internet-Draft Deutsche Telekom | Internet-Draft Deutsche Telekom | |||
Intended status: Standards Track A. Morton | Intended status: Standards Track A. Morton | |||
Expires: January 3, 2011 AT&T Labs | Expires: April 27, 2011 AT&T Labs | |||
R. Fardid | R. Fardid | |||
Cariden Technologies | Cariden Technologies | |||
A. Steinmitz | A. Steinmitz | |||
HS Fulda | HS Fulda | |||
July 2, 2010 | October 24, 2010 | |||
IPPM standard advancement testing | IPPM standard advancement testing | |||
draft-ietf-ippm-metrictest-00 | draft-ietf-ippm-metrictest-01 | |||
Abstract | Abstract | |||
This document specifies tests to determine if multiple independent | This document specifies tests to determine if multiple independent | |||
instantiations of a performance metric RFC have implemented the | instantiations of a performance metric RFC have implemented the | |||
specifications in the same way. This is the performance metric | specifications in the same way. This is the performance metric | |||
equivalent of interoperability, required to advance RFCs along the | equivalent of interoperability, required to advance RFCs along the | |||
standards track. Results from different implementations of metric | standards track. Results from different implementations of metric | |||
RFCs will be collected under the same underlying network conditions | RFCs will be collected under the same underlying network conditions | |||
and compared using state of the art statistical methods. The goal is | and compared using state of the art statistical methods. The goal is | |||
skipping to change at page 1, line 44 | skipping to change at page 1, line 44 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on January 3, 2011. | This Internet-Draft will expire on April 27, 2011. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 22 | skipping to change at page 2, line 22 | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 6 | 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 6 | |||
2. Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
3. Verification of conformance to a metric specification . . . . 8 | 3. Verification of conformance to a metric specification . . . . 8 | |||
3.1. Tests of an individual implementation against a metric | 3.1. Tests of an individual implementation against a metric | |||
specification . . . . . . . . . . . . . . . . . . . . . . 8 | specification . . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.2. Test setup resulting in identical live network testing | 3.2. Test setup resulting in identical live network testing | |||
conditions . . . . . . . . . . . . . . . . . . . . . . . . 9 | conditions . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
3.3. Tests of two or more different implementations against | 3.3. Tests of two or more different implementations against | |||
a metric specification . . . . . . . . . . . . . . . . . . 14 | a metric specification . . . . . . . . . . . . . . . . . . 15 | |||
3.4. Clock synchronisation . . . . . . . . . . . . . . . . . . 14 | 3.4. Clock synchronisation . . . . . . . . . . . . . . . . . . 16 | |||
3.5. Recommended Metric Verification Measurement Process . . . 15 | 3.5. Recommended Metric Verification Measurement Process . . . 17 | |||
3.6. Miscellaneous . . . . . . . . . . . . . . . . . . . . . . 19 | 3.6. Miscellaneous . . . . . . . . . . . . . . . . . . . . . . 20 | |||
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | 3.7. Proposal to determine an "equivalence" threshold for | |||
5. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 19 | each metric evaluated . . . . . . . . . . . . . . . . . . 21 | |||
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 | 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | 5. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | |||
8.1. Normative References . . . . . . . . . . . . . . . . . . . 20 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 22 | |||
8.2. Informative References . . . . . . . . . . . . . . . . . . 21 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
Appendix A. An example on a One-way Delay metric validation . . . 22 | 8.1. Normative References . . . . . . . . . . . . . . . . . . . 23 | |||
A.1. Compliance to Metric specification requirements . . . . . 22 | 8.2. Informative References . . . . . . . . . . . . . . . . . . 24 | |||
A.2. Examples related to statistical tests for One-way Delay . 24 | Appendix A. An example on a One-way Delay metric validation . . . 25 | |||
Appendix B. Anderson-Darling 2 sample C++ code . . . . . . . . . 25 | A.1. Compliance to Metric specification requirements . . . . . 25 | |||
Appendix C. Glossary . . . . . . . . . . . . . . . . . . . . . . 34 | A.2. Examples related to statistical tests for One-way Delay . 26 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35 | Appendix B. Anderson-Darling 2 sample C++ code . . . . . . . . . 28 | |||
Appendix C. A tunneling set up for remote metric | ||||
implementation testing . . . . . . . . . . . . . . . 36 | ||||
Appendix D. Glossary . . . . . . . . . . . . . . . . . . . . . . 38 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38 | ||||
1. Introduction | 1. Introduction | |||
The Internet Standards Process RFC2026 [RFC2026] requires that for a | The Internet Standards Process RFC2026 [RFC2026] requires that for a | |||
IETF specification to advance beyond the Proposed Standard level, at | IETF specification to advance beyond the Proposed Standard level, at | |||
least two genetically unrelated implementations must be shown to | least two genetically unrelated implementations must be shown to | |||
interoperate correctly with all features and options. This | interoperate correctly with all features and options. This | |||
requirement can be met by supplying: | requirement can be met by supplying: | |||
o evidence that (at least a sub-set of) the specification has been | o evidence that (at least a sub-set of) the specification has been | |||
skipping to change at page 5, line 14 | skipping to change at page 5, line 14 | |||
The metric RFC advancement process begins with a request for protocol | The metric RFC advancement process begins with a request for protocol | |||
action accompanied by a memo that documents the supporting tests and | action accompanied by a memo that documents the supporting tests and | |||
results. The procedures of [RFC2026] are expanded in[RFC5657], | results. The procedures of [RFC2026] are expanded in[RFC5657], | |||
including sample implementation and interoperability reports. | including sample implementation and interoperability reports. | |||
Section 3 of [morton-advance-metrics-01] can serve as a template for | Section 3 of [morton-advance-metrics-01] can serve as a template for | |||
a metric RFC report which accompanies the protocol action request to | a metric RFC report which accompanies the protocol action request to | |||
the Area Director, including description of the test set-up, | the Area Director, including description of the test set-up, | |||
procedures, results for each implementation and conclusions. | procedures, results for each implementation and conclusions. | |||
Changes from WG -00 to WG -01 draft | ||||
o Discussion on merits and requirements of a distributed lab test | ||||
using only local load generators. | ||||
o Proposal of metrics suitable for tests using the proposed | ||||
measurement configuration. | ||||
o Hint on delay caused by software based L2TPv3 implementation. | ||||
o Added an appendix with a test configuration allowing remote tests | ||||
comparing different implementations accross the network. | ||||
o Proposal for maximum error of "equivalence", based on performance | ||||
comparison of identical implementations. This may be useful for | ||||
both ADK and non-ADK comparisons. | ||||
Changes from prior ID -02 to WG -00 draft | Changes from prior ID -02 to WG -00 draft | |||
o Incorporation of aspects of reporting to support the protocol | o Incorporation of aspects of reporting to support the protocol | |||
action request in the Introduction and section 3.5 | action request in the Introduction and section 3.5 | |||
o Overhaul of sectcion 3.2 regarding tunneling: Added generic | o Overhaul of sectcion 3.2 regarding tunneling: Added generic | |||
tunneling requirements and L2TPv3 as an example tunneling | tunneling requirements and L2TPv3 as an example tunneling | |||
mechanism fulfilling the tunneling requirements. Removed and | mechanism fulfilling the tunneling requirements. Removed and | |||
adapted some of the prior references to other tunneling protocols | adapted some of the prior references to other tunneling protocols | |||
skipping to change at page 8, line 22 | skipping to change at page 8, line 42 | |||
3. Verification of conformance to a metric specification | 3. Verification of conformance to a metric specification | |||
This section specifies how to verify compliance of two or more IPPM | This section specifies how to verify compliance of two or more IPPM | |||
implementations against a metric specification. This document only | implementations against a metric specification. This document only | |||
proposes a general methodology. Compliance criteria to a specific | proposes a general methodology. Compliance criteria to a specific | |||
metric implementation need to be defined for each individual metric | metric implementation need to be defined for each individual metric | |||
specification. The only exception is the statistical test comparing | specification. The only exception is the statistical test comparing | |||
two metric implementations which are simultaneously tested. This | two metric implementations which are simultaneously tested. This | |||
test is applicable without metric specific decision criteria. | test is applicable without metric specific decision criteria. | |||
Several testing options exist to compare two or more implementations: | ||||
o Use a single test lab to compare the implementations and emulate | ||||
the Internet with an impairment generator. | ||||
o Use a single test lab to compare the implementations and measure | ||||
across the Internet. | ||||
o Use remotely separated test labs to compare the implementations | ||||
and emulate the Internet with two "identically" configured | ||||
impairment generators. | ||||
o Use remotely separated test labs to compare the implementations | ||||
and measure across the Internet. | ||||
o Use remotely separated test labs to compare the implementations | ||||
and measure across the Internet and include a single impairment | ||||
generator to impact all measurement flows in non discriminatory | ||||
way. | ||||
The first two approaches work, but cause higher expenses than the | ||||
other ones (due to travel and/or shipping+installation). For the | ||||
third option, ensuring two identically configured impairment | ||||
generators requires well defined test cases and possibly identical | ||||
hard- and software. >>>Comment: for some specific tests, impairment | ||||
generator accuracy requirements are less-demanding than others, and | ||||
in such cases there is more flexibility in impairment generator | ||||
configuration. <<< | ||||
It is a fair question, whether the last two options can result in any | ||||
applicable test set up at all. While an experimental approach is | ||||
given in Appendix C, the tradeoff that measurement packets of | ||||
different sites pass the path segments but always in a different | ||||
order of segments probably can't be avoided. | ||||
The question of which option above results in identical networking | ||||
conditions and is broadly accepted can't be answered without more | ||||
practical experience in comparing implementations. The last proposal | ||||
has the advantage that, while the measurement equipment is remotely | ||||
distributed, a single network impairment generator and the Internet | ||||
can be used in combination to impact all measurement flows. | ||||
3.1. Tests of an individual implementation against a metric | 3.1. Tests of an individual implementation against a metric | |||
specification | specification | |||
A metric implementation MUST support the requirements classified as | A metric implementation MUST support the requirements classified as | |||
"MUST" and "REQUIRED" of the related metric specification to be | "MUST" and "REQUIRED" of the related metric specification to be | |||
compliant to the latter. | compliant to the latter. | |||
Further, supported options of a metric implementation SHOULD be | Further, supported options of a metric implementation SHOULD be | |||
documented in sufficient detail. The documentation of chosen options | documented in sufficient detail. The documentation of chosen options | |||
is RECOMMENDED to minimise (and recognise) differences in the test | is RECOMMENDED to minimise (and recognise) differences in the test | |||
skipping to change at page 12, line 45 | skipping to change at page 14, line 18 | |||
length of 4 Bytes. By the time of writing, between 1 and 4 Labels | length of 4 Bytes. By the time of writing, between 1 and 4 Labels | |||
seems to be a fair guess of what's expectable. | seems to be a fair guess of what's expectable. | |||
The applicability of one or more of the following tunneling protocols | The applicability of one or more of the following tunneling protocols | |||
may be investigated by interested parties if Ethernet over L2TPv3 is | may be investigated by interested parties if Ethernet over L2TPv3 is | |||
felt to be not suitable: IP in IP [RFC2003] or Generic Routing | felt to be not suitable: IP in IP [RFC2003] or Generic Routing | |||
Encapsulation (GRE) [RFC2784]. RFC 4928 [RFC4928] proposes measures | Encapsulation (GRE) [RFC2784]. RFC 4928 [RFC4928] proposes measures | |||
how to avoid ECMP treatment in MPLS networks. | how to avoid ECMP treatment in MPLS networks. | |||
L2TP is a commodity tunneling protocol [RFC2661]. By the time of | L2TP is a commodity tunneling protocol [RFC2661]. By the time of | |||
writing, L2TPv3 [RFC3931]is the latest version of L2TP. | writing, L2TPv3 [RFC3931]is the latest version of L2TP. If L2TPv3 is | |||
applied, software based implementations of this protocol are not | ||||
suitable for the test set up, as such implementations may cause | ||||
uncalculable delay shifts. | ||||
Ethernet Pseudo Wires may also be set up on MPLS networks [RFC4448]. | Ethernet Pseudo Wires may also be set up on MPLS networks [RFC4448]. | |||
While there's no technical issue with this solution, MPLS interfaces | While there's no technical issue with this solution, MPLS interfaces | |||
are mostly found in the network provider domain. Hence not all of | are mostly found in the network provider domain. Hence not all of | |||
the above tunneling criteria are met. | the above tunneling criteria are met. | |||
Appendix C provides an experimental tunneling set up for metric | ||||
implementation testing between two (or more) remote sites. | ||||
Each test is repeated several times. WAN conditions may change over | Each test is repeated several times. WAN conditions may change over | |||
time. Sequential testing is desirable, but may not be a useful | time. Sequential testing is desirable, but may not be a useful | |||
metric test option. It is RECOMMENDED that tests be carried out by | metric test option. It is RECOMMENDED that tests be carried out by | |||
establishing N different parallel measurement flows. Two or three | establishing N different parallel measurement flows. Two or three | |||
linecards per implementation serving to send or receive measurement | linecards per implementation serving to send or receive measurement | |||
flows should be sufficient to create 5 or more parallel measurement | flows should be sufficient to create 5 or more parallel measurement | |||
flows. If three linecards are used, each card sends and receives 2 | flows. If three linecards are used, each card sends and receives 2 | |||
flows. Other options are to separate flows by DiffServ marks | flows. Other options are to separate flows by DiffServ marks | |||
(without deploying any QoS in the inner or outer tunnel) or using a | (without deploying any QoS in the inner or outer tunnel) or using a | |||
single CBR flow and evaluating every n-th singleton to belong to a | single CBR flow and evaluating every n-th singleton to belong to a | |||
skipping to change at page 15, line 49 | skipping to change at page 17, line 29 | |||
In order to meet their obligations under the IETF Standards Process | In order to meet their obligations under the IETF Standards Process | |||
the IESG must be convinced that each metric specification advanced to | the IESG must be convinced that each metric specification advanced to | |||
Draft Standard or Internet Standard status is clearly written, that | Draft Standard or Internet Standard status is clearly written, that | |||
there are the required multiple verifiably equivalent | there are the required multiple verifiably equivalent | |||
implementations, and that all options have been implemented. | implementations, and that all options have been implemented. | |||
In the context of this document, metrics are designed to measure some | In the context of this document, metrics are designed to measure some | |||
characteristic of a data network. An aim of any metric definition | characteristic of a data network. An aim of any metric definition | |||
should be that it should be specified in a way that can reliably | should be that it should be specified in a way that can reliably | |||
measure the specific characteristic in a repeatable way. | measure the specific characteristic in a repeatable way across | |||
multiple independent implementations. | ||||
Each metric, statistic or option of those to be validated MUST be | Each metric, statistic or option of those to be validated MUST be | |||
compared against a reference measurement or another implementation by | compared against a reference measurement or another implementation by | |||
at least 5 different basic data sets, each one with sufficient size | at least 5 different basic data sets, each one with sufficient size | |||
to reach the specified level of confidence, as specified by this | to reach the specified level of confidence, as specified by this | |||
document. | document. | |||
Finally, the metric definitions, embodied in the text of the RFCs, | Finally, the metric definitions, embodied in the text of the RFCs, | |||
are the objects that require evaluation and possible revision in | are the objects that require evaluation and possible revision in | |||
order to advance to the next step on the standards track. | order to advance to the next step on the standards track. | |||
skipping to change at page 16, line 28 | skipping to change at page 18, line 9 | |||
THEN the details of each implementation should be audited along with | THEN the details of each implementation should be audited along with | |||
the exact definition text, to determine if there is a lack of clarity | the exact definition text, to determine if there is a lack of clarity | |||
that has caused the implementations to vary in a way that affects the | that has caused the implementations to vary in a way that affects the | |||
correspondence of the results. | correspondence of the results. | |||
IF there was a lack of clarity or multiple legitimate interpretations | IF there was a lack of clarity or multiple legitimate interpretations | |||
of the definition text, | of the definition text, | |||
THEN the text should be modified and the resulting memo proposed for | THEN the text should be modified and the resulting memo proposed for | |||
consensus and advancement along the standards track. | consensus and (possible) advancement along the standards track. | |||
Finally, all the findings MUST be documented in a report that can | Finally, all the findings MUST be documented in a report that can | |||
support advancement on the standards track, similar to those | support advancement on the standards track, similar to those | |||
described in [RFC5657]. The list of measurement devices used in | described in [RFC5657]. The list of measurement devices used in | |||
testing satisfies the implementation requirement, while the test | testing satisfies the implementation requirement, while the test | |||
results provide information on the quality of each specification in | results provide information on the quality of each specification in | |||
the metric RFC (the surrogate for feature interoperability). | the metric RFC (the surrogate for feature interoperability). | |||
The complete process of advancing a metric specification to a | The complete process of advancing a metric specification to a | |||
standard as defined by this document is illustrated in Figure 3. | standard as defined by this document is illustrated in Figure 3. | |||
,---. | ,---. | |||
/ \ | / \ | |||
( Start ) | ( Start ) | |||
\ / Implementations | \ / Implementations | |||
`-+-' +-------+ | `-+-' +-------+ | |||
| /| 1 `. | | /| 1 `. | |||
+---+----+ / +-------+ `.-----------+ ,-------. | +---+----+ / +-------+ `.-----------+ ,-------. | |||
| RFC | / |Check for | ,' was RFC `. YES | | RFC | / |Check for | ,' was RFC `. YES | |||
| | / |Equivalence.... clause x --------+ | | | / |Equivalence.... clause x ------+ | |||
| |/ +-------+ |under | `. clear? ,' | | | |/ +-------+ |under | `. clear? ,' | | |||
| Metric \.....| 2 ....relevant | `---+---' +----+---+ | | Metric \.....| 2 ....relevant | `---+---' +----+-----+ | |||
| Metric |\ +-------+ |identical | No | |Report | | | Metric |\ +-------+ |identical | No | |Report | | |||
| Metric | \ |network | +--+----+ |results+| | | Metric | \ |network | +--+----+ |results + | | |||
| ... | \ |conditions | |Modify | |Advance | | | ... | \ |conditions | |Modify | |Advance | | |||
| | \ +-------+ | | |Spec +----+RFC | | | | \ +-------+ | | |Spec +--+RFC | | |||
+--------+ \| n |.'+-----------+ +-------+ |request | | +--------+ \| n |.'+-----------+ +-------+ |request(?)| | |||
+-------+ +--------+ | +-------+ +----------+ | |||
Illustration of the metric standardisation process | Illustration of the metric standardisation process | |||
Figure 3 | Figure 3 | |||
Any recommendation for the advancement of a metric specification MUST | Any recommendation for the advancement of a metric specification MUST | |||
be accompanied by an implementation report, as is the case with all | be accompanied by an implementation report, as is the case with all | |||
requests for the advancement of IETF specifications. The | requests for the advancement of IETF specifications. The | |||
implementation report needs to include the tests performed, the | implementation report needs to include the tests performed, the | |||
applied test setup, the specific metrics in the RFC and reports of | applied test setup, the specific metrics in the RFC and reports of | |||
skipping to change at page 19, line 15 | skipping to change at page 20, line 34 | |||
o Different IP options. | o Different IP options. | |||
o Different DSCP. | o Different DSCP. | |||
o If the N measurements are captured using sequential measurements | o If the N measurements are captured using sequential measurements | |||
instead of simultaneous ones, then the following factors come into | instead of simultaneous ones, then the following factors come into | |||
play: Time varying paths and load conditions. | play: Time varying paths and load conditions. | |||
3.6. Miscellaneous | 3.6. Miscellaneous | |||
In the case that a metric validation requires capturing rare events, | A minimum amount of singletons per metric is required if results are | |||
an impairment generator may have to be added to the test set up. | to be compared. To avoid accidental singletons from impacting a | |||
metric comparison, a minimum number of 5 singletons per compared | ||||
interval was proposed above. Commercial Internet service is not | ||||
operated to reliably create enough rare events of singletons to | ||||
characterize bad measurement engineering or bad implementations. In | ||||
the case that a metric validation requires capturing rare events, an | ||||
impairment generator may have to be added to the test set up. | ||||
Inclusion of an impairment generator and the parameterisation of the | Inclusion of an impairment generator and the parameterisation of the | |||
impairments generated MUST be documented. Rare events could be | impairments generated MUST be documented. | |||
packet duplications, packet loss rates above one digit percentages, | ||||
loss patterns or packet re-ordering and so on. | A metric characterising a common impairment condition would be one, | |||
which by expectation creates a singleton result for each measured | ||||
packet. Delay or Delay Variation are examples of this type, and in | ||||
such cases, the Internet may be used to compare metric | ||||
implementations. | ||||
Rare events are those, where by expectation no or a rather low number | ||||
of "event is present" singletons are captured during a measurement | ||||
interval. Packet duplications, packet loss rates above one digit | ||||
percentages, loss patterns and packet reordering are examples. Note | ||||
especially that a packet reordering or loss pattern metric | ||||
implementation comparison may require a more sophisticated test set | ||||
up than described here. Spatial and temporal effects combine in the | ||||
case of packet re-ordering and measurements with different packet | ||||
rates may always lead to different results. | ||||
As specified above, 5 singletons are the recommended basis to | As specified above, 5 singletons are the recommended basis to | |||
minimise interference of random events with the statistical test | minimise interference of random events with the statistical test | |||
proposed by this document. In the case of ratio measurements (like | proposed by this document. In the case of ratio measurements (like | |||
packet loss), the underlying sum of basic events, against the which | packet loss), the underlying sum of basic events, against the which | |||
the metric's monitored singletons are "rated", determines the | the metric's monitored singletons are "rated", determines the | |||
resolution of the test. A packet loss statistic with a resolution of | resolution of the test. A packet loss statistic with a resolution of | |||
1% requires one packet loss statistic-datapoint to consist of 500 | 1% requires one packet loss statistic-datapoint to consist of 500 | |||
delay singletons (of which at least 5 were lost). To compare EDFs on | delay singletons (of which at least 5 were lost). To compare EDFs on | |||
packet loss requires one hundred such statistics per flow. That | packet loss requires one hundred such statistics per flow. That | |||
means, all in all at least 50 000 delay singletons are required per | means, all in all at least 50 000 delay singletons are required per | |||
single measurement flow. Live network packet loss is assumed to be | single measurement flow. Live network packet loss is assumed to be | |||
present during main traffic hours only. Let this interval be 5 | present during main traffic hours only. Let this interval be 5 | |||
hours. The required minimum rate of a single measurement flow in | hours. The required minimum rate of a single measurement flow in | |||
that case is 2.8 packets/sec (assuming a loss of 1% during 5 hours). | that case is 2.8 packets/sec (assuming a loss of 1% during 5 hours). | |||
If this measurement is too demanding under live network conditions, | If this measurement is too demanding under live network conditions, | |||
an impairment generator should be used. | an impairment generator should be used. | |||
3.7. Proposal to determine an "equivalence" threshold for each metric | ||||
evaluated | ||||
This section describes a proposal for maximum error of "equivalence", | ||||
based on performance comparison of identical implementations. This | ||||
comparison may be useful for both ADK and non-ADK comparisons. | ||||
Each metric tested by two or more implementations (cross- | ||||
implementation testing). | ||||
Each metric is also tested twice simultaneously by the *same* | ||||
implementation, using different Src/Dst Address pairs and other | ||||
differences such that the connectivity differences of the cross- | ||||
implementation tests are also experienced and measured by the same | ||||
implementation. | ||||
Comparative results for the same implementation represent a bound on | ||||
cross-implementation equivalence. This should be particularly useful | ||||
when the metric does *not* produces a continuous distribution of | ||||
singleton values, such as with a loss metric, or a duplication | ||||
metric. Appendix A indicates how the ADK will work for 0ne-way | ||||
delay, and should be likewise applicable to distributions of delay | ||||
variation. | ||||
Proposal: the implementation with the largest difference in | ||||
homogeneous comparison results is the lower bound on the equivalence | ||||
threshold, noting that there may be other systematic errors to | ||||
account for when comparing between implementations. | ||||
Thus, when evaluationg equivalence in cross-implementation results: | ||||
Maximum_Error = Same_Implementation_Error + Systematic_Error | ||||
and only the systematic error need be decided beforehand. | ||||
In the case of ADK comparison, the largest same-implementation | ||||
resolution of distribution equivalence can be used as a limit on | ||||
cross-implementation resolutions (at the same confidence level). | ||||
4. Acknowledgements | 4. Acknowledgements | |||
Gerhard Hasslinger commented a first version of this document, | Gerhard Hasslinger commented a first version of this document, | |||
suggested statistical tests and the evaluation of time series | suggested statistical tests and the evaluation of time series | |||
information. Henk Uijterwaal pushed this work and Mike Hamilton, | information. Henk Uijterwaal and Lars Eggert have encouraged and | |||
Scott Bradner and Emile Stephan commented on versions of this draft | helped to orgainize this work. Mike Hamilton, Scott Bradner, David | |||
before initial publication. Carol Davids reviewed the 01 version of | Mcdysan and Emile Stephan commented on this draft. Carol Davids | |||
this draft. | reviewed the 01 version of the ID before it was promoted to WG draft. | |||
5. Contributors | 5. Contributors | |||
Scott Bradner, Vern Paxson and Allison Mankin drafted bradner- | Scott Bradner, Vern Paxson and Allison Mankin drafted bradner- | |||
metrictest [bradner-metrictest], and major parts of it are included | metrictest [bradner-metrictest], and major parts of it are included | |||
in this document. | in this document. | |||
6. IANA Considerations | 6. IANA Considerations | |||
This memo includes no request to IANA. | This memo includes no request to IANA. | |||
skipping to change at page 25, line 50 | skipping to change at page 28, line 18 | |||
as shown in the first two columns of table 1 clearly fails an ADK | as shown in the first two columns of table 1 clearly fails an ADK | |||
test with 95% confidence. | test with 95% confidence. | |||
The results of Implemnt_2 are now reduced by difference of the | The results of Implemnt_2 are now reduced by difference of the | |||
averages of column 2 (rounded to 6581 us) and column 1 (rounded to | averages of column 2 (rounded to 6581 us) and column 1 (rounded to | |||
5029 us), which is 1552 us. The result may be found in column 3 of | 5029 us), which is 1552 us. The result may be found in column 3 of | |||
table 1. Comparing column 1 and column 3 of the table by an ADK test | table 1. Comparing column 1 and column 3 of the table by an ADK test | |||
shows, that the data contained in these columns passes an ADK tests | shows, that the data contained in these columns passes an ADK tests | |||
with 95% confidence. | with 95% confidence. | |||
>>> Comment: Extensive averaging was used in this example, because of | ||||
the vastly different sampling frequencies. As a result, the | ||||
distributions compared do not exactly align with a metric in | ||||
[RFC2679], but illustrate the ADK process adequately. | ||||
Appendix B. Anderson-Darling 2 sample C++ code | Appendix B. Anderson-Darling 2 sample C++ code | |||
/* Routines for computing the Anderson-Darling 2 sample | /* Routines for computing the Anderson-Darling 2 sample | |||
* test statistic. | * test statistic. | |||
* | * | |||
* Implemented based on the description in | * Implemented based on the description in | |||
* "Anderson-Darling K Sample Test" Heckert, Alan and | * "Anderson-Darling K Sample Test" Heckert, Alan and | |||
* Filliben, James, editors, Dataplot Reference Manual, | * Filliben, James, editors, Dataplot Reference Manual, | |||
* Chapter 15 Auxiliary, NIST, 2004. | * Chapter 15 Auxiliary, NIST, 2004. | |||
* Official Reference by 2010 | * Official Reference by 2010 | |||
* Heckert, N. A. (2001). Dataplot website at the | * Heckert, N. A. (2001). Dataplot website at the | |||
* National Institute of Standards and Technology: | * National Institute of Standards and Technology: | |||
skipping to change at page 34, line 20 | skipping to change at page 36, line 43 | |||
* n_total * (k - 1)) | * n_total * (k - 1)) | |||
* (sum_adk_samp1 / n_sample1 | * (sum_adk_samp1 / n_sample1 | |||
+ sum_adk_samp2 / n_sample2); | + sum_adk_samp2 / n_sample2); | |||
/* if(adk_result <= adk_criterium) | /* if(adk_result <= adk_criterium) | |||
* adk_2_sample test is passed | * adk_2_sample test is passed | |||
*/ | */ | |||
Figure 4 | Figure 4 | |||
Appendix C. Glossary | Appendix C. A tunneling set up for remote metric implementation testing | |||
Parties interested in testing metric compliance is most convenient if | ||||
all involved parties can stay in their local test laboratories. | ||||
Figure 4 shows a test configuration which may enable remote metric | ||||
compliance testing. | ||||
+----+ +----+ +----+ +----+ | ||||
|LC10| |LC11| ,---. |LC20| |LC21| | ||||
+----+ +----+ / \ +-------+ +----+ +----+ | ||||
| V10 | V11 / \ | Tunnel| | V20 | V21 | ||||
| | ( ) | Head | | | | ||||
+--------+ +------+ | | | Router|__+----------+ | ||||
|Ethernet| |Tunnel| |Internet | +---B---+ |Ethernet | | ||||
|Switch |--|Head |-| | | |Switch | | ||||
+-+--+---+ |Router| | | +---+---+ +--+--+----+ | ||||
|__| +--A---+ ( )--|Option.| |__| | ||||
\ / |Impair.| | ||||
Bridge \ / |Gener. | Bridge | ||||
V20 to V21 `-+-? +-------+ V10 to V11 | ||||
Figure 5 | ||||
LC10 identify measurement clients /line cards. V10 and the others | ||||
denote VLANs. All VLANs are using the same tunnel from A to B and in | ||||
the reverse direction. The remote site VLANs are U-bridged at the | ||||
local site Ethernet switch. The measurement packets of site 1 travel | ||||
tunnel A->B first, are U-bridged at site 2 and travel tunnel B->A | ||||
second. Measurement packets of site 2 travel tunnel B->A first, are | ||||
U-bridged at site 1 and travel tunnel A->B second. So all | ||||
measurement packets pass the same tunnel segments, but in different | ||||
segment order. An experiment to prove or reject the above test set | ||||
up shown in figure 4 has been agreed but not yet scheduled between | ||||
Deutsche Telekom and RIPE. | ||||
Figure 4 includes an optional impairment generator. If this | ||||
impairment generator is inserted in the IP path between the tunnel | ||||
head end routers, it equally impacts all measurement packets and | ||||
flows. Thus trouble with ensuring identical test set up by | ||||
configuring two separated impairment generators identically is | ||||
avoided (which was another proposal allowing remote metric compliance | ||||
testing). | ||||
Appendix D. Glossary | ||||
+-------------+-----------------------------------------------------+ | +-------------+-----------------------------------------------------+ | |||
| ADK | Anderson-Darling K-Sample test, a test used to | | | ADK | Anderson-Darling K-Sample test, a test used to | | |||
| | check whether two samples have the same statistical | | | | check whether two samples have the same statistical | | |||
| | distribution. | | | | distribution. | | |||
| ECMP | Equal Cost Multipath, a load balancing mechanism | | | ECMP | Equal Cost Multipath, a load balancing mechanism | | |||
| | evaluating MPLS labels stacks, IP addresses and | | | | evaluating MPLS labels stacks, IP addresses and | | |||
| | ports. | | | | ports. | | |||
| EDF | The "Empirical Distribution Function" of a set of | | | EDF | The "Empirical Distribution Function" of a set of | | |||
| | scalar measurements is a function F(x) which for | | | | scalar measurements is a function F(x) which for | | |||
End of changes. 21 change blocks. | ||||
45 lines changed or deleted | 223 lines changed or added | |||
This html diff was produced by rfcdiff 1.40. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |