draft-ietf-ippm-metrictest-01.txt   draft-ietf-ippm-metrictest-02.txt 
Internet Engineering Task Force R. Geib, Ed. Internet Engineering Task Force R. Geib, Ed.
Internet-Draft Deutsche Telekom Internet-Draft Deutsche Telekom
Intended status: Standards Track A. Morton Intended status: Standards Track A. Morton
Expires: April 27, 2011 AT&T Labs Expires: September 15, 2011 AT&T Labs
R. Fardid R. Fardid
Cariden Technologies Cariden Technologies
A. Steinmitz A. Steinmitz
HS Fulda HS Fulda
October 24, 2010 March 14, 2011
IPPM standard advancement testing IPPM standard advancement testing
draft-ietf-ippm-metrictest-01 draft-ietf-ippm-metrictest-02
Abstract Abstract
This document specifies tests to determine if multiple independent This document specifies tests to determine if multiple independent
instantiations of a performance metric RFC have implemented the instantiations of a performance metric RFC have implemented the
specifications in the same way. This is the performance metric specifications in the same way. This is the performance metric
equivalent of interoperability, required to advance RFCs along the equivalent of interoperability, required to advance RFCs along the
standards track. Results from different implementations of metric standards track. Results from different implementations of metric
RFCs will be collected under the same underlying network conditions RFCs will be collected under the same underlying network conditions
and compared using state of the art statistical methods. The goal is and compared using state of the art statistical methods. The goal is
skipping to change at page 1, line 44 skipping to change at page 1, line 44
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 27, 2011. This Internet-Draft will expire on September 15, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 5, line 14 skipping to change at page 5, line 14
The metric RFC advancement process begins with a request for protocol The metric RFC advancement process begins with a request for protocol
action accompanied by a memo that documents the supporting tests and action accompanied by a memo that documents the supporting tests and
results. The procedures of [RFC2026] are expanded in[RFC5657], results. The procedures of [RFC2026] are expanded in[RFC5657],
including sample implementation and interoperability reports. including sample implementation and interoperability reports.
Section 3 of [morton-advance-metrics-01] can serve as a template for Section 3 of [morton-advance-metrics-01] can serve as a template for
a metric RFC report which accompanies the protocol action request to a metric RFC report which accompanies the protocol action request to
the Area Director, including description of the test set-up, the Area Director, including description of the test set-up,
procedures, results for each implementation and conclusions. procedures, results for each implementation and conclusions.
Changes from WG-01 to WG-02:
o Clarification of the number of test streams recommended in section
3.2.
o Clarifications on testing details in sections 3.3 and 3.4.
o Spelling corrections throughout.
Changes from WG -00 to WG -01 draft Changes from WG -00 to WG -01 draft
o Discussion on merits and requirements of a distributed lab test o Discussion on merits and requirements of a distributed lab test
using only local load generators. using only local load generators.
o Proposal of metrics suitable for tests using the proposed o Proposal of metrics suitable for tests using the proposed
measurement configuration. measurement configuration.
o Hint on delay caused by software based L2TPv3 implementation. o Hint on delay caused by software based L2TPv3 implementation.
o Added an appendix with a test configuration allowing remote tests o Added an appendix with a test configuration allowing remote tests
comparing different implementations accross the network. comparing different implementations across the network.
o Proposal for maximum error of "equivalence", based on performance o Proposal for maximum error of "equivalence", based on performance
comparison of identical implementations. This may be useful for comparison of identical implementations. This may be useful for
both ADK and non-ADK comparisons. both ADK and non-ADK comparisons.
Changes from prior ID -02 to WG -00 draft Changes from prior ID -02 to WG -00 draft
o Incorporation of aspects of reporting to support the protocol o Incorporation of aspects of reporting to support the protocol
action request in the Introduction and section 3.5 action request in the Introduction and section 3.5
o Overhaul of sectcion 3.2 regarding tunneling: Added generic o Overhaul of section 3.2 regarding tunneling: Added generic
tunneling requirements and L2TPv3 as an example tunneling tunneling requirements and L2TPv3 as an example tunneling
mechanism fulfilling the tunneling requirements. Removed and mechanism fulfilling the tunneling requirements. Removed and
adapted some of the prior references to other tunneling protocols adapted some of the prior references to other tunneling protocols
o Softened a requirement within section 3.4 (MUST to SHOULD on o Softened a requirement within section 3.4 (MUST to SHOULD on
precision) and removed some comments of the authors. precision) and removed some comments of the authors.
o Updated contact information of one author and added a new author. o Updated contact information of one author and added a new author.
o Added example C++ code of an Anderson-Darling two sample test o Added example C++ code of an Anderson-Darling two sample test
skipping to change at page 6, line 4 skipping to change at page 6, line 11
o Softened a requirement within section 3.4 (MUST to SHOULD on o Softened a requirement within section 3.4 (MUST to SHOULD on
precision) and removed some comments of the authors. precision) and removed some comments of the authors.
o Updated contact information of one author and added a new author. o Updated contact information of one author and added a new author.
o Added example C++ code of an Anderson-Darling two sample test o Added example C++ code of an Anderson-Darling two sample test
implementation. implementation.
Changes from ID -01 to ID -02 version Changes from ID -01 to ID -02 version
o Major editorial review, rewording and clarifications on all o Major editorial review, rewording and clarifications on all
contents. contents.
o Additional text on parrallel testing using VLANs and GRE or o Additional text on parallel testing using VLANs and GRE or
Pseudowire tunnels. Pseudowire tunnels.
o Additional examples and a glossary. o Additional examples and a glossary.
Changes from ID -00 to ID -01 version Changes from ID -00 to ID -01 version
o Addition of a comparison of individual metric implementations o Addition of a comparison of individual metric implementations
against the metric specification (trying to pick up problems and against the metric specification (trying to pick up problems and
solutions for metric advancement [morton-advance-metrics]). solutions for metric advancement [morton-advance-metrics]).
skipping to change at page 9, line 25 skipping to change at page 9, line 34
other ones (due to travel and/or shipping+installation). For the other ones (due to travel and/or shipping+installation). For the
third option, ensuring two identically configured impairment third option, ensuring two identically configured impairment
generators requires well defined test cases and possibly identical generators requires well defined test cases and possibly identical
hard- and software. >>>Comment: for some specific tests, impairment hard- and software. >>>Comment: for some specific tests, impairment
generator accuracy requirements are less-demanding than others, and generator accuracy requirements are less-demanding than others, and
in such cases there is more flexibility in impairment generator in such cases there is more flexibility in impairment generator
configuration. <<< configuration. <<<
It is a fair question, whether the last two options can result in any It is a fair question, whether the last two options can result in any
applicable test set up at all. While an experimental approach is applicable test set up at all. While an experimental approach is
given in Appendix C, the tradeoff that measurement packets of given in Appendix C, the trade off that measurement packets of
different sites pass the path segments but always in a different different sites pass the path segments but always in a different
order of segments probably can't be avoided. order of segments probably can't be avoided.
The question of which option above results in identical networking The question of which option above results in identical networking
conditions and is broadly accepted can't be answered without more conditions and is broadly accepted can't be answered without more
practical experience in comparing implementations. The last proposal practical experience in comparing implementations. The last proposal
has the advantage that, while the measurement equipment is remotely has the advantage that, while the measurement equipment is remotely
distributed, a single network impairment generator and the Internet distributed, a single network impairment generator and the Internet
can be used in combination to impact all measurement flows. can be used in combination to impact all measurement flows.
skipping to change at page 9, line 51 skipping to change at page 10, line 11
compliant to the latter. compliant to the latter.
Further, supported options of a metric implementation SHOULD be Further, supported options of a metric implementation SHOULD be
documented in sufficient detail. The documentation of chosen options documented in sufficient detail. The documentation of chosen options
is RECOMMENDED to minimise (and recognise) differences in the test is RECOMMENDED to minimise (and recognise) differences in the test
setup if two metric implementations are compared. Further, this setup if two metric implementations are compared. Further, this
documentation is used to validate and improve the underlying metric documentation is used to validate and improve the underlying metric
specification option, to remove options which saw no implementation specification option, to remove options which saw no implementation
or which are badly specified from the metric specification to be or which are badly specified from the metric specification to be
promoted to a standard. This documentation SHOULD be made for all promoted to a standard. This documentation SHOULD be made for all
implementation relevant specifications of a metric picked for a implementation-relevant specifications of a metric picked for a
comparison, which aren't explicitly marked as "MUST" or "REQUIRED" in comparison that are not explicitly marked as "MUST" or "REQUIRED" in
the metric specification. This applies for the following sections of the RFC text. This applies for the following sections of all metric
all metric specifications: specifications:
o Singleton Definition of the Metric. o Singleton Definition of the Metric.
o Sample Definition of the Metric. o Sample Definition of the Metric.
o Statistics Definition of the Metric. As statistics are compared o Statistics Definition of the Metric. As statistics are compared
by the test specified here, this documentation is required even in by the test specified here, this documentation is required even in
the case, that the metric specification does not contain a the case, that the metric specification does not contain a
Statistics Definition. Statistics Definition.
skipping to change at page 11, line 26 skipping to change at page 11, line 31
different flows in the network. Measuring by separate parallel probe different flows in the network. Measuring by separate parallel probe
flows results in repeated collection of data. If both measures are flows results in repeated collection of data. If both measures are
combined, WAN network conditions are identical for a number of combined, WAN network conditions are identical for a number of
independent measurement flows, no matter what the network conditions independent measurement flows, no matter what the network conditions
are in detail. are in detail.
Any measurement setup MUST be made to avoid the probing traffic Any measurement setup MUST be made to avoid the probing traffic
itself to impede the metric measurement. The created measurement itself to impede the metric measurement. The created measurement
load MUST NOT result in congestion at the access link connecting the load MUST NOT result in congestion at the access link connecting the
measurement implementation to the WAN. The created measurement load measurement implementation to the WAN. The created measurement load
MUST NOT overload the measurement implementation itself, eg. by MUST NOT overload the measurement implementation itself, e.g., by
causing a high CPU load or by creating imprecisions due to internal causing a high CPU load or by creating imprecisions due to internal
transmit (receive respectively) probe packet collisions. transmit (receive respectively) probe packet collisions.
Tunneling multiple flows reaching a network element on a single Tunneling multiple flows reaching a network element on a single
physical port may allow to transmit all packets of the tunnel via the physical port may allow to transmit all packets of the tunnel via the
same path. Applying tunnels to avoid undesired influence of standard same path. Applying tunnels to avoid undesired influence of standard
routing for measurement purposes is a concept known from literature, routing for measurement purposes is a concept known from literature,
see e.g. GRE encapsulated multicast probing [GU+Duffield]. An see e.g. GRE encapsulated multicast probing [GU+Duffield]. An
existing IP in IP tunnel protocol can be applied to avoid Equal-Cost existing IP in IP tunnel protocol can be applied to avoid Equal-Cost
Multi-Path (ECMP) routing of different measurement streams if it Multi-Path (ECMP) routing of different measurement streams if it
meets the following criteria: meets the following criteria:
o Inner IP packets from different measurement implementations are o Inner IP packets from different measurement implementations are
mapped into a single tunnel with single outer IP origin and mapped into a single tunnel with single outer IP origin and
destination address as well as origing and destination port destination address as well as origin and destination port numbers
numbers which are identical for all packets. which are identical for all packets.
o An easily accessible commodity tunneling protocol allows to carry o An easily accessible commodity tunneling protocol allows to carry
out a metric test from more test sites. out a metric test from more test sites.
o A low operational overhead may enable a broader audience to set up o A low operational overhead may enable a broader audience to set up
a metric test with the desired properties. a metric test with the desired properties.
o The tunneling protocol should be reliable and stable in set up and o The tunneling protocol should be reliable and stable in set up and
operation to avoid disturbances or influence on the test results. operation to avoid disturbances or influence on the test results.
o The tunneling protocol should not incurr any extra cost for those o The tunneling protocol should not incur any extra cost for those
interested in setting up a metric test. interested in setting up a metric test.
An illustration of a test setup with two tunnels and two flows An illustration of a test setup with two tunnels and two flows
between two linecards of one implementation is given in Figure 1. between two linecards of one implementation is given in Figure 1.
Implementation ,---. +--------+ Implementation ,---. +--------+
+~~~~~~~~~~~/ \~~~~~~| Remote | +~~~~~~~~~~~/ \~~~~~~| Remote |
+------->-----F2->-| / \ |->---+ | +------->-----F2->-| / \ |->---+ |
| +---------+ | Tunnel 1( ) | | | | +---------+ | Tunnel 1( ) | | |
| | transmit|-F1->-| ( ) |->+ | | | | transmit|-F1->-| ( ) |->+ | |
skipping to change at page 14, line 21 skipping to change at page 14, line 21
The applicability of one or more of the following tunneling protocols The applicability of one or more of the following tunneling protocols
may be investigated by interested parties if Ethernet over L2TPv3 is may be investigated by interested parties if Ethernet over L2TPv3 is
felt to be not suitable: IP in IP [RFC2003] or Generic Routing felt to be not suitable: IP in IP [RFC2003] or Generic Routing
Encapsulation (GRE) [RFC2784]. RFC 4928 [RFC4928] proposes measures Encapsulation (GRE) [RFC2784]. RFC 4928 [RFC4928] proposes measures
how to avoid ECMP treatment in MPLS networks. how to avoid ECMP treatment in MPLS networks.
L2TP is a commodity tunneling protocol [RFC2661]. By the time of L2TP is a commodity tunneling protocol [RFC2661]. By the time of
writing, L2TPv3 [RFC3931]is the latest version of L2TP. If L2TPv3 is writing, L2TPv3 [RFC3931]is the latest version of L2TP. If L2TPv3 is
applied, software based implementations of this protocol are not applied, software based implementations of this protocol are not
suitable for the test set up, as such implementations may cause suitable for the test set up, as such implementations may cause
uncalculable delay shifts. incalculable delay shifts.
Ethernet Pseudo Wires may also be set up on MPLS networks [RFC4448]. Ethernet Pseudo Wires may also be set up on MPLS networks [RFC4448].
While there's no technical issue with this solution, MPLS interfaces While there's no technical issue with this solution, MPLS interfaces
are mostly found in the network provider domain. Hence not all of are mostly found in the network provider domain. Hence not all of
the above tunneling criteria are met. the above tunneling criteria are met.
Appendix C provides an experimental tunneling set up for metric Appendix C provides an experimental tunneling set up for metric
implementation testing between two (or more) remote sites. implementation testing between two (or more) remote sites.
Each test is repeated several times. WAN conditions may change over Each test SHOULD be conducted multiple times. Sequential testing is
time. Sequential testing is desirable, but may not be a useful possible, but may not be a useful metric test option because WAN
metric test option. It is RECOMMENDED that tests be carried out by conditions are likely to change over time. It is RECOMMENDED that
establishing N different parallel measurement flows. Two or three tests be carried out by establishing at least 2 different parallel
linecards per implementation serving to send or receive measurement measurement flows. Two linecards per implementation that send and
flows should be sufficient to create 5 or more parallel measurement receive measurement flows should be sufficient to create 4 parallel
flows. If three linecards are used, each card sends and receives 2 measurement flows (when each card sends and receives 2 flows). Other
flows. Other options are to separate flows by DiffServ marks options are to separate flows by DiffServ marks (without deploying
(without deploying any QoS in the inner or outer tunnel) or using a any QoS in the inner or outer tunnel) or using a single CBR flow and
single CBR flow and evaluating every n-th singleton to belong to a evaluating every n-th singleton to belong to a specific measurement
specific measurement flow. flow.
Some additional rules to calculate and compare samples have to be Some additional rules to calculate and compare samples have to be
respected to perform a metric test: respected to perform a metric test:
o To compare different probes of a common underlying distribution in o To compare different probes of a common underlying distribution in
terms of metrics characterising a communication network requires terms of metrics characterising a communication network requires
to respect the temporal nature for which the assumption of common to respect the temporal nature for which the assumption of common
underlying distribution may hold. Any singletons or samples to be underlying distribution may hold. Any singletons or samples to be
compared MUST be captured within the same time interval. compared MUST be captured within the same time interval.
skipping to change at page 15, line 28 skipping to change at page 15, line 28
implementation. Note that the Anderson-Darling test detects small implementation. Note that the Anderson-Darling test detects small
differences in distributions fairly well and will fail for high differences in distributions fairly well and will fail for high
number of compared results (RFC2330 mentions an example with 8192 number of compared results (RFC2330 mentions an example with 8192
measurements where an Anderson-Darling test always failed). measurements where an Anderson-Darling test always failed).
o Generally, the Anderson-Darling test is sensitive to differences o Generally, the Anderson-Darling test is sensitive to differences
in the accuracy or bias associated with varying implementations or in the accuracy or bias associated with varying implementations or
test conditions. These dissimilarities may result in differing test conditions. These dissimilarities may result in differing
averages of samples to be compared. An example may be different averages of samples to be compared. An example may be different
packet sizes, resulting in a constant delay difference between packet sizes, resulting in a constant delay difference between
compared samples. Therefore samples to be compared by an Anderson compared samples. Therefore samples to be compared by an
Darling test MAY be calibrated by the difference of the average Anderson-Darling test MAY be calibrated by the difference of the
values of the samples. Any calibration of this kind MUST be average values of the samples. Any calibration of this kind MUST
documented in the test result. be documented in the test result.
3.3. Tests of two or more different implementations against a metric 3.3. Tests of two or more different implementations against a metric
specification specification
RFC2330 expects "a methodology for a given metric [to] exhibit RFC2330 expects "a methodology for a given metric [to] exhibit
continuity if, for small variations in conditions, it results in continuity if, for small variations in conditions, it results in
small variations in the resulting measurements. Slightly more small variations in the resulting measurements. Slightly more
precisely, for every positive epsilon, there exists a positive delta, precisely, for every positive epsilon, there exists a positive delta,
such that if two sets of conditions are within delta of each other, such that if two sets of conditions are within delta of each other,
then the resulting measurements will be within epsilon of each then the resulting measurements will be within epsilon of each
other." A small variation in conditions in the context of the metric other." A small variation in conditions in the context of the metric
test proposed here can be seen as different implementations measuring test proposed here can be seen as different implementations measuring
the same metric along the same path. the same metric along the same path.
IPPM metric specification however allow for implementor options to IPPM metric specifications however allow for implementor options to
the largest possible degree. It can't be expected that two the largest possible degree. It can not be expected that two
implementors pick identical options for the implementations. implementors pick identical value ranges in options for the
Implementors SHOULD to the highest degree possible pick the same implementations. Implementors SHOULD to the highest degree possible
configurations for their systems when comparing their implementations pick the same configurations for their systems when comparing their
by a metric test. implementations by a metric test.
In some cases, a goodness of fit test may not be possible or show In some cases, a goodness of fit test may not be possible or show
disappointing results. To clarify the difficulties arising from disappointing results. To clarify the difficulties arising from
different implementation options, the individual options picked for different implementation options, the individual options picked for
every compared implementation SHOULD be documented in sufficient every compared implementation SHOULD be documented in sufficient
detail. Based on this documentation, the underlying metric detail. Based on this documentation, the underlying metric
specification should be improved before it is promoted to a standard. specification should be improved before it is promoted to a standard.
The same statistical test as applicable to quantify precision of a The same statistical test as applicable to quantify precision of a
single metric implementation MUST be passed to compare metric single metric implementation MUST be used to compare metric result
conformance of different implementations. To document compatibility, equivalence for different implementations. To document
the smallest measurement resolution at which the compared compatibility, the smallest measurement resolution at which the
implementations passed the ADK sample test MUST be documented. compared implementations passed the ADK sample test MUST be
documented.
For different implementations of the same metric, "variations in For different implementations of the same metric, "variations in
conditions" are reasonably expected. The ADK test comparing samples conditions" are reasonably expected. The ADK test comparing samples
of the different implementations may result in a lower precision than of the different implementations MAY result in a lower precision than
the test for precision of each implementation individually. the test for precision in the same-implementation comparison.
3.4. Clock synchronisation 3.4. Clock synchronisation
Clock synchronization effects require special attention. Accuracy of Clock synchronization effects require special attention. Accuracy of
one-way active delay measurements for any metrics implementation one-way active delay measurements for any metrics implementation
depends on clock synchronization between the source and destination depends on clock synchronization between the source and destination
of tests. Ideally, one-way active delay measurement (RFC 2679, of tests. Ideally, one-way active delay measurement (RFC 2679,
[RFC2679]) test endpoints either have direct access to independent [RFC2679]) test endpoints either have direct access to independent
GPS or CDMA-based time sources or indirect access to nearby NTP GPS or CDMA-based time sources or indirect access to nearby NTP
primary (stratum 1) time sources, equipped with GPS receivers. primary (stratum 1) time sources, equipped with GPS receivers.
skipping to change at page 17, line 11 skipping to change at page 17, line 12
Examination of the second condition requires RTT measurement for Examination of the second condition requires RTT measurement for
reference, e.g., based on TWAMP (RFC5357, RFC 5357 [RFC5357]), in reference, e.g., based on TWAMP (RFC5357, RFC 5357 [RFC5357]), in
conjunction with one-way delay measurement. conjunction with one-way delay measurement.
Specification of X% to strike a balance between identification of Specification of X% to strike a balance between identification of
unreliable one-way delay samples and misidentification of reliable unreliable one-way delay samples and misidentification of reliable
samples under a wide range of Internet path RTTs probably requires samples under a wide range of Internet path RTTs probably requires
further study. further study.
An IPPM compliant metric implementation whose measurement requires An implementation of an RFC that requires synchronized clocks is
synchronized clocks is however expected to provide precise expected to provide precise measurement results in order to claim
measurement results. Any IPPM metric implementation SHOULD be of a that the metric measured is compliant.
precision of 1 ms (+/- 500 us) with a confidence of 95% if the metric
is captured along an Internet path which is stable and not congested IF an implementation publishes a specification of its precision, such
during a measurement duration of an hour or more. as "a precision of 1 ms (+/- 500 us) with a confidence of 95%", then
the specification SHOULD be met over a useful measurement duration.
For example, if the metric is measured along an Internet path which
is stable and not congested, then the precision specification SHOULD
be met over durations of an hour or more.
3.5. Recommended Metric Verification Measurement Process 3.5. Recommended Metric Verification Measurement Process
In order to meet their obligations under the IETF Standards Process In order to meet their obligations under the IETF Standards Process
the IESG must be convinced that each metric specification advanced to the IESG must be convinced that each metric specification advanced to
Draft Standard or Internet Standard status is clearly written, that Draft Standard or Internet Standard status is clearly written, that
there are the required multiple verifiably equivalent there are the a sufficient number of verified equivalent
implementations, and that all options have been implemented. implementations, and that all options have been implemented.
In the context of this document, metrics are designed to measure some In the context of this document, metrics are designed to measure some
characteristic of a data network. An aim of any metric definition characteristic of a data network. An aim of any metric definition
should be that it should be specified in a way that can reliably should be that it should be specified in a way that can reliably
measure the specific characteristic in a repeatable way across measure the specific characteristic in a repeatable way across
multiple independent implementations. multiple independent implementations.
Each metric, statistic or option of those to be validated MUST be Each metric, statistic or option of those to be validated MUST be
compared against a reference measurement or another implementation by compared against a reference measurement or another implementation by
skipping to change at page 21, line 19 skipping to change at page 21, line 24
up than described here. Spatial and temporal effects combine in the up than described here. Spatial and temporal effects combine in the
case of packet re-ordering and measurements with different packet case of packet re-ordering and measurements with different packet
rates may always lead to different results. rates may always lead to different results.
As specified above, 5 singletons are the recommended basis to As specified above, 5 singletons are the recommended basis to
minimise interference of random events with the statistical test minimise interference of random events with the statistical test
proposed by this document. In the case of ratio measurements (like proposed by this document. In the case of ratio measurements (like
packet loss), the underlying sum of basic events, against the which packet loss), the underlying sum of basic events, against the which
the metric's monitored singletons are "rated", determines the the metric's monitored singletons are "rated", determines the
resolution of the test. A packet loss statistic with a resolution of resolution of the test. A packet loss statistic with a resolution of
1% requires one packet loss statistic-datapoint to consist of 500 1% requires one packet loss statistic-data point to consist of 500
delay singletons (of which at least 5 were lost). To compare EDFs on delay singletons (of which at least 5 were lost). To compare EDFs on
packet loss requires one hundred such statistics per flow. That packet loss requires one hundred such statistics per flow. That
means, all in all at least 50 000 delay singletons are required per means, all in all at least 50 000 delay singletons are required per
single measurement flow. Live network packet loss is assumed to be single measurement flow. Live network packet loss is assumed to be
present during main traffic hours only. Let this interval be 5 present during main traffic hours only. Let this interval be 5
hours. The required minimum rate of a single measurement flow in hours. The required minimum rate of a single measurement flow in
that case is 2.8 packets/sec (assuming a loss of 1% during 5 hours). that case is 2.8 packets/sec (assuming a loss of 1% during 5 hours).
If this measurement is too demanding under live network conditions, If this measurement is too demanding under live network conditions,
an impairment generator should be used. an impairment generator should be used.
skipping to change at page 22, line 11 skipping to change at page 22, line 16
singleton values, such as with a loss metric, or a duplication singleton values, such as with a loss metric, or a duplication
metric. Appendix A indicates how the ADK will work for 0ne-way metric. Appendix A indicates how the ADK will work for 0ne-way
delay, and should be likewise applicable to distributions of delay delay, and should be likewise applicable to distributions of delay
variation. variation.
Proposal: the implementation with the largest difference in Proposal: the implementation with the largest difference in
homogeneous comparison results is the lower bound on the equivalence homogeneous comparison results is the lower bound on the equivalence
threshold, noting that there may be other systematic errors to threshold, noting that there may be other systematic errors to
account for when comparing between implementations. account for when comparing between implementations.
Thus, when evaluationg equivalence in cross-implementation results: Thus, when evaluating equivalence in cross-implementation results:
Maximum_Error = Same_Implementation_Error + Systematic_Error Maximum_Error = Same_Implementation_Error + Systematic_Error
and only the systematic error need be decided beforehand. and only the systematic error need be decided beforehand.
In the case of ADK comparison, the largest same-implementation In the case of ADK comparison, the largest same-implementation
resolution of distribution equivalence can be used as a limit on resolution of distribution equivalence can be used as a limit on
cross-implementation resolutions (at the same confidence level). cross-implementation resolutions (at the same confidence level).
4. Acknowledgements 4. Acknowledgements
 End of changes. 25 change blocks. 
54 lines changed or deleted 69 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/