 1/draftietfippmmetrictest03.txt 20111024 21:14:03.506670670 +0200
+++ 2/draftietfippmmetrictest04.txt 20111024 21:14:03.582670560 +0200
@@ 1,23 +1,23 @@
Internet Engineering Task Force R. Geib, Ed.
InternetDraft Deutsche Telekom
Intended status: Standards Track A. Morton
Expires: December 31, 2011 AT&T Labs
+Expires: April 26, 2012 AT&T Labs
R. Fardid
Cariden Technologies
A. Steinmitz
Deutsche Telekom
 June 29, 2011
+ October 24, 2011
IPPM standard advancement testing
 draftietfippmmetrictest03
+ draftietfippmmetrictest04
Abstract
This document specifies tests to determine if multiple independent
instantiations of a performance metric RFC have implemented the
specifications in the same way. This is the performance metric
equivalent of interoperability, required to advance RFCs along the
standards track. Results from different implementations of metric
RFCs will be collected under the same underlying network conditions
and compared using state of the art statistical methods. The goal is
@@ 33,21 +33,21 @@
InternetDrafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as InternetDrafts. The list of current Internet
Drafts is at http://datatracker.ietf.org/drafts/current/.
InternetDrafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use InternetDrafts as reference
material or to cite them other than as "work in progress."
 This InternetDraft will expire on December 31, 2011.
+ This InternetDraft will expire on April 26, 2012.
Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/licenseinfo) in effect on the date of
publication of this document. Please review these documents
@@ 76,21 +76,22 @@
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22
5. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 22
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
7. Security Considerations . . . . . . . . . . . . . . . . . . . 23
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8.1. Normative References . . . . . . . . . . . . . . . . . . . 23
8.2. Informative References . . . . . . . . . . . . . . . . . . 24
Appendix A. An example on a Oneway Delay metric validation . . . 25
A.1. Compliance to Metric specification requirements . . . . . 25
A.2. Examples related to statistical tests for Oneway Delay . 27
 Appendix B. AndersonDarling 2 sample C++ code . . . . . . . . . 29
+ Appendix B. AndersonDarling Ksample Reference and 2 sample
+ C++ code . . . . . . . . . . . . . . . . . . . . . . 29
Appendix C. Glossary . . . . . . . . . . . . . . . . . . . . . . 37
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38
1. Introduction
The Internet Standards Process RFC2026 [RFC2026] requires that for a
IETF specification to advance beyond the Proposed Standard level, at
least two genetically unrelated implementations must be shown to
interoperate correctly with all features and options. This
requirement can be met by supplying:
@@ 186,20 +187,25 @@
The metric RFC advancement process begins with a request for protocol
action accompanied by a memo that documents the supporting tests and
results. The procedures of [RFC2026] are expanded in[RFC5657],
including sample implementation and interoperability reports.
Section 3 of [mortonadvancemetrics01] can serve as a template for
a metric RFC report which accompanies the protocol action request to
the Area Director, including description of the test setup,
procedures, results for each implementation and conclusions.
+ Changes from WG03 to WG04:
+
+ o Revisions to Appendix B code and add reference to "R" in the
+ Appendix and the text of section 3.6.
+
Changes from WG02 to WG03:
o Changes stemming from experiments that implemented this plan, in
general.
o Adoption of the VLAN loopback figure in the main body of the memo
(section 3.2).
Changes from WG01 to WG02:
@@ 986,21 +992,23 @@
differences such that the connectivity differences of the cross
implementation tests are also experienced and measured by the same
implementation.
Comparative results for the same implementation represent a bound on
crossimplementation equivalence. This should be particularly useful
when the metric does *not* produces a continuous distribution of
singleton values, such as with a loss metric, or a duplication
metric. Appendix A indicates how the ADK will work for 0neway
delay, and should be likewise applicable to distributions of delay
 variation.
+ variation. Appendix B discusses two possible ways to perform the ADK
+ analysis, the R statistical language [Rtool] with ADK package [Radk]
+ and C++ code.
Proposal: the implementation with the largest difference in
homogeneous comparison results is the lower bound on the equivalence
threshold, noting that there may be other systematic errors to
account for when comparing between implementations.
Thus, when evaluating equivalence in crossimplementation results:
Maximum_Error = Same_Implementation_Error + Systematic_Error
@@ 1026,21 +1034,21 @@
Scott Bradner, Vern Paxson and Allison Mankin drafted bradner
metrictest [bradnermetrictest], and major parts of it are included
in this document.
6. IANA Considerations
This memo includes no request to IANA.
7. Security Considerations
 This draft does not raise any specific security issues.
+ This memo does not raise any specific security issues.
8. References
8.1. Normative References
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003,
October 1996.
[RFC2026] Bradner, S., "The Internet Standards Process  Revision
3", BCP 9, RFC 2026, October 1996.
@@ 1105,20 +1113,29 @@
[GU+Duffield]
Gu, Y., Duffield, N., Breslau, L., and S. Sen, "GRE
Encapsulated Multicast Probing: A Scalable Technique for
Measuring OneWay Loss", SIGMETRICS'07 San Diego,
California, USA, June 2007.
[RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
Babiarz, "A TwoWay Active Measurement Protocol (TWAMP)",
RFC 5357, October 2008.
+ [Radk] Scholz, F., "adk: AndersonDarling KSample Test and
+ Combinations of Such Tests. R package version 1.0.", ,
+ 2008.
+
+ [Rtool] R Development Core Team, "R: A language and environment
+ for statistical computing. R Foundation for Statistical
+ Computing, Vienna, Austria. ISBN 3900051070, URL
+ http://www.Rproject.org/", , 2011.
+
[Rule of thumb]
Hardy, M., "Confidence interval", March 2010.
[bradnermetrictest]
Bradner, S., Mankin, A., and V. Paxson, "Advancement of
metrics specifications on the IETF Standards Track",
draft bradnermetricstest03, (work in progress),
July 2007.
[mortonadvancemetrics]
@@ 1292,21 +1309,39 @@
table 1. Comparing column 1 and column 3 of the table by an ADK test
shows, that the data contained in these columns passes an ADK tests
with 95% confidence.
>>> Comment: Extensive averaging was used in this example, because of
the vastly different sampling frequencies. As a result, the
distributions compared do not exactly align with a metric in
[RFC2679], but illustrate the ADK process adequately.
Appendix B. AndersonDarling 2 sample C++ code
+Appendix B. AndersonDarling Ksample Reference and 2 sample C++ code
+
+ There are many statistical tools available, and this Appendix
+ describes two that are familiar to the authors.
+
+ The "R tool" is a language and commandline environment for
+ statistical computing and plotting [Rtool]. With the optional "adk"
+ package installed [Radk], it can perform individual and combined
+ sample ADK computations. The user must consult the package
+ documentation and the original paper [ADK] to interpret the results,
+ but this is as it should be.
+
+ The C++ code below will perform a 2sample AD comparison when
+ compiled and presented with two column vectors in a file (using white
+ space as separation). This version contains modifications to use the
+ vectors and run as a standalone module by Wes Eddy, Sept 2011. The
+ status of the comparison can be checked on the command line with "$
+ echo $?" or the last line can be replaced with a printf statement for
+ adk_result instead.
/* Routines for computing the AndersonDarling 2 sample
* test statistic.
*
* Implemented based on the description in
* "AndersonDarling K Sample Test" Heckert, Alan and
* Filliben, James, editors, Dataplot Reference Manual,
* Chapter 15 Auxiliary, NIST, 2004.
* Official Reference by 2010
* Heckert, N. A. (2001). Dataplot website at the
@@ 1315,146 +1350,132 @@
* June 2001.
*/
#include
#include
#include
#include
using namespace std;
+ int main() {
vector vec1, vec2;
double adk_result;
 double adk_criterium = 1.993;

 /* vec1 and vec2 to be initialised with sample 1 and
 * sample 2 values in ascending order.
 */

 /* example for iterating the vectors
 * for(vector::iterator it = vec1>begin();
 * it != vec1>end(); it++
 * {
 * cout << *it << endl;
 * }
 */

static int k, val_st_z_samp1, val_st_z_samp2,
val_eq_z_samp1, val_eq_z_samp2,
j, n_total, n_sample1, n_sample2, L,
max_number_samples, line, maxnumber_z;
static int column_1, column_2;
static double adk, n_value, z, sum_adk_samp1,
sum_adk_samp2, z_aux;
static double H_j, F1j, hj, F2j, denom_1_aux, denom_2_aux;
static bool next_z_sample2, equal_z_both_samples;
static int stop_loop1, stop_loop2, stop_loop3,old_eq_line2,
old_eq_line1;
static double adk_criterium = 1.993;
+ /* vec1 and vec2 to be initialised with sample 1 and
+ * sample 2 values in ascending order */
+ while (!cin.eof()) {
+ double f1, f2;
+ cin >> f1;
+ cin >> f2;
+ vec1.push_back(f1);
+ vec2.push_back(f2);
+ }
+
k = 2;
 n_sample1 = vec1>size()  1;
 n_sample2 = vec2>size()  1;
+ n_sample1 = vec1.size()  1;
+ n_sample2 = vec2.size()  1;
// 1 because vec[0] is a dummy value

n_total = n_sample1 + n_sample2;
/* value equal to the line with a value = zj in sample 1.
* Here j=1, so the line is 1.
*/

val_eq_z_samp1 = 1;
/* value equal to the line with a value = zj in sample 2.
* Here j=1, so the line is 1.
*/

val_eq_z_samp2 = 1;
/* value equal to the last line with a value < zj
* in sample 1. Here j=1, so the line is 0.
*/

val_st_z_samp1 = 0;
/* value equal to the last line with a value < zj
* in sample 1. Here j=1, so the line is 0.
*/

val_st_z_samp2 = 0;
sum_adk_samp1 = 0;
sum_adk_samp2 = 0;
j = 1;
// as mentioned above, j=1

equal_z_both_samples = false;
+
next_z_sample2 = false;
//assuming the next z to be of sample 1

stop_loop1 = n_sample1 + 1;
// + 1 because vec[0] is a dummy, see n_sample1 declaration

stop_loop2 = n_sample2 + 1;
stop_loop3 = n_total + 1;
/* The required z values are calculated until all values
* of both samples have been taken into account. See the
* lines above for the stoploop values. Construct required
* to avoid a mathematical operation in the While condition
*/

while (((stop_loop1 > val_eq_z_samp1)
 (stop_loop2 > val_eq_z_samp2)) && stop_loop3 > j)
{
if(val_eq_z_samp1 < n_sample1+1)
{

/* here, a preliminary zj value is set.
* See below how to calculate the actual zj.
*/

 z = (*vec1)[val_eq_z_samp1];
+ z = vec1[val_eq_z_samp1];
/* this while sequence calculates the number of values
* equal to z.
*/

while ((val_eq_z_samp1+1 < n_sample1)
 && z == (*vec1)[val_eq_z_samp1+1] )
+ && z == vec1[val_eq_z_samp1+1] )
{
val_eq_z_samp1++;
}
}
else
{
val_eq_z_samp1 = 0;
val_st_z_samp1 = n_sample1;
// this should be val_eq_z_samp1  1 = n_sample1
}
if(val_eq_z_samp2 < n_sample2+1)
{
 z_aux = (*vec2)[val_eq_z_samp2];;
+ z_aux = vec2[val_eq_z_samp2];;
/* this while sequence calculates the number of values
* equal to z_aux
*/
while ((val_eq_z_samp2+1 < n_sample2)
 && z_aux == (*vec2)[val_eq_z_samp2+1] )
+ && z_aux == vec2[val_eq_z_samp2+1] )
{
val_eq_z_samp2++;
}
/* the smaller of the two actual data values is picked
* as the next zj.
*/
if(z > z_aux)
{
@@ 1684,29 +1700,30 @@
next_z_sample2 = false;
equal_z_both_samples = false;
/* index to count the z. It is only required to prevent
* the while slope to execute endless
*/
j++;
}
// calculating the adk value is the final step.

adk_result = (double) (n_total  1) / (n_total
* n_total * (k  1))
* (sum_adk_samp1 / n_sample1
+ sum_adk_samp2 / n_sample2);
/* if(adk_result <= adk_criterium)
* adk_2_sample test is passed
*/
+ return adk_result <= adk_criterium;
+ }
Figure 5
Appendix C. Glossary
+++
 ADK  AndersonDarling KSample test, a test used to 
  check whether two samples have the same statistical 
  distribution. 
 ECMP  Equal Cost Multipath, a load balancing mechanism 