draft-ietf-ippm-tcp-throughput-tm-06.txt | draft-ietf-ippm-tcp-throughput-tm-07.txt | |||
---|---|---|---|---|
Network Working Group B. Constantine | Network Working Group B. Constantine | |||
Internet-Draft JDSU | Internet-Draft JDSU | |||
Intended status: Informational G. Forget | Intended status: Informational G. Forget | |||
Expires: February 27, 2011 Bell Canada (Ext. Consultant) | Expires: March 24, 2011 Bell Canada (Ext. Consultant) | |||
L. Jorgenson | ||||
nooCore | ||||
Reinhard Schrage | Reinhard Schrage | |||
Schrage Consulting | Schrage Consulting | |||
August 27, 2010 | September 24, 2010 | |||
TCP Throughput Testing Methodology | Framework for TCP Throughput Testing | |||
draft-ietf-ippm-tcp-throughput-tm-06.txt | draft-ietf-ippm-tcp-throughput-tm-07.txt | |||
Abstract | Abstract | |||
This memo describes a methodology for measuring sustained TCP | This document describes a framework for measuring sustained TCP | |||
throughput performance in an end-to-end managed network environment. | throughput performance in an end-to-end managed network environment. | |||
This memo is intended to provide a practical approach to help users | This document is intended to provide a practical methodology to help | |||
validate the TCP layer performance of a managed network, which should | users validate the TCP layer performance of a managed network, which | |||
provide a better indication of end-user application level experience. | should provide a better indication of end-user experience. In the | |||
In the methodology, various TCP and network parameters are identified | framework, various TCP and network parameters are identified that | |||
that should be tested as part of the network verification at the TCP | should be tested as part of the network verification at the TCP | |||
layer. | layer. | |||
Requirements Language | Requirements Language | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
Status of this Memo | Status of this Memo | |||
skipping to change at page 1, line 47 | skipping to change at page 1, line 45 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on February 27, 2011. | This Internet-Draft will expire on March 24, 2011. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Goals of this Methodology. . . . . . . . . . . . . . . . . . . 4 | 1.1 Test Set-up and Terminology . . . . . . . . . . . . . . . 4 | |||
2.1 TCP Equilibrium State Throughput . . . . . . . . . . . . . 5 | 2. Scope and Goals of this methodology. . . . . . . . . . . . . . 5 | |||
2.2 Metrics for TCP Throughput Tests . . . . . . . . . . . . . 6 | 2.1 TCP Equilibrium State Throughput . . . . . . . . . . . . . 6 | |||
3. TCP Throughput Testing Methodology . . . . . . . . . . . . . . 7 | 2.2 Metrics for TCP Throughput Tests . . . . . . . . . . . . . 7 | |||
3.1 Determine Network Path MTU . . . . . . . . . . . . . . . . 8 | 3. TCP Throughput Testing Methodology . . . . . . . . . . . . . . 9 | |||
3.2. Baseline Round-trip Delay and Bandwidth. . . . . . . . . . 10 | 3.1 Determine Network Path MTU . . . . . . . . . . . . . . . . 11 | |||
3.2.1 Techniques to Measure Round Trip Time . . . . . . . . 10 | 3.2. Baseline Round Trip Time and Bandwidth . . . . . . . . . . 13 | |||
3.2.2 Techniques to Measure End-end Bandwidth . . . . . . . 11 | 3.2.1 Techniques to Measure Round Trip Time . . . . . . . . 13 | |||
3.3. TCP Throughput Tests . . . . . . . . . . . . . . . . . . . 11 | 3.2.2 Techniques to Measure End-end Bandwidth . . . . . . . 14 | |||
3.3.1 Calculate Optimum TCP Window Size. . . . . . . . . . . 12 | 3.3. TCP Throughput Tests . . . . . . . . . . . . . . . . . . . 14 | |||
3.3.2 Conducting the TCP Throughput Tests. . . . . . . . . . 14 | 3.3.1 Calculate Optimum TCP Window Size. . . . . . . . . . . 15 | |||
3.3.3 Single vs. Multiple TCP Connection Testing . . . . . . 15 | 3.3.2 Conducting the TCP Throughput Tests. . . . . . . . . . 17 | |||
3.3.4 Interpretation of the TCP Throughput Results . . . . . 16 | 3.3.3 Single vs. Multiple TCP Connection Testing . . . . . . 18 | |||
3.4. Traffic Management Tests . . . . . . . . . . . . . . . . . 16 | 3.3.4 Interpretation of the TCP Throughput Results . . . . . 19 | |||
3.4.1 Traffic Shaping Tests. . . . . . . . . . . . . . . . . 16 | 3.4. Traffic Management Tests . . . . . . . . . . . . . . . . . 19 | |||
3.4.1.1 Interpretation of Traffic Shaping Test Results. . . 17 | 3.4.1 Traffic Shaping Tests. . . . . . . . . . . . . . . . . 20 | |||
3.4.2 RED Tests. . . . . . . . . . . . . . . . . . . . . . . 18 | 3.4.1.1 Interpretation of Traffic Shaping Test Results. . . 20 | |||
3.4.2.1 Interpretation of RED Results . . . . . . . . . . . 18 | 3.4.2 RED Tests. . . . . . . . . . . . . . . . . . . . . . . 21 | |||
4. Security Considerations . . . . . . . . . . . . . . . . . . . 18 | 3.4.2.1 Interpretation of RED Results . . . . . . . . . . . 21 | |||
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 | 4. Security Considerations . . . . . . . . . . . . . . . . . . . 22 | |||
5.1. Registry Specification . . . . . . . . . . . . . . . . . . 19 | 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | |||
5.2. Registry Contents . . . . . . . . . . . . . . . . . . . . 19 | 5.1. Registry Specification . . . . . . . . . . . . . . . . . . 22 | |||
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 | 5.2. Registry Contents . . . . . . . . . . . . . . . . . . . . 22 | |||
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
7.1 Normative References . . . . . . . . . . . . . . . . . . . 19 | 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |||
7.2 Informative References . . . . . . . . . . . . . . . . . . 20 | 7.1 Normative References . . . . . . . . . . . . . . . . . . . 22 | |||
7.2 Informative References . . . . . . . . . . . . . . . . . . 23 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
1. Introduction | 1. Introduction | |||
Testing an operational network prior to customer activation is | ||||
referred to as "turn-up" testing and the SLA (Service Level | ||||
Agreement) is generally based upon Layer 2/3 packet throughput, | ||||
delay, loss and jitter. | ||||
Network providers are coming to the realization that Layer 2/3 | Network providers are coming to the realization that Layer 2/3 | |||
testing and TCP layer testing are required to more adequately ensure | testing and TCP layer testing are required to more adequately ensure | |||
end-user satisfaction. Therefore, the network provider community | end-user satisfaction. Testing an operational network prior to | |||
desires to measure network throughput performance at the TCP layer. | customer activation is referred to as "turn-up" testing and the SLA | |||
Measuring TCP throughput provides a meaningful measure with respect | (Service Level Agreement) is generally based upon Layer 2/3 | |||
to the end user experience (and ultimately reach some level of | information rate, packet delay, loss and delay variation. Therefore, | |||
TCP testing interoperability which does not exist today). | the network provider community desires to measure network throughput | |||
performance at the TCP layer. Measuring TCP throughput provides a | ||||
meaningful measure with respect to the end user experience (and | ||||
ultimately reach some level of TCP testing interoperability which | ||||
does not exist today). | ||||
Additionally, end-users (business enterprises) seek to conduct | Additionally, end-users (business enterprises) seek to conduct | |||
repeatable TCP throughput tests between enterprise locations. Since | repeatable TCP throughput tests between enterprise locations. Since | |||
these enterprises rely on the networks of the providers, a common | these enterprises rely on the networks of the providers, a common | |||
test methodology (and metrics) would be equally beneficial to both | test methodology (and metrics) would be equally beneficial to both | |||
parties. | parties. | |||
So the intent behind this TCP throughput draft is to define | So the intent behind this TCP throughput methodology is to define | |||
a methodology for testing sustained TCP layer performance. In this | a methodology for testing sustained TCP layer performance. In this | |||
document, sustained TCP throughput is that amount of data per unit | document, sustained TCP throughput is that amount of data per unit | |||
time that TCP transports during equilibrium (steady state), i.e. | time that TCP transports during equilibrium (steady state), i.e. | |||
after the initial slow start phase. We refer to this state as TCP | after the initial slow start phase. We refer to this state as TCP | |||
Equilibrium, and that the equilibrium throughput is the maximum | Equilibrium, and that the equilibrium throughput is the maximum | |||
achievable for the TCP connection(s). | achievable for the TCP connection(s). | |||
There are many variables to consider when conducting a TCP throughput | There are many variables to consider when conducting a TCP throughput | |||
test and this methodology focuses on some of the most common | test and this methodology focuses on some of the most common | |||
parameters that should be considered such as: | parameters that MUST be considered such as: | |||
- Path MTU and Maximum Segment Size (MSS) | - Path MTU and Maximum Segment Size (MSS) | |||
- RTT and Bottleneck BW | - RTT and Bottleneck BW | |||
- Ideal TCP Window (Bandwidth Delay Product) | - Ideal TCP Window (Bandwidth Delay Product) | |||
- Single Connection and Multiple Connection testing | - Single Connection and Multiple Connection testing | |||
One other important note, it is highly recommended that traditional | This methodology proposes a test which SHOULD be performed in | |||
Layer 2/3 type tests are conducted to verify the integrity of the | addition to traditional Layer 2/3 type tests, which are conducted to | |||
network before conducting TCP tests. Examples include RFC 2544 | verify the integrity of the network before conducting TCP tests. | |||
[RFC2544], iperf (UDP mode), or manual packet layer test techniques | Examples include iperf (UDP mode) or manual packet layer test | |||
where packet throughput, loss, and delay measurements are conducted. | techniques where packet throughput, loss, and delay measurements are | |||
conducted. When available, standardized testing similar to RFC 2544 | ||||
[RFC2544] but adapted for use on operational networks may be used | ||||
(because RFC 2544 methods are not intended for use outside the lab | ||||
environment). | ||||
2. Goals of this Methodology | 1.1 Test Set-up and Terminology | |||
This section provides a general overview of the test configuration | ||||
for this methodology. The test is intended to be conducted on an | ||||
end-end operational network, so there are multitudes of network | ||||
architectures and topologies that can be tested. This test set-up | ||||
diagram is very general and the main intent is to illustrate the | ||||
segmentation of the end user and network provider domains. | ||||
Common terminologies used in the test methodology are: | ||||
- Customer Provided Equipment (CPE), refers to customer owned | ||||
- Customer Edge (CE), refers to provider owned demarcation device | ||||
- Provider Edge (PE), refers to provider located distribution | ||||
equipment | ||||
- P (Provider), refers to provider core network equipment | ||||
- Bottleneck Bandwidth*, lowest bandwidth along the complete network | ||||
path | ||||
- Round-Trip Time (RTT), refers to Layer 4 back and forth delay | ||||
- Round-Trip Delay (RTD), refers to Layer 1 back and forth delay | ||||
- Network Under Test (NUT), refers to the tested IP network path | ||||
- TCP Throughput Test Device (TCP TTD), refers to compliant TCP | ||||
host that generates traffic and measures metrics as defined in | ||||
this methodology | ||||
+----+ +----+ +----+ +----+ +---+ +---+ +----+ +----+ +----+ +----+ | ||||
| | | | | | | | | | | | | | | | | | | | | ||||
| TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP| | ||||
| TD | | | | |BB| | | | | | | |BB| | | | | TD | | ||||
+----+ +----+ +----+**+----+ +---+ +---+ +----+**+----+ +----+ +----+ | ||||
<------------------------ NUT ------------------------> | ||||
<-------------------------RTT ------------------------> | ||||
* Bottleneck Bandwidth and Bandwidth are used synonomously in this | ||||
document. | ||||
** Most of the time the Bottleneck Bandwidth is in the access portion | ||||
of the wide area network (CE - PE) | ||||
Note that the NUT may consist of a variety of devices including (and | ||||
NOT limited to): load balancers, proxy servers, WAN acceleration | ||||
devices. The detailed topology of the NUT MUST be considered when | ||||
conducting the TCP throughput tests, but this methodology makes no | ||||
attempt to characterize TCP performance related to specific network | ||||
architectures. | ||||
2. Scope and Goals of this Methodology | ||||
Before defining the goals of this methodology, it is important to | Before defining the goals of this methodology, it is important to | |||
clearly define the areas that are not intended to be measured or | clearly define the areas that are out-of-scope for this | |||
analyzed by such a methodology. | methodology. | |||
- The methodology is not intended to predict TCP throughput | - The methodology is not intended to predict TCP throughput | |||
behavior during the transient stages of a TCP connection, such | behavior during the transient stages of a TCP connection, such | |||
as initial slow start. | as initial slow start. | |||
- The methodology is not intended to definitively benchmark TCP | - The methodology is not intended to definitively benchmark TCP | |||
implementations of one OS to another, although some users may find | implementations of one OS to another, although some users MAY find | |||
some value in conducting qualitative experiments | some value in conducting qualitative experiments. | |||
- The methodology is not intended to provide detailed diagnosis | - The methodology is not intended to provide detailed diagnosis | |||
of problems within end-points or the network itself as related to | of problems within end-points or the network itself as related to | |||
non-optimal TCP performance, although a results interpretation | non-optimal TCP performance, although a results interpretation | |||
section for each test step may provide insight into potential | section for each test step MAY provide insight into potential | |||
issues within the network | issues within the network. | |||
- The methodology does not propose a method to operate permanently | ||||
with high measurement loads. TCP performance and optimization data of | ||||
operational networks MAY be captured and evaluated by using data of | ||||
the "TCP Extended Statistics MIB" [RFC4898]. | ||||
- The methodology is not intended to measure TCP throughput as part | ||||
of an SLA, or to compare the TCP performance between service | ||||
providers or to compare between implementations of this methodology | ||||
(test equipment). | ||||
In contrast to the above exclusions, the goals of this methodology | In contrast to the above exclusions, the goals of this methodology | |||
are to define a method to conduct a structured, end-to-end | are to define a method to conduct a structured, end-to-end | |||
assessment of sustained TCP performance within a managed business | assessment of sustained TCP performance within a managed business | |||
class IP network. A key goal is to establish a set of "best | class IP network. A key goal is to establish a set of "best | |||
practices" that an engineer should apply when validating the | practices" that an engineer SHOULD apply when validating the | |||
ability of a managed network to carry end-user TCP applications. | ability of a managed network to carry end-user TCP applications. | |||
Some specific goals are to: | The specific goals are to: | |||
- Provide a practical test approach that specifies the more well | - Provide a practical test approach that specifies well understood, | |||
understood (and end-user configurable) TCP parameters such as Window | end-user configurable TCP parameters such as TCP Window size, MSS | |||
size, MSS (Maximum Segment Size), # connections, and how these affect | (Maximum Segment Size), number of connections, and how these affect | |||
the outcome of TCP performance over a network. | the outcome of TCP performance over a network. | |||
- Provide specific test conditions (link speed, RTT, window size, | - Provide specific test conditions (link speed, RTT, TCP Window size, | |||
etc.) and maximum achievable TCP throughput under TCP Equilbrium | etc.) and maximum achievable TCP throughput under TCP Equilibrium | |||
conditions. For guideline purposes, provide examples of these test | conditions. For guideline purposes, provide examples of these test | |||
conditions and the maximum achievable TCP throughput during the | conditions and the maximum achievable TCP throughput during the | |||
equilibrium state. Section 2.1 provides specific details concerning | equilibrium state. Section 2.1 provides specific details concerning | |||
the definition of TCP Equilibrium within the context of this draft. | the definition of TCP Equilibrium within the context of this | |||
methodology. | ||||
- Define two (2) basic metrics that can be used to compare the | - Define three (3) basic metrics that can be used to compare the | |||
performance of TCP connections under various network conditions | performance of TCP connections under various network conditions. | |||
- In test situations where the recommended procedure does not yield | - In test situations where the RECOMMENDED procedure does not yield | |||
the maximum achievable TCP throughput result, this draft provides | the maximum achievable TCP throughput result, this methodology | |||
some possible areas within the end host or network that should be | provides some possible areas within the end host or network that | |||
considered for investigation (although again, this draft is not | SHOULD be considered for investigation (although again, this | |||
intended to provide a detailed diagnosis of these issues) | methodology is not intended to provide a detailed diagnosis of these | |||
issues). | ||||
2.1 TCP Equilibrium State Throughput | 2.1 TCP Equilibrium State Throughput | |||
TCP connections have three (3) fundamental congestion window phases | TCP connections have three (3) fundamental congestion window phases | |||
as documented in RFC 5681 [RFC5681]. These states are: | as documented in [RFC5681]. These phases are: | |||
- Slow Start, which occurs during the beginning of a TCP transmission | - Slow Start, which occurs during the beginning of a TCP transmission | |||
or after a retransmission time out event | or after a retransmission time out event. | |||
- Congestion avoidance, which is the phase during which TCP ramps up | - Congestion avoidance, which is the phase during which TCP ramps up | |||
to establish the maximum attainable throughput on an end-end network | to establish the maximum attainable throughput on an end-end network | |||
path. Retransmissions are a natural by-product of the TCP congestion | path. Retransmissions are a natural by-product of the TCP congestion | |||
avoidance algorithm as it seeks to achieve maximum throughput on | avoidance algorithm as it seeks to achieve maximum throughput on | |||
the network path. | the network path. | |||
- Retransmission phase, which include Fast Retransmit (Tahoe) and | - Retransmission phase, which include Fast Retransmit (Tahoe) and | |||
Fast Recovery (Reno and New Reno). When a packet is lost, the | Fast Recovery (Reno and New Reno). When a packet is lost, the | |||
Congestion avoidance phase transitions to a Fast Retransmission or | Congestion avoidance phase transitions to a Fast Retransmission or | |||
Recovery Phase dependent upon the TCP implementation. | Recovery Phase dependent upon the TCP implementation. | |||
The following diagram depicts these states. | The following diagram depicts these phases. | |||
| ssthresh | | ssthresh | |||
TCP | | | TCP | | | |||
Through- | | Equilibrium | Through- | | Equilibrium | |||
put | |\ /\/\/\/\/\ Retransmit /\/\ ... | put | |\ /\/\/\/\/\ Retransmit /\/\ ... | |||
| | \ / | Time-out / | | | \ / | Time-out / | |||
| | \ / | _______ _/ | | | \ / | _______ _/ | |||
| Slow _/ |/ | / | Slow _/ | | Slow _/ |/ | / | Slow _/ | |||
| Start _/ Congestion |/ |Start_/ Congestion | | Start _/ Congestion |/ |Start_/ Congestion | |||
| _/ Avoidance Loss | _/ Avoidance | | _/ Avoidance Loss | _/ Avoidance | |||
| _/ Event | _/ | | _/ Event | _/ | |||
| _/ |/ | | _/ |/ | |||
|/__________________________________________________________ | |/__________________________________________________________ | |||
Time | Time | |||
This TCP methodology provides guidelines to measure the equilibrium | This TCP methodology provides guidelines to measure the equilibrium | |||
throughput which refers to the maximum sustained rate obtained by | throughput which refers to the maximum sustained rate obtained by | |||
congestion avoidance before packet loss conditions occur (which would | congestion avoidance before packet loss conditions occur (which MAY | |||
cause the state change from congestion avoidance to a retransmission | cause the state change from congestion avoidance to a retransmission | |||
phase). All maximum achievable throughputs specified in Section 3 are | phase). All maximum achievable throughputs specified in Section 3 are | |||
with respect to this equilibrium state. | with respect to this equilibrium state. | |||
2.2 Metrics for TCP Throughput Tests | 2.2 Metrics for TCP Throughput Tests | |||
This draft focuses on a TCP throughput methodology and also | This framework focuses on a TCP throughput methodology and also | |||
provides two basic metrics to compare results of various throughput | provides several basic metrics to compare results of various | |||
tests. It is recognized that the complexity and unpredictability of | throughput tests. It is recognized that the complexity and | |||
TCP makes it impossible to develop a complete set of metrics that | unpredictability of TCP makes it impossible to develop a complete | |||
account for the myriad of variables (i.e. RTT variation, loss | set of metrics that account for the myriad of variables (i.e. RTT | |||
conditions, TCP implementation, etc.). However, these two basic | variation, loss conditions, TCP implementation, etc.). However, | |||
metrics will faciliate TCP throughput comparisons under varying | these basic metrics will facilitate TCP throughput comparisons | |||
network conditions and between network traffic management techniques. | under varying network conditions and between network traffic | |||
management techniques. | ||||
The TCP Efficiency metric is the percentage of bytes that were not | The first metric is the TCP Transfer Time, which is simply the | |||
retransmitted and is defined as: | measured time it takes to transfer a block of data across | |||
simultaneous TCP connections. The concept is useful when | ||||
benchmarking traffic management techniques, where multiple | ||||
connections MAY be REQUIRED. | ||||
The TCP Transfer time MAY also be used to provide a normalized ratio | ||||
of the actual TCP Transfer Time versus Ideal Transfer Time. This | ||||
ratio is called the TCP Transfer Index and is defined as: | ||||
Actual TCP Transfer Time | ||||
------------------------- | ||||
Ideal TCP Transfer Time | ||||
The Ideal TCP Transfer time is derived from the network path | ||||
bottleneck bandwidth and the various Layer 1/2/3 overheads associated | ||||
with the network path. Additionally, the TCP Window size must be | ||||
tuned to equal the bandwidth delay product (BDP) as described in | ||||
Section 3.3.1. | ||||
The following table illustrates a single connection TCP Transfer and | ||||
the Ideal TCP Transfer time for a 100 MB file with the ideal TCP | ||||
window size based on the BDP. | ||||
Table 2.2: Link Speed, RTT, TCP Throughput, Ideal TCP Transfer time | ||||
Link Maximum Achievable Ideal TCP Transfer time | ||||
Speed RTT (ms) TCP Throughput(Mbps) Time in seconds | ||||
-------------------------------------------------------------------- | ||||
T1 20 1.17 684.93 | ||||
T1 50 1.40 570.61 | ||||
T1 100 1.40 570.61 | ||||
T3 10 42.05 19.03 | ||||
T3 15 42.05 19.03 | ||||
T3 25 41.52 18.82 | ||||
T3(ATM) 10 36.50 21.92 | ||||
T3(ATM) 15 36.23 22.14 | ||||
T3(ATM) 25 36.27 22.05 | ||||
100M 1 91.98 8.70 | ||||
100M 2 93.44 8.56 | ||||
100M 5 93.44 8.56 | ||||
1Gig 0.1 919.82 0.87 | ||||
1Gig 0.5 934.47 0.86 | ||||
1Gig 1 934.47 0.86 | ||||
10Gig 0.05 9,344.67 0.09 | ||||
10Gig 0.3 9,344.67 0.09 | ||||
* Calculation is based on File Size in Bytes X 8 / TCP Throughput. | ||||
** TCP Throughput is derived from Table 3.3. | ||||
To illustrate the TCP Transfer Time Index, an example would be the | ||||
bulk transfer of 100 MB over 5 simultaneous TCP connections (each | ||||
connection uploading 100 MB). In this example, the Ethernet service | ||||
provides a Committed Access Rate (CAR) of 500 Mbit/s. Each | ||||
connection MAY achieve different throughputs during a test and the | ||||
overall throughput rate is not always easy to determine (especially | ||||
as the number of connections increases). | ||||
The ideal TCP Transfer Time would be ~8 seconds, but in this example, | ||||
the actual TCP Transfer Time was 12 seconds. The TCP Transfer Index | ||||
would be 12/8 = 1.5, which indicates that the transfer across all | ||||
connections took 1.5 times longer than the ideal. | ||||
The second metric is the TCP Efficiency metric which is the | ||||
percentage of bytes that were not retransmitted and is defined as: | ||||
Transmitted Bytes - Retransmitted Bytes | Transmitted Bytes - Retransmitted Bytes | |||
--------------------------------------- x 100 | --------------------------------------- x 100 | |||
Transmitted Bytes | Transmitted Bytes | |||
This metric provides a comparative measure between various QoS | Transmitted bytes are the total number of TCP payload bytes to be | |||
mechanisms such as traffic management, congestion avoidance, and also | transmitted which includes the original and retransmitted bytes. This | |||
various TCP implementations (i.e. Reno, Vegas, etc.). | metric provides a comparative measure between various QoS mechanisms | |||
such as traffic management, congestion avoidance, and also various | ||||
TCP implementations (i.e. Reno, Vegas, etc.). | ||||
As an example, if 100,000 bytes were sent and 2,000 had to be | As an example, if 100,000 bytes were sent and 2,000 had to be | |||
retransmitted, the TCP Efficiency would be calculated as: | retransmitted, the TCP Efficiency SHOULD be calculated as: | |||
100,000 - 2,000 | 102,000 - 2,000 | |||
---------------- x 100 = 98% | ---------------- x 100 = 98.03% | |||
100,000 | 102,000 | |||
Note that the retransmitted bytes may have occurred more than once, | Note that the retransmitted bytes MAY have occurred more than once, | |||
and these multiple retransmissions are added to the bytes | and these multiple retransmissions are added to the Retransmitted | |||
retransmitted count. | Bytes count (and the Transmitted Bytes count). | |||
The second metric is the TCP Transfer Time, which is simply the time | And the third metric is the Buffer Delay Percentage, which represents | |||
it takes to transfer a block of data across simultaneous TCP | the increase in RTT during a TCP throughput test from the inherent | |||
connections. The concept is useful when benchmarking traffic | network RTT (baseline RTT). The baseline RTT is the round-trip time | |||
management techniques, where multiple connections are generally | inherent to the network path under non-congested conditions. | |||
required. | ||||
The TCP Transfer time can also be used to provide a normalized ratio | The Buffer Delay Percentage is defined as: | |||
of the actual TCP Transfer Time versus ideal Transfer Time. This | ||||
ratio is called the TCP Transfer Index and is defined as: | ||||
Actual TCP Transfer Time | Average RTT during Transfer - Baseline RTT | |||
------------------------- | ------------------------------------------ x 100 | |||
Ideal TCP Transfer Time | Baseline RTT | |||
An example would be the bulk transfer of 100 MB upon 5 simultaneous | As an example, the baseline RTT for the network path is 25 msec. | |||
TCP connections over a 500 Mbit/s Ethernet service (each connection | During the course of a TCP transfer, the average RTT across the | |||
uploading 100 MB). Each connection may achieve different throughputs | entire transfer increased to 32 msec. In this example, the Buffer | |||
during a test and the overall throughput rate is not always easy to | Delay Percentage WOULD be calculated as: | |||
determine (especially as the number of connections increases). | ||||
The ideal TCP Transfer Time would be ~8 seconds, but in this example, | 32 - 25 | |||
the actual TCP Transfer Time was 12 seconds. The TCP Transfer Index | ------- x 100 = 28% | |||
would be 12/8 = 1.5, which indicates that the transfer across all | 25 | |||
connections took 1.5 times longer than the ideal. | ||||
Note that both the TCP Efficiency and TCP Transfer Time metrics must | Note that the TCP Transfer Time, TCP Efficiency, and Buffer Delay | |||
be measured during each throughput test. The correlation of TCP | metrics MUST be measured during each throughput test. | |||
Transfer Time with TCP Efficiency can help to diagnose whether the | Poor TCP Transfer Time Indexes (TCP Transfer Time greater than Ideal | |||
TCP Transfer Time was negatively impacted by retransmissions (poor | TCP Transfer Times) MAY be diagnosed by correlating with sub-optimal | |||
TCP Efficiency). | TCP Efficiency and/or Buffer Delay Percentage metrics. | |||
3. TCP Throughput Testing Methodology | 3. TCP Throughput Testing Methodology | |||
As stated in Section 1, it is considered best practice to verify | As stated in Section 1, it is considered best practice to verify | |||
the integrity of the network by conducting Layer2/3 stress tests | the integrity of the network by conducting Layer2/3 stress tests | |||
such as RFC2544 (or other methods of network stress tests). If the | such as [RFC2544] or other methods of network stress tests. If the | |||
network is not performing properly in terms of packet loss, jitter, | network is not performing properly in terms of packet loss, jitter, | |||
etc. then the TCP layer testing will not be meaningful since the | etc. then the TCP layer testing will not be meaningful since the | |||
equilibrium throughput would be very difficult to achieve (in a | equilibrium throughput MAY be very difficult to achieve (in a | |||
"dysfunctional" network). | "dysfunctional" network). | |||
TCP Throughput testing MAY require cooperation between the end user | ||||
customer and the network provider. In a Layer 2/3 VPN architecture, | ||||
the testing SHOULD be conducted on the Customer Edge (CE) router and | ||||
not the Provider Edge (PE) router. | ||||
The following represents the sequential order of steps to conduct the | The following represents the sequential order of steps to conduct the | |||
TCP throughput testing methodology: | TCP throughput testing methodology: | |||
1. Identify the Path MTU. Packetization Layer Path MTU Discovery | 1. Identify the Path MTU. Packetization Layer Path MTU Discovery | |||
or PLPMTUD, [RFC4821], should be conducted to verify the maximum | or PLPMTUD, [RFC4821], MUST be conducted to verify the maximum | |||
network path MTU. Conducting PLPMTUD establishes the upper limit for | network path MTU. Conducting PLPMTUD establishes the upper limit for | |||
the MSS to be used in subsequent steps. | the MSS to be used in subsequent steps. | |||
2. Baseline Round-trip Delay and Bandwidth. These measurements | 2. Baseline Round Trip Time and Bandwidth. This step establishes the | |||
provide estimates of the ideal TCP window size, which will be used in | inherent, non-congested Round Trip Time (RTT) and the bottleneck | |||
subsequent test steps. | bandwidth of the end-end network path. These measurements are used | |||
to provide estimates of the ideal TCP window size, which SHOULD be | ||||
used in subsequent test steps. These measurements reference | ||||
[RFC2681] and [RFC4898] to measure RTD (and the associated RTT). | ||||
Also, [RFC5136] is referenced to measure network capacity. | ||||
3. TCP Connection Throughput Tests. With baseline measurements | 3. TCP Connection Throughput Tests. With baseline measurements | |||
of round trip delay and bandwidth, a series of single and multiple | of Round Trip Time and bottleneck bandwidth, a series of single and | |||
TCP connection throughput tests can be conducted to baseline the | multiple TCP connection throughput tests SHOULD be conducted to | |||
network performance expectations. | baseline the network performance expectations. | |||
4. Traffic Management Tests. Various traffic management and queueing | 4. Traffic Management Tests. Various traffic management and queuing | |||
techniques are tested in this step, using multiple TCP connections. | techniques SHOULD be tested in this step, using multiple TCP | |||
Multiple connection testing can verify that the network is configured | connections. Multiple connection testing SHOULD verify that the | |||
properly for traffic shaping versus policing, various queueing | network is configured properly for traffic shaping versus policing, | |||
implementations, and RED. | various queuing implementations, and RED. | |||
Important to note are some of the key characteristics and | Important to note are some of the key characteristics and | |||
considerations for the TCP test instrument. The test host may be a | considerations for the TCP test instrument. The test host MAY be a | |||
standard computer or dedicated communications test instrument | standard computer or dedicated communications test instrument | |||
and these TCP test hosts be capable of emulating both a client and a | and these TCP test hosts be capable of emulating both a client and a | |||
server. | server. | |||
Whether the TCP test host is a standard computer or dedicated test | Whether the TCP test host is a standard computer or a compliant TCP | |||
instrument, the following areas should be considered when selecting | TTD, the following areas SHOULD be considered when selecting | |||
a test host: | a test host: | |||
- TCP implementation used by the test host OS, i.e. Linux OS kernel | - TCP implementation used by the test host OS version, i.e. Linux OS | |||
using TCP Reno, TCP options supported, etc. This will obviously be | kernel using TCP Reno, TCP options supported, etc. This will | |||
more important when using custom test equipment where the TCP | obviously be more important when using custom test equipment where | |||
implementation may be customized or tuned to run in higher | the TCP implementation MAY be customized or tuned to run in higher | |||
performance hardware | performance hardware. When a compliant TCP TTD is used, the TCP | |||
implementation SHOULD be identified in the test results. The | ||||
compliant TCP TTD SHOULD be usable for complete end-to-end testing | ||||
through network security elements and SHOULD also be usable for | ||||
testing network sections. | ||||
- Most importantly, the TCP test host must be capable of generating | - Most importantly, the TCP test host must be capable of generating | |||
and receiving stateful TCP test traffic at the full link speed of the | and receiving stateful TCP test traffic at the full link speed of the | |||
network under test. As a general rule of thumb, testing TCP | network under test. As a general rule of thumb, testing TCP | |||
throughput at rates greater than 100 Mbit/sec generally requires high | throughput at rates greater than 100 Mbit/sec MAY require high | |||
performance server hardware or dedicated hardware based test tools. | performance server hardware or dedicated hardware based test tools. | |||
Thus, other devices cannot realize higher TCP throughput, and user | ||||
expectations SHOULD be set accordingly with user manual or notes on | ||||
the results report. | ||||
- To measure RTT and TCP Efficiency per connection, this will | - Measuring RTT and TCP Efficiency per connection will generally | |||
generally require dedicated hardware based test tools. In the absence | require dedicated hardware based test tools. In the absence of | |||
of dedicated hardware based test tools, these measurements may need | dedicated hardware based test tools, these measurements MAY need to | |||
to be conducted with packet capture tools (conduct TCP throughput | be conducted with packet capture tools (conduct TCP throughput tests | |||
tests and analyze RTT and retransmission results with packet | and analyze RTT and retransmission results with packet captures). | |||
captures). | Another option MAY be to use "TCP Extended Statistics MIB" per | |||
[RFC4898]. | ||||
- The compliant TCP TTD and its access to the network under test MUST | ||||
NOT introduce a performance bottleneck of any kind. | ||||
3.1. Determine Network Path MTU | 3.1. Determine Network Path MTU | |||
TCP implementations should use Path MTU Discovery techniques (PMTUD). | TCP implementations SHOULD use Path MTU Discovery techniques (PMTUD). | |||
PMTUD relies on ICMP 'need to frag' messages to learn the path MTU. | PMTUD relies on ICMP 'need to frag' messages to learn the path MTU. | |||
When a device has a packet to send which has the Don't Fragment (DF) | When a device has a packet to send which has the Don't Fragment (DF) | |||
bit in the IP header set and the packet is larger than the Maximum | bit in the IP header set and the packet is larger than the Maximum | |||
Transmission Unit (MTU) of the next hop link, the packet is dropped | Transmission Unit (MTU) of the next hop link, the packet is dropped | |||
and the device sends an ICMP 'need to frag' message back to the host | and the device sends an ICMP 'need to frag' message back to the host | |||
that originated the packet. The ICMP 'need to frag' message includes | that originated the packet. The ICMP 'need to frag' message includes | |||
the next hop MTU which PMTUD uses to tune the TCP Maximum Segment | the next hop MTU which PMTUD uses to tune the TCP Maximum Segment | |||
Size (MSS). Unfortunately, because many network managers completely | Size (MSS). Unfortunately, because many network managers completely | |||
disable ICMP, this technique does not always prove reliable in real | disable ICMP, this technique does not always prove reliable in real | |||
world situations. | world situations. | |||
Packetization Layer Path MTU Discovery or PLPMTUD (RFC4821) should | Packetization Layer Path MTU Discovery or PLPMTUD [RFC4821] MUST | |||
be conducted to verify the minimum network path MTU. PLPMTUD can | be conducted to verify the minimum network path MTU. PLPMTUD can | |||
be used with or without ICMP. The following sections provide a | be used with or without ICMP. The following sections provide a | |||
summary of the PLPMTUD approach and an example using the TCP | summary of the PLPMTUD approach and an example using the TCP | |||
protocol. RFC4821 specifies a search_high and search_low parameter | protocol. [RFC4821] specifies a search_high and a search_low | |||
for the MTU. As specified in RFC4821, a value of 1024 is a generally | parameter for the MTU. As specified in [RFC4821], a value of 1024 is | |||
safe value to choose for search_low in modern networks. | a generally safe value to choose for search_low in modern networks. | |||
It is important to determine the overhead of the links in the path, | It is important to determine the overhead of the links in the path, | |||
and then to select a TCP MSS size corresponding to the Layer 3 MTU. | and then to select a TCP MSS size corresponding to the Layer 3 MTU. | |||
For example, if the MTU is 1024 bytes and the TCP/IP headers are 40 | For example, if the MTU is 1024 bytes and the TCP/IP headers are 40 | |||
bytes, then the MSS would be set to 984 bytes. | bytes, then the MSS would be set to 984 bytes. | |||
An example scenario is a network where the actual path MTU is 1240 | An example scenario is a network where the actual path MTU is 1240 | |||
bytes. The TCP client probe MUST be capable of setting the MSS for | bytes. The TCP client probe MUST be capable of setting the MSS for | |||
the probe packets and could start at MSS = 984 (which corresponds | the probe packets and could start at MSS = 984 (which corresponds | |||
to an MTU size of 1024 bytes). | to an MTU size of 1024 bytes). | |||
The TCP client probe would open a TCP connection and advertise the | The TCP client probe would open a TCP connection and advertise the | |||
MSS as 984. Note that the client probe MUST generate these packets | MSS as 984. Note that the client probe MUST generate these packets | |||
with the DF bit set. The TCP client probe then sends test traffic | with the DF bit set. The TCP client probe then sends test traffic | |||
per a nominal window size (8KB, etc.). The window size should be | per a nominal window size (8KB, etc.). The window size SHOULD be | |||
kept small to minimize the possibility of congesting the network, | kept small to minimize the possibility of congesting the network, | |||
which could induce congestive loss. The duration of the test should | which MAY induce congestive loss. The duration of the test should | |||
also be short (10-30 seconds), again to minimize congestive effects | also be short (10-30 seconds), again to minimize congestive effects | |||
during the test. | during the test. | |||
In the example of a 1240 byte path MTU, probing with an MSS equal to | In the example of a 1240 byte path MTU, probing with an MSS equal to | |||
984 would yield a successful probe and the test client packets would | 984 would yield a successful probe and the test client packets would | |||
be successfully transferred to the test server. | be successfully transferred to the test server. | |||
Also note that the test client MUST verify that the MSS advertised | Also note that the test client MUST verify that the MSS advertised | |||
is indeed negotiated. Network devices with built-in Layer 4 | is indeed negotiated. Network devices with built-in Layer 4 | |||
capabilities can intercede during the connection establishment | capabilities can intercede during the connection establishment | |||
process and reduce the advertised MSS to avoid fragmentation. This | process and reduce the advertised MSS to avoid fragmentation. This | |||
is certainly a desirable feature from a network perspective, but | is certainly a desirable feature from a network perspective, but | |||
can yield erroneous test results if the client test probe does not | can yield erroneous test results if the client test probe does not | |||
confirm the negotiated MSS. | confirm the negotiated MSS. | |||
The next test probe would use the search_high value and this would | The next test probe would use the search_high value and this would | |||
be set to MSS = 1460 to correspond to a 1500 byte MTU. In this | be set to MSS = 1460 to correspond to a 1500 byte MTU. In this | |||
example, the test client would retransmit based upon time-outs (since | example, the test client MUST retransmit based upon time-outs (since | |||
no ACKs will be received from the test server). This test probe is | no ACKs will be received from the test server). This test probe is | |||
marked as a conclusive failure if none of the test packets are | marked as a conclusive failure if none of the test packets are | |||
ACK'ed. If any of the test packets are ACK'ed, congestive network | ACK'ed. If any of the test packets are ACK'ed, congestive network | |||
may be the cause and the test probe is not conclusive. Re-testing | MAY be the cause and the test probe is not conclusive. Re-testing | |||
at other times of the day is recommended to further isolate. | at other times of the day is RECOMMENDED to further isolate. | |||
The test is repeated until the desired granularity of the MTU is | The test is repeated until the desired granularity of the MTU is | |||
discovered. The method can yield precise results at the expense of | discovered. The method can yield precise results at the expense of | |||
probing time. One approach would be to reduce the probe size to | probing time. One approach MAY be to reduce the probe size to | |||
half between the unsuccessful search_high and successful search_low | half between the unsuccessful search_high and successful search_low | |||
value, and increase by increments of 1/2 when seeking the upper | value, and increase by increments of 1/2 when seeking the upper | |||
limit. | limit. | |||
3.2. Baseline Round-trip Delay and Bandwidth | 3.2. Baseline Round Trip Time and Bandwidth | |||
Before stateful TCP testing can begin, it is important to baseline | Before stateful TCP testing can begin, it is important to determine | |||
the round trip delay and bandwidth of the network to be tested. | the baseline Round Trip Time (non-congested inherent delay) and | |||
These measurements provide estimates of the ideal TCP window size, | bottleneck bandwidth of the end-end network to be tested. These | |||
which will be used in subsequent test steps. These latency and | measurements are used to provide estimates of the ideal TCP window | |||
bandwidth tests should be run during the time of day for which | size, which SHOULD be used in subsequent test steps. These latency | |||
and bandwidth tests SHOULD be run during the time of day for which | ||||
the TCP throughput tests will occur. | the TCP throughput tests will occur. | |||
The baseline RTT is used to predict the bandwidth delay product and | ||||
the TCP Transfer Time for the subsequent throughput tests. Since this | ||||
methodology requires that RTT be measured during the entire | ||||
throughput test, the extent by which the RTT varied during the | ||||
throughput test can be quantified. | ||||
3.2.1 Techniques to Measure Round Trip Time | 3.2.1 Techniques to Measure Round Trip Time | |||
Following the definitions used in the references of the appendix; | Following the definitions used in the section 1.1; Round Trip Time | |||
Round Trip Time (RTT) is the time elapsed between the clocking in of | (RTT) is the time elapsed between the clocking in of the first bit | |||
the first bit of a payload packet to the receipt of the last bit of | of a payload packet to the receipt of the last bit of the | |||
the corresponding acknowledgement. Round Trip Delay (RTD) is used | corresponding Acknowledgment. Round Trip Delay (RTD) is used | |||
synonymously to twice the Link Latency. | synonymously to twice the Link Latency. RTT measurements SHOULD use | |||
techniques defined in [RFC2681] or statistics available from MIBs | ||||
defined in [RFC4898]. | ||||
In any method used to baseline round trip delay between network | The RTT SHOULD be baselined during "off-peak" hours to obtain a | |||
end-points, it is important to realize that network latency is the | reliable figure for inherent network latency versus additional delay | |||
sum of inherent network delay and congestion. The RTT should be | caused by network buffering delays. | |||
baselined during "off-peak" hours to obtain a reliable figure for | ||||
network latency (versus additional delay caused by congestion). | ||||
During the actual sustained TCP throughput tests, it is critical | During the actual sustained TCP throughput tests, RTT MUST be | |||
to measure RTT along with measured TCP throughput. Congestive | measured along with TCP throughput. Buffer delay effects can be | |||
effects can be isolated if RTT is concurrently measured. | isolated if RTT is concurrently measured. | |||
This is not meant to provide an exhaustive list, but summarizes some | This is not meant to provide an exhaustive list, but summarizes some | |||
of the more common ways to determine round trip time (RTT) through | of the more common ways to determine round trip time (RTT) through | |||
the network. The desired resolution of the measurement (i.e. msec | the network. The desired resolution of the measurement (i.e. msec | |||
versus usec) may dictate whether the RTT measurement can be achieved | versus usec) may dictate whether the RTT measurement can be achieved | |||
with standard tools such as ICMP ping techniques or whether | with standard tools such as ICMP ping techniques or whether | |||
specialized test equipment would be required with high precision | specialized test equipment would be required with high precision | |||
timers. The objective in this section is to list several techniques | timers. The objective in this section is to list several techniques | |||
in order of decreasing accuracy. | in order of decreasing accuracy. | |||
- Use test equipment on each end of the network, "looping" the | - Use test equipment on each end of the network, "looping" the | |||
far-end tester so that a packet stream can be measured end-end. This | far-end tester so that a packet stream can be measured end-end. This | |||
test equipment RTT measurement may be compatible with delay | test equipment RTT measurement MAY be compatible with delay | |||
measurement protocols specified in RFC5357. | measurement protocols specified in [RFC5357]. | |||
- Conduct packet captures of TCP test applications using for example | - Conduct packet captures of TCP test applications using for example | |||
"iperf" or FTP, etc. By running multiple experiments, the packet | "iperf" or FTP, etc. By running multiple experiments, the packet | |||
captures can be studied to estimate RTT based upon the SYN -> SYN-ACK | captures can be studied to estimate RTT based upon the SYN -> SYN-ACK | |||
handshakes within the TCP connection set-up. | handshakes within the TCP connection set-up. | |||
- ICMP Pings may also be adequate to provide round trip time | - ICMP Pings MAY also be adequate to provide round trip time | |||
estimations. Some limitations of ICMP Ping are the msec resolution | estimations. Some limitations of ICMP Ping MAY include msec | |||
and whether the network elements respond to pings (or block them). | resolution and whether the network elements respond to pings (or | |||
block them). | ||||
3.2.2 Techniques to Measure End-end Bandwidth | 3.2.2 Techniques to Measure End-end Bandwidth | |||
There are many well established techniques available to provide | There are many well established techniques available to provide | |||
estimated measures of bandwidth over a network. This measurement | estimated measures of bandwidth over a network. This measurement | |||
should be conducted in both directions of the network, especially for | SHOULD be conducted in both directions of the network, especially for | |||
access networks which are inherently asymmetrical. Some of the | access networks which MAY be asymmetrical. Measurements SHOULD use | |||
asymmetric implications to TCP performance are documented in RFC 3449 | network capacity techniques defined in [RFC5136]. | |||
[RFC3449]. | ||||
The bandwidth measurement test must be run with stateless IP streams | The bandwidth measurement test MUST be run with stateless IP streams | |||
(not stateful TCP) in order to determine the available bandwidth in | (not stateful TCP) in order to determine the available bandwidth in | |||
each direction. And this test should obviously be performed at | each direction. And this test SHOULD obviously be performed at | |||
various intervals throughout a business day (or even across a week). | various intervals throughout a business day (or even across a week). | |||
Ideally, the bandwidth test should produce a log output of the | Ideally, the bandwidth test SHOULD produce a log output of the | |||
bandwidth achieved across the test interval AND the round trip delay. | bandwidth achieved across the test interval. | |||
And during the actual TCP level performance measurements (Sections | ||||
3.3 - 3.5), the test tool must be able to track round trip time | ||||
of the TCP connection(s) during the test. Measuring round trip time | ||||
variation (aka "jitter") provides insight into effects of congestive | ||||
delay on the sustained throughput achieved for the TCP layer test. | ||||
3.3. TCP Throughput Tests | 3.3. TCP Throughput Tests | |||
This draft specifically defines TCP throughput techniques to verify | This methodology specifically defines TCP throughput techniques to | |||
sustained TCP performance in a managed business network. Defined | verify sustained TCP performance in a managed business network. | |||
in section 2.1, the equilibrium throughput reflects the maximum | Defined in section 2.1, the equilibrium throughput reflects the | |||
rate achieved by a TCP connection within the congestion avoidance | maximum rate achieved by a TCP connection within the congestion | |||
phase on a end-end network path. This section and others will define | avoidance phase on an end-end network path. This section and others | |||
the method to conduct these sustained throughput tests and guidelines | will define the method to conduct these sustained throughput tests | |||
of the predicted results. | and guidelines of the predicted results. | |||
With baseline measurements of round trip time and bandwidth | With baseline measurements of round trip time and bandwidth | |||
from section 3.2, a series of single and multiple TCP connection | from section 3.2, a series of single and multiple TCP connection | |||
throughput tests can be conducted to baseline network performance | throughput tests can be conducted to baseline network performance | |||
against expectations. | against expectations. | |||
It is recommended to run the tests in each direction independently | It is RECOMMENDED to run the tests in each direction independently | |||
first, then run both directions simultaneously. In each case, the | first, then run both directions simultaneously. In each case, the | |||
TCP Efficiency and TCP Transfer Time metrics must be measured in each | TCP Transfer Time, TCP Efficiency, and Buffer Delay metrics MUST be | |||
direction. | measured in each direction. | |||
3.3.1 Calculate Optimum TCP Window Size | 3.3.1 Calculate Ideal TCP Window Size | |||
The optimum TCP window size can be calculated from the bandwidth | The ideal TCP window size can be calculated from the bandwidth | |||
delay product (BDP), which is: | delay product (BDP), which is: | |||
BDP (bits) = RTT (sec) x Bandwidth (bps) | BDP (bits) = RTT (sec) x Bandwidth (bps) | |||
By dividing the BDP by 8, the "ideal" TCP window size is calculated. | By dividing the BDP by 8, the "ideal" TCP window size is calculated. | |||
An example would be a T3 link with 25 msec RTT. The BDP would equal | An example would be a T3 link with 25 msec RTT. The BDP would equal | |||
~1,105,000 bits and the ideal TCP window would equal ~138,000 bytes. | ~1,105,000 bits and the ideal TCP window would equal ~138,000 bytes. | |||
The following table provides some representative network link speeds, | The following table provides some representative network link speeds, | |||
latency, BDP, and associated "optimum" TCP window size. Sustained | latency, BDP, and associated ideal TCP window size. Sustained | |||
TCP transfers should reach nearly 100% throughput, minus the overhead | TCP transfers SHOULD reach nearly 100% throughput, minus the overhead | |||
of Layers 1-3 and the divisor of the MSS into the window. | of Layers 1-3 and the divisor of the MSS into the TCP Window. | |||
For this single connection baseline test, the MSS size will effect | For this single connection baseline test, the MSS size will effect | |||
the achieved throughput (especially for smaller TCP window sizes). | the achieved throughput (especially for smaller TCP Window sizes). | |||
Table 3.2 provides the achievable, equilibrium TCP throughput (at | Table 3.2 provides the achievable, equilibrium TCP throughput (at | |||
Layer 4) using 1460 byte MSS. Also in this table, the 58 byte L1-L4 | Layer 4) using 1460 byte MSS. Also in this table, the 58 byte L1-L4 | |||
overhead including the Ethernet CRC32 is used for simplicity. | overhead including the Ethernet CRC32 is used for simplicity. | |||
Table 3.2: Link Speed, RTT and calculated BDP, TCP Throughput | Table 3.3: Link Speed, RTT and calculated BDP, TCP Throughput | |||
Link Ideal TCP Maximum Achievable | Link Ideal TCP Maximum Achievable | |||
Speed* RTT (ms) BDP (bits) Window (kbytes) TCP Throughput(Mbps) | Speed* RTT (ms) BDP (bits) Window (kBytes) TCP Throughput(Mbps) | |||
--------------------------------------------------------------------- | --------------------------------------------------------------------- | |||
T1 20 30,720 3.84 1.17 | T1 20 30,720 3.84 1.17 | |||
T1 50 76,800 9.60 1.40 | T1 50 76,800 9.60 1.40 | |||
T1 100 153,600 19.20 1.40 | T1 100 153,600 19.20 1.40 | |||
T3 10 442,100 55.26 42.05 | T3 10 442,100 55.26 42.05 | |||
T3 15 663,150 82.89 42.05 | T3 15 663,150 82.89 42.05 | |||
T3 25 1,105,250 138.16 41.52 | T3 25 1,105,250 138.16 41.52 | |||
T3(ATM) 10 407,040 50.88 36.50 | T3(ATM) 10 407,040 50.88 36.50 | |||
T3(ATM) 15 610,560 76.32 36.23 | T3(ATM) 15 610,560 76.32 36.23 | |||
T3(ATM) 25 1,017,600 127.20 36.27 | T3(ATM) 25 1,017,600 127.20 36.27 | |||
100M 1 100,000 12.50 91.98 | 100M 1 100,000 12.50 91.98 | |||
100M 2 200,000 25.00 93.44 | 100M 2 200,000 25.00 93.44 | |||
100M 5 500,000 62.50 93.44 | 100M 5 500,000 62.50 93.44 | |||
1Gig 0.1 100,000 12.50 919.82 | 1Gig 0.1 100,000 12.50 919.82 | |||
1Gig 0.5 500,000 62.50 934.47 | 1Gig 0.5 500,000 62.50 934.47 | |||
1Gig 1 1,000,000 125.00 934.47 | 1Gig 1 1,000,000 125.00 934.47 | |||
10Gig 0.05 500,000 62.50 9,344.67 | 10Gig 0.05 500,000 62.50 9,344.67 | |||
10Gig 0.3 3,000,000 375.00 9,344.67 | 10Gig 0.3 3,000,000 375.00 9,344.67 | |||
* Note that link speed is the minimum link speed throughput a | * Note that link speed is the bottleneck bandwidth for the NUT | |||
network; i.e. WAN with T1 link, etc. | ||||
Also, the following link speeds (available payload bandwidth) were | Also, the following link speeds (available payload bandwidth) were | |||
used for the WAN entries: | used for the WAN entries: | |||
- T1 = 1.536 Mbits/sec (B8ZS line encoding facility) | - T1 = 1.536 Mbits/sec (B8ZS line encoding facility) | |||
- T3 = 44.21 Mbits/sec (C-Bit Framing) | - T3 = 44.21 Mbits/sec (C-Bit Framing) | |||
- T3(ATM) = 36.86 Mbits/sec (C-Bit Framing & PLCP, 96000 Cells per | - T3(ATM) = 36.86 Mbits/sec (C-Bit Framing & PLCP, 96000 Cells per | |||
second) | second) | |||
The calculation method used in this document is a 3 step process : | The calculation method used in this document is a 3 step process : | |||
1 - We determine what should be the optimal TCP Window size value | 1 - Determine what SHOULD be the optimal TCP Window size value | |||
based on the optimal quantity of "in-flight" octets discovered by | based on the optimal quantity of "in-flight" octets discovered by | |||
the BDP calculation. We take into consideration that the TCP | the BDP calculation. We take into consideration that the TCP | |||
Window size has to be an exact multiple value of the MSS. | Window size has to be an exact multiple value of the MSS. | |||
2 - Then we calculate the achievable layer 2 throughput by | 2 - Calculate the achievable layer 2 throughput by multiplying the | |||
multiplying the value determined in step 1 with the | value determined in step 1 with the MSS & (MSS + L2 + L3 + L4 | |||
MSS & (MSS + L2 + L3 + L4 Overheads) divided by the RTT. | Overheads) divided by the RTT. | |||
3 - Finally, we multiply the calculated value of step 2 by the MSS | 3 - Finally, multiply the calculated value of step 2 by the MSS | |||
versus (MSS + L2 + L3 + L4 Overheads) ratio. | versus (MSS + L2 + L3 + L4 Overheads) ratio. | |||
This gives us the achievable TCP Throughput value. Sometimes, the | This provides the achievable TCP Throughput value. Sometimes, the | |||
maximum achievable throughput is limited by the maximum achievable | maximum achievable throughput is limited by the maximum achievable | |||
quantity of Ethernet Frames per second on the physical media. Then | quantity of Ethernet Frames per second on the physical media. Then | |||
this value is used in step 2 instead of the calculated one. | this value is used in step 2 instead of the calculated one. | |||
The following diagram compares achievable TCP throughputs on a T3 link | The following diagram compares achievable TCP throughputs on a T3 link | |||
with Windows 2000/XP TCP window sizes of 16KB versus 64KB. | with Windows 2000/XP TCP window sizes of 16KB versus 64KB. | |||
45| | 45| | |||
| _____42.1M | | _____42.1M | |||
40| |64K| | 40| |64K| | |||
skipping to change at page 14, line 6 | skipping to change at page 17, line 6 | |||
15| 14.5M____| | | | | | | 15| 14.5M____| | | | | | | |||
| |16K| | | | | | | | |16K| | | | | | | |||
10| | | | 9.6M+---+ | | | | 10| | | | 9.6M+---+ | | | | |||
| | | | |16K| | 5.8M____+ | | | | | | |16K| | 5.8M____+ | | |||
5| | | | | | | |16K| | | 5| | | | | | | |16K| | | |||
|______+___+___+_______+___+___+_______+__ +___+_______ | |______+___+___+_______+___+___+_______+__ +___+_______ | |||
10 15 25 | 10 15 25 | |||
RTT in milliseconds | RTT in milliseconds | |||
The following diagram shows the achievable TCP throughput on a 25ms | The following diagram shows the achievable TCP throughput on a 25ms | |||
T3 when the TCP Window size is increased and with the RFC1323 TCP | T3 when the TCP Window size is increased and with the [RFC1323] TCP | |||
Window scaling option. | Window scaling option. | |||
45| | 45| | |||
| +-----+42.47M | | +-----+42.47M | |||
40| | | | 40| | | | |||
TCP | | | | TCP | | | | |||
Throughput 35| | | | Throughput 35| | | | |||
in Mbps | | | | in Mbps | | | | |||
30| | | | 30| | | | |||
| | | | | | | | |||
skipping to change at page 14, line 39 | skipping to change at page 17, line 39 | |||
3.3.2 Conducting the TCP Throughput Tests | 3.3.2 Conducting the TCP Throughput Tests | |||
There are several TCP tools that are commonly used in the network | There are several TCP tools that are commonly used in the network | |||
world and one of the most common is the "iperf" tool. With this tool, | world and one of the most common is the "iperf" tool. With this tool, | |||
hosts are installed at each end of the network segment; one as client | hosts are installed at each end of the network segment; one as client | |||
and the other as server. The TCP Window size of both the client and | and the other as server. The TCP Window size of both the client and | |||
the server can be manually set and the achieved throughput is | the server can be manually set and the achieved throughput is | |||
measured, either uni-directionally or bi-directionally. For higher | measured, either uni-directionally or bi-directionally. For higher | |||
BDP situations in lossy networks (long fat networks or satellite | BDP situations in lossy networks (long fat networks or satellite | |||
links, etc.), TCP options such as Selective Acknowledgment should be | links, etc.), TCP options such as Selective Acknowledgment SHOULD be | |||
considered and also become part of the window size / throughput | considered and also become part of the window size / throughput | |||
characterization. | characterization. | |||
Host hardware performance must be well understood before conducting | Host hardware performance MUST be well understood before conducting | |||
the TCP throughput tests and other tests in the following sections. | the TCP throughput tests and other tests in the following sections. | |||
Dedicated test equipment will generally be required, especially for | Dedicated test equipment will generally be REQUIRED, especially for | |||
line rates of GigE and 10 GigE. | line rates of GigE and 10 GigE. A compliant TCP TTD SHOULD provide a | |||
warning message when the expected test throughput will exceed 10% of | ||||
the network bandwidth capacity. If the throughput test is expected | ||||
to exceed 10% of the provider bandwidth, then the test SHOULD be | ||||
coordinated with the network provider. This does not include the | ||||
customer premise bandwidth, the 10% refers directly to the provider's | ||||
bandwidth (Provider Edge to Provider router). | ||||
The TCP throughput test should be run over a a long enough duration | The TCP throughput test SHOULD be run over a long enough duration | |||
to properly exercise network buffers and also characterize | to properly exercise network buffers and also characterize | |||
performance during different time periods of the day. The results | performance during different time periods of the day. | |||
must be logged at the desired interval and the test must record RTT | ||||
and TCP retransmissions at each interval. | ||||
This correlation of retransmissions and RTT over the course of the | ||||
test will clearly identify which portions of the transfer reached | ||||
TCP Equilbrium state and to what effect increased RTT (congestive | ||||
effects) may have been the cause of reduced equilibrium performance. | ||||
Additionally, the TCP Efficiency and TCP Transfer time metrics should | Note that both the TCP Transfer Time, TCP Efficiency, and Buffer | |||
be logged in order to further characterize the window size tests. | Delay metrics MUST be measured during each throughput test. | |||
Poor TCP Transfer Time Indexes (TCP Transfer Time greater than Ideal | ||||
TCP Transfer Times) MAY be diagnosed by correlating with sub-optimal | ||||
TCP Efficiency and/or Buffer Delay Percentage metrics. | ||||
3.3.3 Single vs. Multiple TCP Connection Testing | 3.3.3 Single vs. Multiple TCP Connection Testing | |||
The decision whether to conduct single or multiple TCP connection | The decision whether to conduct single or multiple TCP connection | |||
tests depends upon the size of the BDP in relation to the window | tests depends upon the size of the BDP in relation to the window | |||
sizes configured in the end-user environment. For example, if the | sizes configured in the end-user environment. For example, if the | |||
BDP for a long-fat pipe turns out to be 2MB, then it is probably more | BDP for a long-fat pipe turns out to be 2MB, then it is probably more | |||
realistic to test this pipe with multiple connections. Assuming | realistic to test this pipe with multiple connections. Assuming | |||
typical host computer window settings of 64 KB, using 32 connections | typical host computer window settings of 64 KB, using 32 connections | |||
would realistically test this pipe. | would realistically test this pipe. | |||
skipping to change at page 15, line 37 | skipping to change at page 18, line 39 | |||
#Connections | #Connections | |||
Window to Fill Link | Window to Fill Link | |||
------------------------ | ------------------------ | |||
16KB 20 | 16KB 20 | |||
32KB 10 | 32KB 10 | |||
64KB 5 | 64KB 5 | |||
128KB 3 | 128KB 3 | |||
The TCP Transfer Time metric is useful for conducting multiple | The TCP Transfer Time metric is useful for conducting multiple | |||
connection tests. Each connection should be configured to transfer | connection tests. Each connection SHOULD be configured to transfer | |||
a certain payload (i.e. 100 MB), and the TCP Transfer time provides | payloads of the same size (i.e. 100 MB), and the TCP Transfer time | |||
a simple metric to verify the actual versus expected results. | SHOULD provide a simple metric to verify the actual versus expected | |||
results. | ||||
Note that the TCP transfer time is the time for all connections to | Note that the TCP transfer time is the time for all connections to | |||
complete the transfer of the configured payload size. From the | complete the transfer of the configured payload size. From the | |||
example table listed above, the 64KB window is considered. Each of | example table listed above, the 64KB window is considered. Each of | |||
the 5 connections would be configured to transfer 100MB, and each | the 5 connections would be configured to transfer 100MB, and each | |||
TCP should obtain a maximum of 100 Mb/sec per connection. So for | TCP should obtain a maximum of 100 Mb/sec per connection. So for | |||
this example, the 100MB payload should be transferred across the | this example, the 100MB payload should be transferred across the | |||
connections in approximately 8 seconds (which would be the ideal TCP | connections in approximately 8 seconds (which would be the ideal TCP | |||
transfer time for these conditions). | transfer time for these conditions). | |||
Additionally, the TCP Efficiency metric should be computed for each | Additionally, the TCP Efficiency metric SHOULD be computed for each | |||
connection tested (defined in section 2.2). | connection tested (defined in section 2.2). | |||
3.3.4 Interpretation of the TCP Throughput Results | 3.3.4 Interpretation of the TCP Throughput Results | |||
At the end of this step, the user will document the theoretical BDP | At the end of this step, the user will document the theoretical BDP | |||
and a set of Window size experiments with measured TCP throughput for | and a set of Window size experiments with measured TCP throughput for | |||
each TCP window size setting. For cases where the sustained TCP | each TCP window size setting. For cases where the sustained TCP | |||
throughput does not equal the predicted value, some possible causes | throughput does not equal the ideal value, some possible causes | |||
are listed: | are listed: | |||
- Network congestion causing packet loss; the TCP Efficiency metric | - Network congestion causing packet loss which MAY be inferred from | |||
is a useful gauge to compare network performance | a poor TCP Efficiency metric (100% = no loss) | |||
- Network congestion not causing packet loss but increasing RTT | - Network congestion causing an increase in RTT which MAY be inferred | |||
from the Buffer Delay metric (0% = no increase in RTT over baseline) | ||||
- Intermediate network devices which actively regenerate the TCP | - Intermediate network devices which actively regenerate the TCP | |||
connection and can alter window size, MSS, etc. | connection and can alter window size, MSS, etc. | |||
- Over utilization of available link or rate limiting (policing). | - Rate limiting (policing). More discussion of traffic management | |||
More discussion of traffic management tests follows in section 3.4 | tests follows in section 3.4 | |||
3.4. Traffic Management Tests | 3.4. Traffic Management Tests | |||
In most cases, the network connection between two geographic | In most cases, the network connection between two geographic | |||
locations (branch offices, etc.) is lower than the network connection | locations (branch offices, etc.) is lower than the network connection | |||
of the host computers. An example would be LAN connectivity of GigE | of the host computers. An example would be LAN connectivity of GigE | |||
and WAN connectivity of 100 Mbps. The WAN connectivity may be | and WAN connectivity of 100 Mbps. The WAN connectivity may be | |||
physically 100 Mbps or logically 100 Mbps (over a GigE WAN | physically 100 Mbps or logically 100 Mbps (over a GigE WAN | |||
connection). In the later case, rate limiting is used to provide the | connection). In the later case, rate limiting is used to provide the | |||
WAN bandwidth per the SLA. | WAN bandwidth per the SLA. | |||
Traffic management techniques are employed to provide various forms | Traffic management techniques are employed to provide various forms | |||
of QoS, the more common include: | of QoS, the more common include: | |||
- Traffic Shaping | - Traffic Shaping | |||
- Priority Queueing | - Priority queuing | |||
- Random Early Discard (RED, etc.) | - Random Early Discard (RED, etc.) | |||
Configuring the end-end network with these various traffic management | Configuring the end-end network with these various traffic management | |||
mechanisms is a complex under-taking. For traffic shaping and RED | mechanisms is a complex under-taking. For traffic shaping and RED | |||
techniques, the end goal is to provide better performance for bursty | techniques, the end goal is to provide better performance for bursty | |||
traffic such as TCP (RED is specifically intended for TCP). | traffic such as TCP (RED is specifically intended for TCP). | |||
This section of the methodology provides guidelines to test traffic | This section of the methodology provides guidelines to test traffic | |||
shaping and RED implementations. As in section 3.3, host hardware | shaping and RED implementations. As in section 3.3, host hardware | |||
performance must be well understood before conducting the traffic | performance MUST be well understood before conducting the traffic | |||
shaping and RED tests. Dedicated test equipment will generally be | shaping and RED tests. Dedicated test equipment will generally be | |||
required, especially for line rates of GigE and 10 GigE. | REQUIRED for line rates of GigE and 10 GigE. If the throughput test | |||
is expected to exceed 10% of the provider bandwidth, then the test | ||||
SHOULD be coordinated with the network provider. This does not | ||||
include the customer premise bandwidth, the 10% refers directly to | ||||
the provider's bandwidth (Provider Edge to Provider router). | ||||
3.4.1 Traffic Shaping Tests | 3.4.1 Traffic Shaping Tests | |||
For services where the available bandwidth is rate limited, there are | For services where the available bandwidth is rate limited, there are | |||
two (2) techniques used to implement rate limiting: traffic policing | two (2) techniques used to implement rate limiting: traffic policing | |||
and traffic shaping. | and traffic shaping. | |||
Simply stated, traffic policing marks and/or drops packets which | Simply stated, traffic policing marks and/or drops packets which | |||
exceed the SLA bandwidth (in most cases, excess traffic is dropped). | exceed the SLA bandwidth (in most cases, excess traffic is dropped). | |||
Traffic shaping employs the use of queues to smooth the bursty | Traffic shaping employs the use of queues to smooth the bursty | |||
skipping to change at page 17, line 23 | skipping to change at page 20, line 29 | |||
reduced, which in turn optimizes TCP throughput for the given | reduced, which in turn optimizes TCP throughput for the given | |||
available bandwidth. Through this section, the available | available bandwidth. Through this section, the available | |||
rate-limited bandwidth shall be referred to as the | rate-limited bandwidth shall be referred to as the | |||
"bottleneck bandwidth". | "bottleneck bandwidth". | |||
The ability to detect proper traffic shaping is more easily diagnosed | The ability to detect proper traffic shaping is more easily diagnosed | |||
when conducting a multiple TCP connection test. Proper shaping will | when conducting a multiple TCP connection test. Proper shaping will | |||
provide a fair distribution of the available bottleneck bandwidth, | provide a fair distribution of the available bottleneck bandwidth, | |||
while traffic policing will not. | while traffic policing will not. | |||
The traffic shaping tests build upon the concepts of multiple | The traffic shaping tests are built upon the concepts of multiple | |||
connection testing as defined in section 3.3.3. Calculating the BDP | connection testing as defined in section 3.3.3. Calculating the BDP | |||
for the bottleneck bandwidth is first required and then selecting | for the bottleneck bandwidth is first REQUIRED before selecting the | |||
the number of connections / window size per connection. | number of connections and TCP Window size per connection. | |||
Similar to the example in section 3.3, a typical test scenario might | Similar to the example in section 3.3, a typical test scenario might | |||
be: GigE LAN with a 100Mbps bottleneck bandwidth (rate limited | be: GigE LAN with a 100Mbps bottleneck bandwidth (rate limited | |||
logical interface), and 5 msec RTT. This would require five (5) TCP | logical interface), and 5 msec RTT. This would require five (5) TCP | |||
connections of 64 KB window size evenly fill the bottleneck bandwidth | connections of 64 KB window size evenly fill the bottleneck bandwidth | |||
(about 100 Mbps per connection). | (about 100 Mbps per connection). | |||
The traffic shaping should be run over a long enough duration to | The traffic shaping test SHOULD be run over a long enough duration to | |||
properly exercise network buffers and also characterize performance | properly exercise network buffers (greater than 30 seconds) and also | |||
during different time periods of the day. The throughput of each | characterize performance during different time periods of the day. | |||
connection must be logged during the entire test, along with the TCP | The throughput of each connection MUST be logged during the entire | |||
Efficiency and TCP Transfer time metric. Additionally, it is | test, along with the TCP Transfer Time, TCP Efficiency, and | |||
recommended to log RTT and retransmissions per connection over the | Buffer Delay metrics. | |||
test interval. | ||||
3.4.1.1 Interpretation of Traffic Shaping Test Restults | 3.4.1.1 Interpretation of Traffic Shaping Test Results | |||
By plotting the throughput achieved by each TCP connection, the fair | By plotting the throughput achieved by each TCP connection, the fair | |||
sharing of the bandwidth is generally very obvious when traffic | sharing of the bandwidth is generally very obvious when traffic | |||
shaping is properly configured for the bottleneck interface. For the | shaping is properly configured for the bottleneck interface. For the | |||
previous example of 5 connections sharing 500 Mbps, each connection | previous example of 5 connections sharing 500 Mbps, each connection | |||
would consume ~100 Mbps with a smooth variation. If traffic policing | would consume ~100 Mbps with a smooth variation. If traffic policing | |||
was present on the bottleneck interface, the bandwidth sharing would | was present on the bottleneck interface, the bandwidth sharing MAY | |||
not be fair and the resulting throughput plot would reveal "spikey" | not be fair and the resulting throughput plot MAY reveal "spikey" | |||
throughput consumption of the competing TCP connections (due to the | throughput consumption of the competing TCP connections (due to the | |||
retransmissions). | retransmissions). | |||
3.4.2 RED Tests | 3.4.2 RED Tests | |||
Random Early Discard techniques are specifically targeted to provide | Random Early Discard techniques are specifically targeted to provide | |||
congestion avoidance for TCP traffic. Before the network element | congestion avoidance for TCP traffic. Before the network element | |||
queue "fills" and enters the tail drop state, RED drops packets at | queue "fills" and enters the tail drop state, RED drops packets at | |||
configurable queue depth thresholds. This action causes TCP | configurable queue depth thresholds. This action causes TCP | |||
connections to back-off which helps to prevent tail drop, which in | connections to back-off which helps to prevent tail drop, which in | |||
turn helps to prevent global TCP synchronization. | turn helps to prevent global TCP synchronization. | |||
Again, rate limited interfaces can benefit greatly from RED based | Again, rate limited interfaces can benefit greatly from RED based | |||
techniques. Without RED, TCP is generally not able to achieve the | techniques. Without RED, TCP is generally not able to achieve the | |||
full bandwidth of the bottleneck interface. With RED enabled, TCP | full bandwidth of the bottleneck interface. With RED enabled, TCP | |||
congestion avoidance throttles the connections on the higher speed | congestion avoidance throttles the connections on the higher speed | |||
interface (i.e. LAN) and can reach equalibrium with the bottleneck | interface (i.e. LAN) and can reach equilibrium with the bottleneck | |||
bandwidth (achieving closer to full throughput). | bandwidth (achieving closer to full throughput). | |||
The ability to detect proper RED configuration is more easily | The ability to detect proper RED configuration is more easily | |||
diagnosed when conducting a multiple TCP connection test. Multiple | diagnosed when conducting a multiple TCP connection test. Multiple | |||
TCP connections provide the multiple bursty sources that emulate the | TCP connections provide the multiple bursty sources that emulate the | |||
real-world conditions for which RED was intended. | real-world conditions for which RED was intended. | |||
The RED tests also build upon the concepts of multiple connection | The RED tests also build upon the concepts of multiple connection | |||
testing as defined in secion 3.3.3. Calculating the BDP for the | testing as defined in section 3.3.3. Calculating the BDP for the | |||
bottleneck bandwidth is first required and then selecting the number | bottleneck bandwidth is first REQUIRED before selecting the number | |||
of connections / window size per connection. | of connections and TCP Window size per connection. | |||
For RED testing, the desired effect is to cause the TCP connections | For RED testing, the desired effect is to cause the TCP connections | |||
to burst beyond the bottleneck bandwidth so that queue drops will | to burst beyond the bottleneck bandwidth so that queue drops will | |||
occur. Using the same example from section 3.4.1 (traffic shaping), | occur. Using the same example from section 3.4.1 (traffic shaping), | |||
the 500 Mbps bottleneck bandwidth requires 5 TCP connections (with | the 500 Mbps bottleneck bandwidth requires 5 TCP connections (with | |||
window size of 64Kb) to fill the capacity. Some experimentation is | window size of 64Kb) to fill the capacity. Some experimentation is | |||
required,but it is recommended to start with double the number of | REQUIRED, but it is RECOMMENDED to start with double the number of | |||
connections to stress the network element buffers / queues. In this | connections to stress the network element buffers / queues. In this | |||
example, 10 connections would produce TCP bursts of 64KB for each | example, 10 connections SHOULD produce TCP bursts of 64KB for each | |||
connection. If the timing of the TCP tester permits, these TCP | connection. If the timing of the TCP tester permits, these TCP | |||
bursts could stress queue sizes in the 512KB range. Again | bursts SHOULD stress queue sizes in the 512KB range. Again | |||
experimentation will be required and the proper number of TCP | experimentation will be REQUIRED and the proper number of TCP | |||
connections / window size will be dictated by the size the network | connections and TCP window size will be dictated by the size the | |||
element queue. | network element queue. | |||
3.4.2.1 Interpretation of RED Results | 3.4.2.1 Interpretation of RED Results | |||
The default queuing technique for most network devices is FIFO based. | The default queuing technique for most network devices is FIFO based. | |||
Without RED, the FIFO based queue will cause excessive loss to all of | Without RED, the FIFO based queue will cause excessive loss to all of | |||
the TCP connections and in the worst case global TCP synchronization. | the TCP connections and in the worst case global TCP synchronization. | |||
By plotting the aggregate throughput achieved on the bottleneck | By plotting the aggregate throughput achieved on the bottleneck | |||
interface, proper RED operation can be determined if the bottleneck | interface, proper RED operation MAY be determined if the bottleneck | |||
bandwidth is fully utilized. For the previous example of 10 | bandwidth is fully utilized. For the previous example of 10 | |||
connections (window = 64 KB) sharing 500 Mbps, each connection should | connections (window = 64 KB) sharing 500 Mbps, each connection SHOULD | |||
consume ~50 Mbps. If RED was not properly enabled on the interface, | consume ~50 Mbps. If RED was not properly enabled on the interface, | |||
then the TCP connections will retransmit at a higher rate and the net | then the TCP connections will retransmit at a higher rate and the | |||
effect is that the bottleneck bandwidth is not fully utilized. | net effect is that the bottleneck bandwidth is not fully utilized. | |||
Another means to study non-RED versus RED implementation is to use | Another means to study non-RED versus RED implementation is to use | |||
the TCP Transfer Time metric for all of the connections. In this | the TCP Transfer Time metric for all of the connections. In this | |||
example, a 100 MB payload transfer should take ideally 16 seconds | example, a 100 MB payload transfer SHOULD take ideally 16 seconds | |||
across all 10 connections (with RED enabled). With RED not enabled, | across all 10 connections (with RED enabled). With RED not enabled, | |||
the throughput across the bottleneck bandwidth would be greatly | the throughput across the bottleneck bandwidth MAY be greatly | |||
reduced (generally 20-40%) and the TCP Transfer time would be | reduced (generally 20-40%) and the actual TCP Transfer time MAY be | |||
proportionally longer then the ideal transfer time. | proportionally longer then the Ideal TCP Transfer time. | |||
Additionally, the TCP Transfer Efficiency metric is useful, since | Additionally, the TCP Transfer Efficiency metric is useful, since | |||
non-RED implementations will exhibit a lower TCP Tranfer Efficiency | non-RED implementations MAY exhibit a lower TCP Transfer Efficiency. | |||
than RED implementations. | ||||
4. Security Considerations | 4. Security Considerations | |||
The security considerations that apply to any active measurement of | The security considerations that apply to any active measurement of | |||
live networks are relevant here as well. See [RFC4656] and | live networks are relevant here as well. See [RFC4656] and | |||
[RFC5357]. | [RFC5357]. | |||
5. IANA Considerations | 5. IANA Considerations | |||
This memo does not require and IANA registration for ports dedicated | This document does not REQUIRE an IANA registration for ports | |||
to the TCP testing described in this memo. | dedicated to the TCP testing described in this document. | |||
6. Acknowledgements | ||||
The author would like to thank Gilles Forget, Loki Jorgenson, | 6. Acknowledgments | |||
and Reinhard Schrage for technical review and original contributions | ||||
to this draft-06. | ||||
Also thanks to Matt Mathis, Matt Zekauskas, Al Morton, and Yaakov | Thanks to Matt Mathis, Matt Zekauskas, Al Morton, Rudi Geib, and | |||
Stein for many good comments and for pointing us to great sources of | Yaakov Stein, Loki Jorgenson for many good comments and for pointing | |||
information pertaining to past works in the TCP capacity area. | us to great sources of information pertaining to past works in the | |||
TCP capacity area. | ||||
7. References | 7. References | |||
7.1 Normative References | 7.1 Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. | [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. | |||
Zekauskas, "A One-way Active Measurement Protocol | Zekauskas, "A One-way Active Measurement Protocol | |||
(OWAMP)", RFC 4656, September 2006. | (OWAMP)", RFC 4656, September 2006. | |||
[RFC5681] Allman, M., Paxson, V., Stevens W., "TCP Congestion | [RFC5681] Allman, M., Paxson, V., Stevens W., "TCP Congestion | |||
Control", RFC 5681, September 2009. | Control", RFC 5681, September 2009. | |||
[RFC2544] Bradner, S., McQuaid, J., "Benchmarking Methodology for | [RFC2544] Bradner, S., McQuaid, J., "Benchmarking Methodology for | |||
Network Interconnect Devices", RFC 2544, June 1999 | Network Interconnect Devices", RFC 2544, June 1999 | |||
[RFC3449] Balakrishnan, H., Padmanabhan, V. N., Fairhurst, G., | ||||
Sooriyabandara, M., "TCP Performance Implications of | ||||
Network Path Asymmetry", RFC 3449, December 2002 | ||||
[RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz, | [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz, | |||
J., "A Two-Way Active Measurement Protocol (TWAMP)", | J., "A Two-Way Active Measurement Protocol (TWAMP)", | |||
RFC 5357, October 2008 | RFC 5357, October 2008 | |||
[RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU | [RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU | |||
Discovery", RFC 4821, June 2007 | Discovery", RFC 4821, June 2007 | |||
draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk | draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk | |||
Transfer Capacity Methodology for Cooperating Hosts", | Transfer Capacity Methodology for Cooperating Hosts", | |||
August 2001 | August 2001 | |||
[RFC2681] Almes G., Kalidindi S., Zekauskas, M., "A Round-trip Delay | ||||
Metric for IPPM", RFC 2681, September, 1999 | ||||
[RFC4898] Mathis, M., Heffner, J., Raghunarayan, R., "TCP Extended | ||||
Statistics MIB", May 2007 | ||||
[RFC5136] Chimento P., Ishac, J., "Defining Network Capacity", | ||||
February 2008 | ||||
[RFC1323] Jacobson, V., Braden, R., Borman D., "TCP Extensions for | ||||
High Performance", May 1992 | ||||
7.2. Informative References | 7.2. Informative References | |||
Authors' Addresses | Authors' Addresses | |||
Barry Constantine | Barry Constantine | |||
JDSU, Test and Measurement Division | JDSU, Test and Measurement Division | |||
One Milesone Center Court | One Milesone Center Court | |||
Germantown, MD 20876-7100 | Germantown, MD 20876-7100 | |||
USA | USA | |||
skipping to change at page 20, line 47 | skipping to change at page 23, line 49 | |||
barry.constantine@jdsu.com | barry.constantine@jdsu.com | |||
Gilles Forget | Gilles Forget | |||
Independent Consultant to Bell Canada. | Independent Consultant to Bell Canada. | |||
308, rue de Monaco, St-Eustache | 308, rue de Monaco, St-Eustache | |||
Qc. CANADA, Postal Code : J7P-4T5 | Qc. CANADA, Postal Code : J7P-4T5 | |||
Phone: (514) 895-8212 | Phone: (514) 895-8212 | |||
gilles.forget@sympatico.ca | gilles.forget@sympatico.ca | |||
Loki Jorgenson | ||||
nooCore | ||||
Phone: (604) 908-5833 | ||||
ljorgenson@nooCore.com | ||||
Reinhard Schrage | Reinhard Schrage | |||
Schrage Consulting | Schrage Consulting | |||
Phone: +49 (0) 5137 909540 | Phone: +49 (0) 5137 909540 | |||
reinhard@schrageconsult.com | reinhard@schrageconsult.com | |||
End of changes. 123 change blocks. | ||||
321 lines changed or deleted | 455 lines changed or added | |||
This html diff was produced by rfcdiff 1.39. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |