draft-ietf-bmwg-protection-meth-06.txt   draft-ietf-bmwg-protection-meth-07.txt 
Network Working Group R. Papneja Network Working Group R. Papneja
Internet Draft Isocore Internet Draft Isocore
Intended Status: Informational Intended Status: Informational
Expires: April 2010 S. Vapiwala Expires: June 2010 S. Vapiwala
J. Karthik J. Karthik
Cisco Systems Cisco Systems
S. Poretsky S. Poretsky
Allot Communications Allot Communications
S. Rao S. Rao
Qwest Communications Qwest Communications
J.L. Le Roux J.L. Le Roux
France Telecom France Telecom
October 2009 December 2009
Methodology for benchmarking MPLS protection mechanisms Methodology for benchmarking MPLS protection mechanisms
draft-ietf-bmwg-protection-meth-06.txt draft-ietf-bmwg-protection-meth-07.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 15, 2010. This Internet-Draft will expire on June 20, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info). publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. and restrictions with respect to this document.
skipping to change at page 4, line 30 skipping to change at page 4, line 30
A correlated failure is the simultaneous occurrence A correlated failure is the simultaneous occurrence
of two or more failures. A typical example is failure of a logical of two or more failures. A typical example is failure of a logical
resource (e.g. layer-2 links) due to a dependency on a common resource (e.g. layer-2 links) due to a dependency on a common
physical resource (e.g. common conduit) that fails. Within physical resource (e.g. common conduit) that fails. Within
the context of MPLS protection mechanisms, failures that arise due the context of MPLS protection mechanisms, failures that arise due
to Shared Risk Link Groups (SRLG) [MPLS-FRR-EXT] can be considered to Shared Risk Link Groups (SRLG) [MPLS-FRR-EXT] can be considered
as correlated failures. Not all correlated failures are as correlated failures. Not all correlated failures are
predictable in advance, for example, those caused by natural predictable in advance, for example, those caused by natural
disasters. disasters.
MPLS Fast Re-Route (MPLS-FRR) allows for the possibility that the
Label Switched Paths can be re-optimized in the minutes following
Failover. IP Traffic would be re-routed according to the preferred
path for the post-failure topology. Thus, MPLS-FRR includes an
additional step to the General model:
1. Failover Event - Primary Path (Working Path) fails
2. Failure Detection- Failover Event is detected
3. a. Failover - Working Path switched to Backup path
3. b. Re-Optimization of Working Path (possible change from Backup
Path)
4. Restoration - Primary Path recovers from a Failover Event
5. Reversion (optional) - Working Path returns to Primary Path
2. Document Scope 2. Document Scope
This document provides detailed test cases along with different This document provides detailed test cases along with different
topologies and scenarios that should be considered to effectively topologies and scenarios that should be considered to effectively
benchmark MPLS protection mechanisms and failover times on the benchmark MPLS protection mechanisms and failover times on the
Data Plane. Different Failover Events and scaling considerations Data Plane. Different Failover Events and scaling considerations
are also provided in this document. are also provided in this document.
All benchmarking testcases defined in this document apply to both All benchmarking testcases defined in this document apply to both
facility backup and local protection enabled in detour mode. The facility backup and local protection enabled in detour mode. The
test cases cover all possible failure scenarios and the test cases cover all possible failure scenarios and the
associated procedures benchmark the performance of the Device associated procedures benchmark the performance of the Device
Under Test (DUT) to recover from failures. Data plane traffic is Under Test (DUT) to recover from failures. Data plane traffic is
used to benchmark failover times. used to benchmark failover times.
Benchmarking of correlated failures is out of scope of this Benchmarking of correlated failures is out of scope of this
document. Protection from Bi-directional Forwarding Detection document. Protection from Bi-directional Forwarding Detection
(BFD) is outside the scope of this document. (BFD) is outside the scope of this document.
As described above, MPLS-FRR may include a Re-optimization of the
Working Path, with possible packet transfer impairments.
Characterization of Re-optimization is beyond the scope of this memo.
Protection Mechanisms Protection Mechanisms
3. Existing definitions 3. Existing definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119 document are to be interpreted as described in BCP 14, RFC 2119
[Br97]. RFC 2119 defines the use of these key words to help make the [Br97]. RFC 2119 defines the use of these key words to help make the
intent of standards track documents as clear as possible. While this intent of standards track documents as clear as possible. While this
document uses these keywords, this document is not a standards track document uses these keywords, this document is not a standards track
skipping to change at page 8, line 45 skipping to change at page 8, line 45
Fast Reroute provides a method to return or restore an original Fast Reroute provides a method to return or restore an original
primary LSP upon recovery from the failure (Restoration) and to primary LSP upon recovery from the failure (Restoration) and to
switch traffic from the Backup Path to the restored Primary Path switch traffic from the Backup Path to the restored Primary Path
(Reversion). In MPLS-FRR, Reversion can be implemented as Global (Reversion). In MPLS-FRR, Reversion can be implemented as Global
Reversion or Local Reversion. It is important to include Reversion or Local Reversion. It is important to include
Restoration and Reversion as a step in each test case to measure Restoration and Reversion as a step in each test case to measure
the amount of packet loss, out of order packets, or duplicate the amount of packet loss, out of order packets, or duplicate
packets that is produced. packets that is produced.
Note: In addition to restoration and reversion, re-optimization
can take place while the failure is still not recovered but it
depends on the user configuration, and re-otimization timers.
5.7. Offered Load 5.7. Offered Load
It is suggested that there be one or more traffic streams as long as It is suggested that there be one or more traffic streams as long as
there is a steady and constant rate of flow for all the streams. In there is a steady and constant rate of flow for all the streams. In
order to monitor the DUT performance for recovery times, a set of order to monitor the DUT performance for recovery times, a set of
route prefixes should be advertised before traffic is sent. The route prefixes should be advertised before traffic is sent. The
traffic should be configured towards these routes. traffic should be configured towards these routes.
A typical example would be configuring the traffic generator to send At least 16 flows should be used, and more if possible. Prefix-
the traffic to the first, middle and last of the advertised routes. dependency behaviors are key in IP and tests with route-specific
(First, middle and last could be decided by the numerically
Protection Mechanisms Protection Mechanisms
smallest, median and the largest respectively of the advertised flows spread across the routing table will reveal this dependency.
prefix). Generating traffic to all of the prefixes reachable by the Generating traffic to all of the prefixes reachable by the
protected tunnel (probably in a Round-Robin fashion, where the protected tunnel (probably in a Round-Robin fashion, where the
traffic is destined to all the prefixes but one prefix at a time in traffic is destined to all the prefixes but one prefix at a time in
a cyclic manner) is not recommended. The reason why traffic a cyclic manner) is not recommended. The reason why traffic
generation is not recommended in a Round-Robin fashion to all the generation is not recommended in a Round-Robin fashion to all the
prefixes, one at a time is that if there are many prefixes reachable prefixes, one at a time is that if there are many prefixes reachable
through the LSP the time interval between 2 packets destined to one through the LSP the time interval between 2 packets destined to one
prefix may be significantly high and may be comparable with the prefix may be significantly high and may be comparable with the
failover time being measured which does not aid in getting an failover time being measured which does not aid in getting an
accurate failover measurement. accurate failover measurement.
skipping to change at page 9, line 30 skipping to change at page 9, line 30
It is RECOMMENDED that the Tester used to execute each test case It is RECOMMENDED that the Tester used to execute each test case
have the following capabilities: have the following capabilities:
1. Ability to establish MPLS-TE tunnels and push/pop labels. 1. Ability to establish MPLS-TE tunnels and push/pop labels.
2. Ability to produce Failover Event [TERM-ID]. 2. Ability to produce Failover Event [TERM-ID].
3. Ability to insert a timestamp in each data packet's IP 3. Ability to insert a timestamp in each data packet's IP
payload. payload.
4. An internal time clock to control timestamping, time 4. An internal time clock to control timestamping, time
measurements, and time calculations. measurements, and time calculations.
5. Ability to disable or tune specific Layer-2 and Layer-3 5. Ability to disable or tune specific Layer-2 and Layer-3
protocol functions on any interface(s). protocol functions on any interface(s).
6. Ability to react upon the receipt of path error from the PLR
The Tester MAY be capable to make non-data plane convergence The Tester MAY be capable to make non-data plane convergence
observations and use those observations for measurements. observations and use those observations for measurements.
6. Reference Test Setup 6. Reference Test Setup
In addition to the general reference topology shown in figure 1, In addition to the general reference topology shown in figure 1,
this section provides detailed insight into various proposed test this section provides detailed insight into various proposed test
setups that should be considered for comprehensively benchmarking setups that should be considered for comprehensively benchmarking
the failover time in different roles along the primary tunnel: the failover time in different roles along the primary tunnel:
skipping to change at page 16, line 5 skipping to change at page 15, line 44
The procedure described in this section can be applied to all the 8 The procedure described in this section can be applied to all the 8
base test cases and the associated topologies. The backup as well as base test cases and the associated topologies. The backup as well as
the primary tunnels are configured to be alike in terms of bandwidth the primary tunnels are configured to be alike in terms of bandwidth
usage. In order to benchmark failover with all possible label stack usage. In order to benchmark failover with all possible label stack
depth applicable as seen with current deployments, it is RECOMMENDED depth applicable as seen with current deployments, it is RECOMMENDED
to perform all of the test cases provided in this section. The to perform all of the test cases provided in this section. The
forwarding performance test cases in section 7.1 MUST be performed forwarding performance test cases in section 7.1 MUST be performed
prior to performing the failover test cases. prior to performing the failover test cases.
The considerations of Section 4 of [RFC2544] are applicable when
evaluating the results obtained using these methodologies as well.
Protection Mechanisms Protection Mechanisms
7.1. MPLS FRR Forwarding Performance 7.1. MPLS FRR Forwarding Performance
Benchmarking Failover Time [TERM-ID] for MPLS protection first Benchmarking Failover Time [TERM-ID] for MPLS protection first
requires baseline measurement of the forwarding performance of the requires baseline measurement of the forwarding performance of the
test topology including the DUT. Forwarding performance is test topology including the DUT. Forwarding performance is
benchmarked by the metric Throughput as defined in [Br91] and benchmarked by the Throughput as defined in [MPLS-FWD] and
measured in units pps. This section provides two test cases to measured in units pps. This section provides two test cases to
benchmark forwarding performance. These are with the DUT benchmark forwarding performance. These are with the DUT
configured as a Headend PLR, Mid-Point PLR, and Egress PLR. configured as a Headend PLR, Mid-Point PLR, and Egress PLR.
7.1.1. Headend PLR Forwarding Performance 7.1.1. Headend PLR Forwarding Performance
Objective Objective
To benchmark the maximum rate (pps) on the PLR (as headend) over To benchmark the maximum rate (pps) on the PLR (as headend) over
primary LSP and backup LSP. primary LSP and backup LSP.
skipping to change at page 17, line 16 skipping to change at page 17, line 16
7.1.2. Mid-Point PLR Forwarding Performance 7.1.2. Mid-Point PLR Forwarding Performance
Objective Objective
To benchmark the maximum rate (pps) on the PLR (as mid-point) over To benchmark the maximum rate (pps) on the PLR (as mid-point) over
primary LSP and backup LSP. primary LSP and backup LSP.
Test Setup Test Setup
- Select any one topology out of 8 from section 6. - Select any one topology out of 9 from section 6.
- Select overlay technologies (e.g. IGP, VPN, or VC) with DUT - Select overlay technologies (e.g. IGP, VPN, or VC) with DUT
as Mid-Point PLR. as Mid-Point PLR.
- The DUT will also have 2 interfaces connected to the traffic - The DUT will also have 2 interfaces connected to the traffic
generator. generator.
Procedure Procedure
1. Establish the primary LSP on R1 required by the topology 1. Establish the primary LSP on R1 required by the topology
selected. selected.
2. Establish the backup LSP on R2 required by the selected 2. Establish the backup LSP on R2 required by the selected
skipping to change at page 21, line 17 skipping to change at page 21, line 17
7.4. Headend PLR with Node Failure 7.4. Headend PLR with Node Failure
Objective Objective
To benchmark the MPLS failover time due to Node failure events To benchmark the MPLS failover time due to Node failure events
described in section 5.1 experienced by the DUT which is the described in section 5.1 experienced by the DUT which is the
Headend PLR. Headend PLR.
Test Setup Test Setup
- Select any one topology from section 6.5 to 6.8 - Select any one topology from section 6
- Select overlay technology for FRR test (e.g. IGP, VPN, or VC) - Select overlay technology for FRR test (e.g. IGP, VPN, or VC)
- The DUT will also have 2 interfaces connected to the traffic - The DUT will also have 2 interfaces connected to the traffic
generator. generator.
Test Configuration Test Configuration
1. Configure the number of primaries on R2 and the backups on R2 1. Configure the number of primaries on R2 and the backups on R2
as required by the topology selected. as required by the topology selected.
2. Configure the test setup to support Reversion. 2. Configure the test setup to support Reversion.
3. Advertise prefixes (as per FRR Scalability table describe in 3. Advertise prefixes (as per FRR Scalability table describe in
skipping to change at page 22, line 17 skipping to change at page 22, line 17
7.5. Mid-Point PLR with Node failure 7.5. Mid-Point PLR with Node failure
Objective Objective
To benchmark the MPLS failover time due to Node failure events To benchmark the MPLS failover time due to Node failure events
described in section 5.1 experienced by the DUT which is the described in section 5.1 experienced by the DUT which is the
Mid-Point PLR. Mid-Point PLR.
Test Setup Test Setup
- Select any one topology from section 6.5 to 6.8. - Select any one topology from section 6.1 to 6.2.
- Select overlay technology for FRR test as Mid-Point LSPs. - Select overlay technology for FRR test as Mid-Point LSPs.
- The DUT will also have 2 interfaces connected to the traffic - The DUT will also have 2 interfaces connected to the traffic
generator. generator.
Test Configuration Test Configuration
1. Configure the number of primaries on R1 and the backups on 1. Configure the number of primaries on R1 and the backups on
R2 as required by the topology selected. R2 as required by the topology selected.
2. Configure the test setup to support Reversion. 2. Configure the test setup to support Reversion.
3. Advertise prefixes (as per FRR Scalability table describe in 3. Advertise prefixes (as per FRR Scalability table describe in
skipping to change at page 23, line 18 skipping to change at page 23, line 18
For each test, it is recommended that the results be reported in the For each test, it is recommended that the results be reported in the
following format. following format.
Parameter Units Parameter Units
IGP used for the test ISIS-TE/ OSPF-TE IGP used for the test ISIS-TE/ OSPF-TE
Interface types Gige,POS,ATM,VLAN etc. Interface types Gige,POS,ATM,VLAN etc.
Packet Sizes offered to the DUT Bytes Packet Sizes offered to the DUT Bytes (at layer 3)
Forwarding rate packets per second Offered Load packets per second
IGP routes advertised Number of IGP routes IGP routes advertised Number of IGP routes
Penultimate Hop Popping Used/Not Used Penultimate Hop Popping Used/Not Used
RSVP hello timers Milliseconds RSVP hello timers Milliseconds
Number of FRR tunnels Number of tunnels Number of Protected tunnels Number of tunnels
Number of VPN routes installed Number of VPN routes Number of VPN routes installed Number of VPN routes
on the Headend on the Headend
Number of VC tunnels Number of VC tunnels Number of VC tunnels Number of VC tunnels
Number of BGP routes BGP routes installed
Number of mid-point tunnels Number of tunnels Number of mid-point tunnels Number of tunnels
Number of Prefixes protected by Number of LSPs Number of Prefixes protected by Number of LSPs
Primary Primary
Topology being used Section number, and Topology being used Section number, and
figure reference figure reference
Failover Event Event type Failover Event Event type
Re-optimization Yes/No
Protection Mechanisms Protection Mechanisms
Benchmarks (to be recorded for each test case): Benchmarks (to be recorded for each test case):
Failover- Failover-
Failover Time seconds Failover Time seconds
Failover Packet Loss packets Failover Packet Loss packets
Additive Backup Delay seconds Additive Backup Delay seconds
Out-of-Order Packets packets Out-of-Order Packets packets
Duplicate Packets packets Duplicate Packets packets
Failover Time Calculation Method Method Used
Reversion- Reversion-
Reversion Time seconds Reversion Time seconds
Reversion Packet Loss packets Reversion Packet Loss packets
Additive Backup Delay seconds Additive Backup Delay seconds
Out-of-Order Packets packets Out-of-Order Packets packets
Duplicate Packets packets Duplicate Packets packets
Failover Time Calculation Method Method Used
Failover Time suggested above is calculated using one of the Failover Time suggested above is calculated using one of the
following three methods following three methods
1. Packet-Based Loss method (PBLM): (Number of packets 1. Packet-Loss Based method (PLBM): (Number of packets
dropped/packets per second * 1000) milliseconds. This method dropped/packets per second * 1000) milliseconds. This method
could also be referred as Rate Derived method. could also be referred as Loss-Derived method.
2. Time-Based Loss Method (TBLM): This method relies on the 2. Time-Based Loss Method (TBLM): This method relies on the
ability of the Traffic generators to provide statistics which ability of the Traffic generators to provide statistics which
reveal the duration of failure in milliseconds based on when reveal the duration of failure in milliseconds based on when
the packet loss occurred (interval between non-zero packet loss the packet loss occurred (interval between non-zero packet loss
and zero loss). and zero loss).
3. Timestamp Based Method (TBM): This method of failover 3. Timestamp Based Method (TBM): This method of failover
calculation is based on the timestamp that gets transmitted as calculation is based on the timestamp that gets transmitted as
payload in the packets originated by the generator. The Traffic payload in the packets originated by the generator. The Traffic
Analyzer records the timestamp of the last packet received Analyzer records the timestamp of the last packet received
before the failover event and the first packet after the before the failover event and the first packet after the
failover and derives the time based on the difference between failover and derives the time based on the difference between
these 2 timestamps. Note: The payload could also contain these 2 timestamps. Note: The payload could also contain
sequence numbers for out-of-order packet calculation and sequence numbers for out-of-order packet calculation and
duplicate packets. duplicate packets.
The timestamp based method method would be able to detect Reversion
impairments beyond loss, thus it is RECOMMENDED method as a Failover
Time method.
Protection Mechanisms Protection Mechanisms
9. Security Considerations 9. Security Considerations
Documents of this type do not directly affect the security of
Internet or corporate networks as long as benchmarking is not Benchmarking activities as described in this memo are limited to
performed on devices or systems connected to production networks. technology characterization using controlled stimuli in a laboratory
Security threats and how to counter these in SIP and the media environment, with dedicated address space and the constraints
layer is discussed in RFC3261, RFC3550, and RFC3711 and various specified in the sections above.
other drafts. This document attempts to formalize a set of
common methodology for benchmarking performance of failover The benchmarking network topology will be an independent test setup
mechanisms in a lab environment. and MUST NOT be connected to devices that may forward the test
traffic into a production network, or misroute traffic to the test
management network.
Further, benchmarking is performed on a "black-box" basis, relying
solely on measurements observable external to the DUT/SUT.
Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
benchmarking purposes. Any implications for network security arising
from the DUT/SUT SHOULD be identical in the lab and in production
networks.
10. IANA Considerations 10. IANA Considerations
This document requires no IANA considerations. This document requires no IANA considerations.
11. References 11. References
11.1. Informative References 11.1. Informative References
NONE NONE
11.2. Normative References 11.2. Normative References
skipping to change at page 25, line 52 skipping to change at page 25, line 62
[Br97] Bradner, S., "Key words for use in RFCs to Indicate [Br97] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119, July 1997. Requirement Levels", RFC 2119, July 1997.
[Ma98] Mandeville, R., "Benchmarking Terminology for LAN [Ma98] Mandeville, R., "Benchmarking Terminology for LAN
Switching Devices", RFC 2285, February 1998. Switching Devices", RFC 2285, February 1998.
[Po06] Poretsky, S., et al., "Terminology for Benchmarking [Po06] Poretsky, S., et al., "Terminology for Benchmarking
Network-layer Traffic Control Mechanisms", RFC 4689, Network-layer Traffic Control Mechanisms", RFC 4689,
November 2006. November 2006.
[MPLS-FWD] Akhter, A., et al., "MPLS Forwarding Benchmarking
for IP Flows", draft-ietf-bmwg-mpls-forwarding-meth-06,
September 2009.
12. Acknowledgments 12. Acknowledgments
We would like to thank Jean Philip Vasseur for his invaluable input We would like to thank Jean Philip Vasseur for his invaluable input
to the document and Curtis Villamizar his contribution in suggesting to the document and Curtis Villamizar his contribution in suggesting
text on definition and need for benchmarking Correlated failures. text on definition and need for benchmarking Correlated failures.
Additionally we would like to thank Al Morton, Arun Gandhi, Additionally we would like to thank Al Morton, Arun Gandhi,
Amrit Hanspal, Karu Ratnam, Raveesh Janardan, Andrey Kiselev, and Amrit Hanspal, Karu Ratnam, Raveesh Janardan, Andrey Kiselev, and
Mohan Nanduri for their formal reviews of this document. Mohan Nanduri for their formal reviews of this document.
Protection Mechanisms Protection Mechanisms
 End of changes. 27 change blocks. 
28 lines changed or deleted 73 lines changed or added

This html diff was produced by rfcdiff 1.37b. The latest version is available from http://tools.ietf.org/tools/rfcdiff/