draft-ietf-bmwg-methodology-02.txt   rfc1944.txt 
Network Working Group Scott Bradner Network Working Group S. Bradner
Internet Draft Harvard University Request for Comments: 1944 Harvard University
Expires in six months Jim McQuaid Category: Informational J. McQuaid
Wandel & Goltermann Bay Networks
August 1995 May 1996
Benchmarking Methodology for Network Interconnect Devices Benchmarking Methodology for Network Interconnect Devices
<draft-ietf-bmwg-methodology-02.txt> Status of This Memo
Status of this Document
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the
``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
Directories on ds.internic.net (US East Coast), nic.nordu.net
(Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
Rim).
Distribution of this document is unlimited. Please send comments to This memo provides information for the Internet community. This memo
bmwg@harvard.edu or to the editors. does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.
Abstract Abstract
This document discusses and defines a number of tests that may be This document discusses and defines a number of tests that may be
used to describe the performance characteristics of a network used to describe the performance characteristics of a network
interconnecting device. In addition to defining the tests this interconnecting device. In addition to defining the tests this
document also describes specific formats for reporting the results of document also describes specific formats for reporting the results of
the tests. Appendix A lists the tests and conditions that we believe the tests. Appendix A lists the tests and conditions that we believe
should be included for specific cases and gives additional should be included for specific cases and gives additional
information about testing practices. Appendix B is a reference information about testing practices. Appendix B is a reference
listing of maximum frame rates to be used with specific frame sizes listing of maximum frame rates to be used with specific frame sizes
on various media and Appendix C gives some examples of frame formats on various media and Appendix C gives some examples of frame formats
to be used in testing. to be used in testing.
skipping to change at page 2, line 14 skipping to change at page 2, line 6
measure and report the performance characteristics of network measure and report the performance characteristics of network
devices. The results of these tests will provide the user comparable devices. The results of these tests will provide the user comparable
data from different vendors with which to evaluate these devices. data from different vendors with which to evaluate these devices.
A previous document, "Benchmarking Terminology for Network A previous document, "Benchmarking Terminology for Network
Interconnect Devices" (RFC 1242), defined many of the terms that are Interconnect Devices" (RFC 1242), defined many of the terms that are
used in this document. The terminology document should be consulted used in this document. The terminology document should be consulted
before attempting to make use of this document. before attempting to make use of this document.
2. Real world 2. Real world
In producing this document the authors attempted to keep in mind the In producing this document the authors attempted to keep in mind the
requirement that apparatus to perform the described tests must requirement that apparatus to perform the described tests must
actually be built. We do not know of "off the shelf" equipment actually be built. We do not know of "off the shelf" equipment
available to implement all of the tests but it is our opinion that available to implement all of the tests but it is our opinion that
such equipment can be constructed. such equipment can be constructed.
3. Tests to be run 3. Tests to be run
There are a number of tests described in this document. Not all of There are a number of tests described in this document. Not all of
the tests apply to all types of devices under test (DUTs). Vendors the tests apply to all types of devices under test (DUTs). Vendors
should perform all of the tests that can be supported by a specific should perform all of the tests that can be supported by a specific
type of product. The authors understand that it will take a type of product. The authors understand that it will take a
considerable period of time to perform all of the recommended tests considerable period of time to perform all of the recommended tests
under all of the recommended conditions. We believe that the results nder all of the recommended conditions. We believe that the results
are worth the effort. Appendix A lists some of the tests and are worth the effort. Appendix A lists some of the tests and
conditions that we believe should be included for specific cases. conditions that we believe should be included for specific cases.
4. Evaluating the results 4. Evaluating the results
Performing all of the recommended tests will result in a great deal Performing all of the recommended tests will result in a great deal
of data. Much of this data will not apply to the evaluation of the of data. Much of this data will not apply to the evaluation of the
devices under each circumstance. For example, the rate at which a devices under each circumstance. For example, the rate at which a
router forwards IPX frames will be of little use in selecting a router forwards IPX frames will be of little use in selecting a
router for an environment that does not (and will not) support that router for an environment that does not (and will not) support that
protocol. Evaluating even that data which is relevant to a protocol. Evaluating even that data which is relevant to a
particular network installation will require experience which may not particular network installation will require experience which may not
be readily available. Furthermore, selection of the tests to be run be readily available. Furthermore, selection of the tests to be run
and evaluation of the test data must be done with an understanding of and evaluation of the test data must be done with an understanding of
generally accepted testing practices regarding repeatability, generally accepted testing practices regarding repeatability,
skipping to change at page 3, line 21 skipping to change at page 3, line 16
An implementation is not compliant if it fails to satisfy one or more An implementation is not compliant if it fails to satisfy one or more
of the MUST requirements for the protocols it implements. An of the MUST requirements for the protocols it implements. An
implementation that satisfies all the MUST and all the SHOULD implementation that satisfies all the MUST and all the SHOULD
requirements for its protocols is said to be "unconditionally requirements for its protocols is said to be "unconditionally
compliant"; one that satisfies all the MUST requirements but not all compliant"; one that satisfies all the MUST requirements but not all
the SHOULD requirements for its protocols is said to be the SHOULD requirements for its protocols is said to be
"conditionally compliant". "conditionally compliant".
6. Test set up 6. Test set up
The ideal way to implement this series of tests is to use a tester The ideal way to implement this series of tests is to use a tester
with both transmitting and receiving ports. Connections are made with both transmitting and receiving ports. Connections are made
from the sending ports of the tester to the receiving ports of the from the sending ports of the tester to the receiving ports of the
DUT and from the sending ports of the DUT back to the tester. (see DUT and from the sending ports of the DUT back to the tester. (see
figure 1) Since the tester both sends the test traffic and receives Figure 1) Since the tester both sends the test traffic and receives
it back, after the traffic has been forwarded but the DUT, the tester it back, after the traffic has been forwarded but the DUT, the tester
can easily determine if all of the transmitted packets were received can easily determine if all of the transmitted packets were received
and verify that the correct packets were received. The same and verify that the correct packets were received. The same
functionality can be obtained with separate transmitting and functionality can be obtained with separate transmitting and
receiving devices (see figure 2) but unless they are remotely receiving devices (see Figure 2) but unless they are remotely
controlled by some computer in a way that simulates the single controlled by some computer in a way that simulates the single
tester, the labor required to accurately perform some of the tests tester, the labor required to accurately perform some of the tests
(particularly the throughput test) can be prohibitive. (particularly the throughput test) can be prohibitive.
+------------+ +------------+
| | | |
+------------| tester |<-------------+ +------------| tester |<-------------+
| | | | | | | |
| +------------+ | | +------------+ |
| | | |
| +------------+ | | +------------+ |
| | | | | | | |
+----------->| DUT |--------------+ +----------->| DUT |--------------+
| | | |
+------------+ +------------+
Figure 1
figure 1
+--------+ +------------+ +----------+ +--------+ +------------+ +----------+
| | | | | | | | | | | |
| sender |-------->| DUT |--------->| receiver | | sender |-------->| DUT |--------->| receiver |
| | | | | | | | | | | |
+--------+ +------------+ +----------+ +--------+ +------------+ +----------+
Figure 2
figure 2
6.1 Test set up for multiple media types 6.1 Test set up for multiple media types
Two different setups could be used to test a DUT which is used in Two different setups could be used to test a DUT which is used in
real-world networks to connect networks of differing media type, real-world networks to connect networks of differing media type,
local Ethernet to a backbone FDDI ring for example. The tester could local Ethernet to a backbone FDDI ring for example. The tester could
support both media types in which case the set up shown in figure 1 support both media types in which case the set up shown in Figure 1
would be used. would be used.
Two identical DUTs are used in the other test set up. (see figure 3) Two identical DUTs are used in the other test set up. (see Figure 3)
In many cases this set up may more accurately simulate the real In many cases this set up may more accurately simulate the real
world. For example, connecting two LANs together with a WAN link or world. For example, connecting two LANs together with a WAN link or
high speed backbone. This set up would not be as good at simulating high speed backbone. This set up would not be as good at simulating
a system where clients on a Ethernet LAN were interacting with a a system where clients on a Ethernet LAN were interacting with a
server on an FDDI backbone. server on an FDDI backbone.
+-----------+ +-----------+
| | | |
+---------------------| tester |<---------------------+ +---------------------| tester |<---------------------+
| | | | | | | |
| +-----------+ | | +-----------+ |
| | | |
| +----------+ +----------+ | | +----------+ +----------+ |
| | | | | | | | | | | |
+------->| DUT 1 |-------------->| DUT 2 |---------+ +------->| DUT 1 |-------------->| DUT 2 |---------+
| | | | | | | |
+----------+ +----------+ +----------+ +----------+
figure 3 Figure 3
7. DUT set up 7. DUT set up
Before starting to perform the tests, the DUT to be tested MUST be Before starting to perform the tests, the DUT to be tested MUST be
configured following the instructions provided to the user. configured following the instructions provided to the user.
Specifically, it is expected that all of the supported protocols will Specifically, it is expected that all of the supported protocols will
be configured and enabled during this set up (See Appendix A). It is be configured and enabled during this set up (See Appendix A). It is
expected that all of the tests will be run without changing the expected that all of the tests will be run without changing the
configuration or setup of the DUT in any way other than that required configuration or setup of the DUT in any way other than that required
to do the specific test. For example, it is not acceptable to change to do the specific test. For example, it is not acceptable to change
the size of frame handling buffers between tests of frame handling the size of frame handling buffers between tests of frame handling
rates or to disable all but one transport protocol when testing the rates or to disable all but one transport protocol when testing the
throughput of that protocol. It is necessary to modify the throughput of that protocol. It is necessary to modify the
skipping to change at page 15, line 46 skipping to change at page 16, line 24
To determine the latency as defined in RFC 1242. To determine the latency as defined in RFC 1242.
Procedure: Procedure:
First determine the throughput for DUT at each of the listed frame First determine the throughput for DUT at each of the listed frame
sizes. Send a stream of frames at a particular frame size through sizes. Send a stream of frames at a particular frame size through
the DUT at the determined throughput rate to a specific the DUT at the determined throughput rate to a specific
destination. The stream SHOULD be at least 120 seconds in destination. The stream SHOULD be at least 120 seconds in
duration. An identifying tag SHOULD be included in one frame duration. An identifying tag SHOULD be included in one frame
after 60 seconds with the type of tag being implementation after 60 seconds with the type of tag being implementation
dependent. The time at which this frame is fully transmitted is dependent. The time at which this frame is fully transmitted is
recorded, i.e. the last bit has been transmitted (timestamp A). recorded (timestamp A). The receiver logic in the test equipment
The receiver logic in the test equipment MUST be able to recognize MUST recognize the tag information in the frame stream and record
the tag information in the frame stream and record the time at the time at which the tagged frame was received (timestamp B).
which the entire tagged frame was received (timestamp B).
The latency is timestamp B minus timestamp A minus the transit The latency is timestamp B minus timestamp A as per the relevant
time for a frame of the tested size on the tested media. This definition frm RFC 1242, namely latency as defined for store and
calculation may result in a negative value for those DUTs that forward devices or latency as defined for bit forwarding devices.
begin to transmit the output frame before the entire input frame
has been received.
The test MUST be repeated at least 20 times with the reported The test MUST be repeated at least 20 times with the reported
value being the average of the recorded values. value being the average of the recorded values.
This test SHOULD be performed with the test frame addressed to the This test SHOULD be performed with the test frame addressed to the
same destination as the rest of the data stream and also with each same destination as the rest of the data stream and also with each
of the test frames addressed to a new destination network. of the test frames addressed to a new destination network.
Reporting format: Reporting format:
The latency results SHOULD be reported in the format of a table The report MUST state which definition of latency (from RFC 1242)
with a row for each of the tested frame sizes. There SHOULD be was used for this test. The latency results SHOULD be reported
columns for the frame size, the rate at which the latency test was in the format of a table with a row for each of the tested frame
run for that frame size, for the media types tested, and for the sizes. There SHOULD be columns for the frame size, the rate at
resultant latency values for each type of data stream tested. which the latency test was run for that frame size, for the media
types tested, and for the resultant latency values for each
type of data stream tested.
26.3 Frame loss rate 26.3 Frame loss rate
Objective: Objective:
To determine the frame loss rate, as defined in RFC 1242, of a DUT To determine the frame loss rate, as defined in RFC 1242, of a DUT
throughout the entire range of input data rates and frame sizes. throughout the entire range of input data rates and frame sizes.
Procedure: Procedure:
Send a specific number of frames at a specific rate through the Send a specific number of frames at a specific rate through the
DUT to be tested and count the frames that are transmitted by the DUT to be tested and count the frames that are transmitted by the
skipping to change at page 17, line 29 skipping to change at page 18, line 10
Send a burst of frames with minimum inter-frame gaps to the DUT Send a burst of frames with minimum inter-frame gaps to the DUT
and count the number of frames forwarded by the DUT. If the count and count the number of frames forwarded by the DUT. If the count
of transmitted frames is equal to the number of frames forwarded of transmitted frames is equal to the number of frames forwarded
the length of the burst is increased and the test is rerun. If the length of the burst is increased and the test is rerun. If
the number of forwarded frames is less than the number the number of forwarded frames is less than the number
transmitted, the length of the burst is reduced and the test is transmitted, the length of the burst is reduced and the test is
rerun. rerun.
The back-to-back value is the number of frames in the longest The back-to-back value is the number of frames in the longest
burst that the DUT will handle without the loss of any frames. burst that the DUT will handle without the loss of any frames.
The trial length MUST be at least 2 seconds and SHOULD be The trial length MUST be at least 2 seconds and SHOULD be
repeated at least 50 times with the average of the recorded values repeated at least 50 times with the average of the recorded values
being reported. being reported.
Reporting format: Reporting format:
The back-to-back results SHOULD be reported in the format of a The back-to-back results SHOULD be reported in the format of a
table with a row for each of the tested frame sizes. There SHOULD table with a row for each of the tested frame sizes. There SHOULD
be columns for the frame size and for the resultant average frame be columns for the frame size and for the resultant average frame
count for each type of data stream tested. The standard deviation count for each type of data stream tested. The standard deviation
for each measurement MAY also be reported. for each measurement MAY also be reported.
skipping to change at page 18, line 8 skipping to change at page 18, line 36
Procedure: Procedure:
First determine the throughput for a DUT at each of the listed First determine the throughput for a DUT at each of the listed
frame sizes. frame sizes.
Send a stream of frames at a rate 110% of the recorded throughput Send a stream of frames at a rate 110% of the recorded throughput
rate or the maximum rate for the media, whichever is lower, for at rate or the maximum rate for the media, whichever is lower, for at
least 60 seconds. At Timestamp A reduce the frame rate to 50% of least 60 seconds. At Timestamp A reduce the frame rate to 50% of
the above rate and record the time of the last frame lost the above rate and record the time of the last frame lost
(Timestamp B). The system recovery time is determined by (Timestamp B). The system recovery time is determined by
subtracting Timestamp A from Timestamp B. The test SHOULD be subtracting Timestamp B from Timestamp A. The test SHOULD be
repeated a number of times and the average of the recorded values repeated a number of times and the average of the recorded values
being reported. being reported.
Reporting format: Reporting format:
The system recovery results SHOULD be reported in the format of a The system recovery results SHOULD be reported in the format of a
table with a row for each of the tested frame sizes. There SHOULD table with a row for each of the tested frame sizes. There SHOULD
be columns for the frame size, the frame rate used as the be columns for the frame size, the frame rate used as the
throughput rate for each type of data stream tested, and for the throughput rate for each type of data stream tested, and for the
measured recovery time for each type of data stream tested. measured recovery time for each type of data stream tested.
skipping to change at page 19, line 9 skipping to change at page 20, line 5
SHOULD be tested. SHOULD be tested.
Reporting format: Reporting format:
The reset value SHOULD be reported in a simple set of statements, The reset value SHOULD be reported in a simple set of statements,
one for each reset type. one for each reset type.
27. Security Considerations 27. Security Considerations
Security issues are not addressed in this document. Security issues are not addressed in this document.
28. Editor's Addresses 28. Editors' Addresses
Scott Bradner Scott Bradner
Harvard University Phone +1 617 495-3864 Harvard University
1350 Mass. Ave, room 813 Fax +1 617 496-8500 1350 Mass. Ave, room 813
Cambridge, MA 02138 Email: sob@harvard.edu Cambridge, MA 02138
Phone +1 617 495-3864
Fax +1 617 496-8500
EMail: sob@harvard.edu
Jim McQuaid Jim McQuaid
Wandel & Goltermann Technologies, Inc Phone +1 919 941-4730 Bay Networks
P. O. Box 13585 Fax: +1 919 941-5751 3 Federal Street
Research Triangle Park, NC 27709 Email: mcquaid@wg.com Billerica, MA 01821
Phone +1 508 436-3915
Fax: +1 508 670-8145
EMail: jmcquaid@baynetworks.com
Appendix A: Testing Considerations Appendix A: Testing Considerations
A.1 Scope Of This Appendix A.1 Scope Of This Appendix
This appendix discusses certain issues in the benchmarking This appendix discusses certain issues in the benchmarking
methodology where experience or judgment may play a role in the tests methodology where experience or judgment may play a role in the tests
selected to be run or in the approach to constructing the test with a selected to be run or in the approach to constructing the test with a
particular DUT. As such, this appendix MUST not be read as an particular DUT. As such, this appendix MUST not be read as an
amendment to the methodology described in the body of this document amendment to the methodology described in the body of this document
skipping to change at page 23, line 37 skipping to change at page 25, line 23
between the protocol node address and the MAC address. The between the protocol node address and the MAC address. The
Address Resolution Protocol (ARP) is used to perform this Address Resolution Protocol (ARP) is used to perform this
function in TCP/IP. No such procedure is required in XNS or function in TCP/IP. No such procedure is required in XNS or
IPX because the MAC address is used as the protocol node IPX because the MAC address is used as the protocol node
address. address.
In the ideal case the tester would be able to respond to ARP In the ideal case the tester would be able to respond to ARP
requests from the DUT. In cases where this is not possible an requests from the DUT. In cases where this is not possible an
ARP request should be sent to the router's "output" port. This ARP request should be sent to the router's "output" port. This
request should be seen as coming from the immediate destination request should be seen as coming from the immediate destination
of the test frame stream. (i.e. the phantom router (figure 2) of the test frame stream. (i.e. the phantom router (Figure 2)
or the end node if adjacent network routing is being used.) It or the end node if adjacent network routing is being used.) It
is assumed that the router will cache the MAC address of the is assumed that the router will cache the MAC address of the
requesting device. The ARP request should be sent 5 seconds requesting device. The ARP request should be sent 5 seconds
before the test frame stream starts in each trial. Trial before the test frame stream starts in each trial. Trial
lengths of longer than 50 seconds may require that the router lengths of longer than 50 seconds may require that the router
be configured for an extended ARP timeout. be configured for an extended ARP timeout.
+--------+ +------------+ +--------+ +------------+
| | | phantom |------ P LAN | | | phantom |------ P LAN
A A
IN A------| DUT |------------| |------ P LAN IN A------| DUT |------------| |------ P LAN
B B
| | OUT A | router |------ P LAN | | OUT A | router |------ P LAN
C C
+--------+ +------------+ +--------+ +------------+
figure 2
Figure 2
In the case where full routing is being used In the case where full routing is being used
C.2.4.2 Routing Update Frame C.2.4.2 Routing Update Frame
If the test does not involve adjacent net routing the tester If the test does not involve adjacent net routing the tester
must supply proper routing information using a routing update. must supply proper routing information using a routing update.
A single routing update is used before each trial on each A single routing update is used before each trial on each
"destination" port (see section C.24). This update includes "destination" port (see section C.24). This update includes
the network addresses that are reachable through a phantom the network addresses that are reachable through a phantom
router on the network attached to the port. For a full mesh router on the network attached to the port. For a full mesh
test, one destination network address is present in the routing test, one destination network address is present in the routing
update for each of the "input" ports. The test stream on each update for each of the "input" ports. The test stream on each
"input" port consists of a repeating sequence of frames, one to "input" port consists of a repeating sequence of frames, one to
each of the "output" ports. each of the "output" ports.
C.2.4.3 Management Query Frame C.2.4.3 Management Query Frame
The management overhead test uses SNMP to query a set of The management overhead test uses SNMP to query a set of
variables that should be present in all DUTs that support SNMP. variables that should be present in all DUTs that support SNMP.
The variables are read by an NMS at the appropriate intervals. The variables for a single interface only are read by an NMS
The list of variables to retrieve follow: at the appropriate intervals. The list of variables to
retrieve follow:
sysUpTime sysUpTime
ifInOctets ifInOctets
ifOutOctets ifOutOctets
ifInUcastPkts ifInUcastPkts
ifOutUcastPkts ifOutUcastPkts
C.2.4.4 Test Frames C.2.4.4 Test Frames
The test frame is an UDP Echo Request with enough data to fill The test frame is an UDP Echo Request with enough data to fill
out the required frame size. The data should not be all bits out the required frame size. The data should not be all bits
 End of changes. 32 change blocks. 
67 lines changed or deleted 65 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/