[Docs] [txt|pdf] [Tracker] [WG] [Email] [Nits] [IPR]

Versions: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 RFC 5474

INTERNET-DRAFT                                    Editor: Nick Duffield
draft-ietf-psamp-framework-00                      AT&T Labs - Research

September 03, 2002

               A Framework for Passive Packet Measurement

    Copyright (C) The Internet Society (2001).  All Rights Reserved.

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at


   A wide range of traffic engineering and troubleshooting tasks rely
   on reliable, timely, and detailed traffic measurements. We describe
   a passive packet measurement framework that is (a) general enough
   to serve as the basis for a wide range of operational tasks, and
   (b) relies on a small set of primitives that facilitate uniform
   deployment in router interfaces or dedicated measurement devices,
   even at very high speeds. This document describes the motivation
   for such a framework through several operational examples, defines
   the measurement primitives (filtering, sampling, and hashing), and
   illustrates their use.

   Comments on this document should be addressed to the PSAMP WG
   mailing list: psamp@ops.ietf.org
   To subscribe: psamp-request@ops.ietf.org, in body: subscribe
   Archive: https://ops.ietf.org/lists/psamp/

1 Motivation

   Framework: This document is described a framework for a standard
   set of capabilities for network elements to sample packets and
   report on them.  One motivation to standardize these capabilities
   comes from the requirement for measurement-based support for
   network management and control across multivendor domains.  This
   requires domain wide consistency in the types of sampling schemes
   available, the manner in which the resulting measurements are
   presented, and consequently, consistency of the interpretation that
   can be put on them.

   Relation to other work: The measurement capabilities are positioned
   as suppliers of packet samples to higher level consumers, including
   both remote collectors and applications, and on board
   measurement-based applications.  Indeed, development of the
   standards within the framework described here should take into
   account the measurement requirements of standards in other IETF
   WGs, including IPPM and TEWG. Conversely, we expect that aspects of
   this framework not specifically concerned with the central issue of
   packet sampling may be able to leverage work in other WGs. The
   prime example is the format and export of measurement reports,
   which may leverage the work of IPFIX.

   Applications: We first describe several representative operational
   applications that require traffic measurements at various levels of
   temporal and spatial granularity.

   Example 1: Troubleshooting

   A network operator typically monitors aggregate statistics on a
   per- link basis. Such aggregate statistics may include total number
   of packets and bytes, dropped number of packets and bytes. These
   statistics are typically moving averages over relatively long time
   windows (e.g., 5 minutes), and serve as a coarse-grain indication
   of operational health of the network. The most common method of
   obtaining such measurements are through the appropriate SNMP MIBs
   (MIB-II and vendor-specific MIBs.)

   Suppose an operator detects a link that is persistently overloaded
   and experiences significant packet drop rates. There is a wide
   range of potential causes: routing parameters (e.g., OSPF link
   weights) that are poorly adapted to the traffic matrix, e.g.,
   because of a shift in that matrix; a denial of service attack or a
   flash crowd; a routing problem (link flapping). In most cases,
   aggregate link statistics are not sufficient to distinguish between
   such causes, and to decide on an appropriate corrective action. For
   example, if routing over two links is unstable, and the links flap
   between being overloaded and inactive, this might be averaged out
   in a 5 min window, indicating moderate loads on both links.

   Hence, the operator must be able to drill down into the traffic on
   a link, and obtain measurements that are more fine-grained both in
   space and in time. The operator has to be able to determine how
   many bytes/packets are generated for each source/destination
   address, port number, and prefix, or other attributes, such as
   protocol number, MPLS forwarding equivalence class (FEC), type of
   service, etc. This allows to pinpoint precisely the nature of the
   offending traffic. For example, in the case of a DDoS attack, the
   operator would see a significant fraction of traffic with an
   identical destination address.

   Example 2: Characterizing Demand

   Traffic engineering has two goals: optimizing the quality of
   service provided to customers, and optimizing the use of network
   resources.  This is achieved through network-wide control of
   routing, traffic classification and differentiation, and resource
   allocation. Traffic measurements are necessarily part of such a
   closed control loop.  Specifically, the operator has to be able to
   measure the total network-wide traffic demand at several levels of
   granularity and time scales.

   For example, in order to optimize intradomain routing by modifying
   OSPF link weights or by configuring MPLS tunnels, the volume per
   ingress-egress pair has to be measured (traffic matrix.)  At a
   longer time scale (weeks to months), measurements also drive
   topology and capacity planning and the management of peering
   agreements.  Topology and capacity planning involves upgrading
   links and routers and modifying the network topology to be
   well-adapted to the prevailing traffic pattern. This includes
   deciding where new customers should be attached. A natural
   representation for traffic demand to drive topology and capacity
   planning is a previous/next-hop AS traffic matrix, which
   characterizes demand in terms of neighboring ASs.  Managing peering
   agreements, i.e., making strategic decisions about setting up and
   retiring peering agreements, and modifying the terms of existing
   ones (e.g., where to interconnect with peers), benefits from a
   source/destination AS traffic matrix, because the set of
   neighboring ASs may change as a result of peering management.

   Therefore, in general, it is necessary to obtain averages over
   various time scales of the entire traffic carried by a network
   domain.  The spatial resolution of these averages include the
   source and destination IP address, AS, prefix, port number, and the
   previous and next hop AS with respect to the measurement domain.
   Furthermore, if a service provider uses multiple service types, it
   should also be possible to measure these matrices individually per
   service type.

   Example 3: Direct Observation of Network Behavior

   In certain circumstances, precise information about the spatial
   flow of traffic through the network domain is required to detect
   and diagnose problems and verify correct network behavior.  For
   example, in the case of the overloaded link in Example 1, it would
   be very helpful to know the precise set of paths that packets
   traversing this link follow. This would readily reveal a routing
   problem such as a loop, or a link with a misconfigured weight. More
   generally, complex diagnosis scenarios can benefit from measurement
   of traffic intensities (and other attributes) over a set of paths
   that is constrained in some way. For example, if a multihomed
   customer complains about performance problems on one of the access
   links from a particular source address prefix, the operator should
   be able to examine in detail the traffic from that source prefix
   which also traverses the specified access link towards the

   While it is in principle possible to obtain the spatial flow of
   traffic through auxiliary network state information, e.g., by
   downloading routing and forwarding tables from routers, this
   information is often unreliable, outdated, voluminous, and
   contingent on a network model. For operational purposes, a direct
   observation of traffic flow is more reliable, as it does not depend
   on any such auxiliary information. For example, if there was a bug
   in a router's software, direct observation would allow to diagnose
   the effect of this bug, while an indirect method would not.

2 Goals

   The main goal of this proposal is to define a measurement framework
   that relies on three canonical primitives: packet sampling,
   filtering, and hashing.  A wide spectrum of applications, including
   those described in the previous section, are enabled by
   measurements obtained through combinations of these three
   primitives.  Furthermore, a sampling device based on these
   measurement primitives is relatively simple, as (a) it requires
   only minimal per-packet processing, and (b) it requires little
   (local) memory. Therefore, the proposed framework represents an
   effective tradeoff between implementation complexity and the range
   of traffic engineering applications and other operational tasks it

   More generally, the following goals motivate the proposed framework:

   o Greatly assist a very wide range of applications that can be
   built on traffic measurement (Section 4), from a very small set of
   primitives implemented ubiquitously.

   o Aim for ubiquity, by including in the minimal set of primitives
   functions that can be implemented at maximal line rate with minimal
   additional state.

   o Aim for ubiquity, by not forcing tight integration with packet
   control actions (policing, marking, shaping, queueing).

   o Allow for extensibility, which can be applied where needed
   (depending on the application) for enhanced functionality.

   o Aim for flexibility in data export format and options.

   o A common data stream must support different applications, teams
   and organizations (e.g., traffic engineering, marketing, billing)

   o Allow for flexibility in implementation.  In particular, export
   of local router state information can be decoupled from export of
   usage information.

   o Ease of configuration of sampling and export parameters, e.g. for
   automated remote reconfiguration in response to measurements.

   o Allow transparent interpretation of measurements through
   inclusion of sampling configuration in the reporting stream.

   o Allow robust interpretation of measurements with respect to
   reports missing due to loss in transport, or omission at the
   measurement device.

3 Measurement Functionality

   3.1 Measurement Information Flow

   The framework for passive measurement has three main parts: the
   selection of packets for measurement, the creation and export of
   measurement reports, and the content and format of the measurement
   records.  Because of the increasing number of distinct measurement
   applications, we believe it is desirable to set up parallel
   measurement information flows from the stream of packets.  Each
   information flow should consist of independently-configurable
   pipelines for selecting packets and exporting measurement records.

   The processing of each measurement information flow should, as far
   as possible, be independent. However, resource constraints may
   prevent complete reporting on a packet selected for multiple
   information flows. In this case, reporting for the packet must be
   complete for at least one information flow; other information flows
   need only report that they selected the packet. The priority
   amongst information flows to report packets must be configurable.

   3.2 Packet Selection

   The function of packet selection is to select a subset out of the
   stream of all packets.  Selection may be used to select a subset of
   packets of interest based on their content, and/or to reduce the rate
   of packets into the measurement flow regardless of content.  Packet
   selection is performed by combining a number of measurement primitives
   described below. In this document we do not set any restrictions on
   the form these combinations can take.

   o Hashing:

   A hashing function operates on a subset of packet bits and
   associates the resulting hash with the packet.  Bit positions can
   be excluded from the input to the hashing function by masking. This
   ability would be used, for example, by applications that require
   the hash to be independent on packet header fields, such as TTL or
   header CRC, that are mutable on its passage through the network.

   o Filtering:

   Filtering is accomplished by applying mask/match operations to any
   combination of bit positions from the packet and the configured
   hashes.  The mask/match operation is configurable independently for
   each filter. Higher level interfaces to the match/mask primitive
   may be used to specify mask and matches for particular fields, for
   example, for IP addresses and/or TCP/UDP port numbers.

   o Sampling:

   Each sampler will be individually configurable to sample packets
   with a certain probability p.  Examples are probabilistic sampling,
   in which each packet is selected quasirandomly with probability p,
   and deterministic sampling, in which packets are sampled
   periodically with period 1/p. In some sampling schemes, the
   sampling probability may depend on the packet content. Sampling at
   full line rate with probability p=1 is not excluded in principle,
   although resource constraints may not support it in practice.

   In order to be able to function at line rates, each measurement
   primitive take as its input only a packet itself, or quantities
   that have been calculated from the packet previously by other
   measurement primitives. Router state is not assumed to be available
   to the measurement primitives.

   3.3 Report Generation

   Although the primary goal of this draft is to set up a framework
   for the sampling operations themselves, utilization of the
   resulting measurements places requirements on the information
   available for export, and the methods by which reports are
   exported. Any scheme that can accommodate the framework described
   in this section and section 3.4 is a convenient candidate for the

   Report preparation involves selecting fields of interest from each
   sampled packet, then adjoining subsidiary information (e.g., hash
   values, byte and packet counts, timestamps, etc.) from the
   selection process and router state information.  The router state
   values may depend on the packet content (e.g., the IP prefix or
   Autonomous System associated with the destination address in the IP
   header, the input and output interfaces that carried the packet,
   etc.).  Reports may also include subsidiary quantities calculated
   as a function of the selected packet and the router state. To
   simplify the design, some of the subsidiary information and router
   state may be incorporated when the records are exported, rather
   than when the packets are selected. However, all such router state
   information must be included for reporting in a timely manner, in
   order that it reflects the actual state encountered by the packet.

   3.4 Measurement Record Format

   Report export involves the bundling of one or more measurement records
   and sending a packet to the collection system. This happens separately
   for every measurement flow. A report includes several types of
   information, such as:

   o Per-packet information: The measurement record for each sampled
   packet includes various header fields (e.g., IP addresses, port
   numbers, ToS bits, TCP flags, etc.), as well as subsidiary
   information (e.g., timestamp, input and output links, other router
   state, hash values, etc.).

   o Configuration information: The stream of reports should provide
   information about the configuration of the measurement flow (e.g.,
   the sampling frequency, the sampling technique and associated
   parameters, the match/mask filter, etc.).  This ensures that the
   measurement data are self-describing and allows the collection
   system to analyze the measurement data without a separate feed of
   the configuration state. Changes in configuration must be
   immediately reflected in the report stream.

   o Aggregate information: The reports should include sufficient
   information for the collection system to account for discarded
   measurement records and lost exported packets.  For example, the
   reports could include sequence numbers to enable the collection
   machine to detect lost reports.  The reports could include a count
   of the number of bytes and packets that matched the filter, or that
   passed both the filtering and sampling stages.

   To conserve storage space and network bandwidth, the device may
   compress the measurement records as they are stored or exported.
   Compression should be quite effective since the sampled packets may
   share many fields in common (especially if the filter focuses on
   packets with certain values in particular header fields).

   3.5 Export

   The device generating the measurement records is configured to
   transmit each measurement data flow to a collection system, identified
   by IP address and port number.  Exporting these records to external
   systems introduces several practical issues that have important
   implications on the analysis of the data:

   o Reliable vs. unreliable transport: The export of measurement records
   does not necessarily require reliable export.  In fact, retransmission
   of lost measurement packets could consume additional network resources
   and would require state on the device generating the records.  The
   device would have to be addressable, able to receive and process
   acknowledgments, and to store unacknowledged data.  These requirements
   would be a significant impediment to having uniform support for
   measurement on high-speed interface cards in IP routers.  Instead, we
   propose that PSAMP devices can support a unreliable export mechanism.
   Sequence numbers on the packets or the measurement records within the
   packet would indicate when loss has occurred, and the analysis of the
   measurement data can account for this loss.  In some sense, packet
   loss becomes just another form of sampling (albeit a less desirable,
   and less controlled, form of sampling).

   o Maximum delay in exporting records: The device may queue
   measurement records in order to export multiple records in a single
   packet.  However, the device should bound the delay in exporting
   measurement records, even if the number of records is small.  This
   is important for two reasons.  First, having an upper bound on the
   export delay ensures that the collection system has up-to-date
   information about the sampled packets.  Second, in some scenarios,
   the device may associate a timestamp with the record(s) at the
   export stage.  Limiting the delay in exporting the records places a
   tight bound on the inaccuracy in the timestamp information.

   The device can impose a (configurable) Maximum Transmission Unit
   (MTU) size for reports.

   o Configurable export rate: The device should impose a (configurable)
   limit of the number of measurement records per unit time.  Otherwise,
   the measurement device could overload the network and the collection
   system.  This problem would be exacerbated in the reliable transport
   mode, where the device would retransmit any lost packets (thereby
   imposing an additional load on the network).  At times, the device may
   generate new records faster than the allowed export rate.  In this
   situation, the device should discard the excess records rather than
   transmitting them to the collection system.  The device may record
   information (such as sequence numbers, or packet and byte counter
   values accumulated at the inputs and outputs of a packet selector) to
   aid the collection system in compensating for the missing data in any
   subsequent analysis.

   o Local Export: packet reports may also be directly exported to
   on-board measurement-based applications, for example those that for
   composite statistics from more than one packet. Local export may be
   presented through an interface direct to the higher level
   applications, i.e., without employing the transport used for off-board

   3.6 Congestion-aware Transport

   Exported measurement traffic competes for resources with other
   Internet transfers.  Congestion-aware export is important to ensure
   that the measurement records do not overwhelm the capacity of the
   network or unduly degrade the performance of other applications.

   The collection server is the recipient of the traffic from the PSAMP
   device(s).  This server can detect congestion along the path from the
   PSAMP device through lost packets (which manifest themselves as gaps
   in the sequence numbers, or the absence of packets for a period of
   time).  The server can run an appropriate congestion-control algorithm
   to compute a new sending rate, and can reconfigure the PSAMP device
   with the new rate.  This is an attractive alternative to requiring the
   PSAMP device to receive acknowledgment packets.  The server can
   reconfigure the sending rate using an SNMP MIB or a command-line
   interface.  (Alternatively, the PSAMP group could define a
   light-weight configuration protocol for this purpose, if necessary.)
   Implementing the congestion control algorithm in the collection server
   has the added advantages of flexibility in adapting the sending rate
   and the ability to incorporate new congestion-control algorithms as
   they become available.

   o Changing the export rate: In congestion-aware transport, the export
   rate is dictated by the congestion control algorithm. The sender has
   to adapt to fluctuations in the export rate.  For example, a device
   could conceivably react to a reduction in the available export rate
   due to congestion by discarding excess records.  This approach may be
   an appropriate reaction to transient congestion.  Or, the device could
   operate with a smaller sampling rate.  This may be a more appropriate
   reaction to long-term congestion.  Or, the device could include fewer
   fields in the measurement records to reduce the volume of data that
   are exported to the collection machine.  In some cases, a collection
   server may receive measurement records from more than one device, and
   could decide to reduce the export rate at one device rather than the
   other (or reduce the rate for one of the filter banks at a device
   rather than the other), in order to prioritize the measurement data.
   This type of flexibility is valuable for network operators that
   collect measurement data from multiple locations to drive multiple

   o Notions of fairness: In some cases, it may be reasonable to allow
   the collection server to have flexibility in deciding how aggressively
   to respond to congestion.  For example, the PSAMP device and the
   collection server may have a very small round-trip time relative to
   other traffic.  Conventional TCP-friendly congestion control would
   allocate a very large share of the bandwidth to this traffic.
   Instead, the collection server could apply an algorithm that reacts
   more aggressively to congestion to give a larger share of the
   bandwidth to other traffic (with larger RTTs).  In other cases, the
   measurement records may require a larger share of the bandwidth than
   other flows.  For example, consider a link that carries tens of
   thousands of flows, including some non TCP-friendly DoS attack
   traffic.  Restricting the PSAMP traffic to a "fair share" allocation
   may be too restrictive, and might limit the collection of the data
   necessary to diagnose the DoS attack.  In this situation, the
   collection server could conceivably have a less aggressive reaction to
   congestion (say, dividing the sending rate by 1.5 rather than 2 in the
   presence of loss) to claim a larger fraction of the link bandwidth.
   The collection server could also employ policies that allocate
   bandwidth in certain proportions among streams of measurement records
   from different PSAMP devices (or filter banks in a single device).

   o Behavior under overload and failure: The congestion control
   algorithm has to be robust to severe overload or complete loss of
   connectivity between the device and the collection system, and also to
   the failure of the device or the collection system. For example, in a
   scenario where the collection system is unable to reconfigure the
   export rate because of loss of reverse (collection system to device)
   connectivity, it is desirable that the device reduce the export rate
   automatically. Similarly, if no measurement reports reach the
   collection system because of loss of forward connectivity, the
   collection system should not react to this by increasing the export
   rate. This problem may be solved through periodic heartbeat packets in both
   directions (i.e., measurement reports in the forward direction,
   configuration refresh messsages in the reverse direction). This allows
   each side to detect a loss in connectivity or outright failure and to
   react appropriately.

   3.7 Configuration and Management

   All configuration parameters associated with the sampling of packets
   and export of measurements are to be contained in a MIB. A secure
   protocol is to be used to access to the MIB for reconfiguration and
   retrieval of the parameters.

4 Applications

   We describe a representative set of operational applications
   enabled by the passive measurement device described in the previous
   section, by referring back to the examples in Section 1.

   Example 1: Troubleshooting

   Packet sampling is ideally suited to determine the composition of
   the traffic (e.g., on a link) in terms of various attributes
   (source and destination address and port numbers, prefix, protocol
   number, type of service, etc.) Typically, unfiltered sampling would
   be used to obtain a coarse-grained view of the traffic on a link,
   say. Once the characteristics of an interesting subset of traffic
   (e.g., a service type, or a source address prefix corresponding to
   some customer) has been identified, the resolution can be refined
   by filtering out this traffic, and by boosting the sampling rate
   correspondingly. In this way, the traffic can be examined and
   characterized ("sliced and diced") arbitrarily.

   Example 2: Characterizing Demand

   Characterizing demand for an entire network domain will likely be
   achieved by sampling packets on all the ingress links, or some
   other well-chosen cut set. The sampling rate would typically be
   chosen relatively low, given that we are interested in averages
   over longer time scales, e.g., to detect significant systemic
   shifts in demand not due to random fluctuations.  Some of the
   subsidiary fields included in reports, such as source and
   destination AS, and input and output link, will be useful,
   depending on the spatial granularity of demand characterization.

   Example 3: Direct Observation of Network Behavior

   Direct observation of the spatial flow of traffic through the
   domain can be achieved through a method called trajectory sampling,
   which relies on the hash function to make sampling decisions
   [DG01].  Specifically, the hash function is computed over a
   predefined set of fields of the IP packet header and payload. If
   the hash function for a packet falls within a configurable interval
   [a,b], then the packet should be sampled; otherwise, it should not
   be sampled. This features yields the full paths followed by sampled
   packets, by ensuring that a packet is sampled on every router it
   traverses, or no router at all.  This requires that the hash
   function and the set of packet fields over which it is computed are
   the same everywhere.

   A similar use of hash functions has also been considered for hash-
   based IP traceback of distributed denial-of-service (DDoS) attacks

5 References

   [B88] R.T. Braden, A pseudo-machine for packet monitoring and
   statistics, in Proc ACM SIGCOMM 1988

   [DG01] N. G. Duffield and M. Grossglauser, Trajectory Sampling for
   Direct Traffic Observation, IEEE/ACM Trans. on Networking, 9(3), pp.
   280-292, June 2001.

   [SPSJTKS01] A. C. Snoeren, C. Partridge, L. A. Sanchez, C. E. Jones,
   F. Tchakountio, S. T. Kent, W. T. Strayer, Hash-Based IP Traceback,
   Proc. ACM SIGCOMM 2001, San Diego, CA, September 2001.

6 Authors' Addresses

   Nick Duffield
   AT&T Labs - Research
   Room B-139
   180 Park Ave
   Florham Park NJ 07932, USA
   Phone: +1 973-360-8726
   Email: duffield@research.att.com

   Albert Greenberg
   AT&T Labs - Research
   Room A-161
   180 Park Ave
   Florham Park NJ 07932, USA
   Phone: +1 973-360-8730
   Email: albert@research.att.com

   Matthias Grossglauser
   AT&T Labs - Research
   Room A-167
   180 Park Ave
   Florham Park NJ 07932, USA
   Phone: +1 973-360-7172
   Email: mgross@research.att.com

   Jennifer Rexford
   AT&T Labs - Research
   Room A-169
   180 Park Ave
   Florham Park NJ 07932, USA
   Phone: +1 973-360-8728
   Email: jrex@research.att.com

7 Intellectual Property Statement

   AT&T Corp. may own intellectual property applicable to this
   contribution. AT&T is currently reviewing its licensing intent
   relative to the Intellectual Property and will notify the IETF when
   AT&T has made a determination of that intent.

8 Full Copyright Statement

   Copyright (C) The Internet Society (1999).  All Rights Reserved.

   This document and translations of it may be copied and furnished to others,
   and derivative works that comment on or otherwise explain it or assist in
   its implementation may be prepared, copied, published and distributed, in
   whole or in part, without restriction of any kind, provided that the above
   copyright notice and this paragraph are included on all such copies and
   derivative works.  However, this document itself may not be modified in any
   way, such as by removing the copyright notice or references to the Internet
   Society or other Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for copyrights
   defined in the Internet Standards process must be followed, or as required
   to translate it into languages other than English.

   The limited permissions granted above are perpetual and will not be revoked
   by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an "AS IS"

Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/