draft-ietf-mpls-forwarding-09.txt   rfc7325.txt 
MPLS C. Villamizar, Ed. Internet Engineering Task Force (IETF) C. Villamizar, Ed.
Internet-Draft OCCNC Request for Comments: 7325 OCCNC
Intended status: Informational K. Kompella Category: Informational K. Kompella
Expires: September 5, 2014 Juniper Networks ISSN: 2070-1721 Juniper Networks
S. Amante S. Amante
Apple Inc. Apple Inc.
A. Malis A. Malis
Huawei Huawei
C. Pignataro C. Pignataro
Cisco Cisco
March 4, 2014 August 2014
MPLS Forwarding Compliance and Performance Requirements MPLS Forwarding Compliance and Performance Requirements
draft-ietf-mpls-forwarding-09
Abstract Abstract
This document provides guidelines for implementers regarding MPLS This document provides guidelines for implementers regarding MPLS
forwarding and a basis for evaluations of forwarding implementations. forwarding and a basis for evaluations of forwarding implementations.
Guidelines cover many aspects of MPLS forwarding. Topics are Guidelines cover many aspects of MPLS forwarding. Topics are
highlighted where implementers might otherwise overlook practical highlighted where implementers might otherwise overlook practical
requirements which are unstated or under emphasized or are optional requirements that are unstated or underemphasized, or that are
for conformance to RFCs but are often considered mandatory by optional for conformance to RFCs but often considered mandatory by
providers. providers.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This document is not an Internet Standards Track specification; it is
provisions of BCP 78 and BCP 79. published for informational purposes.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are a candidate for any level of Internet
Standard; see Section 2 of RFC 5741.
This Internet-Draft will expire on September 5, 2014. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7325.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
skipping to change at page 2, line 17 skipping to change at page 2, line 23
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction and Document Scope . . . . . . . . . . . . . . . 3 1. Introduction and Document Scope . . . . . . . . . . . . . . . 4
1.1. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Use of Requirements Language . . . . . . . . . . . . . . 8 1.2. Use of Requirements Language . . . . . . . . . . . . . . 8
1.3. Apparent Misconceptions . . . . . . . . . . . . . . . . . 8 1.3. Apparent Misconceptions . . . . . . . . . . . . . . . . . 9
1.4. Target Audience . . . . . . . . . . . . . . . . . . . . . 10 1.4. Target Audience . . . . . . . . . . . . . . . . . . . . . 10
2. Forwarding Issues . . . . . . . . . . . . . . . . . . . . . . 10 2. Forwarding Issues . . . . . . . . . . . . . . . . . . . . . . 11
2.1. Forwarding Basics . . . . . . . . . . . . . . . . . . . . 10 2.1. Forwarding Basics . . . . . . . . . . . . . . . . . . . . 11
2.1.1. MPLS Special Purpose Labels . . . . . . . . . . . . . 11 2.1.1. MPLS Special-Purpose Labels . . . . . . . . . . . . . 12
2.1.2. MPLS Differentiated Services . . . . . . . . . . . . 12 2.1.2. MPLS Differentiated Services . . . . . . . . . . . . 13
2.1.3. Time Synchronization . . . . . . . . . . . . . . . . 13 2.1.3. Time Synchronization . . . . . . . . . . . . . . . . 14
2.1.4. Uses of Multiple Label Stack Entries . . . . . . . . 14 2.1.4. Uses of Multiple Label Stack Entries . . . . . . . . 14
2.1.5. MPLS Link Bundling . . . . . . . . . . . . . . . . . 15 2.1.5. MPLS Link Bundling . . . . . . . . . . . . . . . . . 15
2.1.6. MPLS Hierarchy . . . . . . . . . . . . . . . . . . . 15 2.1.6. MPLS Hierarchy . . . . . . . . . . . . . . . . . . . 16
2.1.7. MPLS Fast Reroute (FRR) . . . . . . . . . . . . . . . 15 2.1.7. MPLS Fast Reroute (FRR) . . . . . . . . . . . . . . . 16
2.1.8. Pseudowire Encapsulation . . . . . . . . . . . . . . 17 2.1.8. Pseudowire Encapsulation . . . . . . . . . . . . . . 17
2.1.8.1. Pseudowire Sequence Number . . . . . . . . . . . 17 2.1.8.1. Pseudowire Sequence Number . . . . . . . . . . . 17
2.1.9. Layer-2 and Layer-3 VPN . . . . . . . . . . . . . . . 19 2.1.9. Layer 2 and Layer 3 VPN . . . . . . . . . . . . . . . 19
2.2. MPLS Multicast . . . . . . . . . . . . . . . . . . . . . 19 2.2. MPLS Multicast . . . . . . . . . . . . . . . . . . . . . 20
2.3. Packet Rates . . . . . . . . . . . . . . . . . . . . . . 20 2.3. Packet Rates . . . . . . . . . . . . . . . . . . . . . . 21
2.4. MPLS Multipath Techniques . . . . . . . . . . . . . . . . 22 2.4. MPLS Multipath Techniques . . . . . . . . . . . . . . . . 23
2.4.1. Pseudowire Control Word . . . . . . . . . . . . . . . 23 2.4.1. Pseudowire Control Word . . . . . . . . . . . . . . . 24
2.4.2. Large Microflows . . . . . . . . . . . . . . . . . . 23 2.4.2. Large Microflows . . . . . . . . . . . . . . . . . . 24
2.4.3. Pseudowire Flow Label . . . . . . . . . . . . . . . . 24 2.4.3. Pseudowire Flow Label . . . . . . . . . . . . . . . . 25
2.4.4. MPLS Entropy Label . . . . . . . . . . . . . . . . . 24 2.4.4. MPLS Entropy Label . . . . . . . . . . . . . . . . . 25
2.4.5. Fields Used for Multipath Load Balance . . . . . . . 25 2.4.5. Fields Used for Multipath Load Balance . . . . . . . 25
2.4.5.1. MPLS Fields in Multipath . . . . . . . . . . . . 25 2.4.5.1. MPLS Fields in Multipath . . . . . . . . . . . . 26
2.4.5.2. IP Fields in Multipath . . . . . . . . . . . . . 27 2.4.5.2. IP Fields in Multipath . . . . . . . . . . . . . 27
2.4.5.3. Fields Used in Flow Label . . . . . . . . . . . . 28 2.4.5.3. Fields Used in Flow Label . . . . . . . . . . . . 29
2.4.5.4. Fields Used in Entropy Label . . . . . . . . . . 29 2.4.5.4. Fields Used in Entropy Label . . . . . . . . . . 29
2.5. MPLS-TP and UHP . . . . . . . . . . . . . . . . . . . . . 29 2.5. MPLS-TP and UHP . . . . . . . . . . . . . . . . . . . . . 30
2.6. Local Delivery of Packets . . . . . . . . . . . . . . . . 29 2.6. Local Delivery of Packets . . . . . . . . . . . . . . . . 30
2.6.1. DoS Protection . . . . . . . . . . . . . . . . . . . 30 2.6.1. DoS Protection . . . . . . . . . . . . . . . . . . . 31
2.6.2. MPLS OAM . . . . . . . . . . . . . . . . . . . . . . 32 2.6.2. MPLS OAM . . . . . . . . . . . . . . . . . . . . . . 33
2.6.3. Pseudowire OAM . . . . . . . . . . . . . . . . . . . 33 2.6.3. Pseudowire OAM . . . . . . . . . . . . . . . . . . . 34
2.6.4. MPLS-TP OAM . . . . . . . . . . . . . . . . . . . . . 33 2.6.4. MPLS-TP OAM . . . . . . . . . . . . . . . . . . . . . 34
2.6.5. MPLS OAM and Layer-2 OAM Interworking . . . . . . . . 35 2.6.5. MPLS OAM and Layer 2 OAM Interworking . . . . . . . . 35
2.6.6. Extent of OAM Support by Hardware . . . . . . . . . . 35 2.6.6. Extent of OAM Support by Hardware . . . . . . . . . . 36
2.6.7. Support for IPFIX in Hardware . . . . . . . . . . . . 36 2.6.7. Support for IPFIX in Hardware . . . . . . . . . . . . 37
2.7. Number and Size of Flows . . . . . . . . . . . . . . . . 36 2.7. Number and Size of Flows . . . . . . . . . . . . . . . . 37
3. Questions for Suppliers . . . . . . . . . . . . . . . . . . . 37 3. Questions for Suppliers . . . . . . . . . . . . . . . . . . . 38
3.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 37 3.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 38
3.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 39 3.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 40
3.3. Multipath Capabilities and Performance . . . . . . . . . 40 3.3. Multipath Capabilities and Performance . . . . . . . . . 41
3.4. Pseudowire Capabilities and Performance . . . . . . . . . 40 3.4. Pseudowire Capabilities and Performance . . . . . . . . . 41
3.5. Entropy Label Support and Performance . . . . . . . . . . 41 3.5. Entropy Label Support and Performance . . . . . . . . . . 42
3.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 41 3.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 42
3.7. OAM Capabilities and Performance . . . . . . . . . . . . 41 3.7. OAM Capabilities and Performance . . . . . . . . . . . . 42
4. Forwarding Compliance and Performance Testing . . . . . . . . 42 4. Forwarding Compliance and Performance Testing . . . . . . . . 43
4.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 42 4.1. Basic Compliance . . . . . . . . . . . . . . . . . . . . 43
4.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 43 4.2. Basic Performance . . . . . . . . . . . . . . . . . . . . 44
4.3. Multipath Capabilities and Performance . . . . . . . . . 44 4.3. Multipath Capabilities and Performance . . . . . . . . . 45
4.4. Pseudowire Capabilities and Performance . . . . . . . . . 44 4.4. Pseudowire Capabilities and Performance . . . . . . . . . 46
4.5. Entropy Label Support and Performance . . . . . . . . . . 45 4.5. Entropy Label Support and Performance . . . . . . . . . . 46
4.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 46 4.6. DoS Protection . . . . . . . . . . . . . . . . . . . . . 47
4.7. OAM Capabilities and Performance . . . . . . . . . . . . 46 4.7. OAM Capabilities and Performance . . . . . . . . . . . . 47
5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 47 5. Security Considerations . . . . . . . . . . . . . . . . . . . 48
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 48 6. Organization of References Section . . . . . . . . . . . . . 50
7. Security Considerations . . . . . . . . . . . . . . . . . . . 48 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 50
8. Organization of References Section . . . . . . . . . . . . . 50 7.1. Normative References . . . . . . . . . . . . . . . . . . 50
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.2. Informative References . . . . . . . . . . . . . . . . . 53
9.1. Normative References . . . . . . . . . . . . . . . . . . 50 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 59
9.2. Informative References . . . . . . . . . . . . . . . . . 52
9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 58
1. Introduction and Document Scope 1. Introduction and Document Scope
The initial purpose of this document was to address concerns raised The initial purpose of this document was to address concerns raised
on the MPLS WG mailing list about shortcomings in implementations of on the MPLS WG mailing list about shortcomings in implementations of
MPLS forwarding. Documenting existing misconceptions and potential MPLS forwarding. Documenting existing misconceptions and potential
pitfalls might potentially avoid repeating past mistakes. The pitfalls might potentially avoid repeating past mistakes. The
document has grown to address a broad set of forwarding requirements. document has grown to address a broad set of forwarding requirements.
The focus of this document is MPLS forwarding, base pseudowire The focus of this document is MPLS forwarding, base pseudowire
forwarding, and MPLS Operations, Administration, and Maintenance forwarding, and MPLS Operations, Administration, and Maintenance
(OAM). The use of pseudowire control word, and sequence number are (OAM). The use of pseudowire Control Word and the use of pseudowire
discussed. Specific pseudowire Attachment Circuit (AC) and Native Sequence Number are discussed. Specific pseudowire Attachment
Service Processing (NSP) are out of scope. Specific pseudowire Circuit (AC) and Native Service Processing (NSP) are out of scope.
applications, such as various forms of Virtual Private Network (VPN), Specific pseudowire applications, such as various forms of Virtual
are out of scope. Private Network (VPN), are out of scope.
MPLS support for multipath techniques is considered essential by many MPLS support for multipath techniques is considered essential by many
service providers and is useful for other high capacity networks. In service providers and is useful for other high-capacity networks. In
order to obtain sufficient entropy from MPLS traffic service order to obtain sufficient entropy from MPLS, traffic service
providers and others find it essential for the MPLS implementation to providers and others find it essential for the MPLS implementation to
interpret the MPLS payload as IPv4 or IPv6 based on the contents of interpret the MPLS payload as IPv4 or IPv6 based on the contents of
the first nibble of payload. The use of IP addresses, the IP the first nibble of payload. The use of IP addresses, the IP
protocol field, and UDP and TCP port number fields in multipath load protocol field, and UDP and TCP port number fields in multipath load
balancing are considered within scope. The use of any other IP balancing are considered within scope. The use of any other IP
protocol fields, such as tunneling protocols carried within IP, are protocol fields, such as tunneling protocols carried within IP, are
out of scope. out of scope.
Implementation details are a local matter and are out of scope. Most Implementation details are a local matter and are out of scope. Most
interfaces today operate at 1 Gb/s or greater. It is assumed that interfaces today operate at 1 Gb/s or greater. It is assumed that
all forwarding operations are implemented in specialized forwarding all forwarding operations are implemented in specialized forwarding
hardware rather than on a general purpose processor. This is often hardware rather than on a general-purpose processor. This is often
referred to as "fast path" and "slow path" processing. Some referred to as "fast path" and "slow path" processing. Some
recommendations are made regarding implementing control or management recommendations are made regarding implementing control or
plane functionality in specialized hardware or with limited management-plane functionality in specialized hardware or with
assistance from specialized hardware. This advice is based on limited assistance from specialized hardware. This advice is based
expected control or management protocol loads and on the need for on expected control or management protocol loads and on the need for
denial of service (DoS) protection. denial of service (DoS) protection.
1.1. Abbreviations 1.1. Abbreviations
The following abbreviations are used. The following abbreviations are used.
AC Attachment Circuit ([RFC3985]) AC Attachment Circuit ([RFC3985])
ACH Associated Channel Header (pseudowires) ACH Associated Channel Header (pseudowires)
skipping to change at page 4, line 32 skipping to change at page 5, line 4
1.1. Abbreviations 1.1. Abbreviations
The following abbreviations are used. The following abbreviations are used.
AC Attachment Circuit ([RFC3985]) AC Attachment Circuit ([RFC3985])
ACH Associated Channel Header (pseudowires) ACH Associated Channel Header (pseudowires)
ACK Acknowledgement (TCP flag and type of TCP packet) ACK Acknowledgement (TCP flag and type of TCP packet)
AIS Alarm Indication Signal (MPLS-TP OAM) AIS Alarm Indication Signal (MPLS-TP OAM)
ATM Asynchronous Transfer Mode (legacy switched circuits) ATM Asynchronous Transfer Mode (legacy switched circuits)
BFD Bidirectional Forwarding Detection BFD Bidirectional Forwarding Detection
BGP Border Gateway Protocol BGP Border Gateway Protocol
CC-CV Connectivity Check and Connectivity Verification CC-CV Continuity Check and Connectivity Verification
CE Customer Edge (LDP, RSVP-TE, other protocols) CE Customer Edge ([RFC4364])
CPU Central Processing Unit (computer or microprocessor) CPU Central Processing Unit (computer or microprocessor)
CT Class Type ([RFC4124]) CT Class Type ([RFC4124])
CW Control Word ([RFC4385]) CW Control Word ([RFC4385])
DCCP Datagram Congestion Control Protocol DCCP Datagram Congestion Control Protocol
DDoS Distributed Denial of Service DDoS Distributed Denial of Service
DM Delay Measurement (MPLS-TP OAM) DM Delay Measurement (MPLS-TP OAM)
DSCP Differentiated Services Code Point ([RFC2474]) DSCP Differentiated Services Code Point ([RFC2474])
DWDM Dense Wave Division Multiplexing DWDM Dense Wave Division Multiplexing
DoS Denial of Service DoS Denial of Service
skipping to change at page 5, line 14 skipping to change at page 5, line 34
DDoS Distributed Denial of Service DDoS Distributed Denial of Service
DM Delay Measurement (MPLS-TP OAM) DM Delay Measurement (MPLS-TP OAM)
DSCP Differentiated Services Code Point ([RFC2474]) DSCP Differentiated Services Code Point ([RFC2474])
DWDM Dense Wave Division Multiplexing DWDM Dense Wave Division Multiplexing
DoS Denial of Service DoS Denial of Service
E-LSP EXP-Inferred-PSC LSP ([RFC3270]) E-LSP Explicitly TC-encoded-PSC LSP ([RFC5462])
EBGP External BGP EBGP External BGP
ECMP Equal Cost Multi-Path ECMP Equal-Cost Multipath
ECN Explicit Congestion Notification ([RFC3168] and [RFC5129]) ECN Explicit Congestion Notification ([RFC3168] and [RFC5129])
EL Entropy Label ([RFC6790]) EL Entropy Label ([RFC6790])
ELI Entropy Label Indicator ([RFC6790]) ELI Entropy Label Indicator ([RFC6790])
EXP Experimental (field in MPLS renamed to TC in [RFC5462]) EXP Experimental (field in MPLS renamed to "TC" in [RFC5462])
FEC Forwarding Equivalence Classes (LDP), also Forward Error FEC Forwarding Equivalence Classes ([RFC3031]); also Forward Error
Correction in other context Correction in other context
FR Frame Relay (legacy switched circuits) FR Frame Relay (legacy switched circuits)
FRR Fast Reroute ([RFC4090]) FRR Fast Reroute ([RFC4090])
G-ACh Generic Associated Channel ([RFC5586]) G-ACh Generic Associated Channel ([RFC5586])
GAL Generic Associated Channel Label ([RFC5586]) GAL Generic Associated Channel Label ([RFC5586])
GFP Generic Framing Protocol (used in OTN) GFP Generic Framing Procedure (used in OTN)
GMPLS Generalized MPLS ([RFC3471]) GMPLS Generalized MPLS ([RFC3471])
GTSM Generalized TTL Security Mechanism ([RFC5082]) GTSM Generalized TTL Security Mechanism ([RFC5082])
Gb/s Gigabits per second (billion bits per second) Gb/s Gigabits per second (billion bits per second)
IANA Internet Assigned Numbers Authority IANA Internet Assigned Numbers Authority
ILM Incoming Label Map ([RFC3031]) ILM Incoming Label Map ([RFC3031])
skipping to change at page 6, line 26 skipping to change at page 6, line 46
LER Label Edge Router ([RFC3031]) LER Label Edge Router ([RFC3031])
LM Loss Measurement (MPLS-TP OAM) LM Loss Measurement (MPLS-TP OAM)
LSP Label Switched Path ([RFC3031]) LSP Label Switched Path ([RFC3031])
LSR Label Switching Router ([RFC3031]) LSR Label Switching Router ([RFC3031])
MP2MP Multipoint to Multipoint MP2MP Multipoint to Multipoint
MPLS MultiProtocol Label Switching ([RFC3031]) MPLS Multiprotocol Label Switching ([RFC3031])
MPLS-TP MPLS Transport Profile ([RFC5317]) MPLS-TP MPLS Transport Profile ([RFC5317])
Mb/s Megabits per second (million bits per second) Mb/s Megabits per second (million bits per second)
NSP Native Service Processing ([RFC3985]) NSP Native Service Processing ([RFC3985])
NTP Network Time Protocol NTP Network Time Protocol
OAM Operations, Administration, and Maintenance ([RFC6291]) OAM Operations, Administration, and Maintenance ([RFC6291])
OOB Out-of-band (not carried within a data channel) OOB Out-of-band (not carried within a data channel)
OTN Optical Transport Network OTN Optical Transport Network
skipping to change at page 6, line 42 skipping to change at page 7, line 14
NSP Native Service Processing ([RFC3985]) NSP Native Service Processing ([RFC3985])
NTP Network Time Protocol NTP Network Time Protocol
OAM Operations, Administration, and Maintenance ([RFC6291]) OAM Operations, Administration, and Maintenance ([RFC6291])
OOB Out-of-band (not carried within a data channel) OOB Out-of-band (not carried within a data channel)
OTN Optical Transport Network OTN Optical Transport Network
P Provider router (LDP, RSVP-TE, other protocols) P Provider router ([RFC4364])
P2MP Point to Multi-Point P2MP Point to Multipoint
PE Provider Edge router (LDP, RSVP-TE, other protocols) PE Provider Edge router ([RFC4364])
PHB Per-Hop-Behavior ([RFC2475]) PHB Per-Hop Behavior ([RFC2475])
PHP Penultimate Hop Popping ([RFC3443]) PHP Penultimate Hop Popping ([RFC3443])
POS Packet over SONET
POS PPP over SONET
PSC This abbreviation has multiple interpretations. PSC This abbreviation has multiple interpretations.
1. Packet Switch Capable ([RFC3471] 1. Packet Switch Capable ([RFC3471]
2. PHB Scheduling Class ([RFC3270]) 2. PHB Scheduling Class ([RFC3270])
3. Protection State Coordination ([RFC6378]) 3. Protection State Coordination ([RFC6378])
PTP Precision Time Protocol PTP Precision Time Protocol
skipping to change at page 7, line 26 skipping to change at page 7, line 46
PW Pseudowire PW Pseudowire
QoS Quality of Service QoS Quality of Service
RA Router Alert ([RFC3032]) RA Router Alert ([RFC3032])
RDI Remote Defect Indication (MPLS-TP OAM) RDI Remote Defect Indication (MPLS-TP OAM)
RSVP-TE RSVP Traffic Engineering RSVP-TE RSVP Traffic Engineering
RTP Real-Time Transport Protocol RTP Real-time Transport Protocol
SCTP Stream Control Transmission Protocol SCTP Stream Control Transmission Protocol
SDH Synchronous Data Hierarchy (European SONET, a form of TDM) SDH Synchronous Data Hierarchy (European SONET, a form of TDM)
SONET Synchronous Optical Network (US SDH, a form of TDM) SONET Synchronous Optical Network (US SDH, a form of TDM)
T-LDP Targeted LDP (LDP sessions over more than one hop) T-LDP Targeted LDP (LDP sessions over more than one hop)
TC Traffic Class ([RFC5462]) TC Traffic Class ([RFC5462])
TCP Transmission Control Protocol TCP Transmission Control Protocol
TDM Time-Division Multiplexing (legacy encapsulations) TDM Time-Division Multiplexing (legacy encapsulations)
skipping to change at page 8, line 14 skipping to change at page 8, line 34
VLAN Virtual Local Area Network (Ethernet) VLAN Virtual Local Area Network (Ethernet)
VOQ Virtual Output Queuing (switch fabric design) VOQ Virtual Output Queuing (switch fabric design)
VPN Virtual Private Network VPN Virtual Private Network
WG Working Group WG Working Group
1.2. Use of Requirements Language 1.2. Use of Requirements Language
This document is informational. The upper case [RFC2119] key words This document is Informational. The uppercase [RFC2119] key words
"MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" are used in "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" are used in
this document in the following cases. this document in the following cases.
1. RFC 2119 keywords are used where requirements stated in this 1. RFC 2119 keywords are used where requirements stated in this
document are called for in referenced RFCs. In most cases the document are called for in referenced RFCs. In most cases, the
RFC containing the requirement is cited within the statement RFC containing the requirement is cited within the statement
using an RFC 2119 keyword. using an RFC 2119 keyword.
2. RFC 2119 keywords are used where explicitly noted that the 2. RFC 2119 keywords are used where explicitly noted that the
keywords indicate that operator experiences indicate a keywords indicate that operator experiences indicate a
requirement, but there are no existing RFC requirements. requirement, but there are no existing RFC requirements.
Advice provided by this document may be ignored by implementations. Advice provided by this document may be ignored by implementations.
Similarly, implementations not claiming conformance to specific RFCs Similarly, implementations not claiming conformance to specific RFCs
may ignore the requirements of those RFCs. In both cases, may ignore the requirements of those RFCs. In both cases,
implementers should consider the risk of doing so. implementers should consider the risk of doing so.
1.3. Apparent Misconceptions 1.3. Apparent Misconceptions
In early generations of forwarding silicon (which might now be behind In early generations of forwarding silicon (which might now be behind
us), there apparently were some misconceptions about MPLS. The us), there apparently were some misconceptions about MPLS. The
following statements provide clarifications. following statements provide clarifications.
1. There are practical reasons to have more than one or two labels 1. There are practical reasons to have more than one or two labels
in an MPLS label stack. Under some circumstances the label stack in an MPLS label stack. Under some circumstances, the label
can become quite deep. See Section 2.1. stack can become quite deep. See Section 2.1.
2. The label stack MUST be considered to be arbitrarily deep. 2. The label stack MUST be considered to be arbitrarily deep.
Section 3.27.4. "Hierarchy: LSP Tunnels within LSPs" of RFC3031 Section 3.27.4 ("Hierarchy: LSP Tunnels within LSPs") of RFC 3031
states "The label stack mechanism allows LSP tunneling to nest to states "The label stack mechanism allows LSP tunneling to nest to
any depth." [RFC3031] If a bottom of the label stack cannot be any depth" [RFC3031]. If a bottom of the label stack cannot be
found, but sufficient number of labels exist to forward, an LSR found, but sufficient number of labels exist to forward, an LSR
MUST forward the packet. An LSR MUST NOT assume the packet is MUST forward the packet. An LSR MUST NOT assume the packet is
malformed unless the end of packet is found before bottom of malformed unless the end of packet is found before the bottom of
stack. See Section 2.1. the stack. See Section 2.1.
3. In networks where deep label stacks are encountered, they are not 3. In networks where deep label stacks are encountered, they are not
rare. Full packet rate performance is required regardless of rare. Full packet rate performance is required regardless of
label stack depth, except where multiple pop operations are label stack depth, except where multiple pop operations are
required. See Section 2.1. required. See Section 2.1.
4. Research has shown that long bursts of short packets with 40 byte 4. Research has shown that long bursts of short packets with 40-byte
or 44 byte IP payload sizes in these bursts are quite common. or 44-byte IP payload sizes in these bursts are quite common.
This is due to TCP ACK compression [ACK-compression]. The This is due to TCP ACK compression [ACK-compression]. The
following two sub-bullets constitutes advice that reflects very following two sub-bullets constitute advice that reflects very
common non-negotiable requirements of providers. Implementers common nonnegotiable requirements of providers. Implementers may
may ignore this advice but should consider the risk of doing so. ignore this advice but should consider the risk of doing so.
A. A forwarding engine SHOULD, if practical, be able to sustain A. A forwarding engine SHOULD, if practical, be able to sustain
an arbitrarily long sequence of small packets arriving at an arbitrarily long sequence of small packets arriving at
full interface rate. full interface rate.
B. If indefinite full packet rate for small packets is not B. If indefinitely sustained full packet rate for small packets
practical, a forwarding engine MUST be able to buffer a long is not practical, a forwarding engine MUST be able to buffer
sequence of small packets inbound to the on-chip decision a long sequence of small packets inbound to the on-chip
engine and sustain full interface rate for some reasonable decision engine and sustain full interface rate for some
average packet rate. Absent this small on-chip buffering, reasonable average packet rate. Absent this small on-chip
QoS agnostic packet drops can occur. buffering, QoS-agnostic packet drops can occur.
See Section 2.3. See Section 2.3.
5. The implementations and system designs MUST support pseudowire 5. The implementations and system designs MUST support pseudowire
control word (CW) if MPLS-TP is supported or if ACH [RFC5586] is Control Word (CW) if MPLS-TP is supported or if ACH [RFC5586] is
being used on a pseudowire. The implementation and system design being used on a pseudowire. The implementation and system
SHOULD support pseudowire CW even if MPLS-TP and ACH [RFC5586] designs SHOULD support pseudowire CW even if MPLS-TP and ACH
are not used, using instead CW and VCCV Type 1 [RFC5085] to allow
the use of multipath in the underlying network topology without [RFC5586] are not used, using instead CW and VCCV Type 1
impacting the PW traffic. [RFC7079] does note that there are [RFC5085] to allow the use of multipath in the underlying network
still some deployments where the CW is not always used. It also topology without impacting the PW traffic. [RFC7079] does note
notes that many service providers do enable the CW. See that there are still some deployments where the CW is not always
Section 2.4.1 for more discussion on why deployments SHOULD used. It also notes that many service providers do enable the
enable the pseudowire CW. CW. See Section 2.4.1 for more discussion on why deployments
SHOULD enable the pseudowire CW.
The following statements provide clarification regarding more recent The following statements provide clarification regarding more recent
requirements that are often missed. requirements that are often missed.
1. The implementer and system designer SHOULD support adding a 1. The implementer and system designer SHOULD support adding a
pseudowire Flow Label [RFC6391]. Deployments MAY enable this pseudowire Flow Label [RFC6391]. Deployments MAY enable this
feature for appropriate pseudowire types. See Section 2.4.3. feature for appropriate pseudowire types. See Section 2.4.3.
2. The implementer and system designer SHOULD support adding an MPLS 2. The implementer and system designer SHOULD support adding an MPLS
entropy label [RFC6790]. Deployments MAY enable this feature. Entropy Label [RFC6790]. Deployments MAY enable this feature.
See Section 2.4.4. See Section 2.4.4.
Non-IETF definitions of MPLS exist and these should not be used as Non-IETF definitions of MPLS exist, and these should not be used as
normative texts in place of the relevant IETF RFCs. [RFC5704] normative texts in place of the relevant IETF RFCs. [RFC5704]
documents incompatibilities between the IETF definition of MPLS and documents incompatibilities between the IETF definition of MPLS and
one such alternative MPLS definition which led to significant issues one such alternative MPLS definition, which led to significant issues
in the resulting non-IETF specification. in the resulting non-IETF specification.
1.4. Target Audience 1.4. Target Audience
This document is intended for multiple audiences: implementer This document is intended for multiple audiences: implementer
(implementing MPLS forwarding in silicon or in software); systems (implementing MPLS forwarding in silicon or in software); systems
designer (putting together a MPLS forwarding systems); deployer designer (putting together a MPLS forwarding systems); deployer
(running an MPLS network). These guidelines are intended to serve (running an MPLS network). These guidelines are intended to serve
the following purposes: the following purposes:
skipping to change at page 10, line 32 skipping to change at page 11, line 6
2. Highlight pitfalls to look for when implementing an MPLS 2. Highlight pitfalls to look for when implementing an MPLS
forwarding chip. (audience: implementer) forwarding chip. (audience: implementer)
3. Provide a checklist of features and performance specifications to 3. Provide a checklist of features and performance specifications to
request. (audience: systems designer, deployer) request. (audience: systems designer, deployer)
4. Provide a set of tests to perform. (audience: systems designer, 4. Provide a set of tests to perform. (audience: systems designer,
deployer). deployer).
The implementer, systems designer, and deployer have a transitive The implementer, systems designer, and deployer have a transitive
supplier customer relationship. It is in the best interest of the supplier-customer relationship. It is in the best interest of the
supplier to review their product against their customer's checklist supplier to review their product against their customer's checklist
and secondary customer's checklist if applicable. and secondary customer's checklist if applicable.
This document identifies and explains many details and potential pit- This document identifies and explains many details and potential
falls of MPLS forwarding. It is likely that the identified set of pitfalls of MPLS forwarding. It is likely that the identified set of
potential pit-falls will later prove to be an incomplete set. potential pitfalls will later prove to be an incomplete set.
2. Forwarding Issues 2. Forwarding Issues
A brief review of forwarding issues is provided in the subsections A brief review of forwarding issues is provided in the subsections
that follow. This section provides some background on why some of that follow. This section provides some background on why some of
these requirements exist. The questions to ask of suppliers is these requirements exist. The questions to ask of suppliers is
covered in Section 3. Some guidelines for testing are provided in covered in Section 3. Some guidelines for testing are provided in
Section 4. Section 4.
2.1. Forwarding Basics 2.1. Forwarding Basics
Basic MPLS architecture and MPLS encapsulation, and therefore packet Basic MPLS architecture and MPLS encapsulation, and therefore packet
forwarding are defined in [RFC3031] and [RFC3032]. RFC3031 and forwarding, are defined in [RFC3031] and [RFC3032]. RFC 3031 and RFC
RFC3032 are somewhat LDP centric. RSVP-TE supports traffic 3032 are somewhat LDP centric. RSVP-TE supports traffic engineering
engineering (TE) and fast reroute, features that LDP lacks. The base (TE) and fast reroute, features that LDP lacks. The base document
document for RSVP-TE based MPLS is [RFC3209]. for MPLS RSVP-TE is [RFC3209].
A few RFCs update RFC3032. Those with impact on forwarding include A few RFCs update RFC 3032. Those with impact on forwarding include
the following. the following.
1. TTL processing is clarified in [RFC3443]. 1. TTL processing is clarified in [RFC3443].
2. The use of MPLS Explicit NULL is modified in [RFC4182]. 2. The use of MPLS Explicit NULL is modified in [RFC4182].
3. Differentiated Services is supported by [RFC3270] and [RFC4124]. 3. Differentiated Services is supported by [RFC3270] and [RFC4124].
The "EXP" field is renamed to "Traffic Class" in [RFC5462], The "EXP" field is renamed to "Traffic Class" in [RFC5462],
removing any misconception that it was available for removing any misconception that it was available for
experimentation or could be ignored. experimentation or could be ignored.
4. ECN is supported by [RFC5129]. 4. ECN is supported by [RFC5129].
5. The MPLS G-ACh and GAL are defined in [RFC5586]. 5. The MPLS G-ACh and GAL are defined in [RFC5586].
6. [RFC5332] redefines the two data link layer codepoints for MPLS 6. [RFC5332] redefines the two data link layer codepoints for MPLS
packets. packets.
Tunneling encapsulations carrying MPLS, such as MPLS in IP [RFC4023], Tunneling encapsulations carrying MPLS, such as MPLS in IP [RFC4023],
MPLS in GRE [RFC4023], MPLS in L2TPv3 [RFC4817], or MPLS in UDP MPLS in GRE [RFC4023], MPLS in L2TPv3 [RFC4817], or MPLS in UDP
[I-D.ietf-mpls-in-udp], are out of scope. [MPLS-IN-UDP], are out of scope.
Other RFCs have implications to MPLS Forwarding and do not update Other RFCs have implications to MPLS Forwarding and do not update RFC
RFC3032 or RFC3209, including: 3032 or RFC 3209, including:
1. The pseudowire (PW) Associated Channel Header (ACH), defined by 1. The pseudowire (PW) Associated Channel Header (ACH) is defined by
[RFC5085], later generalized by the MPLS G-ACh [RFC5586]. [RFC5085] and was later generalized by the MPLS G-ACh [RFC5586].
2. The entropy label indicator (ELI) and entropy label (EL) are 2. The Entropy Label Indicator (ELI) and Entropy Label (EL) are
defined by [RFC6790]. defined by [RFC6790].
A few RFCs update RFC3209. Those that are listed as updating RFC3209 A few RFCs update RFC 3209. Those that are listed as updating RFC
generally impact only RSVP-TE signaling. Forwarding is modified by 3209 generally impact only RSVP-TE signaling. Forwarding is modified
major extension built upon RFC3209. by major extensions built upon RFC 3209.
RFCs which impact forwarding are discussed in the following RFCs that impact forwarding are discussed in the following
subsections. subsections.
2.1.1. MPLS Special Purpose Labels 2.1.1. MPLS Special-Purpose Labels
[RFC3032] specifies that label values 0-15 are special purpose labels [RFC3032] specifies that label values 0-15 are special-purpose labels
with special meanings. [I-D.ietf-mpls-special-purpose-labels] with special meanings. [RFC7274] renamed these from the term
renamed these from the term "reserved labels" used in [RFC3032] to "reserved labels" used in [RFC3032] to "special-purpose labels".
"special purpose labels". Three values of NULL label are defined Three values of NULL label are defined (two of which are later
(two of which are later updated by [RFC4182]) and a router-alert updated by [RFC4182]) and a Router Alert Label is defined. The
label is defined. The original intent was that special purpose original intent was that special-purpose labels, except the NULL
labels, except the NULL labels, could be sent to the routing engine labels, could be sent to the routing engine CPU rather than be
CPU rather than be processed in forwarding hardware. Hardware processed in forwarding hardware. Hardware support is required by
support is required by new RFCs such as those defining entropy label new RFCs such as those defining Entropy Label and OAM processed as a
and OAM processed as a result of receiving a GAL. For new special result of receiving a GAL. For new special-purpose labels, some
purpose labels, some accommodation is needed for LSR that will send accommodation is needed for LSRs that will send the labels to a
the labels to a general purpose CPU or other highly programmable general-purpose CPU or other highly programmable hardware. For
hardware. For example, ELI will only be sent to LSR which have example, ELI will only be sent to LSRs that have signaled support for
signaled support for [RFC6790] and high OAM packet rate must be [RFC6790], and a high OAM packet rate must be negotiated among
negotiated among endpoints. endpoints.
[RFC3429] reserves a label for ITU-T Y.1711, however Y.1711 does not [RFC3429] reserves a label for ITU-T Y.1711; however, Y.1711 does not
work with multipath and its use is strongly discouraged. work with multipath and its use is strongly discouraged.
The current list of special purpose labels can be found on the The current list of special-purpose labels can be found on the
"Multiprotocol Label Switching Architecture (MPLS) Label Values" "Multiprotocol Label Switching Architecture (MPLS) Label Values"
registry reachable at IANA's pages at http://www.iana.org. registry reachable at IANA's pages at <http://www.iana.org>.
[I-D.ietf-mpls-special-purpose-labels] introduces an IANA "Extended [RFC7274] introduces an IANA "Extended Special-Purpose MPLS Label
Special Purpose MPLS Label Values" registry and makes use of the Values" registry and makes use of the "extension" label, label 15, to
"extension" label, label 15, to indicate that the next label is an indicate that the next label is an extended special-purpose label and
extended special purpose label and requires special handling. The requires special handling. The range of only 16 values for special-
range of only 16 values for special purpose labels allows a table to purpose labels allows a table to be used. The range of extended
be used. The range of extended special purpose labels with 20 bits special-purpose labels with 20 bits available for use may have to be
available for use may have to be handled in some other way in the handled in some other way in the unlikely event that in the future
unlikely event that in the future the range of currently reserved the range of currently reserved values 256-1048575 is used. If only
values 256-1048575 are used. If only the standards action range, the Standards Action range, 16-239, and the Experimental range,
16-239, and the experimental range, 240-255, are used, then a table 240-255, are used, then a table of 256 entries can be used.
of 256 entries can be used.
Unknown special purpose labels and unknown extended special purpose Unknown special-purpose labels and unknown extended special-purpose
labels are handled the same. When an unknown special purpose label labels are handled the same. When an unknown special-purpose label
is encountered or a special purpose label not directly handled in is encountered or a special purpose label not directly handled in
forwarding hardware is encountered, the packet should be sent to a forwarding hardware is encountered, the packet should be sent to a
general purpose CPU by default. If this capability is supported, general-purpose CPU by default. If this capability is supported,
there must be an option to either drop or rate limit such packets on there must be an option to either drop or rate limit such packets
a per special purpose label value basis. based on the value of each special-purpose label.
2.1.2. MPLS Differentiated Services 2.1.2. MPLS Differentiated Services
[RFC2474] deprecates the IP Type of Service (TOS) and IP Precedence [RFC2474] deprecates the IP Type of Service (TOS) and IP Precedence
(Prec) fields and replaces them with the Differentiated Services (Prec) fields and replaces them with the Differentiated Services
Field more commonly known as the Differentiated Services Code Point Field more commonly known as the Differentiated Services Code Point
(DSCP) field. [RFC2475] defines the Differentiated Services (DSCP) field. [RFC2475] defines the Differentiated Services
architecture, which in other forums, is often called a Quality of architecture, which in other forums, is often called a Quality of
Service (QoS) architecture. Service (QoS) architecture.
MPLS uses the Traffic Class (TC) field to support Differentiated MPLS uses the Traffic Class (TC) field to support Differentiated
Services [RFC5462]. There are two primary documents describing how Services [RFC5462]. There are two primary documents describing how
DSCP is mapped into TC. DSCP is mapped into TC.
1. [RFC3270] defines E-LSP and L-LSP. E-LSP use a static mapping of 1. [RFC3270] defines E-LSP and L-LSP. E-LSP uses a static mapping
DSCP into TC. L-LSP uses a per LSP mapping of DSCP into TC, with of DSCP into TC. L-LSP uses a per-LSP mapping of DSCP into TC,
one PHB Scheduling Class (PSC) per L-LSP. Each PSC can use with one PHB Scheduling Class (PSC) per L-LSP. Each PSC can use
multiple Per-Hop Behavior (PHB) values. For example, the Assured multiple Per-Hop Behavior (PHB) values. For example, the Assured
Forwarding service defines three PSC, each with three PHB Forwarding service defines three PSCs, each with three PHB
[RFC2597]. [RFC2597].
2. [RFC4124] defines assignment of a class-type (CT) to an LSP, 2. [RFC4124] defines assignment of a class-type (CT) to an LSP,
where a per CT static mapping of TC to PHB is used. [RFC4124] where a per-CT static mapping of TC to PHB is used. [RFC4124]
provides a means to support up to eight E-LSP-like mappings of provides a means to support up to eight E-LSP-like mappings of
DSCP to TC. DSCP to TC.
To meet Differentiated Services requirements specified in [RFC3270], To meet Differentiated Services requirements specified in [RFC3270],
the following forwarding requirements must be met. An ingress LER the following forwarding requirements must be met. An ingress LER
MUST be able to select an LSP and then apply a per LSP map of DSCP MUST be able to select an LSP and then apply a per-LSP map of DSCP
into TC. A midpoint LSR MUST be able to apply a per LSP map of TC to into TC. A midpoint LSR MUST be able to apply a per-LSP map of TC to
PHB. The number of mappings supported will be far less than the PHB. The number of mappings supported will be far less than the
number of LSP supported. number of LSPs supported.
To meet Differentiated Services requirements specified in [RFC4124], To meet Differentiated Services requirements specified in [RFC4124],
the following forwarding requirements must be met. An ingress LER the following forwarding requirements must be met. An ingress LER
MUST be able to select an LSP and then apply a per LSP map of DSCP MUST be able to select an LSP and then apply a per-LSP map of DSCP
into TC. A midpoint LSR MUST be able to apply a per LSP map to CT into TC. A midpoint LSR MUST be able to map LSP number to Class Type
map and then use Class Type (CT) to map TC to PHB. Since there are (CT), then use a per-CT map to map TC to PHB. Since there are only
only eight allowed values of CT, only eight maps of TC to PHB need to eight allowed values of CT, only eight maps of TC to PHB need to be
be supported. The LSP label can be used directly to find the TC to supported. The LSP label can be used directly to find the TC-to-PHB
PHB mapping, as is needed to support [RFC3270] L-LSP. mapping, as is needed to support L-LSPs as defined by [RFC3270].
While support for [RFC4124] and not [RFC3270] would allow support for While support for [RFC4124] and not [RFC3270] would allow support for
only eight mappings of TC to PHB, it is common to support both and only eight mappings of TC to PHB, it is common to support both and
simply state a limit on the number of unique TC to PHB mappings which simply state a limit on the number of unique TC-to-PHB mappings that
can be supported. can be supported.
2.1.3. Time Synchronization 2.1.3. Time Synchronization
PTP or NTP may be carried over MPLS [I-D.ietf-tictoc-1588overmpls]. PTP or NTP may be carried over MPLS [TIMING-OVER-MPLS]. Generally,
Generally NTP will be carried within IP with IP carried in MPLS NTP will be carried within IP, and IP will be carried in MPLS
[RFC5905]. Both PTP and NTP benefit from accurate time stamping of [RFC5905]. Both PTP and NTP benefit from accurate timestamping of
incoming packets and the ability to insert accurate time stamps in incoming packets and the ability to insert accurate timestamps in
outgoing packets. PTP correction which occurs when forwarding outgoing packets. PTP correction that occurs when forwarding
requires updating a timestamp compensation field based on the requires updating a timestamp compensation field based on the
difference between packet arrival at an LSR and packet transmit time difference between packet arrival at an LSR and packet transmit time
at that same LSR. at that same LSR.
Since the label stack depth may vary, hardware should allow a Since the label stack depth may vary, hardware should allow a
timestamp to be placed in an outgoing packet at any specified byte timestamp to be placed in an outgoing packet at any specified byte
position. It may be necessary to modify layer-2 checksums or frame position. It may be necessary to modify Layer 2 checksums or frame
check sequences after insertion. PTP and NTP timestamp formats check sequences after insertion. PTP and NTP timestamp formats
differ in such a way as to require different implementations of the differ in such a way as to require different implementations of the
timestamp correction. If NTP or PTP is carried over UDP/IP or UDP/IP timestamp correction. If NTP or PTP is carried over UDP/IP or
/MPLS, the UDP checksum will also have to be updated. UDP/IP/MPLS, the UDP checksum will also have to be updated.
Accurate time synchronization in addition to being generally useful Accurate time synchronization, in addition to being generally useful,
is required for MPLS-TP delay measurement (DM) OAM. See is required for MPLS-TP Delay Measurement (DM) OAM. See
Section 2.6.4. Section 2.6.4.
2.1.4. Uses of Multiple Label Stack Entries 2.1.4. Uses of Multiple Label Stack Entries
MPLS deployments in the early part of the prior decade (circa 2000) MPLS deployments in the early part of the prior decade (circa 2000)
tended to support either LDP or RSVP-TE. LDP was favored by some for tended to support either LDP or RSVP-TE. LDP was favored by some for
its ability to scale to a very large number of PE devices at the edge its ability to scale to a very large number of PE devices at the edge
of the network, without adding deployment complexity. RSVP-TE was of the network, without adding deployment complexity. RSVP-TE was
favored, generally in the network core, where traffic engineering and favored, generally in the network core, where traffic engineering
/or fast reroute were considered important. and/or fast reroute were considered important.
Both LDP and RSVP-TE are used simultaneously within major Service Both LDP and RSVP-TE are used simultaneously within major service
Provider networks using a technique known as "LDP over RSVP-TE provider networks using a technique known as "LDP over RSVP-TE
Tunneling". This technique allows service providers to carry LDP Tunneling". This technique allows service providers to carry LDP
tunnels inside RSVP-TE tunnels. This makes it possible to take tunnels inside RSVP-TE tunnels. This makes it possible to take
advantage of the Traffic Engineering and Fast Re-Route on more advantage of the traffic engineering and fast reroute on more
expensive Inter-City and Inter-Continental transport paths. The expensive intercity and intercontinental transport paths. The
ingress RSVP-TE PEs places many LDP tunnels on a single RSVP-TE LSP ingress RSVP-TE PE places many LDP tunnels on a single RSVP-TE LSP
and carries it to the egress RSVP-TE PE. The LDP PEs are situated and carries it to the egress RSVP-TE PE. The LDP PEs are situated
further from the core, for example within a metro network. LDP over further from the core, for example, within a metro network. LDP over
RSVP-TE tunneling requires a minimum of two MPLS labels: one each for RSVP-TE tunneling requires a minimum of two MPLS labels: one each for
LDP and RSVP-TE. LDP and RSVP-TE.
The use of MPLS FRR [RFC4090] might add one more label to MPLS The use of MPLS FRR [RFC4090] might add one more label to MPLS
traffic, but only when FRR protection is in use (active). If LDP traffic but only when FRR protection is in use (active). If LDP over
over RSVP-TE is in use, and FRR protection is in use, then at least RSVP-TE is in use, and FRR protection is in use, then at least three
three MPLS labels are present on the label stack on the links through MPLS labels are present on the label stack on the links through which
which the Bypass LSP traverses. FRR is covered in Section 2.1.7. the Bypass LSP traverses. FRR is covered in Section 2.1.7.
LDP L2VPN, LDP IPVPN, BGP L2VPN, and BGP IPVPN added support for VPN LDP L2VPN, LDP IPVPN, BGP L2VPN, and BGP IPVPN added support for VPN
services that are deployed by the vast majority of service providers. services that are deployed by the vast majority of service providers.
These VPN services added yet another label, bringing the label stack These VPN services added yet another label, bringing the label stack
depth (when FRR is active) to four. depth (when FRR is active) to four.
Pseudowires and VPN are discussed in further detail in Section 2.1.8 Pseudowires and VPN are discussed in further detail in Sections 2.1.8
and Section 2.1.9. and 2.1.9.
MPLS hierarchy as described in [RFC4206] and updated by [RFC7074] can MPLS hierarchy as described in [RFC4206] and updated by [RFC7074] can
in principle add at least one additional label. MPLS hierarchy is in principle add at least one additional label. MPLS hierarchy is
discussed in Section 2.1.6. discussed in Section 2.1.6.
Other features such as Entropy Label (discussed in Section 2.4.4) and Other features such as Entropy Label (discussed in Section 2.4.4) and
Flow Label (discussed in Section 2.4.3) can add additional labels to Flow Label (discussed in Section 2.4.3) can add additional labels to
the label stack. the label stack.
Although theoretical scenarios can easily result in eight or more Although theoretical scenarios can easily result in eight or more
labels, such cases are rare if they occur at all today. For the labels, such cases are rare if they occur at all today. For the
purpose of forwarding, only the top label needs to be examined if PHP purpose of forwarding, only the top label needs to be examined if PHP
is used, a few more if UHP is used (see Section 2.5). For deep label is used, and a few more if UHP is used (see Section 2.5). For deep
stacks, quite a few labels may have to be examined for the purpose of label stacks, quite a few labels may have to be examined for the
load balancing across parallel links (see Section 2.4), however this purpose of load balancing across parallel links (see Section 2.4);
depth can be bounded by a provider through use of Entropy Label. however, this depth can be bounded by a provider through use of
Entropy Label.
Other creative use of MPLS within the IETF, such as the use of MPLS Other creative uses of MPLS within the IETF, such as the use of MPLS
label stack in source routing, may result in label stacks that are label stack in source routing, may result in label stacks that are
considerably deeper than those encountered today. considerably deeper than those encountered today.
2.1.5. MPLS Link Bundling 2.1.5. MPLS Link Bundling
MPLS Link Bundling was the first RFC to address the need for multiple MPLS Link Bundling was the first RFC to address the need for multiple
parallel links between nodes [RFC4201]. MPLS Link Bundling is parallel links between nodes [RFC4201]. MPLS Link Bundling is
notable in that it tried not to change MPLS forwarding, except in notable in that it tried not to change MPLS forwarding, except in
specifying the "All-Ones" component link. MPLS Link Bundling is specifying the "all-ones" component link. MPLS Link Bundling is
seldom if ever deployed. Instead multipath techniques described in seldom if ever deployed. Instead, multipath techniques described in
Section 2.4 are used. Section 2.4 are used.
2.1.6. MPLS Hierarchy 2.1.6. MPLS Hierarchy
MPLS hierarchy is defined in [RFC4206] and updated by [RFC7074]. MPLS hierarchy is defined in [RFC4206] and updated by [RFC7074].
Although RFC4206 is considered part of GMPLS, the Packet Switching Although RFC 4206 is considered part of GMPLS, the Packet Switching
Capable (PSC) portion of the MPLS hierarchy are applicable to MPLS Capable (PSC) portion of the MPLS hierarchy is applicable to MPLS and
and may be supported in an otherwise GMPLS free implementation. The may be supported in an otherwise GMPLS-free implementation. The MPLS
MPLS PSC hierarchy remains the most likely means of providing further PSC hierarchy remains the most likely means of providing further
scaling in an RSVP-TE MPLS network, particularly where the network is scaling in an RSVP-TE MPLS network, particularly where the network is
designed to provide RSVP-TE connectivity to the edges. This is the designed to provide RSVP-TE connectivity to the edges. This is the
case for envisioned MPLS-TP networks. The use of the MPLS PSC case for envisioned MPLS-TP networks. The use of the MPLS PSC
hierarchy can add at least one additional label to a label stack, hierarchy can add at least one additional label to a label stack,
though it is likely that only one layer of PSC will be used in the though it is likely that only one layer of PSC will be used in the
near future. near future.
2.1.7. MPLS Fast Reroute (FRR) 2.1.7. MPLS Fast Reroute (FRR)
Fast reroute is defined by [RFC4090]. Two significantly different Fast reroute is defined by [RFC4090]. Two significantly different
methods are defined in RFC4090, the "One-to-One Backup" method which methods are defined in RFC 4090: the "One-to-One Backup" method,
uses the "Detour LSP" and the "Facility Backup" which uses a "bypass which uses the "Detour LSP", and the "Facility Backup", which uses a
tunnel". These are commonly referred to as the detour and bypass "bypass tunnel". These are commonly referred to as the detour and
methods respectively. bypass methods, respectively.
The detour method makes use of a presignaled LSP. Hardware The detour method makes use of a presignaled LSP. Hardware
assistance is needed for detour FRR only if necessary to accomplish assistance may be needed for detour FRR in order to accomplish local
local repair of a large number of LSP within the 10s of milliseconds repair of a large number of LSPs within the target of tens of
target. For each affected LSP a swap operation must be reprogrammed milliseconds. For each affected LSP, a swap operation must be
or otherwise switched over. The use of detour FRR doubles the number reprogrammed or otherwise switched over. The use of detour FRR
of LSP terminating at any given hop and will increase the number of doubles the number of LSPs terminating at any given hop and will
LSP within a network by a factor dependent on the average detour path increase the number of LSPs within a network by a factor dependent on
length. the average detour path length.
The bypass method makes use of a tunnel that is unused when no fault The bypass method makes use of a tunnel that is unused when no fault
exists but may carry many LSP when a local repair is required. There exists but may carry many LSPs when a local repair is required.
is no presignaling indicating which working LSP will be diverted into There is no presignaling indicating which working LSP will be
any specific bypass LSP. If interface label space is used the bypass diverted into any specific bypass LSP. If interface label space is
LSP MUST extend one hop beyond the merge point, except if the merge used, the bypass LSP MUST extend one hop beyond the merge point,
point is the egress and PHP is used. If the bypass LSP are not except if the merge point is the egress and PHP is used. If the
extended in this way, then the merge LSR (egress LSR of the bypass bypass LSPs are not extended in this way, then the merge LSR (egress
LSP) MUST use platform label space (as defined in [RFC3031]) so that LSR of the bypass LSP) MUST use platform label space (as defined in
an LSP working path on any given interface can be backed up using a [RFC3031]) so that an LSP working path on any given interface can be
bypass LSP terminating on any other interface. Hardware assistance backed up using a bypass LSP terminating on any other interface.
is needed if necessary to accomplish local repair of a large number Hardware assistance may be needed to accomplish local repair of a
of LSP within the 10s of milliseconds target. For each affected LSP large number of LSPs within the target of tens of milliseconds. For
a swap operation must be reprogrammed or otherwise switched over with each affected LSP a swap operation must be reprogrammed or otherwise
an additional push of the bypass LSP label. The use of platform switched over with an additional push of the bypass LSP label. The
label space impacts the size of the LSR ILM for LSR with a very large use of platform label space impacts the size of the LSR ILM for an
number of interfaces. LSR with a very large number of interfaces.
IP/LDP Fast Reroute (IP/LDR FRR) [RFC5714] is also applicable in MPLS IP/LDP Fast Reroute (IP/LDP FRR) [RFC5714] is also applicable in MPLS
networks. ECMP and Loop-Free Alternates (LFA) [RFC5286] are well networks. ECMP and Loop-Free Alternates (LFAs) [RFC5286] are well-
established IP/LDP FRR techniques and were the first methods to be established IP/LDP FRR techniques and were the first methods to be
widely deployed. Work on IP/LDP FRR is ongoing within the IETF widely deployed. Work on IP/LDP FRR is ongoing within the IETF
RTGWG. Two topics actively discussed in RTGWG are microloops and RTGWG. Two topics actively discussed in RTGWG are microloops and
partial coverage of the established techniques in some network partial coverage of the established techniques in some network
topologies. [RFC5715] covers the topic of IP/LDP Fast Reroute topologies. [RFC5715] covers the topic of IP/LDP Fast Reroute
microloops and microloops prevention. RTGWG has developed additional microloops and microloop prevention. RTGWG has developed additional
IP/LDP FRR techniques to handle coverage concerns. RTGWG is IP/LDP FRR techniques to handle coverage concerns. RTGWG is
extending LFA through the use of remote LFA extending LFA through the use of remote LFA [REMOTE-LFA]. Other
[I-D.ietf-rtgwg-remote-lfa]. Other techniques that require new techniques that require new forwarding paths to be established are
forwarding paths to be established are also under consideration, also under consideration, including the IPFRR "not-via" technique
including the IPFRR "not-via" technique defined in [RFC6981] and defined in [RFC6981] and maximally redundant trees (MRT) [MRT].
maximally redundant trees (MRT) ECMP, LFA (but not remote LFA), and MRT swap the top label to an
[I-D.ietf-rtgwg-mrt-frr-architecture]. ECMP, LFA (but not remote alternate MPLS label. The other methods operate in a similar manner
LFA) and MRT swap the top label to an alternate MPLS label. The to the facility backup described in RFC 4090 and push an additional
other methods operate in a similar manner to RFC 4090 facility backup label. IP/LDP FRR methods that push more than one label have been
and push an additional label. IP/LDP FRR methods which push more suggested but are in early discussion.
than one label have been suggested but are in early discussion.
2.1.8. Pseudowire Encapsulation 2.1.8. Pseudowire Encapsulation
The pseudowire (PW) architecture is defined in [RFC3985]. A The pseudowire (PW) architecture is defined in [RFC3985]. A
pseudowire, when carried over MPLS, adds one or more additional label pseudowire, when carried over MPLS, adds one or more additional label
entries to the MPLS label stack. A PW Control Word is defined in entries to the MPLS label stack. A PW Control Word is defined in
[RFC4385] with motivation for defining the control word in [RFC4928]. [RFC4385] with motivation for defining the Control Word in [RFC4928].
The PW Associated Channel defined in [RFC4385] is used for OAM in The PW Associated Channel defined in [RFC4385] is used for OAM in
[RFC5085]. The PW Flow Label is defined in [RFC6391] and is [RFC5085]. The PW Flow Label is defined in [RFC6391] and is
discussed further in this document in Section 2.4.3. discussed further in this document in Section 2.4.3.
There are numerous pseudowire encapsulations, supporting emulation of There are numerous pseudowire encapsulations, supporting emulation of
services such as Frame Relay, ATM, Ethernet, TDM, and SONET/SDH over services such as Frame Relay, ATM, Ethernet, TDM, and SONET/SDH over
packet switched networks (PSNs) using IP or MPLS. packet switched networks (PSNs) using IP or MPLS.
The pseudowire encapsulation is out of scope for this document. The pseudowire encapsulation is out of scope for this document.
Pseudowire impact on MPLS forwarding at midpoint LSR is within scope. Pseudowire impact on MPLS forwarding at the midpoint LSR is within
The impact on ingress MPLS push and egress MPLS UHP pop are within scope. The impact on ingress MPLS push and egress MPLS UHP pop are
scope. While pseudowire encapsulation is out of scope, some advice within scope. While pseudowire encapsulation is out of scope, some
is given on sequence number support. advice is given on Sequence Number support.
2.1.8.1. Pseudowire Sequence Number 2.1.8.1. Pseudowire Sequence Number
Pseudowire (PW) sequence number support is most important for PW Pseudowire (PW) Sequence Number support is most important for PW
payload types with a high expectation of lossless and/or in-order payload types with a high expectation of lossless and/or in-order
delivery. Identifying lost PW packets and the exact amount of lost delivery. Identifying lost PW packets and the exact amount of lost
payload is critical for PW services which maintain bit timing, such payload is critical for PW services that maintain bit timing, such as
as Time Division Multiplexing (TDM) services since these services Time Division Multiplexing (TDM) services since these services MUST
MUST compensate lost payload on a bit-for-bit basis. compensate lost payload on a bit-for-bit basis.
With PW services which maintain bit timing, packets that have been With PW services that maintain bit timing, packets that have been
received out of order also MUST be identified and MAY be either re- received out of order also MUST be identified and MAY be either
ordered or dropped. Resequencing requires, in addition to sequence reordered or dropped. Resequencing requires, in addition to sequence
numbering, a "reorder buffer" in the egress PE, and ability to numbering, a "reorder buffer" in the egress PE, and the ability to
reorder is limited by the depth of this buffer. The down side of reorder is limited by the depth of this buffer. The down side of
maintaining a large reorder buffer is added end-to-end service delay. maintaining a large reorder buffer is added end-to-end service delay.
For PW services which maintain bit timing or any other service where For PW services that maintain bit timing or any other service where
jitter must be bounded, a jitter buffer is always necessary. The jitter must be bounded, a jitter buffer is always necessary. The
jitter buffer is needed regardless of whether reordering is done. In jitter buffer is needed regardless of whether reordering is done. In
order to be effective, a reorder buffer must often be larger than a order to be effective, a reorder buffer must often be larger than a
jitter buffer needs to be creating a tradeoff between reducing loss jitter buffer needs to be, thus creating a tradeoff between reducing
and minimizing delay. loss and minimizing delay.
PW services which are not timing critical bit streams in nature are PW services that are not timing critical bit streams in nature are
cell oriented or frame oriented. Though resequencing support may be cell oriented or frame oriented. Though resequencing support may be
beneficial to PW cell and frame oriented payloads such as ATM, FR and beneficial to PW cell- and frame-oriented payloads such as ATM, FR,
Ethernet, this support is desirable but not required. Requirements and Ethernet, this support is desirable but not required.
to handle out of order packets at all vary among services and Requirements to handle out-of-order packets at all vary among
deployments. For example for Ethernet PW, occasional (very rare) services and deployments. For example, for Ethernet PW, occasional
reordering is usually acceptable. If the Ethernet PW is carrying (very rare) reordering is usually acceptable. If the Ethernet PW is
MPLS-TP, then this reordering may be acceptable. carrying MPLS-TP, then this reordering may be acceptable.
Reducing jitter is best done by an end-system, given that the Reducing jitter is best done by an end-system, given that the
tradeoff of loss vs delay varies among services. For example with tradeoff of loss vs. delay varies among services. For example, with
interactive real time services low delay is preferred, while with interactive real-time services, low delay is preferred, while with
non-interactive (one way) real time services low loss is preferred. non-interactive (one-way) real-time services, low loss is preferred.
The same end-site may be receiving both types of traffic. Regardless The same end-site may be receiving both types of traffic. Regardless
of this, bounded jitter is sometimes a requirement for specific of this, bounded jitter is sometimes a requirement for specific
deployments. deployments.
Packet reordering should be rare except in a small number of Packet reordering should be rare except in a small number of
circumstances, most of which are due to network design or equipment circumstances, most of which are due to network design or equipment
design errors: design errors:
1. The most common case is where reordering is rare, occurring only 1. The most common case is where reordering is rare, occurring only
when a network or equipment fault forces traffic on a new path when a network or equipment fault forces traffic on a new path
with different delay. The packet loss that accompanies a network with different delay. The packet loss that accompanies a network
or equipment fault is generally more disruptive than any or equipment fault is generally more disruptive than any
reordering which may occur. reordering that may occur.
2. A path change can be caused by reasons other than a network or 2. A path change can be caused by reasons other than a network or
equipment fault, such as administrative routing change. This may equipment fault, such as an administrative routing change. This
result in packet reordering but generally without any packet may result in packet reordering but generally without any packet
loss. loss.
3. If the edge is not using pseudowire control word (CW) and the 3. If the edge is not using pseudowire Control Word (CW) and the
core is using multipath, reordering will be far more common. If core is using multipath, reordering will be far more common. If
this is occurring, using CW on the edge will solve the problem. this is occurring, using CW on the edge will solve the problem.
Without CW, resequencing is not possible since the sequence Without CW, resequencing is not possible since the Sequence
number is contained in the CW. Number is contained in the CW.
4. Another avoidable case is where some core equipment has multipath 4. Another avoidable case is where some core equipment has multipath
and for some reason insists on periodically installing a new and for some reason insists on periodically installing a new
random number as the multipath hash seed. If supporting MPLS-TP, random number as the multipath hash seed. If supporting MPLS-TP,
equipment MUST provide a means to disable periodic hash reseeding equipment MUST provide a means to disable periodic hash
and deployments MUST disable periodic hash reseeding. Operator reseeding, and deployments MUST disable periodic hash reseeding.
experience dictates that even if not supporting MPLS-TP, Operator experience dictates that even if not supporting MPLS-TP,
equipment SHOULD provide a means to disable periodic hash equipment SHOULD provide a means to disable periodic hash
reseeding and deployments SHOULD disable periodic hash reseeding. reseeding, and deployments SHOULD disable periodic hash
reseeding.
In provider networks which use multipath techniques and which may In provider networks that use multipath techniques and that may
occasionally rebalance traffic or which may change PW paths occasionally rebalance traffic or that may change PW paths
occasionally for other reasons, reordering may be far more common occasionally for other reasons, reordering may be far more common
than loss. Where reordering is more common than loss, resequencing than loss. Where reordering is more common than loss, resequencing
packets is beneficial, rather than dropping packets at egress when packets is beneficial, rather than dropping packets at egress when
out of order arrival occurs. Resequencing is most important for PW out-of-order arrival occurs. Resequencing is most important for PW
payload types with a high expectation of lossless delivery since in payload types with a high expectation of lossless delivery since in
such cases out of order delivery within the network results in PW such cases out-of-order delivery within the network results in PW
loss. loss.
2.1.9. Layer-2 and Layer-3 VPN 2.1.9. Layer 2 and Layer 3 VPN
Layer-2 VPN [RFC4664] and Layer-3 VPN [RFC4110] add one or more label Layer 2 VPN [RFC4664] and Layer 3 VPN [RFC4110] add one or more label
entry to the MPLS label stack. VPN encapsulations are out of scope entry to the MPLS label stack. VPN encapsulations are out of scope
for this document. Its impact on forwarding at midpoint LSR are for this document. Their impact on forwarding at the midpoint LSR
within scope. are within scope.
Any of these services may be used on an MPLS entropy label enabled Any of these services may be used on an ingress and egress that are
ingress and egress (see Section 2.4.4 for discussion of entropy MPLS Entropy Label enabled (see Section 2.4.4 for discussion of
label) which would add an additional two labels to the MPLS label Entropy Label); this would add an additional two labels to the MPLS
stack. The need to provide a useful entropy label value impacts the label stack. The need to provide a useful Entropy Label value
requirements of the VPN ingress LER but is out of scope for this impacts the requirements of the VPN ingress LER but is out of scope
document. for this document.
2.2. MPLS Multicast 2.2. MPLS Multicast
MPLS Multicast encapsulation is clarified in [RFC5332]. MPLS MPLS Multicast encapsulation is clarified in [RFC5332]. MPLS
Multicast may be signaled using RSVP-TE [RFC4875] or LDP [RFC6388]. Multicast may be signaled using RSVP-TE [RFC4875] or LDP [RFC6388].
[RFC4875] defines a root initiated RSVP-TE LSP setup rather than leaf [RFC4875] defines a root-initiated RSVP-TE LSP setup rather than the
initiated join used in IP multicast. [RFC6388] defines a leaf leaf-initiated join used in IP multicast. [RFC6388] defines a leaf-
initiated LDP setup. Both [RFC4875] and [RFC6388] define point to initiated LDP setup. Both [RFC4875] and [RFC6388] define point-to-
multipoint (P2MP) LSP setup. [RFC6388] also defined multipoint to multipoint (P2MP) LSP setup. [RFC6388] also defined multipoint-to-
multipoint (MP2MP) LSP setup. multipoint (MP2MP) LSP setup.
The P2MP LSP have a single source. An LSR may be a leaf node, an The P2MP LSPs have a single source. An LSR may be a leaf node, an
intermediate node, or a "bud" node. A bud serves as both a leaf and intermediate node, or a "bud" node. A bud serves as both a leaf and
intermediate. At a leaf an MPLS pop is performed. The payload may intermediate. At a leaf, an MPLS pop is performed. The payload may
be a IP Multicast packet that requires further replication. At an be an IP multicast packet that requires further replication. At an
intermediate node a MPLS swap operation is performed. The bud intermediate node, an MPLS swap operation is performed. The bud
requires that both a pop operation and a swap operation be performed requires that both a pop operation and a swap operation be performed
for the same incoming packet. for the same incoming packet.
One strategy to support P2MP functionality is to pop at the LSR One strategy to support P2MP functionality is to pop at the LSR
interface serving as ingress to the P2MP traffic and then optionally interface serving as ingress to the P2MP traffic and then optionally
push labels at each LSR interface serving as egress to the P2MP push labels at each LSR interface serving as egress to the P2MP
traffic at that same LSR. A given LSR egress chip may support traffic at that same LSR. A given LSR egress chip may support
multiple egress interfaces, each of which requires a copy, but each multiple egress interfaces, each of which requires a copy, but each
with a different set of added labels and layer-2 encapsulation. Some with a different set of added labels and Layer 2 encapsulation. Some
physical interfaces may have multiple sub-interfaces (such as physical interfaces may have multiple sub-interfaces (such as
Ethernet VLAN or channelized interfaces) each requiring a copy. Ethernet VLAN or channelized interfaces), each requiring a copy.
If packet replication is performed at LSR ingress, then the ingress If packet replication is performed at LSR ingress, then the ingress
interface performance may suffer. If the packet replication is interface performance may suffer. If the packet replication is
performed within a LSR switching fabric and at LSR egress, congestion performed within a LSR switching fabric and at LSR egress, congestion
of egress interfaces cannot make use of backpressure to ingress of egress interfaces cannot make use of backpressure to ingress
interfaces using techniques such as virtual output queuing (VOQ). If interfaces using techniques such as virtual output queuing (VOQ). If
buffering is primarily supported at egress, then the need for buffering is primarily supported at egress, then the need for
backpressure is minimized. There may be no good solution for high backpressure is minimized. There may be no good solution for high
volumes of multicast traffic if VOQ is used. volumes of multicast traffic if VOQ is used.
Careful consideration should be given to the performance Careful consideration should be given to the performance
characteristics of high fanout multicast for equipment that is characteristics of high-fanout multicast for equipment that is
intended to be used in such a role. intended to be used in such a role.
MP2MP LSP differ in that any branch may provide an input, including a MP2MP LSPs differ in that any branch may provide an input, including
leaf. Packets must be replicated onto all other branches. This a leaf. Packets must be replicated onto all other branches. This
forwarding is often implemented as multiple P2MP forwarding trees, forwarding is often implemented as multiple P2MP forwarding trees,
one for each potential input interface at a given LSR. one for each potential input interface at a given LSR.
2.3. Packet Rates 2.3. Packet Rates
While average packet size of Internet traffic may be large, long While average packet size of Internet traffic may be large, long
sequences of small packets have both been predicted in theory and sequences of small packets have both been predicted in theory and
observed in practice. Traffic compression and TCP ACK compression observed in practice. Traffic compression and TCP ACK compression
can conspire to create long sequences of packets of 40-44 bytes in can conspire to create long sequences of packets of 40-44 bytes in
payload length. If carried over Ethernet, the 64 byte minimum payload length. If carried over Ethernet, the 64-byte minimum
payload applies, yielding a packet rate of approximately 150 Mpps payload applies, yielding a packet rate of approximately 150 Mpps
(million packets per second) for the duration of the burst on a (million packets per second) for the duration of the burst on a
nominal 100 Gb/s link. The peak rate for other encapsulations can be nominal 100 Gb/s link. The peak rate for other encapsulations can be
as high as 250 Mpps (for example IP or MPLS encapsulated using GFP as high as 250 Mpps (for example, when IP or MPLS is encapsulated
over OTN ODU4). using GFP over OTN ODU4).
It is possible that the packet rates achieved by a specific It is possible that the packet rates achieved by a specific
implementation is acceptable for a minimum payload size, such as 64 implementation are acceptable for a minimum payload size, such as a
byte (64B) payload for Ethernet, but the achieved rate declines to an 64-byte (64B) payload for Ethernet, but the achieved rate declines to
unacceptable level for other packet sizes, such as 65B payload. an unacceptable level for other packet sizes, such as a 65B payload.
There are other packet rates of interest besides TCP ACK. For There are other packet rates of interest besides TCP ACK. For
example, a TCP ACK carried over an Ethernet PW over MPLS over example, a TCP ACK carried over an Ethernet PW over MPLS over
Ethernet may occupy 82B or 82B plus an increment of 4B if additional Ethernet may occupy 82B or 82B plus an increment of 4B if additional
MPLS labels are present. MPLS labels are present.
A graph of packet rate vs. packet size often displays a sawtooth. A graph of packet rate vs. packet size often displays a sawtooth.
The sawtooth is commonly due to a memory bottleneck and memory The sawtooth is commonly due to a memory bottleneck and memory
widths, sometimes internal cache, but often a very wide external widths, sometimes an internal cache, but often a very wide external
buffer memory interface. In some cases it may be due to a fabric buffer memory interface. In some cases, it may be due to a fabric
transfer width. A fine packing, rounding up to the nearest 8B or 16B transfer width. A fine packing, rounding up to the nearest 8B or 16B
will result in a fine sawtooth with small degradation for 65B, and will result in a fine sawtooth with small degradation for 65B, and
even less for 82B packets. A course packing, rounding up to 64B can even less for 82B packets. A coarse packing, rounding up to 64B can
yield a sharper drop in performance for 65B packets, or perhaps more yield a sharper drop in performance for 65B packets, or perhaps more
important, a larger drop for 82B packets. important, a larger drop for 82B packets.
The loss of some TCP ACK packets are not the primary concern when The loss of some TCP ACK packets are not the primary concern when
such a burst occurs. When a burst occurs, any other packets, such a burst occurs. When a burst occurs, any other packets,
regardless of packet length and packet QoS are dropped once on-chip regardless of packet length and packet QoS are dropped once on-chip
input buffers prior to the decision engine are exceeded. Buffers in input buffers prior to the decision engine are exceeded. Buffers in
front of the packet decision engine are often very small or non- front of the packet decision engine are often very small or
existent (less than one packet of buffer) causing significant QoS nonexistent (less than one packet of buffer) causing significant QoS-
agnostic packet drop. agnostic packet drop.
Internet service providers and content providers at one time Internet service providers and content providers at one time
specified full rate forwarding with 40 byte payload packets as a specified full rate forwarding with 40-byte payload packets as a
requirement. Today, this requirement often can be waived if the requirement. Today, this requirement often can be waived if the
provider can be convinced that when long sequence of short packets provider can be convinced that when long sequences of short packets
occur no packets will be dropped. occur no packets will be dropped.
Many equipment suppliers have pointed out that the extra cost in Many equipment suppliers have pointed out that the extra cost in
designing hardware capable of processing the minimum size packets at designing hardware capable of processing the minimum size packets at
full line rate is significant for very high speed interfaces. If full line rate is significant for very-high-speed interfaces. If
hardware is not capable of processing the minimum size packets at hardware is not capable of processing the minimum size packets at
full line rate, then that hardware MUST be capable of handling large full line rate, then that hardware MUST be capable of handling large
burst of small packets, a condition which is often observed. This bursts of small packets, a condition that is often observed. This
level of performance is necessary to meet Differentiated Services level of performance is necessary to meet Differentiated Services
[RFC2475] requirements for without it, packets are lost prior to [RFC2475] requirements; without it, packets are lost prior to
inspection of the IP DSCP field [RFC2474] or MPLS TC field [RFC5462]. inspection of the IP DSCP field [RFC2474] or MPLS TC field [RFC5462].
With adequate on-chip buffers before the packet decision engine, an With adequate on-chip buffers before the packet decision engine, an
LSR can absorb a long sequence of short packets. Even if the output LSR can absorb a long sequence of short packets. Even if the output
is slowed to the point where light congestion occurs, the packets, is slowed to the point where light congestion occurs, the packets,
having cleared the decision process, can make use of larger VOQ or having cleared the decision process, can make use of larger VOQ or
output side buffers and be dealt with according to configured QoS output side buffers and be dealt with according to configured QoS
treatment, rather than dropped completely at random. treatment, rather than dropped completely at random.
The buffering before the packet decision engine should be arranged
such that 1) it can hold a relatively large number of small packets,
2) it can hold a small number of large packets, and 3) it can hold a
mix of packets of different sizes.
These on-chip buffers need not contribute significant delay since These on-chip buffers need not contribute significant delay since
they are only used when the packet decision engine is unable to keep they are only used when the packet decision engine is unable to keep
up, not in response to congestion, plus these buffers are quite up, not in response to congestion, plus these buffers are quite
small. For example, an on-chip buffer capable of handling 4K packets small. For example, an on-chip buffer capable of handling 4K packets
of 64 bytes in length, or 256KB, corresponds to 200 usec on a 10 Gb/s of 64 bytes in length, or 256KB, corresponds to 200 microseconds on a
link and 20 usec on a 100 Gb/s link. If the packet decision engine 10 Gb/s link and 20 microseconds on a 100 Gb/s link. If the packet
is capable of handling packets at 90% of the full rate for small decision engine is capable of handling packets at 90% of the full
packets, then the maximum added delay is 20 usec and 2 usec rate for small packets, then the maximum added delay is 20
respectively, and this delay only applies if a 4K burst of short microseconds and 2 microseconds, respectively, and this delay only
packets occurs. When no burst of short packets was being processed, applies if a 4K burst of short packets occurs. When no burst of
no delay is added. These buffers are only needed on high speed short packets was being processed, no delay is added. These buffers
interfaces where it is difficult to process small packets at full are only needed on high-speed interfaces where it is difficult to
line rate. process small packets at full line rate.
Packet rate requirements apply regardless of which network tier Packet rate requirements apply regardless of which network tier the
equipment is deployed in. Whether deployed in the network core or equipment is deployed in. Whether deployed in the network core or
near the network edges, one of the two conditions MUST be met if near the network edges, one of the two conditions MUST be met if
Differentiated Services requirements are to be met: Differentiated Services requirements are to be met:
1. Packets must be processed at full line rate with minimum sized 1. Packets must be processed at full line rate with minimum-sized
packets. -OR- packets. -OR-
2. Packets must be processed at a rate well under generally accepted 2. Packets must be processed at a rate well under generally accepted
average packet sizes, with sufficient buffering prior to the average packet sizes, with sufficient buffering prior to the
packet decision engine to accommodate long bursts of small packet decision engine to accommodate long bursts of small
packets. packets.
2.4. MPLS Multipath Techniques 2.4. MPLS Multipath Techniques
In any large provider, service providers and content providers, hash In any large provider, service providers, and content providers,
based multipath techniques are used in the core and in the edge. In hash-based multipath techniques are used in the core and in the edge.
many of these providers hash based multipath is also used in the In many of these providers, hash-based multipath is also used in the
larger metro networks. larger metro networks.
The Differentiated Services requirements for good reasons dictate For good reason, the Differentiated Services requirements dictate
that packets within a common microflow SHOULD NOT be reordered that packets within a common microflow SHOULD NOT be reordered
[RFC2474]. Service providers generally impose stronger requirements, [RFC2474]. Service providers generally impose stronger requirements,
commonly requiring that packets within a microflow MUST NOT be commonly requiring that packets within a microflow MUST NOT be
reordered except in rare circumstances such as load balancing across reordered except in rare circumstances such as load balancing across
multiple links or path change for load balancing or path change for multiple links, path change for load balancing, or path change for
other reason. other reason.
The most common multipath techniques are ECMP applied at the IP The most common multipath techniques are ECMP applied at the IP
forwarding level, Ethernet LAG with inspection of the IP payload, and forwarding level, Ethernet Link Aggregation Group (LAG) with
multipath on links carrying both IP and MPLS, where the IP header is inspection of the IP payload, and multipath on links carrying both IP
inspected below the MPLS label stack. In most core networks, the and MPLS, where the IP header is inspected below the MPLS label
vast majority of traffic is MPLS encapsulated. stack. In most core networks, the vast majority of traffic is MPLS
encapsulated.
In order to support an adequately balanced load distribution across In order to support an adequately balanced load distribution across
multiple links, IP header information must be used. Common practice multiple links, IP header information must be used. Common practice
today is to reinspect the IP headers at each LSR and use the label today is to reinspect the IP headers at each LSR and use the label
stack and IP header information in a hash performed at each LSR. stack and IP header information in a hash performed at each LSR.
Further details are provided in Section 2.4.5. Further details are provided in Section 2.4.5.
The use of this technique is so ubiquitous in provider networks that The use of this technique is so ubiquitous in provider networks that
lack of support for multipath makes any product unsuitable for use in lack of support for multipath makes any product unsuitable for use in
large core networks. This will continue to be the case in the near large core networks. This will continue to be the case in the near
future, even as deployment of MPLS entropy label begins to relax the future, even as deployment of the MPLS Entropy Label begins to relax
core LSR multipath performance requirements given the existing the core LSR multipath performance requirements given the existing
deployed base of edge equipment without the ability to add an entropy deployed base of edge equipment without the ability to add an Entropy
label. Label.
A generation of edge equipment supporting the ability to add an MPLS A generation of edge equipment supporting the ability to add an MPLS
entropy label is needed before the performance requirements for core Entropy Label is needed before the performance requirements for core
LSR can be relaxed. However, it is likely that two generations of LSRs can be relaxed. However, it is likely that two generations of
deployment in the future will allow core LSR to support full packet deployment in the future will allow core LSRs to support full packet
rate only when a relatively small number of MPLS labels need to be rate only when a relatively small number of MPLS labels need to be
inspected before hashing. For now, don't count on it. inspected before hashing. For now, don't count on it.
Common practice today is to reinspect the packet at each LSR and use Common practice today is to reinspect the packet at each LSR and use
information from the packet combined plus a hash seed that is information from the packet combined with a hash seed that is
selected by each LSR. Where flow labels or entropy labels are used, selected by each LSR. Where Flow Labels or Entropy Labels are used,
a hash seed must be used when creating these labels. a hash seed must be used when creating these labels.
2.4.1. Pseudowire Control Word 2.4.1. Pseudowire Control Word
Within the core of a network some form of multipath is almost certain Within the core of a network, some form of multipath is almost
to be used. Multipath techniques deployed today are likely to be certain to be used. Multipath techniques deployed today are likely
looking beneath the label stack for an opportunity to hash on IP to be looking beneath the label stack for an opportunity to hash on
addresses. IP addresses.
A pseudowire encapsulated at a network edge must have a means to A pseudowire encapsulated at a network edge must have a means to
prevent reordering within the core if the pseudowire will be crossing prevent reordering within the core if the pseudowire will be crossing
a network core, or any part of a network topology where multipath is a network core, or any part of a network topology where multipath is
used (see [RFC4385] and [RFC4928]). used (see [RFC4385] and [RFC4928]).
Not supporting the ability to encapsulate a pseudowire with a control Not supporting the ability to encapsulate a pseudowire with a Control
word may lock a product out from consideration. A pseudowire Word may lock a product out from consideration. A pseudowire
capability without control word support might be sufficient for capability without Control Word support might be sufficient for
applications that are strictly both intra-metro and low bandwidth. applications that are strictly both intra-metro and low bandwidth.
However a provider with other applications will very likely not However, a provider with other applications will very likely not
tolerate having equipment which can only support a subset of their tolerate having equipment that can only support a subset of their
pseudowire needs. pseudowire needs.
2.4.2. Large Microflows 2.4.2. Large Microflows
Where multipath makes use of a simple hash and simple load balance Where multipath makes use of a simple hash and simple load balance
such as modulo or other fixed allocation (see Section 2.4) the such as modulo or other fixed allocation (see Section 2.4), there can
presence of large microflows that each consumes 10% of the capacity be the presence of large microflows that each consume 10% of the
of a component link of a potentially congested composite link, one capacity of a component link of a potentially congested composite
such microflow can upset the traffic balance and more than one can in link. One such microflow can upset the traffic balance, and more
effect reduce the effective capacity of the entire composite link by than one can reduce the effective capacity of the entire composite
more than 10%. link by more than 10%.
When even a very small number of large microflows are present, there When even a very small number of large microflows are present, there
is a significant probability that more than one of these large is a significant probability that more than one of these large
microflows could fall on the same component link. If the traffic microflows could fall on the same component link. If the traffic
contribution from large microflows is small, the probability for contribution from large microflows is small, the probability for
three or more large microflows on the same component link drops three or more large microflows on the same component link drops
significantly. Therefore in a network where a significant number of significantly. Therefore, in a network where a significant number of
parallel 10 Gb/s links exists, even a 1 Gb/s pseudowire or other parallel 10 Gb/s links exists, even a 1 Gb/s pseudowire or other
large microflow that could not otherwise be subdivided into smaller large microflow that could not otherwise be subdivided into smaller
flows should carry a flow label or entropy label if possible. flows should carry a Flow Label or Entropy Label if possible.
Active management of the hash space to better accommodate large Active management of the hash space to better accommodate large
microflows has been implemented and deployed in the past, however microflows has been implemented and deployed in the past; however,
such techniques are out of scope for this document. such techniques are out of scope for this document.
2.4.3. Pseudowire Flow Label 2.4.3. Pseudowire Flow Label
Unlike a pseudowire control word, a pseudowire flow label [RFC6391], Unlike a pseudowire Control Word, a pseudowire Flow Label [RFC6391]
is required only for relatively large capacity pseudowires. There is required only for pseudowires that have a relatively large
are many cases where a pseudowire flow label makes sense. Any capacity. There are many cases where a pseudowire Flow Label makes
service such as a VPN which carries IP traffic within a pseudowire sense. Any service such as a VPN that carries IP traffic within a
can make use of a pseudowire flow label. pseudowire can make use of a pseudowire Flow Label.
Any pseudowire carried over MPLS which makes use of the pseudowire Any pseudowire carried over MPLS that makes use of the pseudowire
control word and does not carry a flow label is in effect a single Control Word and does not carry a Flow Label is in effect a single
microflow (in [RFC2475] terms) and may result in the types of microflow (in the terms defined in [RFC2475]) and may result in the
problems described in Section 2.4.2. types of problems described in Section 2.4.2.
2.4.4. MPLS Entropy Label 2.4.4. MPLS Entropy Label
The MPLS entropy label simplifies flow group identification [RFC6790] The MPLS Entropy Label simplifies flow group identification [RFC6790]
at midpoint LSRs. Prior to the MPLS entropy label midpoint LSRs at midpoint LSRs. Prior to the MPLS Entropy Label, midpoint LSRs
needed to inspect the entire label stack and often the IP headers to needed to inspect the entire label stack and often the IP headers to
provide an adequate distribution of traffic when using multipath provide an adequate distribution of traffic when using multipath
techniques (see Section 2.4.5). With the use of MPLS entropy label, techniques (see Section 2.4.5). With the use of the MPLS Entropy
a hash can be performed closer to network edges, placed in the label Label, a hash can be performed closer to network edges, placed in the
stack, and used by midpoint LSRs without fully reinspecting the label label stack, and used by midpoint LSRs without fully reinspecting the
stack and inspecting the payload. label stack and inspecting the payload.
The MPLS entropy label is capable of avoiding full label stack and The MPLS Entropy Label is capable of avoiding full label stack and
payload inspection within the core where performance levels are most payload inspection within the core where performance levels are most
difficult to achieve (see Section 2.3). The label stack inspection difficult to achieve (see Section 2.3). The label stack inspection
can be terminated as soon as the first entropy label is encountered, can be terminated as soon as the first Entropy Label is encountered,
which is generally after a small number of labels are inspected. which is generally after a small number of labels are inspected.
In order to provide these benefits in the core, LSR closer to the In order to provide these benefits in the core, an LSR closer to the
edge must be capable of adding an entropy label. This support may edge must be capable of adding an Entropy Label. This support may
not be required in the access tier, the tier closest to the customer, not be required in the access tier, the tier closest to the customer,
but is likely to be required in the edge or the border to the network but is likely to be required in the edge or the border to the network
core. LSR peering with external networks will also need to be able core. An LSR peering with external networks will also need to be
to add an entropy label on incoming traffic. able to add an Entropy Label on incoming traffic.
2.4.5. Fields Used for Multipath Load Balance 2.4.5. Fields Used for Multipath Load Balance
The most common multipath techniques are based on a hash over a set The most common multipath techniques are based on a hash over a set
of fields. Regardless of whether a hash is used or some other method of fields. Regardless of whether a hash is used or some other method
is used, the there is a limited set of fields which can safely be is used, there is a limited set of fields that can safely be used for
used for multipath. multipath.
2.4.5.1. MPLS Fields in Multipath 2.4.5.1. MPLS Fields in Multipath
If the "outer" or "first" layer of encapsulation is MPLS, then label If the "outer" or "first" layer of encapsulation is MPLS, then label
stack entries are used in the hash. Within a finite amount of time stack entries are used in the hash. Within a finite amount of time
(and for small packets arriving at high speed that time can be quite (and for small packets arriving at high speed, that time can be quite
limited) only a finite number of label entries can be inspected. limited), only a finite number of label entries can be inspected.
Pipelined or parallel architectures improve this, but the limit is Pipelined or parallel architectures improve this, but the limit is
still finite. still finite.
The following guidelines are provided for use of MPLS fields in The following guidelines are provided for use of MPLS fields in
multipath load balancing. multipath load balancing.
1. Only the 20 bit label field SHOULD be used. The TTL field SHOULD 1. Only the 20-bit label field SHOULD be used. The TTL field SHOULD
NOT be used. The S bit MUST NOT be used. The TC field (formerly NOT be used. The S bit MUST NOT be used. The TC field (formerly
EXP) MUST NOT be used. See text following this list for reasons. EXP) MUST NOT be used. See text following this list for reasons.
2. If an ELI label is found, then if the LSR supports entropy label, 2. If an ELI label is found, then if the LSR supports Entropy
the EL label field in the next label entry (the EL) SHOULD be Labels, the EL label field in the next label entry (the EL)
used and label entries below that label SHOULD NOT be used and SHOULD be used, label entries below that label SHOULD NOT be
the MPLS payload SHOULD NOT be used. See below this list for used, and the MPLS payload SHOULD NOT be used. See below this
reasons. list for reasons.
3. Special purpose labels (label values 0-15) MUST NOT be used. 3. Special-purpose labels (label values 0-15) MUST NOT be used.
Extended special purpose labels (any label following label 15) Extended special-purpose labels (any label following label 15)
MUST NOT be used. In particular, GAL and RA MUST NOT be used so MUST NOT be used. In particular, GAL and RA MUST NOT be used so
that OAM traffic follows the same path as payload packets with that OAM traffic follows the same path as payload packets with
the same label stack. the same label stack.
4. If a new special purpose label or extended special purpose label 4. If a new special-purpose label or extended special-purpose label
is defined which requires special load balance processing, then, is defined that requires special load-balance processing, then,
as is the case for the ELI label, a special action may be needed as is the case for the ELI label, a special action may be needed
rather than skipping the special purpose label or extended rather than skipping the special-purpose label or extended
special purpose label. special-purpose label.
5. The most entropy is generally found in the label stack entries 5. The most entropy is generally found in the label stack entries
near the bottom of the label stack (innermost label, closest to near the bottom of the label stack (innermost label, closest to
S=1 bit). If the entire label stack cannot be used (or entire S=1 bit). If the entire label stack cannot be used (or entire
stack up to an EL), then it is better to use as many labels as stack up to an EL), then it is better to use as many labels as
possible closest to the bottom of stack. possible closest to the bottom of stack.
6. If no ELI is encountered, and the first nibble of payload 6. If no ELI is encountered, and the first nibble of payload
contains a 4 (IPv4) or 6 (IPv6), an implementation SHOULD support contains a 4 (IPv4) or 6 (IPv6), an implementation SHOULD support
the ability to interpret the payload as IPv4 or IPv6 and extract the ability to interpret the payload as IPv4 or IPv6 and extract
and use appropriate fields from the IP headers. This feature is and use appropriate fields from the IP headers. This feature is
considered a non-negotiable requirement by many service considered a nonnegotiable requirement by many service providers.
providers. If supported, there MUST be a way to disable it (if, If supported, there MUST be a way to disable it (if, for example,
for example, PW without CW are used). This ability to disable PW without CW are used). This ability to disable this feature is
this feature is considered a non-negotiable requirement by many considered a nonnegotiable requirement by many service providers.
service providers. Therefore an implementation has a very strong
incentive to support both options.
7. A label which is popped at egress (UHP pop) SHOULD NOT be used. Therefore, an implementation has a very strong incentive to
A label which is popped at the penultimate hop (PHP pop) SHOULD support both options.
be used.
Apparently some chips have made use of the TC (formerly EXP) bits as 7. A label that is popped at egress (UHP pop) SHOULD NOT be used. A
label that is popped at the penultimate hop (PHP pop) SHOULD be
used.
Apparently, some chips have made use of the TC (formerly EXP) bits as
a source of entropy. This is very harmful since it will reorder a source of entropy. This is very harmful since it will reorder
Assured Forwarding (AF) traffic [RFC2597] when a subset does not Assured Forwarding (AF) traffic [RFC2597] when a subset does not
conform to the configured rates and is remarked but not dropped at a conform to the configured rates and is remarked but not dropped at a
prior LSR. Traffic which uses MPLS ECN [RFC5129] can also be prior LSR. Traffic that uses MPLS ECN [RFC5129] can also be
reordered if TC is used for entropy. Therefore, as stated in the reordered if TC is used for entropy. Therefore, as stated in the
guidelines above, the TC field (formerly EXP) MUST NOT be used in guidelines above, the TC field (formerly EXP) MUST NOT be used in
multipath load balancing as it violates Differentiated Services multipath load balancing as it violates Differentiated Services
Ordered Aggregate (OA) requirements in these two instances. Ordered Aggregate (OA) requirements in these two instances.
Use of the MPLS label entry S bit would result in putting OAM traffic Use of the MPLS label entry S bit would result in putting OAM traffic
on a different path if the addition of a GAL at the bottom of stack on a different path if the addition of a GAL at the bottom of stack
removed the S bit from the prior label. removed the S bit from the prior label.
If an ELI label is found, then if the LSR supports entropy label, the If an ELI label is found, then if the LSR supports Entropy Labels,
EL label field in the next label entry (the EL) SHOULD be used and the EL label field in the next label entry (the EL) SHOULD be used,
the search for additional entropy within the packet SHOULD be and the search for additional entropy within the packet SHOULD be
terminated. Failure to terminate the search will impact client MPLS- terminated. Failure to terminate the search will impact client MPLS-
TP LSP carried within server MPLS LSP. A network operator has the TP LSPs carried within server MPLS LSPs. A network operator has the
option to use administrative attributes as a means to identify LSR option to use administrative attributes as a means to identify LSRs
which do not terminate the entropy search at the first EL. that do not terminate the entropy search at the first EL.
Administrative attributes are defined in [RFC3209]. Some Administrative attributes are defined in [RFC3209]. Some
configuration is required to support this. configuration is required to support this.
If the label removed by a PHP pop is not used, then for any PW for If the label removed by a PHP pop is not used, then for any PW for
which CW is used, there is no basis for multipath load split. In which CW is used, there is no basis for multipath load split. In
some networks it is infeasible to put all PW traffic on one component some networks, it is infeasible to put all PW traffic on one
link. Any PW which does not use CW will be improperly split component link. Any PW that does not use CW will be improperly
regardless of whether the label removed by a PHP pop is used. split, regardless of whether the label removed by a PHP pop is used.
Therefore the PHP pop label SHOULD be used as recommended above. Therefore, the PHP pop label SHOULD be used as recommended above.
2.4.5.2. IP Fields in Multipath 2.4.5.2. IP Fields in Multipath
Inspecting the IP payload provides the most entropy in provider Inspecting the IP payload provides the most entropy in provider
networks. The practice of looking past the bottom of stack label for networks. The practice of looking past the bottom of stack label for
an IP payload is well accepted and documented in [RFC4928] and in an IP payload is well accepted and documented in [RFC4928] and in
other RFCs. other RFCs.
Where IP is mentioned in the document, both IPv4 and IPv6 apply. All Where IP is mentioned in the document, both IPv4 and IPv6 apply. All
LSRs MUST fully support IPv6. LSRs MUST fully support IPv6.
skipping to change at page 27, line 26 skipping to change at page 28, line 16
apply: apply:
1. Both the IP source address and IP destination address SHOULD be 1. Both the IP source address and IP destination address SHOULD be
used. There MAY be an option to reverse the order of these used. There MAY be an option to reverse the order of these
addresses, improving the ability to provide symmetric paths in addresses, improving the ability to provide symmetric paths in
some cases. Many service providers require that both addresses some cases. Many service providers require that both addresses
be used. be used.
2. Implementations SHOULD allow inspection of the IP protocol field 2. Implementations SHOULD allow inspection of the IP protocol field
and use of the UDP or TCP port numbers. For many service and use of the UDP or TCP port numbers. For many service
providers this feature is considered mandatory, particularly for providers, this feature is considered mandatory, particularly for
enterprise, data center, or edge equipment. If this feature is enterprise, data center, or edge equipment. If this feature is
provided, it SHOULD be possible to disable use of TCP and UDP provided, it SHOULD be possible to disable use of TCP and UDP
ports. Many service providers consider it a non-negotiable ports. Many service providers consider it a nonnegotiable
requirement that use of UDP and TCP ports can be disabled. requirement that use of UDP and TCP ports can be disabled.
Therefore there is a strong incentive for implementations to Therefore, there is a strong incentive for implementations to
provide both options. provide both options.
3. Equipment suppliers MUST NOT make assumptions that because the IP 3. Equipment suppliers MUST NOT make assumptions that because the IP
version field is equal to 4 (an IPv4 packet) that the IP protocol version field is equal to 4 (an IPv4 packet) that the IP protocol
will either be TCP (IP protocol 6) or UDP (IP protocol 17) and will either be TCP (IP protocol 6) or UDP (IP protocol 17) and
blindly fetch the data at the offset where the TCP or UDP ports blindly fetch the data at the offset where the TCP or UDP ports
would be found. With IPv6, TCP and UDP port numbers are not at would be found. With IPv6, TCP and UDP port numbers are not at
fixed offsets. With IPv4 packets carrying IP options, TCP and fixed offsets. With IPv4 packets carrying IP options, TCP and
UDP port numbers are not at fixed offsets. UDP port numbers are not at fixed offsets.
4. The IPv6 header flow field SHOULD be used. This is the explicit 4. The IPv6 header flow field SHOULD be used. This is the explicit
purpose of the IPv6 flow field, however observed flow fields purpose of the IPv6 flow field; however, observed flow fields
rarely contains a non-zero value. Some uses of the flow field rarely contain a non-zero value. Some uses of the flow field
have been defined such as [RFC6438]. In the absence of MPLS have been defined, such as [RFC6438]. In the absence of MPLS
encapsulation, the IPv6 flow field can serve a role equivalent to encapsulation, the IPv6 flow field can serve a role equivalent to
entropy label. the Entropy Label.
5. Support for other protocols that share a common Layer-4 header 5. Support for other protocols that share a common Layer 4 header
such as RTP [RFC3550], UDP-Lite [RFC3828], SCTP [RFC4960] and such as RTP [RFC3550], UDP-Lite [RFC3828], SCTP [RFC4960], and
DCCP [RFC4340] SHOULD be provided, particularly for edge or DCCP [RFC4340] SHOULD be provided, particularly for edge or
access equipment where additional entropy may be needed. access equipment where additional entropy may be needed.
Equipment SHOULD also use RTP, UDP-lite, SCTP, and DCCP headers
Equipment SHOULD also use RTP, UDP-lite, SCTP and DCCP headers when creating an Entropy Label.
when creating an entropy label.
6. The following IP header fields should not or must not be used: 6. The following IP header fields should not or must not be used:
A. Similar to avoiding TC in MPLS, the IP DSCP, and ECN bits A. Similar to avoiding TC in MPLS, the IP DSCP, and ECN bits
MUST NOT be used. MUST NOT be used.
B. The IPv4 TTL or IPv6 Hop Count SHOULD NOT be used. B. The IPv4 TTL or IPv6 Hop Count SHOULD NOT be used.
C. Note that the IP TOS field was deprecated ([RFC0791] was C. Note that the IP TOS field was deprecated. ([RFC0791] was
updated by [RFC2474]). No part of the IP DSCP field can be updated by [RFC2474].) No part of the IP DSCP field can be
used (formerly IP PREC and IP TOS bits). used (formerly IP PREC and IP TOS bits).
7. Some IP encapsulations support tunneling, such as IP-in-IP, GRE, 7. Some IP encapsulations support tunneling, such as IP-in-IP, GRE,
L2TPv3, and IPSEC. These provide a greater source of entropy L2TPv3, and IPsec. These provide a greater source of entropy
which some provider networks carrying large amounts of tunneled that some provider networks carrying large amounts of tunneled
traffic may need, for example as used in [RFC5640] for GRE and traffic may need, for example, as used in [RFC5640] for GRE and
L2TPv3. The use of tunneling header information is out of scope L2TPv3. The use of tunneling header information is out of scope
for this document. for this document.
This document makes the following recommendations. These This document makes the following recommendations. These
recommendations are not required to claim compliance to any existing recommendations are not required to claim compliance to any existing
RFC therefore implementers are free to ignore them, but due to RFC; therefore, implementers are free to ignore them, but due to
service provider requirements should consider the risk of doing so. service provider requirements should consider the risk of doing so.
The use of IP addresses MUST be supported and TCP and UDP ports The use of IP addresses MUST be supported, and TCP and UDP ports
(conditional on the protocol field and properly located) MUST be (conditional on the protocol field and properly located) MUST be
supported. The ability to disable use of UDP and TCP ports MUST be supported. The ability to disable use of UDP and TCP ports MUST be
available. available.
Though potentially very useful in some networks, it is uncommon to Though potentially very useful in some networks, it is uncommon to
support using payloads of tunneling protocols carried over IP. support using payloads of tunneling protocols carried over IP.
Though the use of tunneling protocol header information is out of Though the use of tunneling protocol header information is out of
scope for this document, it is not discouraged. scope for this document, it is not discouraged.
2.4.5.3. Fields Used in Flow Label 2.4.5.3. Fields Used in Flow Label
The ingress to a pseudowire (PW) can extract information from the The ingress to a pseudowire (PW) can extract information from the
payload being encapsulated to create a flow label. [RFC6391] payload being encapsulated to create a Flow Label. [RFC6391]
references IP carried in Ethernet as an example. The Native Service references IP carried in Ethernet as an example. The Native Service
Processing (NSP) function defined in [RFC3985] differs with Processing (NSP) function defined in [RFC3985] differs with
pseudowire type. It is in the NSP function where information for a pseudowire type. It is in the NSP function where information for a
specific type of PW can be extracted for use in a flow label. Which specific type of PW can be extracted for use in a Flow Label.
fields to use for any given PW NSP is out of scope for this document. Determining which fields to use for any given PW NSP is out of scope
for this document.
2.4.5.4. Fields Used in Entropy Label 2.4.5.4. Fields Used in Entropy Label
An entropy label is added at the ingress to an LSP. The payload An Entropy Label is added at the ingress to an LSP. The payload
being encapsulated is most often MPLS, a PW, or IP. The payload type being encapsulated is most often MPLS, a PW, or IP. The payload type
is identified by the layer-2 encapsulation (Ethernet, GFP, POS, etc). is identified by the Layer 2 encapsulation (Ethernet, GFP, POS,
etc.).
If the payload is MPLS, then the information used to create an If the payload is MPLS, then the information used to create an
entropy label is the same information used for local load balancing Entropy Label is the same information used for local load balancing
(see Section 2.4.5.1). This information MUST be extracted for use in (see Section 2.4.5.1). This information MUST be extracted for use in
generating an entropy label even if the LSR local egress interface is generating an Entropy Label even if the LSR local egress interface is
not a multipath. not a multipath.
Of the non-MPLS payload types, only payloads that are forwarded are Of the non-MPLS payload types, only payloads that are forwarded are
of interest. For example, ARP is not forwarded and CNLP (used only of interest. For example, payloads using the Address Resolution
for ISIS) is not forwarded. Protocol (ARP) are not forwarded, and payloads using the
Connectionless-mode Network Protocol (CLNP), which is used only for
IS-IS, are not forwarded.
The non-MPLS payload type of greatest interest are IPv4 and IPv6. The non-MPLS payload types of greatest interest are IPv4 and IPv6.
The guidelines in Section 2.4.5.2 apply to fields used to create and The guidelines in Section 2.4.5.2 apply to fields used to create an
entropy label. Entropy Label.
The IP tunneling protocols mentioned in Section 2.4.5.2 may be more The IP tunneling protocols mentioned in Section 2.4.5.2 may be more
applicable to generation of an entropy label at edge or access where applicable to generation of an Entropy Label at the edge or access
deep packet inspection is practical due to lower interface speeds where deep packet inspection is practical due to lower interface
than in the core where deep packet inspection may be impractical. speeds than in the core where deep packet inspection may be
impractical.
2.5. MPLS-TP and UHP 2.5. MPLS-TP and UHP
MPLS-TP introduces forwarding demands that will be extremely MPLS-TP introduces forwarding demands that will be extremely
difficult to meet in a core network. Most troublesome is the difficult to meet in a core network. Most troublesome is the
requirement for Ultimate Hop Popping (UHP, the opposite of requirement for Ultimate Hop Popping (UHP), the opposite of
Penultimate Hop Popping or PHP). Using UHP opens the possibility of Penultimate Hop Popping (PHP). Using UHP opens the possibility of
one or more MPLS pop operation plus an MPLS swap operation for each one or more MPLS pop operations plus an MPLS swap operation for each
packet. The potential for multiple lookups and multiple counter packet. The potential for multiple lookups and multiple counter
instances per packet exists. instances per packet exists.
As networks grow and tunneling of LDP LSPs into RSVP-TE LSPs is used, As networks grow and tunneling of LDP LSPs into RSVP-TE LSPs is used,
and/or RSVP-TE hierarchy is used, the requirement to perform one or and/or RSVP-TE hierarchy is used, the requirement to perform one or
two or more MPLS pop operations plus a MPLS swap operation (and more MPLS pop operations plus an MPLS swap operation (and possibly a
possibly a push or two) increases. If MPLS-TP LM (link monitoring) push or two) increases. If MPLS-TP LM (link monitoring) OAM is
OAM is enabled at each layer, then a packet and byte count MUST be enabled at each layer, then a packet and byte count MUST be
maintained for each pop and swap operation so as to offer OAM for maintained for each pop and swap operation so as to offer OAM for
each layer. each layer.
2.6. Local Delivery of Packets 2.6. Local Delivery of Packets
There are a number of situations in which packets are destined to a There are a number of situations in which packets are destined to a
local address or where a return packet must be generated. There is a local address or where a return packet must be generated. There is a
need to mitigate the potential for outage as a result of either need to mitigate the potential for outage as a result of either
attacks on network infrastructure, or in some cases unintentional attacks on network infrastructure, or in some cases unintentional
misconfiguration resulting in processor overload. Some hardware misconfiguration resulting in processor overload. Some hardware
assistance is needed for all traffic destined to the general purpose assistance is needed for all traffic destined to the general-purpose
CPU that is used in MPLS control protocol processing or network CPU that is used in processing of the MPLS control protocol or the
management protocol processing and in most cases to other general network management protocol and in most cases to other general-
purpose CPUs residing on an LSR. This is due to the ease of purpose CPUs residing on an LSR. This is due to the ease of
overwhelming such a processor with traffic arriving on LSR high speed overwhelming such a processor with traffic arriving on LSR high-speed
interfaces, whether the traffic is malicious or not. interfaces, whether the traffic is malicious or not.
Denial of service (DoS) protection is an area requiring hardware Denial of service (DoS) protection is an area requiring hardware
support that is often overlooked or inadequately considered. support that is often overlooked or inadequately considered.
Hardware assist is also needed for OAM, particularly the more Hardware assists are also needed for OAM, particularly the more
demanding MPLS-TP OAM. demanding MPLS-TP OAM.
2.6.1. DoS Protection 2.6.1. DoS Protection
Modern equipment supports a number of control plane and management Modern equipment supports a number of control-plane and management-
plane protocols. Generally no single means of protecting network plane protocols. Generally, no single means of protecting network
equipment from denial of service (DoS) attacks is sufficient, equipment from DoS attacks is sufficient, particularly for high-speed
particularly for high speed interfaces. This problem is not specific interfaces. This problem is not specific to MPLS but is a topic that
to MPLS, but is a topic that cannot be ignored when implementing or cannot be ignored when implementing or evaluating MPLS
evaluating MPLS implementations. implementations.
Two types of protections are often cited as primary means of Two types of protections are often cited as the primary means of
protecting against attacks of all kinds. protecting against attacks of all kinds.
Isolated Control/Management Traffic Isolated Control/Management Traffic
Control and Management traffic can be carried out-of-band (OOB), Control and management traffic can be carried out-of-band (OOB),
meaning not intermixed with payload. For MPLS, use of G-ACh and meaning not intermixed with payload. For MPLS, use of G-ACh and
GAL to carry control and management traffic provides a means of GAL to carry control and management traffic provides a means of
isolation from potentially malicious payload. Used alone, the isolation from potentially malicious payloads. Used alone, the
compromise of a single node, including a small computer at a compromise of a single node, including a small computer at a
network operations center, could compromise an entire network. network operations center, could compromise an entire network.
Implementations which send all G-ACh/GAL traffic directly to a Implementations that send all G-ACh/GAL traffic directly to a
routing engine CPU are subject to DoS attack as a result of such routing engine CPU are subject to DoS attack as a result of such
a compromise. a compromise.
Cryptographic Authentication Cryptographic Authentication
Cryptographic authentication can very effectively prevent Cryptographic authentication can very effectively prevent
malicious injection of control or management traffic. malicious injection of control or management traffic.
Cryptographic authentication can in some circumstances be subject Cryptographic authentication can in some circumstances be subject
to DoS attack by overwhelming the capacity of the decryption with to DoS attack by overwhelming the capacity of the decryption with
a high volume of malicious traffic. For very low speed a high volume of malicious traffic. For very-low-speed
interfaces, cryptographic authentication can be performed by the interfaces, cryptographic authentication can be performed by the
general purpose CPU used as a routing engine. For all other general-purpose CPU used as a routing engine. For all other
cases, cryptographic hardware may be needed. For very high speed cases, cryptographic hardware may be needed. For very-high-speed
interfaces, even cryptographic hardware can be overwhelmed. interfaces, even cryptographic hardware can be overwhelmed.
Some control and management protocols are often carried with payload Some control and management protocols are often carried with payload
traffic. This is commonly the case with BGP, T-LDP, and SNMP. It is traffic. This is commonly the case with BGP, T-LDP, and SNMP. It is
often the case with RSVP-TE. Even when carried over G-ACh/GAL often the case with RSVP-TE. Even when carried over G-ACh/GAL,
additional measures can reduce the potential for a minor breach to be additional measures can reduce the potential for a minor breach to be
leveraged to a full network attack. leveraged to a full network attack.
Some of the additional protections are supported by hardware packet Some of the additional protections are supported by hardware packet
filtering. filtering.
GTSM GTSM
[RFC5082] defines a mechanism that uses the IPv4 TTL or IPv6 Hop [RFC5082] defines a mechanism that uses the IPv4 TTL or IPv6 Hop
Limit fields to insure control traffic that can only originate Limit fields to ensure control traffic that can only originate
from an immediate neighbor is not forged and originating from a from an immediate neighbor is not forged and is not originating
distant source. GTSM can be applied to many control protocols from a distant source. GTSM can be applied to many control
which are routable, for example LDP [RFC6720]. protocols that are routable, for example, LDP [RFC6720].
IP Filtering IP Filtering
At the very minimum, packet filtering plus classification and use At the very minimum, packet filtering plus classification and use
of multiple queues supporting rate limiting is needed for traffic of multiple queues supporting rate limiting is needed for traffic
that could potentially be sent to a general purpose CPU used as a that could potentially be sent to a general-purpose CPU used as a
routing engine. The first level of filtering only allows routing engine. The first level of filtering only allows
connections to be initiated from specific IP prefixes to specific connections to be initiated from specific IP prefixes to specific
destination ports and then preferably passes traffic directly to destination ports and then preferably passes traffic directly to
a cryptographic engine and/or rate limits. The second level of a cryptographic engine and/or rate limits. The second level of
filtering passes connected traffic, such as TCP connections filtering passes connected traffic, such as TCP connections
having received at least one authenticated SYN or having been having received at least one authenticated SYN or having been
locally initiated. The second level of filtering only passes locally initiated. The second level of filtering only passes
traffic to specific address and port pairs to be checked for traffic to specific address and port pairs to be checked for
cryptographic authentication. cryptographic authentication.
The cryptographic authentication is generally the last resort in DoS The cryptographic authentication is generally the last resort in DoS
attack mitigation. If a packet must be first sent to a general attack mitigation. If a packet must be first sent to a general-
purpose CPU, then sent to a cryptographic engine, a DoS attack is purpose CPU, then sent to a cryptographic engine, a DoS attack is
possible on high speed interfaces. Only where hardware can fully possible on high-speed interfaces. Only where hardware can fully
process a cryptographic authentication without intervention from a process a cryptographic authentication without intervention from a
general purpose CPU to find the authentication field and to identify general-purpose CPU (to find the authentication field and to identify
the portion of packet to run the cryptographic algorithm over is the portion of packet to run the cryptographic algorithm over) is
cryptographic authentication beneficial in protecting against DoS cryptographic authentication beneficial in protecting against DoS
attacks. attacks.
For chips supporting multiple 100 Gb/s interfaces, only a very large For chips supporting multiple 100 Gb/s interfaces, only a very large
number of parallel cryptographic engines can provide the processing number of parallel cryptographic engines can provide the processing
capacity to handle a large scale DoS or distributed DoS (DDoS) capacity to handle a large-scale DoS or distributed DoS (DDoS)
attack. For many forwarding chips this much processing power attack. For many forwarding chips, this much processing power
requires significant chip real estate and power, and therefore requires significant chip real estate and power, and therefore
reduces system space and power density. For this reason, reduces system space and power density. For this reason,
cryptographic authentication is not considered a viable first line of cryptographic authentication is not considered a viable first line of
defense. defense.
For some networks the first line of defense is some means of For some networks, the first line of defense is some means of
supporting OOB control and management traffic. In the past this OOB supporting OOB control and management traffic. In the past, this OOB
channel might make use of overhead bits in SONET or OTN or a channel might make use of overhead bits in SONET or OTN or a
dedicated DWDM wavelength. G-ACh and GAL provide an alternative OOB dedicated DWDM wavelength. G-ACh and GAL provide an alternative OOB
mechanism which is independent of underlying layers. In other mechanism that is independent of underlying layers. In other
networks, including most IP/MPLS networks, perimeter filtering serves networks, including most IP/MPLS networks, perimeter filtering serves
a similar purpose, though less effective without extreme vigilance. a similar purpose, though it is less effective without extreme
vigilance.
A second line of defense is filtering, including GTSM. For protocols A second line of defense is filtering, including GTSM. For protocols
such as EBGP, GTSM and other filtering is often the first line of such as EBGP, GTSM and other filtering are often the first line of
defense. Cryptographic authentication is usually the last line of defense. Cryptographic authentication is usually the last line of
defense and insufficient by itself to mitigate DoS or DDoS attacks. defense and insufficient by itself to mitigate DoS or DDoS attacks.
2.6.2. MPLS OAM 2.6.2. MPLS OAM
[RFC4377] defines requirements for MPLS OAM that predate MPLS-TP. [RFC4377] defines requirements for MPLS OAM that predate MPLS-TP.
[RFC4379] defines what is commonly referred to as LSP Ping and LSP [RFC4379] defines what is commonly referred to as LSP Ping and LSP
Traceroute. [RFC4379] is updated by [RFC6424] supporting MPLS Traceroute. [RFC4379] is updated by [RFC6424], which supports MPLS
tunnels and stitched LSP and P2MP LSP. [RFC4379] is updated by tunnels and stitched LSP and P2MP LSP. [RFC4379] is updated by
[RFC6425] supporting P2MP LSP. [RFC4379] is updated by [RFC6426] to [RFC6425], which supports P2MP LSP. [RFC4379] is updated by
support MPLS-TP connectivity verification (CV) and route tracing. [RFC6426] to support MPLS-TP connectivity verification (CV) and route
tracing.
[RFC4950] extends the ICMP format to support TTL expiration that may [RFC4950] extends the ICMP format to support TTL expiration that may
occur when using IP traceroute within an MPLS tunnel. The ICMP occur when using IP Traceroute within an MPLS tunnel. The ICMP
message generation can be implemented in forwarding hardware, but if message generation can be implemented in forwarding hardware, but if
sent to a general purpose CPU must be rate limited to avoid a the ICMP packets are sent to a general-purpose CPU, this packet flow
potential denial or service (DoS) attack. must be rate limited to avoid a potential DoS attack.
[RFC5880] defines Bidirectional Forwarding Detection (BFD), a [RFC5880] defines Bidirectional Forwarding Detection (BFD), a
protocol intended to detect faults in the bidirectional path between protocol intended to detect faults in the bidirectional path between
two forwarding engines. [RFC5884] and [RFC5885] define BFD for MPLS. two forwarding engines. [RFC5884] and [RFC5885] define BFD for MPLS.
BFD can provide failure detection on any kind of path between BFD can provide failure detection on any kind of path between
systems, including direct physical links, virtual circuits, tunnels, systems, including direct physical links, virtual circuits, tunnels,
MPLS Label Switched Paths (LSPs), multihop routed paths, and MPLS Label Switched Paths (LSPs), multihop routed paths, and
unidirectional links as long as there is some return path. unidirectional links as long as there is some return path.
The processing requirements for BFD are less than for LSP Ping, The processing requirements for BFD are less than for LSP Ping,
making BFD somewhat better suited for relatively high rate proactive making BFD somewhat better suited for relatively high-rate proactive
monitoring. BFD does not verify that the data plane matches the monitoring. BFD does not verify that the data plane matches the
control plane, where LSP Ping does. LSP Ping is somewhat better control plane, where LSP Ping does. LSP Ping is somewhat better
suited for on-demand monitoring including relatively low rate suited for on-demand monitoring including relatively low-rate
periodic verification of data plane and as a diagnostic tool. periodic verification of the data plane and as a diagnostic tool.
Hardware assistance is often provided for BFD response where BFD Hardware assistance is often provided for BFD response where BFD
setup or parameter change is not involved and may be necessary for setup or parameter change is not involved and may be necessary for
relatively high rate proactive monitoring. If both BFD and LSP Ping relatively high-rate proactive monitoring. If both BFD and LSP Ping
are recognized in filtering prior to passing traffic to a general are recognized in filtering prior to passing traffic to a general-
purpose CPU, appropriate DoS protection can be applied (see purpose CPU, appropriate DoS protection can be applied (see
Section 2.6.1). Failure to recognize BFD and LSP Ping and at least Section 2.6.1). Failure to recognize BFD and LSP Ping and at least
rate limit creates the potential for misconfiguration to cause to rate limit creates the potential for misconfiguration to cause
outages rather than cause errors in the misconfigured OAM. outages rather than cause errors in the misconfigured OAM.
2.6.3. Pseudowire OAM 2.6.3. Pseudowire OAM
Pseudowire OAM makes use of the control channel provided by Virtual Pseudowire OAM makes use of the control channel provided by Virtual
Circuit Connectivity Verification (VCCV) [RFC5085]. VCCV makes use Circuit Connectivity Verification (VCCV) [RFC5085]. VCCV makes use
of the Pseudowire Control Word. BFD support over VCCV is defined by of the pseudowire Control Word. BFD support over VCCV is defined by
[RFC5885]. [RFC5885] is updated by [RFC6478] in support of static [RFC5885]. [RFC5885] is updated by [RFC6478] in support of static
pseudowires. [RFC4379] is updated by [RFC6829] supporting LSP Ping pseudowires. [RFC4379] is updated by [RFC6829] to support LSP Ping
for Pseudowire FEC advertised over IPv6. for Pseudowire FEC advertised over IPv6.
G-ACh/GAL (defined in [RFC5586]) is the preferred MPLS-TP OAM control G-ACh/GAL (defined in [RFC5586]) is the preferred MPLS-TP OAM control
channel and applies to any MPLS-TP end points, including Pseudowire. channel and applies to any MPLS-TP endpoints, including pseudowire.
See Section 2.6.4 for an overview of MPLS-TP OAM. See Section 2.6.4 for an overview of MPLS-TP OAM.
2.6.4. MPLS-TP OAM 2.6.4. MPLS-TP OAM
[RFC6669] summarizes the MPLS-TP OAM toolset, the set of protocols [RFC6669] summarizes the MPLS-TP OAM toolset, the set of protocols
supporting the MPLS-TP OAM requirements specified in [RFC5860] and supporting the MPLS-TP OAM requirements specified in [RFC5860] and
supported by the MPLS-TP OAM framework defined in [RFC6371]. supported by the MPLS-TP OAM framework defined in [RFC6371].
The MPLS-TP OAM toolset includes: The MPLS-TP OAM toolset includes:
CC-CV CC-CV
[RFC6428] defines BFD extensions to support proactive [RFC6428] defines BFD extensions to support proactive Continuity
Connectivity Check and Connectivity Verification (CC-CV) Check and Connectivity Verification (CC-CV) applications.
applications. [RFC6426] provides LSP ping extensions that are [RFC6426] provides LSP Ping extensions that are used to implement
used to implement on-demand connectivity verification. on-demand connectivity verification.
RDI RDI
Remote Defect Indication (RDI) is triggered by failure of Remote Defect Indication (RDI) is triggered by failure of
proactive CC-CV, which is BFD based. For fast RDI initiation, proactive CC-CV, which is BFD based. For fast RDI, RDI SHOULD be
RDI SHOULD be initiated and handled by hardware if BFD is handled initiated and handled by hardware if BFD is handled in forwarding
in forwarding hardware. [RFC6428] provides an extension for BFD hardware. [RFC6428] provides an extension for BFD that includes
that includes the RDI indication in the BFD format and a the RDI in the BFD format and a specification of how this
specification of how this indication is to be used. indication is to be used.
Route Tracing Route Tracing
[RFC6426] specifies that the LSP ping enhancements for MPLS-TP [RFC6426] specifies that the LSP Ping enhancements for MPLS-TP
on-demand connectivity verification include information on the on-demand connectivity verification include information on the
use of LSP ping for route tracing of an MPLS-TP path. use of LSP Ping for route tracing of an MPLS-TP path.
Alarm Reporting Alarm Reporting
[RFC6427] describes the details of a new protocol supporting [RFC6427] describes the details of a new protocol supporting
Alarm Indication Signal (AIS), Link Down Indication, and fault Alarm Indication Signal (AIS), Link Down Indication (LDI), and
management. Failure to support this functionality in forwarding fault management. Failure to support this functionality in
hardware can potentially result in failure to meet protection forwarding hardware can potentially result in failure to meet
recovery time requirements and is therefore strongly recommended. protection recovery time requirements; therefore, support of this
functionality is strongly recommended.
Lock Instruct Lock Instruct
Lock instruct is initiated on-demand and therefore need not be Lock instruct is initiated on demand and therefore need not be
implemented in forwarding hardware. [RFC6435] defines a lock implemented in forwarding hardware. [RFC6435] defines a lock
instruct protocol. instruct protocol.
Lock Reporting Lock Reporting
[RFC6427] covers lock reporting. Lock reporting need not be [RFC6427] covers lock reporting. Lock reporting need not be
implemented in forwarding hardware. implemented in forwarding hardware.
Diagnostic Diagnostic
[RFC6435] defines protocol support for loopback. Loopback [RFC6435] defines protocol support for loopback. Loopback
initiation is on-demand and therefore need not be implemented in initiation is on demand and therefore need not be implemented in
forwarding hardware. Loopback of packet traffic SHOULD be forwarding hardware. Loopback of packet traffic SHOULD be
implemented in forwarding hardware on high speed interfaces. implemented in forwarding hardware on high-speed interfaces.
Packet Loss and Delay Measurement Packet Loss and Delay Measurement
[RFC6374] and [RFC6375] define a protocol and profile for packet [RFC6374] and [RFC6375] define a protocol and profile for Packet
loss measurement (LM) and delay measurement (DM). LM requires a Loss Measurement (LM) and Delay Measurement (DM). LM requires a
very accurate capture and insertion of packet and byte counters very accurate capture and insertion of packet and byte counters
when a packet is transmitted and capture of packet and byte when a packet is transmitted and capture of packet and byte
counters when a packet is received. This capture and insertion counters when a packet is received. This capture and insertion
MUST be implemented in forwarding hardware for LM OAM if high MUST be implemented in forwarding hardware for LM OAM if high
accuracy is needed. DM requires very accurate capture and accuracy is needed. DM requires very accurate capture and
insertion of a timestamp on transmission and capture of timestamp insertion of a timestamp on transmission and capture of timestamp
when a packet is received. This timestamp capture and insertion when a packet is received. This timestamp capture and insertion
MUST be implemented in forwarding hardware for DM OAM if high MUST be implemented in forwarding hardware for DM OAM if high
accuracy is needed. accuracy is needed.
See Section 2.6.2 for discussion of hardware support necessary for See Section 2.6.2 for discussion of hardware support necessary for
BFD and LSP Ping. BFD and LSP Ping.
CC-CV and alarm reporting is tied to protection and therefore SHOULD CC-CV and alarm reporting is tied to protection and therefore SHOULD
be supported in forwarding hardware in order to provide protection be supported in forwarding hardware in order to provide protection
for a large number of affected LSP within target response intervals. for a large number of affected LSPs within target response intervals.
Since CC-CV is supported by BFD, for MPLS-TP providing hardware When using MPLS-TP, since CC-CV is supported by BFD, providing
assistance for BFD processing helps insure that protection recovery hardware assistance for BFD processing helps ensure that protection
time requirements can be met even for faults affecting a large number recovery time requirements can be met even for faults affecting a
of LSP. large number of LSPs.
MPLS-TP Protection State Coordination (PSC) is defined by [RFC6378] MPLS-TP Protection State Coordination (PSC) is defined by [RFC6378]
and updated by [I-D.ietf-mpls-psc-updates], correcting some errors in and updated by [RFC7324], which corrects some errors in [RFC6378].
[RFC6378].
2.6.5. MPLS OAM and Layer-2 OAM Interworking 2.6.5. MPLS OAM and Layer 2 OAM Interworking
[RFC6670] provides the reasons for selecting a single MPLS-TP OAM [RFC6670] provides the reasons for selecting a single MPLS-TP OAM
solution and examines the consequences were ITU-T to develop a second solution and examines the consequences were ITU-T to develop a second
OAM solution that is based on Ethernet encodings and mechanisms. OAM solution that is based on Ethernet encodings and mechanisms.
[RFC6310] and [RFC7023] specifies the mapping of defect states [RFC6310] and [RFC7023] specify the mapping of defect states between
between many types of hardware Attachment Circuits (ACs) and many types of hardware Attachment Circuits (ACs) and associated
associated Pseudowires (PWs). This functionality SHOULD be supported pseudowires (PWs). This functionality SHOULD be supported in
in forwarding hardware. forwarding hardware.
It is beneficial if an MPLS OAM implementation can interwork with the It is beneficial if an MPLS OAM implementation can interwork with the
underlying server layer and provide a means to interwork with a underlying server layer and provide a means to interwork with a
client layer. For example, [RFC6427] specifies an inter-layer client layer. For example, [RFC6427] specifies an inter-layer
propagation of AIS and LDI from MPLS server layer to client MPLS propagation of AIS and LDI from MPLS server layer to client MPLS
layers. Where the server layer is a Layer-2, such as Ethernet, PPP layers. Where the server layer uses a Layer 2 mechanism, such as
over SONET/SDH, or GFP over OTN, interwork among layers is also Ethernet, PPP over SONET/SDH, or GFP over OTN, interwork among layers
beneficial. For high speed interfaces, supporting this interworking is also beneficial. For high-speed interfaces, supporting this
in forwarding hardware helps insure that protection based on this interworking in forwarding hardware helps ensure that protection
interworking can meet recovery time requirements even for faults based on this interworking can meet recovery time requirements even
affecting a large number of LSP. for faults affecting a large number of LSPs.
2.6.6. Extent of OAM Support by Hardware 2.6.6. Extent of OAM Support by Hardware
Where certain requirements must be met, such as relatively high CC-CV Where certain requirements must be met, such as relatively high CC-CV
rates and a large number of interfaces, or strict protection recovery rates and a large number of interfaces, or strict protection recovery
time requirements and a moderate number of affected LSP, some OAM time requirements and a moderate number of affected LSPs, some OAM
functionality must be supported by forwarding hardware. In other functionality must be supported by forwarding hardware. In other
cases, such as highly accurate LM and DM OAM or strict protection cases, such as highly accurate LM and DM OAM or strict protection
recovery time requirements with a large number of affected LSP, OAM recovery time requirements with a large number of affected LSPs, OAM
functionality must be entirely implemented in forwarding hardware. functionality must be entirely implemented in forwarding hardware.
Where possible, implementation in forwarding hardware should be in Where possible, implementation in forwarding hardware should be in
programmable hardware such that if standards are later changed or programmable hardware such that if standards are later changed or
extended these changes are likely to be accommodated with hardware extended these changes are likely to be accommodated with hardware
reprogramming rather than replacement. reprogramming rather than replacement.
For some functionality there is a strong case for an implementation For some functionality, there is a strong case for an implementation
in dedicated forwarding hardware. Examples include packet and byte in dedicated forwarding hardware. Examples include packet and byte
counters needed for LM OAM as well as needed for management counters needed for LM OAM as well as needed for management
protocols. Similarly the capture and insertion of packet and byte protocols. Similarly, the capture and insertion of packet and byte
counts or timestamps needed for transmitted LM or DM or time counts or timestamps needed for transmitted LM or DM or time
synchronization packets MUST be implemented in forwarding hardware if synchronization packets MUST be implemented in forwarding hardware if
high accuracy is required. high accuracy is required.
For some functions there is a strong case to provide limited support For some functions, there is a strong case to provide limited support
in forwarding hardware but may make use of an external general in forwarding hardware, but an external general-purpose processor may
purpose processor if performance criteria can be met. For example be used if performance criteria can be met. For example, origination
origination of RDI triggered by CC-CV, response to RDI, and of RDI triggered by CC-CV, response to RDI, and Protection State
Protection State Coordination (PSC) functionality may be supported by Coordination (PSC) functionality may be supported by hardware, but
hardware, but expansion to a large number of client LSP and expansion to a large number of client LSPs and transmission of AIS or
transmission of AIS or RDI to the client LSP may occur in a general RDI to the client LSPs may occur in a general-purpose processor.
purpose processor. Some forwarding hardware supports one or more on- Some forwarding hardware supports one or more on-chip general-purpose
chip general purpose processors which may be well suited for such a processors that may be well suited for such a role. [RFC7324], being
role. [I-D.ietf-mpls-psc-updates], being a very recent document that a very recent document that affects a protection state machine that
affects a protection state machine that requires hardware support, requires hardware support, underscores the importance of having a
underscores the importance of having a degree of programmability in degree of programmability in forwarding hardware.
forwarding hardware.
The customer (system supplier or provider) should not dictate design, The customer (system supplier or provider) should not dictate design,
but should independently validate target functionality and but should independently validate target functionality and
performance. However, it is not uncommon for service providers and performance. However, it is not uncommon for service providers and
system implementers to insist on reviewing design details (under NDA) system implementers to insist on reviewing design details (under a
due to past experiences with suppliers and to reject suppliers who non-disclosure agreement) due to past experiences with suppliers and
are unwilling to provide details. to reject suppliers who are unwilling to provide details.
2.6.7. Support for IPFIX in Hardware 2.6.7. Support for IPFIX in Hardware
The IPFIX architecture is defined by [RFC5470]. IPFIX supports per The IPFIX architecture is defined by [RFC5470]. IPFIX supports per-
flow statistics. IPFIX infomation elements (IEs) are defined in flow statistics. IPFIX information elements (IEs) are defined in
[RFC5102] and include IEs for MPLS. [RFC7012] and include IEs for MPLS.
The forwarding chips used in core routers are not optimized for high The forwarding chips used in core routers are not optimized for high-
touch applications like IPFIX. Often support for IPFIX in core touch applications like IPFIX. Often, support for IPFIX in core
routers is limited to optional IPFIX metering, which involves a routers is limited to optional IPFIX metering, which involves a
1-in-N packet sampling, limited filtering support, and redirection to 1-in-N packet sampling, limited filtering support, and redirection to
either an internal CPU or an external interface. The CPU or device either an internal CPU or an external interface. The CPU or device
at the other end of the external interface then implements the full at the other end of the external interface then implements the full
IPFIX filtering and IPFIX collector functionality. IPFIX filtering and IPFIX collector functionality.
LSR which are intended to be deployed further from the core may LSRs that are intended to be deployed further from the core may
support lower capacity interfaces but support higher touch support lower-capacity interfaces but support higher-touch
applications on the forwarding hardware and may provide dedicated applications on the forwarding hardware and may provide dedicated
hardware to support a greater subset IPFIX functionality before hardware to support a greater subset of IPFIX functionality before
handing off to a general purpose CPU. In some cases, far from the handing off to a general-purpose CPU. In some cases, far from the
core the entire IPFIX functionality up to and including the collector core the entire IPFIX functionality up to and including the collector
may be implemented in hardware and firmware in the forwarding may be implemented in hardware and firmware in the forwarding
silicon. It is also worth noting that at lower speeds a general silicon. It is also worth noting that at lower speeds a general-
purpose CPU may become adequate to implement IPFIX, particularly if purpose CPU may become adequate to implement IPFIX, particularly if
metering is used. metering is used.
2.7. Number and Size of Flows 2.7. Number and Size of Flows
Service provider networks may carry up to hundreds of millions of Service provider networks may carry up to hundreds of millions of
flows on 10 Gb/s links. Most flows are very short lived, many under flows on 10 Gb/s links. Most flows are very short lived, many under
a second. A subset of the flows are low capacity and somewhat long a second. A subset of the flows are low capacity and somewhat long
lived. When Internet traffic dominates capacity a very small subset lived. When Internet traffic dominates capacity, a very small subset
of flows are high capacity and/or very long lived. of flows are high capacity and/or very long lived.
Two types of limitations with regard to number and size of flows have Two types of limitations with regard to number and size of flows have
been observed. been observed.
1. Some hardware cannot handle some high capacity flows because of 1. Some hardware cannot handle some high-capacity flows because of
internal paths which are limited, such as per packet backplane internal paths that are limited, such as per-packet backplane
paths or paths internal or external to chips such as buffer paths or paths internal or external to chips such as buffer
memory paths. Such designs can handle aggregates of smaller memory paths. Such designs can handle aggregates of smaller
flows. Some hardware with acknowledged limitations has been flows. Some hardware with acknowledged limitations has been
successfully deployed but may be increasingly problematic if the successfully deployed but may be increasingly problematic if the
capacity of large microflows in deployed networks continues to capacity of large microflows in deployed networks continues to
grow. grow.
2. Some hardware approaches cannot handle a large number of flows, 2. Some hardware approaches cannot handle a large number of flows,
or a large number of large flows due to attempting to count per or a large number of large flows, due to attempting to count per
flow, rather than deal with aggregates of flows. Hash techniques flow, rather than deal with aggregates of flows. Hash techniques
scale with regard to number of flows due to a fixed hash size scale with regard to number of flows due to a fixed hash size
with many flows falling into the same hash bucket. Techniques with many flows falling into the same hash bucket. Techniques
that identify individual flows have been implemented but have that identify individual flows have been implemented but have
never successfully deployed for Internet traffic. never successfully deployed for Internet traffic.
3. Questions for Suppliers 3. Questions for Suppliers
The following questions should be asked of a supplier. These The following questions should be asked of a supplier. These
questions are grouped into broad categories. The questions questions are grouped into broad categories and are intended to be
themselves are intended to be an open ended question to the supplier. open-ended questions to the supplier. The tests in Section 4 are
The tests in Section 4 are intended to verify whether the supplier intended to verify whether the supplier disclosed any compliance or
disclosed any compliance or performance limitations completely and performance limitations completely and accurately.
accurately.
3.1. Basic Compliance 3.1. Basic Compliance
Q#1 Can the implementation forward packets with an arbitrarily Q#1 Can the implementation forward packets with an arbitrarily
large stack depth? What limitations exist, and under what large stack depth? What limitations exist, and under what
circumstances do further limitations come into play (such as high circumstances do further limitations come into play (such as
packet rate or specific features enabled or specific types of high packet rate or specific features enabled or specific types
packet processing)? See Section 2.1. of packet processing)? See Section 2.1.
Q#2 Is the entire set of basic MPLS functionality described in Q#2 Is the entire set of basic MPLS functionality described in
Section 2.1 supported? Section 2.1 supported?
Q#3 Are the set of MPLS special purpose labels handled correctly Q#3 Is the set of MPLS special-purpose labels handled correctly and
and with adequate performance? Are extended special purpose with adequate performance? Are extended special-purpose labels
labels handled correctly and with adequate performance? See handled correctly and with adequate performance? See
Section 2.1.1. Section 2.1.1.
Q#4 Are mappings of label value and TC to PHB handled correctly, Q#4 Are mappings of label value and TC to PHB handled correctly,
including RFC3270 L-LSP mappings and RFC4124 CT mappings to PHB? including L-LSP mappings (RFC 3270) and CT mappings (RFC 4124)
See Section 2.1.2. to PHB? See Section 2.1.2.
Q#5 Is time synchronization adequately supported in forwarding Q#5 Is time synchronization adequately supported in forwarding
hardware? hardware?
A. Are both PTP and NTP formats supported? A. Are both PTP and NTP formats supported?
B. Is the accuracy of timestamp insertion and incoming stamping B. Is the accuracy of timestamp insertion and incoming
sufficient? stamping sufficient?
See Section 2.1.3. See Section 2.1.3.
Q#6 Is link bundling supported? Q#6 Is link bundling supported?
A. Can LSP be pinned to specific components? A. Can an LSP be pinned to specific components?
B. Is the "all-ones" component link supported? B. Is the "all-ones" component link supported?
See Section 2.1.5. See Section 2.1.5.
Q#7 Is MPLS hierarchy supported? Q#7 Is MPLS hierarchy supported?
A. Are both PHP and UHP supported? What limitations exist on A. Are both PHP and UHP supported? What limitations exist on
the number of pop operations with UHP? the number of pop operations with UHP?
B. Are the pipe, short-pipe, and uniform models supported? Are B. Are the pipe, short-pipe, and uniform models supported?
TTL and TC values updated correctly at egress where Are TTL and TC values updated correctly at egress where
applicable? applicable?
See Section 2.1.6 regarding MPLS hierarchy. See [RFC3443] See Section 2.1.6 regarding MPLS hierarchy. See [RFC3443]
regarding PHP, UHP, and pipe, short-pipe, and uniform models. regarding PHP, UHP, and pipe, short-pipe, and uniform models.
Q#8 Is FRR supported? Q#8 Is FRR supported?
A. Are both "One-to-One Backup" and "Facility Backup" supported? A. Are both "One-to-One Backup" and "Facility Backup"
supported?
B. What forms of IPFRR/LDPFRR are supported? B. What forms of IP/LDP FRR are supported?
C. How quickly does protection recovery occur? C. How quickly does protection recovery occur?
D. Does protection recovery speed increase when a fault affects D. Does protection recovery speed increase when a fault
a large numbers of protected LSP, and if so by how much? affects a large number of protected LSPs? And if so, by
how much?
See Section 2.1.7. See Section 2.1.7.
Q#9 Are pseudowire sequence numbers handled correctly? See Q#9 Are pseudowire Sequence Numbers handled correctly? See
Section 2.1.8.1. Section 2.1.8.1.
Q#10 Is VPN LER functionality handled correctly and without Q#10 Is VPN LER functionality handled correctly and without
performance issues? See Section 2.1.9. performance issues? See Section 2.1.9.
Q#11 Is MPLS multicast (P2MP and MP2MP) handled correctly? Q#11 Is MPLS multicast (P2MP and MP2MP) handled correctly?
A. Are packets dropped on uncongested outputs if some outputs A. Are packets dropped on uncongested outputs if some outputs
are congested? are congested?
B. Is performance limited in high fanout situations? B. Is performance limited in high-fanout situations?
See Section 2.2. See Section 2.2.
3.2. Basic Performance 3.2. Basic Performance
Q#12 Can very small packets be forwarded at full line rate on all Q#12 Can very small packets be forwarded at full line rate on all
interfaces indefinitely? What limitations exist, and under what interfaces indefinitely? What limitations exist? And under
circumstances do further limitations come into play (such as what circumstances do further limitations come into play (such
specific features enabled or specific types of packet as specific features enabled or specific types of packet
processing)? processing)?
Q#13 Customers must decide whether to relax the prior requirement and Q#13 Customers must decide whether to relax the prior requirement and
to what extent. If the answer to the prior question indicates to what extent. If the answer to the prior question indicates
that limitations exist, then: that limitations exist, then:
A. What is the smallest packet size where full line rate A. What is the smallest packet size where full line rate
forwarding can be supported? forwarding can be supported?
B. What is the longest burst of full rate small packets that can B. What is the longest burst of full-rate small packets that
be supported? can be supported?
Specify circumstances (such as specific features enabled or Specify circumstances (such as specific features enabled or
specific types of packet processing) often impact these rates and specific types of packet processing) that often impact these
burst sizes. rates and burst sizes.
Q#14 How many pop operations can be supported along with a swap Q#14 How many pop operations can be supported along with a swap
operation at full line rate while maintaining per LSP packet and operation at full line rate while maintaining per-LSP packet and
byte counts for each pop and swap? This requirement is byte counts for each pop and swap? This requirement is
particularly relevant for MPLS-TP. particularly relevant for MPLS-TP.
Q#15 How many label push operations can be supported. While this Q#15 How many label push operations can be supported. While this
limitation is rarely an issue, it applies to both PHP and UHP, limitation is rarely an issue, it applies to both PHP and UHP,
unlike the pop limit which applies to UHP. unlike the pop limit that applies to UHP.
Q#16 For a worst case where all packets arrive on one LSP, what is Q#16 For a worst case where all packets arrive on one LSP, what is
the counter overflow time? Are any means provided to avoid the counter overflow time? Are any means provided to avoid
polling all counters at short intervals? This applies to both polling all counters at short intervals? This applies to both
MPLS and MPLS-TP. MPLS and MPLS-TP.
3.3. Multipath Capabilities and Performance 3.3. Multipath Capabilities and Performance
Multipath capabilities and performance do not apply to MPLS-TP but Multipath capabilities and performance do not apply to MPLS-TP, but
apply to MPLS and apply if MPLS-TP is carried in MPLS. they apply to MPLS and apply if MPLS-TP is carried in MPLS.
Q#17 How are large microflows accommodated? Is there active Q#17 How are large microflows accommodated? Is there active
management of the hash space mapping to output ports? See management of the hash space mapping to output ports? See
Section 2.4.2. Section 2.4.2.
Q#18 How many MPLS labels can be included in a hash based on the MPLS Q#18 How many MPLS labels can be included in a hash based on the MPLS
label stack? label stack?
Q#19 Is packet rate performance decreased beyond some number of Q#19 Is packet rate performance decreased beyond some number of
labels? labels?
Q#20 Can the IP header and payload information below the MPLS stack Q#20 Can the IP header and payload information below the MPLS stack
be used in the hash? If so, which IP fields, payload types and be used in the hash? If so, which IP fields, payload types, and
payload fields are supported? payload fields are supported?
Q#21 At what maximum MPLS label stack depth can Bottom of Stack and Q#21 At what maximum MPLS label stack depth can Bottom of Stack and
an IP header appear without impacting packet rate performance? an IP header appear without impacting packet rate performance?
Q#22 Are special purpose labels excluded from the label stack hash? Q#22 Are special-purpose labels excluded from the label stack hash?
Are extended purpose labels excluded from the label stack hash? Are extended special-purpose labels excluded from the label
See Section 2.4.5.1. stack hash? See Section 2.4.5.1.
Q#23 How is multipath performance affected by high capacity flows or Q#23 How is multipath performance affected by high-capacity flows, an
an extremely large number of flows, or by very short lived flows? extremely large number of flows, or very short-lived flows? See
See Section 2.7. Section 2.7.
3.4. Pseudowire Capabilities and Performance 3.4. Pseudowire Capabilities and Performance
Q#24 Is the pseudowire control word supported? Q#24 Is the pseudowire Control Word supported?
Q#25 What is the maximum rate of pseudowire encapsulation and Q#25 What is the maximum rate of pseudowire encapsulation and
decapsulation? Apply the same questions as in Base Performance decapsulation? Apply the same questions as in Section 3.2
for any packet based pseudowire such as IP VPN or Ethernet. ("Basic Performance") for any packet-based pseudowire, such as
IP VPN or Ethernet.
Q#26 Does inclusion of a pseudowire control word impact performance? Q#26 Does inclusion of a pseudowire Control Word impact performance?
Q#27 Are flow labels supported? Q#27 Are Flow Labels supported?
Q#28 If so, what fields are hashed on for the flow label for Q#28 If so, what fields are hashed on for the Flow Label for
different types of pseudowires? different types of pseudowires?
Q#29 Does inclusion of a flow label impact performance? Q#29 Does inclusion of a Flow Label impact performance?
3.5. Entropy Label Support and Performance 3.5. Entropy Label Support and Performance
Q#30 Can an entropy label be added when acting as in ingress LER and Q#30 Can an Entropy Label be added when acting as an ingress LER, and
can it be removed when acting as an egress LER? can it be removed when acting as an egress LER?
Q#31 If so, what fields are hashed on for the entropy label? Q#31 If an Entropy Label can be added, what fields are hashed on for
the Entropy Label?
Q#32 Does adding or removing an entropy label impact packet rate Q#32 Does adding or removing an Entropy Label impact packet rate
performance? performance?
Q#33 Can an entropy label be detected in the label stack, used in the Q#33 Can an Entropy Label be detected in the label stack, used in the
hash, and properly terminate the search for further information hash, and properly terminate the search for further information
to hash on? to hash on?
Q#34 Does using an entropy label have any negative impact on Q#34 Does using an Entropy Label have any negative impact on
performance? It should have no impact or a positive impact. performance? It should have no impact or a positive impact.
3.6. DoS Protection 3.6. DoS Protection
Q#35 For each control and management plane protocol in use, what Q#35 For each control- and management-plane protocol in use, what
measures are taken to provide DoS attack hardening? measures are taken to provide DoS attack hardening?
Q#36 Have DoS attack tests been performed? Q#36 Have DoS attack tests been performed?
Q#37 Can compromise of an internal computer on a management subnet be Q#37 Can compromise of an internal computer on a management subnet be
leveraged for any form of attack including DoS attack? leveraged for any form of attack including DoS attack?
3.7. OAM Capabilities and Performance 3.7. OAM Capabilities and Performance
Q#38 What OAM proactive and on-demand mechanisms are supported? Q#38 What OAM proactive and on-demand mechanisms are supported?
Q#39 What performance limits exist under high proactive monitoring Q#39 What performance limits exist under high proactive monitoring
rates? rates?
Q#40 Can excessively high proactive monitoring rates impact control Q#40 Can excessively high proactive monitoring rates impact control-
plane performance or cause control plane instability? plane performance or cause control-plane instability?
Q#41 Ask the prior questions for each of the following. Q#41 Ask the prior questions for each of the following.
A. MPLS OAM A. MPLS OAM
B. Pseudowire OAM B. Pseudowire OAM
C. MPLS-TP OAM C. MPLS-TP OAM
D. Layer 2 OAM Interworking
D. Layer-2 OAM Interworking See Section 2.6.
See Section 2.6.
4. Forwarding Compliance and Performance Testing 4. Forwarding Compliance and Performance Testing
Packet rate performance of equipment supporting a large number of 10 Packet rate performance of equipment supporting a large number of 10
Gb/s or 100 Gb/s links is not possible using desktop computers or Gb/s or 100 Gb/s links is not possible using desktop computers or
workstations. The use of high end workstations as a source of test workstations. The use of high-end workstations as a source of test
traffic was barely viable 20 years ago, but is no longer at all traffic was barely viable 20 years ago but is no longer at all
viable. Though custom microcode has been used on specialized router viable. Though custom microcode has been used on specialized router
forwarding cards to serve the purpose of generating test traffic and forwarding cards to serve the purpose of generating test traffic and
measuring it, for the most part performance testing will require measuring it, for the most part, performance testing will require
specialized test equipment. There are multiple sources of suitable specialized test equipment. There are multiple sources of suitable
equipment. equipment.
The set of tests listed here do not correspond one-to-one to the set The set of tests listed here do not correspond one-to-one to the set
of questions in Section 3. The same categorization is used and these of questions in Section 3. The same categorization is used, and
tests largely serve to validate answers provided to the prior these tests largely serve to validate answers provided to the prior
questions, and can also provide answers where a supplier is unwilling questions. They can also provide answers where a supplier is
to disclose compliance or performance. unwilling to disclose compliance or performance.
Performance testing is the domain of the IETF Benchmark Methodology Performance testing is the domain of the IETF Benchmark Methodology
Working Group (BMWG). Below are brief descriptions of conformance Working Group (BMWG). Below are brief descriptions of conformance
and performance tests. Some very basic tests are specified in and performance tests. Some very basic tests, specified in
[RFC5695] which partially cover only the basic performance test T#3. [RFC5695], partially cover only the basic performance test T#3.
The following tests should be performed by the systems designer, or The following tests should be performed by the systems designer or
deployer, or performed by the supplier on their behalf if it is not deployer; or, if it is not practical for the potential customer to
practical for the potential customer to perform the tests directly. perform the tests directly, they may be performed by the supplier on
These tests are grouped into broad categories. their behalf. These tests are grouped into broad categories.
The tests in Section 4.1 should be repeated under various conditions The tests in Section 4.1 should be repeated under various conditions
to retest basic performance when critical capabilities are enabled. to retest basic performance when critical capabilities are enabled.
Complete repetition of the performance tests enabling each capability Complete repetition of the performance tests enabling each capability
and combinations of capabilities would be very time intensive, and combinations of capabilities would be very time intensive;
therefore a reduced set of performance tests can be used to gauge the therefore, a reduced set of performance tests can be used to gauge
impact of enabling specific capabilities. the impact of enabling specific capabilities.
4.1. Basic Compliance 4.1. Basic Compliance
T#1 Test forwarding at a high rate for packets with varying number T#1 Test forwarding at a high rate for packets with varying number
of label entries. While packets with more than a dozen label of label entries. While packets with more than a dozen label
entries are unlikely to be used in any practical scenario today, entries are unlikely to be used in any practical scenario today,
it is useful to know if limitations exists. it is useful to know if limitations exists.
T#2 For each of the questions listed under "Basic Compliance" in T#2 For each of the questions listed under "Basic Compliance" in
Section 3, verify the claimed compliance. For any functionality Section 3, verify the claimed compliance. For any functionality
considered critical to a deployment, where applicable performance considered critical to a deployment, the applicable performance
using each capability under load should be verified in addition using each capability under load should be verified in addition
to basic compliance. to basic compliance.
4.2. Basic Performance 4.2. Basic Performance
T#3 Test packet forwarding at full line rate with small packets. T#3 Test packet forwarding at full line rate with small packets.
See [RFC5695]. The most likely case to fail is the smallest See [RFC5695]. The most likely case to fail is the smallest
packet size. Also test with packet sizes in four byte increments packet size. Also, test with packet sizes in 4-byte increments
ranging from payload sizes or 40 to 128 bytes. ranging from payload sizes of 40 to 128 bytes.
T#4 If the prior tests did not succeed for all packet sizes, then T#4 If the prior tests did not succeed for all packet sizes, then
perform the following tests. perform the following tests.
A. Increase the packet size by 4 bytes until a size is found A. Increase the packet size by 4 bytes until a size is found
that can be forwarded at full rate. that can be forwarded at full rate.
B. Inject bursts of consecutive small packets into a stream of B. Inject bursts of consecutive small packets into a stream of
larger packets. Allow some time for recovery between bursts. larger packets. Allow some time for recovery between
Increase the number of packets in the burst until packets are bursts. Increase the number of packets in the burst until
dropped. packets are dropped.
T#5 Send test traffic where a swap operation is required. Also set T#5 Send test traffic where a swap operation is required. Also set
up multiple LSP carried over other LSP where the device under up multiple LSPs carried over other LSPs where the device under
test (DUT) is the egress of these LSP. Create test packets such test (DUT) is the egress of these LSPs. Create test packets
that the swap operation is performed after pop operations, such that the swap operation is performed after pop operations,
increasing the number of pop operations until forwarding of small increasing the number of pop operations until forwarding of
packets at full line rate can no longer be supported. Also check small packets at full line rate can no longer be supported.
to see how many pop operations can be supported before the full Also, check to see how many pop operations can be supported
set of counters can no longer be maintained. This requirement is before the full set of counters can no longer be maintained.
particularly relevant for MPLS-TP. This requirement is particularly relevant for MPLS-TP.
T#6 Send all traffic on one LSP and see if the counters become T#6 Send all traffic on one LSP and see if the counters become
inaccurate. Often counters on silicon are much smaller than the inaccurate. Often, counters on silicon are much smaller than
64 bit packet and byte counters in various IETF MIBs. System the 64-bit packet and byte counters in various IETF MIBs.
developers should consider what counter polling rate is necessary System developers should consider what counter polling rate is
to maintain accurate counters and whether those polling rates are necessary to maintain accurate counters and whether those
practical. Relevant MIBs for MPLS are discussed in [RFC4221] and polling rates are practical. Relevant MIBs for MPLS are
[RFC6639]. discussed in [RFC4221] and [RFC6639].
T#7 [RFC6894] provides a good basis for MPLS FRR testing. Similar T#7 [RFC6894] provides a good basis for MPLS FRR testing. Similar
testing should be performed to determine restoration times, testing should be performed to determine restoration times;
however this testing is far more difficult to perform due to the however, this testing is far more difficult to perform due to
need for a simulated test topology that is capable of simulating the need for a simulated test topology that is capable of
the signaling used in restoration. The simulated topology should simulating the signaling used in restoration. The simulated
be comparable with the target deployment in the number of nodes topology should be comparable with the target deployment in the
and links and in resource usage flooding and setup delays. Some number of nodes and links and in resource usage flooding and
commercial test equipment can support this type of testing. setup delays. Some commercial test equipment can support this
type of testing.
4.3. Multipath Capabilities and Performance 4.3. Multipath Capabilities and Performance
Multipath capabilities do not apply to MPLS-TP but apply to MPLS and Multipath capabilities do not apply to MPLS-TP but apply to MPLS and
apply if MPLS-TP is carried in MPLS. apply if MPLS-TP is carried in MPLS.
T#8 Send traffic at a rate well exceeding the capacity of a single T#8 Send traffic at a rate well exceeding the capacity of a single
multipath component link, and where entropy exists only below the multipath component link, and where entropy exists only below
top of stack. If only the top label is used this test will fail the top of stack. If only the top label is used, this test will
immediately. fail immediately.
T#9 Move the labels with entropy down in the stack until either the T#9 Move the labels with entropy down in the stack until either the
full forwarding rate can no longer be supported or most or all full forwarding rate can no longer be supported or most or all
packets try to use the same component link. packets try to use the same component link.
T#10 Repeat the two tests above with the entropy contained in IP T#10 Repeat the two tests above with the entropy contained in IP
headers or IP payload fields below the label stack rather than in headers or IP payload fields below the label stack rather than
the label stack. Test with the set of IP headers or IP payload in the label stack. Test with the set of IP headers or IP
fields considered relevant to the deployment or to the target payload fields considered relevant to the deployment or to the
market. target market.
T#11 Determine whether traffic that contains a pseudowire control T#11 Determine whether traffic that contains a pseudowire Control
word is interpreted as IP traffic. Information in the payload Word is interpreted as IP traffic. Information in the payload
MUST NOT be used in the load balancing if the first nibble of the MUST NOT be used in the load balancing if the first nibble of
packet is not 4 or 6 (IPv4 or IPv6). the packet is not 4 or 6 (IPv4 or IPv6).
T#12 Determine whether special purpose labels and extended special T#12 Determine whether special-purpose labels and extended special-
purpose labels are excluded from the label stack hash. They MUST purpose labels are excluded from the label stack hash. They
be excluded. MUST be excluded.
T#13 Perform testing in the presence of combinations of: T#13 Perform testing in the presence of combinations of:
A. Very large microflows. A. Very large microflows.
B. Relatively short lived high capacity flows. B. Relatively short-lived high-capacity flows.
C. Extremely large numbers of flows. C. Extremely large numbers of flows.
D. Very short lived small flows. D. Very short-lived small flows.
4.4. Pseudowire Capabilities and Performance 4.4. Pseudowire Capabilities and Performance
T#14 Ensure that pseudowire can be set up with a pseudowire label and T#14 Ensure that pseudowire can be set up with a pseudowire label and
pseudowire control word added at ingress and the pseudowire label pseudowire Control Word added at ingress and the pseudowire
and pseudowire control word removed at egress. label and pseudowire Control Word removed at egress.
T#15 For pseudowire that contains variable length payload packets, T#15 For pseudowire that contains variable-length payload packets,
repeat performance tests listed under "Basic Performance" for repeat performance tests listed under "Basic Performance" for
pseudowire ingress and egress functions. pseudowire ingress and egress functions.
T#16 Repeat pseudowire performance tests with and without a T#16 Repeat pseudowire performance tests with and without a
pseudowire control word. pseudowire Control Word.
T#17 Determine whether pseudowire can be set up with a pseudowire T#17 Determine whether pseudowire can be set up with a pseudowire
label, flow label, and pseudowire control word added at ingress label, Flow Label, and pseudowire Control Word added at ingress
and the pseudowire label, flow label, and pseudowire control word and the pseudowire label, Flow Label, and pseudowire Control
removed at egress. Word removed at egress.
T#18 Determine which payload fields are used to create the flow label T#18 Determine which payload fields are used to create the Flow Label
and whether the set of fields and algorithm provide sufficient and whether the set of fields and algorithm provide sufficient
entropy for load balancing. entropy for load balancing.
T#19 Repeat pseudowire performance tests with flow labels included. T#19 Repeat pseudowire performance tests with Flow Labels included.
4.5. Entropy Label Support and Performance 4.5. Entropy Label Support and Performance
T#20 Determine whether entropy labels can be added at ingress and T#20 Determine whether Entropy Labels can be added at ingress and
removed at egress. removed at egress.
T#21 Determine which fields are used to create an entropy label. T#21 Determine which fields are used to create an Entropy Label.
Labels further down in the stack, including entropy labels Labels further down in the stack, including Entropy Labels
further down and IP headers or IP payload fields where applicable further down and IP headers or IP payload fields where
should be used. Determine whether the set of fields and applicable, should be used. Determine whether the set of fields
algorithm provide sufficient entropy for load balancing. and algorithm provide sufficient entropy for load balancing.
T#22 Repeat performance tests under "Basic Performance" when entropy T#22 Repeat performance tests under "Basic Performance" when Entropy
labels are used, where ingress or egress is the device under test Labels are used, where ingress or egress is the device under
(DUT). test (DUT).
T#23 Determine whether an ELI is detected when acting as a midpoint T#23 Determine whether an ELI is detected when acting as a midpoint
LSR and whether the search for further information on which to LSR and whether the search for further information on which to
base the load balancing is used. Information below the entropy base the load balancing is used. Information below the Entropy
label SHOULD NOT be used. Label SHOULD NOT be used.
T#24 Ensure that the entropy label indicator and entropy label (ELI T#24 Ensure that the Entropy Label indicator and Entropy Label (ELI
and EL) are removed from the label stack during UHP and PHP and EL) are removed from the label stack during UHP and PHP
operations. operations.
T#25 Insure that operations on the TC field when adding and removing T#25 Ensure that operations on the TC field when adding and removing
entropy label are correctly carried out. If TC is changed during Entropy Label are correctly carried out. If TC is changed
a swap operation, the ability to transfer that change MUST be during a swap operation, the ability to transfer that change
provided. The ability to suppress the transfer of TC MUST also MUST be provided. The ability to suppress the transfer of TC
be provided. See "pipe", "short pipe", and "uniform" models in MUST also be provided. See the pipe, short-pipe, and uniform
[RFC3443]. models in [RFC3443].
T#26 Repeat performance tests for a midpoint LSR with entropy labels T#26 Repeat performance tests for a midpoint LSR with Entropy Labels
found at various label stack depths. found at various label stack depths.
4.6. DoS Protection 4.6. DoS Protection
T#27 Actively attack LSR under high protocol churn load and determine T#27 Actively attack LSRs under high protocol churn load and
control plane performance impact or successful DoS under test determine control-plane performance impact or successful DoS
conditions. Specifically test for the following. under test conditions. Specifically, test for the following.
A. TCP SYN attack against control plane and management plane A. TCP SYN attack against control-plane and management-plane
protocols using TCP, including CLI access (typically SSH protocols using TCP, including CLI access (typically SSH-
protected login), NETCONF, etc. protected login), NETCONF, etc.
B. High traffic volume attack against control plane and B. High traffic volume attack against control-plane and
management plane protocols not using TCP. management-plane protocols not using TCP.
C. Attacks which can be performed from a compromised management C. Attacks that can be performed from a compromised management
subnet computer, but not one with authentication keys. subnet computer, but not one with authentication keys.
D. Attacks which can be performed from a compromised peer within D. Attacks that can be performed from a compromised peer within
the control plane (internal domain and external domain). the control plane (internal domain and external domain).
Assume that per peering keys and per router ID keys rather Assume that keys that are per peering and keys that are per
than network wide keys are in use. router ID, rather than network-wide keys, are in use.
See Section 2.6.1. See Section 2.6.1.
4.7. OAM Capabilities and Performance 4.7. OAM Capabilities and Performance
T#28 Determine maximum sustainable rates of BFD traffic. If BFD T#28 Determine maximum sustainable rates of BFD traffic. If BFD
requires CPU intervention, determine both maximum rates and CPU requires CPU intervention, determine both maximum rates and CPU
loading when multiple interfaces are active. loading when multiple interfaces are active.
T#29 Verify LSP Ping and LSP Traceroute capability. T#29 Verify LSP Ping and LSP Traceroute capability.
T#30 Determine maximum rates of MPLS-TP CC-CV traffic. If CC-CV T#30 Determine maximum rates of MPLS-TP CC-CV traffic. If CC-CV
requires CPU intervention, determine both maximum rates and CPU requires CPU intervention, determine both maximum rates and CPU
loading when multiple interfaces are active. loading when multiple interfaces are active.
T#31 Determine MPLS-TP DM precision. T#31 Determine MPLS-TP DM precision.
T#32 Determine MPLS-TP LM accuracy. T#32 Determine MPLS-TP LM accuracy.
T#33 Verify MPLS-TP AIS/RDI and Protection State Coordination (PSC) T#33 Verify MPLS-TP AIS/RDI and Protection State Coordination (PSC)
functionality, protection speed, and AIS/RDI notification speed functionality, protection speed, and AIS/RDI notification speed
when a large number of Management Entities (ME) must be notified when a large number of Maintenance Entities (MEs) must be
with AIS/RDI. notified with AIS/RDI.
5. Acknowledgements
Numerous very useful comments have been received in private email.
Some of these contributions are acknowledged here, approximately in
chronologic order.
Paul Doolan provided a brief review resulting in a number of
clarifications, most notably regarding on-chip vs. system buffering,
100 Gb/s link speed assumptions in the 150 Mpps figure, and handling
of large microflows. Pablo Frank reminded us of the sawtooth effect
in PPS vs. packet size graphs, prompting the addition of a few
paragraphs on this. Comments from Lou Berger at IETF-85 prompted the
addition of Section 2.7.
Valuable comments were received on the BMWG mailing list. Jay
Karthik pointed out testing methodology hints that after discussion
were deemed out of scope and were removed but may benefit later work
in BMWG.
Nabil Bitar pointed out the need to cover QoS (Differentiated
Services), MPLS multicast (P2MP and MP2MP), and MPLS-TP OAM. Nabil
also provided a number of clarifications to the questions and tests
in Section 3 and Section 4.
Mark Szczesniak provided a thorough review and a number of useful
comments and suggestions that improved the document.
Gregory Mirsky and Thomas Beckhaus provided useful comments during
the MPLS RT review.
Tal Mizrahi provided comments that prompted clarifications regarding
timestamp processing, local delivery of packets, and the need for
hardware assistance in processing OAM traffic.
Alexander (Sasha) Vainshtein pointed out errors in Section 2.1.8.1
and suggested new text which after lengthy discussion resulted in
restating the summarization of requirements from PWE3 RFCs and more
clearly stating the benefits and drawbacks of packet resequencing
based on PW sequence number.
Loa Anderson provided useful comments and corrections prior to WGLC.
Adrian Farrel provided useful comments and corrections prior as part
of the AD review.
Discussion with Steve Kent during SecDir review resulted in expansion
of Section 7, briefly summarizing security considerations related to
forwarding in normative references. Tom Petch pointed out some
editorial errors in private email plus an important math error. Al
Morton during OpsDir review prompted clarification in the target
audience section, suggested more clear wording in places, and found
numerous editorial errors.
Discussion with Steward Bryant and Alia Atlas as part of IESG review
resulted in coverage of IPFIX and improvements to document coverage
of MPLS FRR, and IP/LDP FRR, plus some corrections to the text
elsewhere.
6. IANA Considerations
This memo includes no request to IANA.
7. Security Considerations 5. Security Considerations
This document reviews forwarding behavior specified elsewhere and This document reviews forwarding behavior specified elsewhere and
points out compliance and performance requirements. As such it points out compliance and performance requirements. As such, it
introduces no new security requirements or concerns. introduces no new security requirements or concerns.
Discussion of hardware support and other equipment hardening against Discussion of hardware support and other equipment hardening against
DoS attack can be found in Section 2.6.1. Section 3.6 provides a DoS attack can be found in Section 2.6.1. Section 3.6 provides a
list of question regarding DoS to be asked of suppliers. Section 4.6 list of questions regarding DoS to be asked of suppliers.
suggests types of testing that can provide some assurance of the Section 4.6 suggests types of testing that can provide some assurance
effectiveness of supplier DoS hardening claims. of the effectiveness of a supplier's claims about DoS hardening.
Knowledge of potential performance shortcomings may serve to help new Knowledge of potential performance shortcomings may serve to help new
implementations avoid pitfalls. It is unlikely that such knowledge implementations avoid pitfalls. It is unlikely that such knowledge
could be the basis of new denial of service as these pitfalls are could be the basis of new denial of service, as these pitfalls are
already widely known in the service provider community and among already widely known in the service provider community and among
leading equipment suppliers. In practice extreme data and packet leading equipment suppliers. In practice, extreme data and packet
rate are needed to affect existing equipment and to affect networks rates are needed to affect existing equipment and to affect networks
that may be still vulnerable due to failure to implement adequate that may be still vulnerable due to failure to implement adequate
protection. The extreme data and packet rates make this type of protection. The extreme data and packet rates make this type of
denial of service unlikely and make undetectable denial of service of denial of service unlikely and make undetectable denial of service of
this type impossible. this type impossible.
The set of normative references each contain security considerations. Each normative reference contains security considerations. A brief
A brief summarization of MPLS security considerations applicable to summarization of MPLS security considerations applicable to
forwarding follows: forwarding follows:
1. MPLS encapsulation does not support an authentication extension. 1. MPLS encapsulation does not support an authentication extension.
This is reflected in the security section of [RFC3032]. This is reflected in the security section of [RFC3032].
Documents which clarify MPLS header fields such as TTL Documents that clarify MPLS header fields such as TTL [RFC3443],
[RFC3443], the explicit null label [RFC4182], renaming EXP to TC the explicit null label [RFC4182], renaming EXP to TC [RFC5462],
[RFC5462], ECN for MPLS [RFC5129], and MPLS Ethernet ECN for MPLS [RFC5129], and MPLS Ethernet encapsulation
encapsulation [RFC5332] make no changes to security [RFC5332] make no changes to security considerations in
considerations in [RFC3032]. [RFC3032].
2. Some cited RFCs are related to Diffserv forwarding. [RFC3270] 2. Some cited RFCs are related to Diffserv forwarding. [RFC3270]
refers to MPLS and Diffserv security. [RFC2474] mentions theft refers to MPLS and Diffserv security. [RFC2474] mentions theft
of service and denial of service due to mismarking. [RFC2474] of service and denial of service due to mismarking. [RFC2474]
mentions IPsec interaction, but with MPLS, not being carried by mentions IPsec interaction, but with MPLS, not being carried by
IP, this type of interaction in [RFC2474] is not relevant. IP, the type of interaction in [RFC2474] is not relevant.
3. [RFC3209] is cited here due only to make-before-break forwarding 3. [RFC3209] is cited here due only to make-before-break forwarding
requirements. This is related to resource sharing and the theft requirements. This is related to resource sharing and the
of service and denial of service concerns in [RFC2474] apply. theft-of-service and denial-of-service concerns in [RFC2474]
apply.
4. [RFC4090] defines FRR which provides protection but does not add 4. [RFC4090] defines FRR, which provides protection but does not
security concerns. RFC4201 defines link bundling but raises no add security concerns. RFC 4201 defines link bundling but
additional security concerns. raises no additional security concerns.
5. Various OAM control channels are defined in [RFC4385] (PW CW), 5. Various OAM control channels are defined in [RFC4385] (PW CW),
[RFC5085] (VCCV), [RFC5586] (G-Ach and GAL). These documents [RFC5085] (VCCV), and [RFC5586] (G-Ach and GAL). These
describe potential abuse of these OAM control channels. documents describe potential abuse of these OAM control
channels.
6. [RFC4950] defines ICMP extensions when MPLS TTL expires and 6. [RFC4950] defines ICMP extensions when MPLS TTL expires and the
payload is IP. This provides MPLS header information which is payload is IP. This provides MPLS header information that is of
of no use to an IP attacker, but sending this information can be no use to an IP attacker, but sending this information can be
suppressed through configuration. suppressed through configuration.
7. GTSM [RFC5082] provides a means to improve protection against 7. GTSM [RFC5082] provides a means to improve protection against
high traffic volume spoofing as a form of DoS attack. high traffic volume spoofing as a form of DoS attack.
8. BFD [RFC5880] [RFC5884] [RFC5885] provides a form of OAM used in 8. BFD [RFC5880] [RFC5884] [RFC5885] provides a form of OAM used in
MPLS and MPLS-TP. The security considerations related to the MPLS and MPLS-TP. The security considerations related to the
OAM control channel are relevant. The BFD payload supports OAM control channel are relevant. The BFD payload supports
authentication unlike the MPLS encapsulation or MPLS or PW authentication. The MPLS encapsulation, the MPLS control
control channel encapsulation is carried in. Where an IP return channel, or the PW control channel, which BFD may be carried in,
OAM path is used IPsec is suggested as a means of securing the do not support authentication. Where an IP return OAM path is
return path. used, IPsec is suggested as a means of securing the return path.
9. Other forms of OAM are supported by [RFC6374] [RFC6375] (Loss 9. Other forms of OAM are supported by [RFC6374] [RFC6375] (Loss
and Delay Measurement), [RFC6428] (Connectivity Check/ and Delay Measurement), [RFC6428] (Continuity Check/Verification
Verification based on BFD), and [RFC6427] (Fault Management). based on BFD), and [RFC6427] (Fault Management). The security
The security considerations related to the OAM control channel considerations related to the OAM control channel are relevant.
are relevant. IP return paths, where used, can be secured with IP return paths, where used, can be secured with IPsec.
IPsec.
10. Linear protection is defined by [RFC6378] and updated by 10. Linear protection is defined by [RFC6378] and updated by
[I-D.ietf-mpls-psc-updates]. Security concerns related to MPLS [RFC7324]. Security concerns related to MPLS encapsulation and
encapsulation and OAM control channels apply. Security concerns OAM control channels apply. Security concerns reiterate
reiterate [RFC5920] as applied to protection switching. [RFC5920] as applied to protection switching.
11. The PW Flow Label [RFC6391] and MPLS Entropy Label [RFC6790] 11. The PW Flow Label [RFC6391] and MPLS Entropy Label [RFC6790]
affect multipath load balancing. Security concerns reiterate affect multipath load balancing. Security concerns reiterate
[RFC5920]. Security impacts would be limited to load [RFC5920]. Security impacts would be limited to load
distribution. distribution.
MPLS security including data plane security is discussed in greater MPLS security including data-plane security is discussed in greater
detail in [RFC5920] (MPLS/GMPLS Security Framework). The MPLS-TP detail in [RFC5920] (MPLS/GMPLS Security Framework). The MPLS-TP
security framework [RFC6941] build upon this, focusing largely on the security framework [RFC6941] builds upon this, focusing largely on
MPLS-TP OAM additions and OAM channels with some attention given to the MPLS-TP OAM additions and OAM channels with some attention given
using network management in place of control plane setup. In both to using network management in place of control-plane setup. In both
security framework documents MPLS is assumed to run within a "trusted security framework documents, MPLS is assumed to run within a
zone", defined as being where a single service provider (SP) has "trusted zone", defined as being where a single service provider has
total operational control over that part of the network. total operational control over that part of the network.
If control plane security and management plane security are If control-plane security and management-plane security are
sufficiently robust, compromise of a single network element may sufficiently robust, compromise of a single network element may
result in chaos in the data plane anywhere in the network through result in chaos in the data plane anywhere in the network through
denial of service attacks, but not a Byzantine security failure in denial-of-service attacks, but not a Byzantine security failure in
which other network elements are fully compromised. which other network elements are fully compromised.
MPLS security, or lack of, can affect whether traffic can be MPLS security, or lack thereof, can affect whether traffic can be
misrouted and lost, or intercepted, or intercepted and reinserted (a misrouted and lost, or intercepted, or intercepted and reinserted (a
man-in-the-middle attack) or spoofed. End user applications, man-in-the-middle attack), or spoofed. End-user applications,
including control plane and management plane protocols used by the including control-plane and management-plane protocols used by the
SP, are expected to make use of appropriate end-to-end authentication service provider, are expected to make use of appropriate end-to-end
and where appropriate end-to-end encryption. authentication and, where appropriate, end-to-end encryption.
8. Organization of References Section 6. Organization of References Section
The References section is split into Normative and Informative The References section is split into Normative and Informative
subsections. References that directly specify forwarding subsections. References that directly specify forwarding
encapsulations or behaviors are listed as normative. References encapsulations or behaviors are listed as normative. References that
which describe signaling only, though normative with respect to describe signaling only, though normative with respect to signaling,
signaling, are listed as informative. They are informative with are listed as informative. They are informative with respect to MPLS
respect to MPLS forwarding. forwarding.
9. References
9.1. Normative References 7. References
[I-D.ietf-mpls-psc-updates] 7.1. Normative References
Osborne, E., "Updates to PSC", draft-ietf-mpls-psc-
updates-01 (work in progress), January 2014.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
Encoding", RFC 3032, January 2001. Encoding", RFC 3032, January 2001.
[RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
skipping to change at page 52, line 46 skipping to change at page 52, line 43
[RFC6428] Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive [RFC6428] Allan, D., Swallow Ed. , G., and J. Drake Ed. , "Proactive
Connectivity Verification, Continuity Check, and Remote Connectivity Verification, Continuity Check, and Remote
Defect Indication for the MPLS Transport Profile", RFC Defect Indication for the MPLS Transport Profile", RFC
6428, November 2011. 6428, November 2011.
[RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and
L. Yong, "The Use of Entropy Labels in MPLS Forwarding", L. Yong, "The Use of Entropy Labels in MPLS Forwarding",
RFC 6790, November 2012. RFC 6790, November 2012.
9.2. Informative References [RFC7324] Osborne, E., "Updates to MPLS Transport Profile Linear
Protection", RFC 7324, July 2014.
7.2. Informative References
[ACK-compression] [ACK-compression]
, , and , "Observations and Dynamics of a Congestion Zhang, L., Shenker, S., and D. Clark, "Observations and
Control Algorithm: The Effects of Two-Way Traffic", Proc. Dynamics of a Congestion Control Algorithm: The Effects of
ACM SIGCOMM, ACM Computer Communications Review (CCR) Vol Two-Way Traffic", Proc. ACM SIGCOMM, ACM Computer
21, No 4, 1991, pp.133-147., 1991. Communications Review (CCR) Vol. 21, No. 4, pp. 133-147.,
1991.
[I-D.ietf-mpls-in-udp] [MPLS-IN-UDP]
Xu, X., Sheth, N., Yong, L., Pignataro, C., and F. Xu, X., Sheth, N., Yong, L., Pignataro, C., and F.
Yongbing, "Encapsulating MPLS in UDP", draft-ietf-mpls-in- Yongbing, "Encapsulating MPLS in UDP", Work in Progress,
udp-05 (work in progress), January 2014. January 2014.
[I-D.ietf-mpls-special-purpose-labels]
Kompella, K., Andersson, L., and A. Farrel, "Allocating
and Retiring Special Purpose MPLS Labels", draft-ietf-
mpls-special-purpose-labels-03 (work in progress), July
2013.
[I-D.ietf-rtgwg-mrt-frr-architecture] [MRT] Atlas, A., Kebler, R., Bowers, C., Envedi, G., Csaszar,
Atlas, A., Kebler, R., Envedi, G., Csaszar, A., Tantsura, A., Tantsura, J., Konstantynowicz, M., and R. White, "An
J., Konstantynowicz, M., and R. White, "An Architecture Architecture for IP/LDP Fast-Reroute Using Maximally
for IP/LDP Fast-Reroute Using Maximally Redundant Trees", Redundant Trees", Work in Progress, July 2014.
draft-ietf-rtgwg-mrt-frr-architecture-03 (work in
progress), July 2013.
[I-D.ietf-rtgwg-remote-lfa] [REMOTE-LFA]
Bryant, S., Filsfils, C., Previdi, S., Shand, M., and S. Bryant, S., Filsfils, C., Previdi, S., Shand, M., and S.
Ning, "Remote LFA FRR", draft-ietf-rtgwg-remote-lfa-04 Ning, "Remote LFA FRR", Work in Progress, May 2014.
(work in progress), November 2013.
[I-D.ietf-tictoc-1588overmpls]
Davari, S., Oren, A., Bhatia, M., Roberts, P., and L.
Montini, "Transporting Timing messages over MPLS
Networks", draft-ietf-tictoc-1588overmpls-05 (work in
progress), June 2013.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, September
1981. 1981.
[RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black,
"Definition of the Differentiated Services Field (DS "Definition of the Differentiated Services Field (DS
Field) in the IPv4 and IPv6 Headers", RFC 2474, December Field) in the IPv4 and IPv6 Headers", RFC 2474, December
1998. 1998.
[RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
skipping to change at page 55, line 5 skipping to change at page 54, line 48
Hierarchy with Generalized Multi-Protocol Label Switching Hierarchy with Generalized Multi-Protocol Label Switching
(GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005.
[RFC4221] Nadeau, T., Srinivasan, C., and A. Farrel, "Multiprotocol [RFC4221] Nadeau, T., Srinivasan, C., and A. Farrel, "Multiprotocol
Label Switching (MPLS) Management Overview", RFC 4221, Label Switching (MPLS) Management Overview", RFC 4221,
November 2005. November 2005.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, March 2006. Congestion Control Protocol (DCCP)", RFC 4340, March 2006.
[RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
Networks (VPNs)", RFC 4364, February 2006.
[RFC4377] Nadeau, T., Morrow, M., Swallow, G., Allan, D., and S. [RFC4377] Nadeau, T., Morrow, M., Swallow, G., Allan, D., and S.
Matsushima, "Operations and Management (OAM) Requirements Matsushima, "Operations and Management (OAM) Requirements
for Multi-Protocol Label Switched (MPLS) Networks", RFC for Multi-Protocol Label Switched (MPLS) Networks", RFC
4377, February 2006. 4377, February 2006.
[RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol
Label Switched (MPLS) Data Plane Failures", RFC 4379, Label Switched (MPLS) Data Plane Failures", RFC 4379,
February 2006. February 2006.
[RFC4664] Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual [RFC4664] Andersson, L. and E. Rosen, "Framework for Layer 2 Virtual
skipping to change at page 55, line 36 skipping to change at page 55, line 36
[RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal
Cost Multipath Treatment in MPLS Networks", BCP 128, RFC Cost Multipath Treatment in MPLS Networks", BCP 128, RFC
4928, June 2007. 4928, June 2007.
[RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC
4960, September 2007. 4960, September 2007.
[RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP
Specification", RFC 5036, October 2007. Specification", RFC 5036, October 2007.
[RFC5102] Quittek, J., Bryant, S., Claise, B., Aitken, P., and J.
Meyer, "Information Model for IP Flow Information Export",
RFC 5102, January 2008.
[RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast [RFC5286] Atlas, A. and A. Zinin, "Basic Specification for IP Fast
Reroute: Loop-Free Alternates", RFC 5286, September 2008. Reroute: Loop-Free Alternates", RFC 5286, September 2008.
[RFC5317] Bryant, S. and L. Andersson, "Joint Working Team (JWT) [RFC5317] Bryant, S. and L. Andersson, "Joint Working Team (JWT)
Report on MPLS Architectural Considerations for a Report on MPLS Architectural Considerations for a
Transport Profile", RFC 5317, February 2009. Transport Profile", RFC 5317, February 2009.
[RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching [RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label Switching
(MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic (MPLS) Label Stack Entry: "EXP" Field Renamed to "Traffic
Class" Field", RFC 5462, February 2009. Class" Field", RFC 5462, February 2009.
skipping to change at page 58, line 18 skipping to change at page 58, line 13
6894, March 2013. 6894, March 2013.
[RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R. [RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R.
Graveman, "MPLS Transport Profile (MPLS-TP) Security Graveman, "MPLS Transport Profile (MPLS-TP) Security
Framework", RFC 6941, April 2013. Framework", RFC 6941, April 2013.
[RFC6981] Bryant, S., Previdi, S., and M. Shand, "A Framework for IP [RFC6981] Bryant, S., Previdi, S., and M. Shand, "A Framework for IP
and MPLS Fast Reroute Using Not-Via Addresses", RFC 6981, and MPLS Fast Reroute Using Not-Via Addresses", RFC 6981,
August 2013. August 2013.
[RFC7012] Claise, B. and B. Trammell, "Information Model for IP Flow
Information Export (IPFIX)", RFC 7012, September 2013.
[RFC7023] Mohan, D., Bitar, N., Sajassi, A., DeLord, S., Niger, P., [RFC7023] Mohan, D., Bitar, N., Sajassi, A., DeLord, S., Niger, P.,
and R. Qiu, "MPLS and Ethernet Operations, Administration, and R. Qiu, "MPLS and Ethernet Operations, Administration,
and Maintenance (OAM) Interworking", RFC 7023, October and Maintenance (OAM) Interworking", RFC 7023, October
2013. 2013.
[RFC7074] Berger, L. and J. Meuric, "Revised Definition of the GMPLS [RFC7074] Berger, L. and J. Meuric, "Revised Definition of the GMPLS
Switching Capability and Type Fields", RFC 7074, November Switching Capability and Type Fields", RFC 7074, November
2013. 2013.
[RFC7079] Del Regno, N. and A. Malis, "The Pseudowire (PW) and [RFC7079] Del Regno, N. and A. Malis, "The Pseudowire (PW) and
Virtual Circuit Connectivity Verification (VCCV) Virtual Circuit Connectivity Verification (VCCV)
Implementation Survey Results", RFC 7079, November 2013. Implementation Survey Results", RFC 7079, November 2013.
9.3. URIs [RFC7274] Kompella, K., Andersson, L., and A. Farrel, "Allocating
and Retiring Special-Purpose MPLS Labels", RFC 7274, June
2014.
[1] http://www.iana.org [TIMING-OVER-MPLS]
Davari, S., Oren, A., Bhatia, M., Roberts, P., and L.
Montini, "Transporting Timing messages over MPLS
Networks", Work in Progress, April 2014.
Appendix A. Acknowledgements
Numerous very useful comments have been received in private email.
Some of these contributions are acknowledged here, approximately in
chronologic order.
Paul Doolan provided a brief review resulting in a number of
clarifications, most notably regarding on-chip vs. system buffering,
100 Gb/s link speed assumptions in the 150 Mpps figure, and handling
of large microflows. Pablo Frank reminded us of the sawtooth effect
in PPS vs. packet-size graphs, prompting the addition of a few
paragraphs on this. Comments from Lou Berger at IETF 85 prompted the
addition of Section 2.7.
Valuable comments were received on the BMWG mailing list. Jay
Karthik pointed out testing methodology hints that after discussion
were deemed out of scope and were removed but may benefit later work
in BMWG.
Nabil Bitar pointed out the need to cover QoS (Differentiated
Services), MPLS multicast (P2MP and MP2MP), and MPLS-TP OAM. Nabil
also provided a number of clarifications to the questions and tests
in Sections 3 and 4.
Mark Szczesniak provided a thorough review and a number of useful
comments and suggestions that improved the document.
Gregory Mirsky and Thomas Beckhaus provided useful comments during
the review by the MPLS Review Team.
Tal Mizrahi provided comments that prompted clarifications regarding
timestamp processing, local delivery of packets, and the need for
hardware assistance in processing OAM traffic.
Alexander (Sasha) Vainshtein pointed out errors in Section 2.1.8.1
and suggested new text that, after lengthy discussion, resulted in
restating the summarization of requirements from PWE3 RFCs and more
clearly stating the benefits and drawbacks of packet resequencing
based on PW Sequence Number.
Loa Anderson provided useful comments and corrections prior to WGLC.
Adrian Farrel provided useful comments and corrections prior as part
of the AD review.
Discussion with Steve Kent during SecDir review resulted in expansion
of Section 5, briefly summarizing security considerations related to
forwarding in normative references. Tom Petch pointed out some
editorial errors in private email plus an important math error. Al
Morton during OpsDir review prompted clarification in the section
about the target audience, suggested more clear wording in places,
and found numerous editorial errors.
Discussion with Stewart Bryant and Alia Atlas as part of IESG review
resulted in coverage of IPFIX and improvements to document coverage
of MPLS FRR, and IP/LDP FRR, plus some corrections to the text
elsewhere.
Authors' Addresses Authors' Addresses
Curtis Villamizar (editor) Curtis Villamizar (editor)
Outer Cape Cod Network Consulting, LLC Outer Cape Cod Network Consulting, LLC
Email: curtis@occnc.com EMail: curtis@occnc.com
Kireeti Kompella Kireeti Kompella
Juniper Networks Juniper Networks
Email: kireeti@juniper.net EMail: kireeti@juniper.net
Shane Amante Shane Amante
Apple Inc. Apple Inc.
1 Infinite Loop 1 Infinite Loop
Cupertino, California 95014 Cupertino, California 95014
Email: samante@apple.com EMail: amante@apple.com
Andrew Malis Andrew Malis
Huawei Technologies Huawei Technologies
Email: agmalis@gmail.com EMail: agmalis@gmail.com
Carlos Pignataro Carlos Pignataro
Cisco Systems Cisco Systems
7200-12 Kit Creek Road 7200-12 Kit Creek Road
Research Triangle Park, NC 27709 Research Triangle Park, NC 27709
US US
Email: cpignata@cisco.com EMail: cpignata@cisco.com
 End of changes. 409 change blocks. 
1030 lines changed or deleted 1030 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/