Network Working Group                                        K. Kompella
Internet-Draft                                                  J. Drake
Updates: 3031 (if approved)                             Juniper Networks
Intended status: Standards Track                               S. Amante
Expires: May 3, November 8, 2012                    Level 3 Communications, LLC
                                                           W. Henderickx
                                                          Alcatel-Lucent
                                                                 L. Yong
                                                              Huawei USA
                                                        October 31, 2011
                                                             May 7, 2012

              The Use of Entropy Labels in MPLS Forwarding
                    draft-ietf-mpls-entropy-label-01
                    draft-ietf-mpls-entropy-label-02

Abstract

   Load balancing is a powerful tool for engineering traffic across a
   network.  This memo suggests ways of improving load balancing across
   MPLS networks using the concept of "entropy labels".  It defines the
   concept, describes why entropy labels are useful, enumerates
   properties of entropy labels that allow maximal benefit, and shows
   how they can be signaled and used for various applications.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on May 3, November 8, 2012.

Copyright Notice

   Copyright (c) 2011 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Conventions used . . . . . . . . . . . . . . . . . . . . .  4
     1.2.  Motivation . . . . . . . . . . . . . . . . . . . . . . . .  5  6
   2.  Approaches . . . . . . . . . . . . . . . . . . . . . . . . . .  6  7
   3.  Entropy Labels and Their Structure . . . . . . . . . . . . . .  7  8
   4.  Data Plane Processing of Entropy Labels  . . . . . . . . . . .  8  9
     4.1.  Ingress  Egress LSR . . . . . . . . . . . . . . . . . . . . . . .  8 .  9
     4.2.  Transit  Ingress LSR  . . . . . . . . . . . . . . . . . . . . . . .  9 10
     4.3.  Egress  Transit LSR  . . . . . . . . . . . . . . . . . . . . . . . 11
     4.4.  Penultimate Hop LSR  .  9 . . . . . . . . . . . . . . . . . . 11
   5.  Signaling for Entropy Labels . . . . . . . . . . . . . . . . . 10 11
     5.1.  LDP Signaling  . . . . . . . . . . . . . . . . . . . . . . 10 12
     5.2.  BGP Signaling  . . . . . . . . . . . . . . . . . . . . . . 11 12
     5.3.  RSVP-TE Signaling  . . . . . . . . . . . . . . . . . . . . 12 13
   6.  Operations, Administration, and Maintenance (OAM) and
       Entropy Labels . . . . . . . . . . . . . . . . . . . . . . . . 13
   7.  MPLS-TP and Entropy Labels . . . . . . . . . . . . . . . . . . 14
   8.  Point-to-Multipoint LSPs and Entropy Labels  . . . . . . . . . 15
   9.  Entropy Labels and Applications  . . in Various Scenarios  . . . . . . . . . . . . . 15
     9.1.  Tunnels  .  LDP Tunnel . . . . . . . . . . . . . . . . . . . . . . . . 15 16
     9.2.  LDP Pseudowires Over RSVP-TE . . . . . . . . . . . . . . . . . . . . . 17 18
     9.3.  BGP  MPLS Applications  . . . . . . . . . . . . . . . . . . . . . 18
       9.3.1.  Inter-AS BGP VPNs  . . . . . . . . . . . . . . . . . . 19
     9.4.  Multiple Applications  . . . . . . . . . . . . . . . . . . 20
   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 21 18
   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 22 19
     11.1. LDP Entropy Reserved Label TLV for ELI . . . . . . . . . . . . . . . . . . 22 19
     11.2. BGP LDP Entropy Label Attribute  . . Capability TLV . . . . . . . . . . . . . 22 19
     11.3. BGP Entropy Label Capability Attribute Flags for LSP_Attributes Object . . . . . . . . 22 . . 19
     11.4. Attributes TLV for LSP_Attributes Object RSVP-TE Entropy Label Capability flag  . . . . . . . . . 22 . 19
   12. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 23 20
   13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 20
     13.1. Normative References . . . . . . . . . . . . . . . . . . . 23 20
     13.2. Informative References . . . . . . . . . . . . . . . . . . 23 20
   Appendix A.  Applicability of LDP Entropy Label sub-TLV  . . . Capability TLV . . 24 21
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 22

1.  Introduction

   Load balancing, or multi-pathing, is an attempt to balance traffic
   across a network by allowing the traffic to use multiple paths.  Load
   balancing has several benefits: it eases capacity planning; it can
   help absorb traffic surges by spreading them across multiple paths;
   it allows better resilience by offering alternate paths in the event
   of a link or node failure.

   As providers scale their networks, they use several techniques to
   achieve greater bandwidth between nodes.  Two widely used techniques
   are: Link Aggregation Group (LAG) and Equal-Cost Multi-Path (ECMP).
   LAG is used to bond together several physical circuits between two
   adjacent nodes so they appear to higher-layer protocols as a single,
   higher bandwidth 'virtual' pipe.  ECMP is used between two nodes
   separated by one or more hops, to allow load balancing over several
   shortest paths in the network.  This is typically obtained by
   arranging IGP metrics such that there are several equal cost paths
   between source-destination pairs.  Both of these techniques may, and
   often do, co-exist in various parts of a given provider's network,
   depending on various choices made by the provider.

   A very important requirement when load balancing is that packets
   belonging to a given 'flow' must be mapped to the same path, i.e.,
   the same exact sequence of links across the network.  This is to
   avoid jitter, latency and re-ordering issues for the flow.  What
   constitutes a flow varies considerably.  A common example of a flow
   is a TCP session.  Other examples are an L2TP session corresponding
   to a given broadband user, or traffic within an ATM virtual circuit.

   To meet this requirement, a node uses certain fields, termed 'keys',
   within a packet's header as input to a load balancing function
   (typically a hash function) that selects the path for all packets in
   a given flow.  The keys chosen for the load balancing function depend
   on the packet type; a typical set (for IP packets) is the IP source
   and destination addresses, the protocol type, and (for TCP and UDP
   traffic) the source and destination port numbers.  An overly
   conservative choice of fields may lead to many flows mapping to the
   same hash value (and consequently poorer load balancing); an overly
   aggressive choice may map a flow to multiple values, potentially
   violating the above requirement.

   For MPLS networks, most of the same principles (and benefits) apply.
   However, finding useful keys in a packet for the purpose of load
   balancing can be more of a challenge.  In many cases, MPLS
   encapsulation may require fairly deep inspection of packets to find
   these keys at transit LSRs.

   One way to eliminate the need for this deep inspection is to have the
   ingress LSR of an MPLS Label Switched Path extract the appropriate
   keys from a given packet, input them to its load balancing function,
   and place the result in an additional label, termed the 'entropy
   label', as part of the MPLS label stack it pushes onto that packet.

   The packet's MPLS entire label stack can then be used by transit LSRs
   to perform load balancing, as the entropy label introduces the right
   level of "entropy" into the label stack.

   There are four five key reasons why this is beneficial:

   1.  at the ingress LSR, MPLS encapsulation hasn't yet occurred, so
       deep inspection is not necessary;

   2.  the ingress LSR has more context and information about incoming
       packets than transit LSRs;

   3.  ingress LSRs usually operate at lower bandwidths than transit
       LSRs, allowing them to do more work per packet, and packet;

   4.  transit LSRs do not need to perform deep packet inspection and
       can load balance effectively using only a packet's MPLS label
       stack.
       stack; and

   5.  transit LSRs, not having the full context that an ingress LSR
       does, have the hard choice between potentially misinterpreting
       fields in a packet as valid keys for load balancing (causing
       packet ordering problems) or adopting a conservative approach
       (giving rise to sub-optimal load balancing).  Entropy labels
       relieves them of making this choice.

   This memo describes why entropy labels are needed and defines the
   properties of entropy labels; in particular how they are generated
   and received, and the expected behavior of transit LSRs.  Finally, it
   describes in general how signaling works and what needs to be
   signaled, as well as specifics for the signaling of entropy labels
   for LDP ([RFC5036]), BGP ([RFC3107], [RFC4364]), ([RFC3107]), and RSVP-TE ([RFC3209]).

1.1.  Conventions used

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   The following acronyms are used:

      LSR:

      BoS: Bottom of Stack

      CE: Customer Edge device

      ECMP: Equal Cost Multi-Path

      EL: Entropy Label Switching Router;

      ELC: Entropy Label Capability

      ELI: Entropy Label Indicator

      FEC: Forwarding Equivalence Class

      LAG: Link Aggregation Group

      LER: Label Edge Router; Router

      LSR: Label Switching Router

      PE: Provider Edge router;
      CE: Customer Edge device; and

      FEC: Forwarding Equivalence Class. Router

      PHP: Penultimate Hop Popping

      TC: Traffic Class

      TTL: Time-to-Live

      UHP: Ultimate Hop Popping

      VPLS: Virtual Private LAN (Local Area Network) Service

      VPN: Virtual Private Network

   The term ingress (or egress) LSR is used interchangeably with ingress
   (or egress) LER.  The term application throughout the text refers to
   an MPLS application (such as a VPN or VPLS).

   A label stack (say of three labels) is denoted by <L1, L2, L3>, where
   L1 is the "outermost" label and L3 the innermost (closest to the
   payload).  Packet flows are depicted left to right, and signaling is
   shown right to left (unless otherwise indicated).

   The term 'label' is used both for the entire 32-bit label and the 20-
   bit label field within a label.  It should be clear from the context
   which is meant.

1.2.  Motivation

   MPLS is very successful generic forwarding substrate that transports
   several dozen types of protocols, most notably: IP, PWE3, VPLS and IP
   VPNs.  Within each type of protocol, there typically exist several
   variants, each with a different set of load balancing keys, e.g., for
   IP: IPv4, IPv6, IPv6 in IPv4, etc.; for PWE3: Ethernet, ATM, Frame-
   Relay, etc.  There are also several different types of Ethernet over
   PW encapsulation, ATM over PW encapsulation, etc. as well.  Finally,
   given the popularity of MPLS, it is likely that it will continue to
   be extended to transport new protocols.

   Currently, each transit LSR along the path of a given LSP has to try
   to infer the underlying protocol within an MPLS packet in order to
   extract appropriate keys for load balancing.  Unfortunately, if the
   transit LSR is unable to infer the MPLS packet's protocol (as is
   often the case), it will typically use the topmost (or all) MPLS
   labels in the label stack as keys for the load balancing function.
   The result may be an extremely inequitable distribution of traffic
   across equal-cost paths exiting that LSR.  This is because MPLS
   labels are generally fairly coarse-grained forwarding labels that
   typically describe a next-hop, or provide some of demultiplexing
   and/or forwarding function, and do not describe the packet's
   underlying protocol.

   On the other hand, an ingress LSR (e.g., a PE router) has detailed
   knowledge of an packet's contents, typically through a priori
   configuration of the encapsulation(s) that are expected at a given
   PE-CE interface, (e.g., IPv4, IPv6, VPLS, etc.).  They also have more
   flexible forwarding hardware.  PE routers need this information and
   these capabilities to:

      a) apply the required services for the CE;

      b) discern the packet's CoS forwarding treatment;

      c) apply filters to forward or block traffic to/from the CE;

      d) to forward routing/control traffic to an onboard management
      processor; and,

      e) load-balance the traffic on its uplinks to transit LSRs (e.g.,
      P routers).

   By knowing the expected encapsulation types, an ingress LSR router
   can apply a more specific set of payload parsing routines to extract
   the keys appropriate for a given protocol.  This allows for
   significantly improved accuracy in determining the appropriate load
   balancing behavior for each protocol.

   If the ingress LSR were to capture the flow information so gathered
   in a convenient form for downstream transit LSRs, transit LSRs could
   remain completely oblivious to the contents of each MPLS packet, and
   use only the captured flow information to perform load balancing.  In
   particular, there will be no reason to duplicate an ingress LSR's
   complex packet/payload parsing functionality in a transit LSR.  This
   will result in less complex transit LSRs, enabling them to more
   easily scale to higher forwarding rates, larger port density, lower
   power consumption, etc.  The idea in this memo is to capture this
   flow information as a label, the so-called entropy label.

   Ingress LSRs can also adapt more readily to new protocols and extract
   the appropriate keys to use for load balancing packets of those
   protocols.  This means that deploying new protocols or services in
   edge devices requires fewer concommitant concomitant changes in the core,
   resulting in higher edge service velocity and at the same time more
   stable core networks.

2.  Approaches

   There are two main approaches to encoding load balancing information
   in the label stack.  The first allocates multiple labels for a
   particular Forwarding Equivalance Equivalence Class (FEC).  These labels are
   equivalent in terms of forwarding semantics, but having multiple
   labels allows flexibility in assigning labels to flows belonging to
   the same FEC.  This approach has the advantage that the label stack
   has the same depth whether or not one uses label-based load
   balancing; and so, consequently, there is no change to forwarding
   operations on transit and egress LSRs.  However, it has a major
   drawback in that there is a significant increase in both signaling
   and forwarding state.

   The other approach encodes the load balancing information as an
   additional label in the label stack, thus increasing the depth of the
   label stack by one.  With this approach, there is minimal change to
   signaling state for a FEC; also, there is no change in forwarding
   operations in transit LSRs, and no increase of forwarding state in
   any LSR.  The only purpose of the additional label is to increase the
   entropy in the label stack, so this is called an "entropy label".
   This memo focuses solely on this approach.

3.  Entropy Labels and

   This latter approach uses upstream generated entropy labels, which
   may conflict with downstream allocated application labels.  There are
   a few approaches to deal with this: 1) allocate a pair of labels for
   each FEC, one that must have an entropy label below it, and one that
   must not; 2) use a label (the "Entropy Label Indicator") to indicate
   that the next label is an entropy label; and 3) allow entropy labels
   only where there is no possible confusion.  The first doubles control
   and data plane state in the network; the last is too restrictive.
   The approach taken here is the second.  In making both the above
   choices, the trade-off is to increase label stack depth rather than
   control and data plane state in the network.

   Finally, one may choose to associate ELs with MPLS tunnels (LSPs), or
   with MPLS applications (e.g., VPNs).  (What this entails is described
   in later sections.)  We take the former approach, for the following
   reasons:

   1.  There are a small number of tunneling protocols for MPLS, but a
       large and growing number of applications.  Defining ELs on a
       tunnel basis means simpler standards, lower development,
       interoperability and testing efforts.

   2.  As a consequence, there will be much less churn in the network as
       new applications (services) are defined and deployed.

   3.  Processing application labels in the data plane is more complex
       than processing tunnel labels.  Thus, it is preferable to burden
       the latter rather than the former with EL processing.

   4.  Associating ELs with tunnels makes it simpler to deal with
       hierarchy, be it LDP-over-RSVP-TE or Carrier's Carrier VPNs.
       Each layer in the hierarchy can choose independently whether or
       not they want ELs.

   The cost of this approach is that ELIs will be mandatory; again, the
   trade-off is the size of the label stack.  To summarize, the net
   increase in the label stack to use entropy labels is two: one
   reserved label for the ELI, and the entropy label itself.

3.  Entropy Labels and Their Structure

   An entropy label (as used here) is a label:

   1.  that is not used for forwarding;

   2.  that is not signaled; and

   3.  whose only purpose in the label stack is to provide 'entropy' to
       improve load balancing.

   Entropy labels are generated by an ingress LSR, based entirely on
   load balancing information.  However, they MUST NOT have values in
   the reserved label space (0-15). (0-15) [IANA MPLS Label Values].  To ensure
   that they are not used inadvertently for forwarding, entropy labels
   SHOULD have a TTL of 0.  The CoS field of an entropy label can be set
   to any value deemed appropriate.

   Since entropy labels are generated by an ingress LSR, an egress LSR
   MUST be able to tell distinguish unambiguously that a given label is an between entropy
   label.  If any ambiguity labels and
   application labels.  This is possible, accomplished by REQUIRING that the label above the
   immediately preceding an entropy label MUST (EL) in the MPLS label stack
   be an 'entropy label indicator' (ELI), which indicates
   that the following Label is an entropy label.  An (ELI).  The ELI is typically
   signaled by an egress LSR and is added to the MPLS a reserved label stack along
   with an entropy label value (TBD by an ingress LSR.  For many applications, the
   use of entropy labels is unambiguous, and an ELI is not needed. IANA).  An ELI MUST have 'Bottom of Stack' (S) (BoS)
   bit = 0 ([RFC3032]).  The TTL SHOULD be set to whatever value the
   label above it in the stack has.  The CoS field can be set to any
   value deemed appropriate; typically, this will be the value in the
   label above it the ELI in the label stack.

   Applications for MPLS entropy

   Entropy labels include are useful for pseudowires ([RFC4447]),
   Layer 3 VPNs ([RFC4364]), VPLS ([RFC4761], [RFC4762]) and Tunnel LSPs
   carrying, say, IP traffic. ([RFC4447]).
   [I-D.ietf-pwe3-fat-pw] explains how entropy labels can be used for
   RFC 4447-style pseudowires, and thus is complementary to this memo,
   which focuses on several other
   applications of how entropy labels. labels can be used for tunnels, and thus
   for all other MPLS applications.

4.  Data Plane Processing of Entropy Labels

4.1.  Ingress LSR

   Suppose that for a particular application (or service or FEC), an
   ingress  Egress LSR X is to push label stack <TL, AL>, where TL is the
   'tunnel label' and AL is the 'application label'.  (Note the use of
   the convention for label stacks described in Section 1.1.  The use of
   a two-label stack is just for illustrative purposes.)

   Suppose
   furthermore that the egress LSR Y has told X that it is capable of processing entropy labels for a
   tunnel.  Y indicates this application. to all ingresses via signaling (see
   Section 5).  Y MUST be prepared to deal both with packets with an
   imposed EL and those without; the ELI will distinguish these cases.
   If X cannot insert
   entropy labels, it simply uses a particular ingress chooses not to impose an EL, Y's processing
   of the received label stack of <TL, AL> for this
   application. (which might be empty) is as if Y chose
   not to accept ELs.

   If X can insert entropy labels, it does the following
   for an incoming packet:

   1. ingress X identifies the application chooses to which the impose an EL, then Y will receive a tunnel
   termination packet belongs,
       identifies the egress LSR as Y, and thereby picks the outgoing with label stack <TL, AL> to push onto the ELI, EL> <remaining packet to send to Y.

   2.  X determines which keys that
   header>.  Y recognizes TL as the label it will use distributed to its
   upstreams for load balancing.

   3.  X, having kept state the tunnel, and pops it.  (Note that Y can process entropy labels for this
       application, generates an entropy TL may be the
   implicit null label, in which case it doesn't appear in the label EL (based on
   stack.)  Y then recognizes the output
       of ELI and pops two labels: the load balancing function).

   4.  If ELI and
   the EL.  Y does not need an ELI, X pushes <TL, AL, EL> onto then processes the remaining packet
       before forwarding it to header as normal; this
   may require further processing of tunnel termination, perhaps with
   further ELI+EL pairs.  When processing the next hop to Y.

   5.  If final tunnel termination,
   Y requires an ELI, X pushes <TL, AL, E, EL> onto MAY enqueue the packet
       before forwarding it to based on that tunnel TL's or ELI's TC value,
   and MAY use the next hop tunnel TL's or ELI's TTL to Y, where E is a label
       whose 20-bit label field is compute the TTL of the
   remaining packet header.  The EL's TTL MUST be ignored.

   If any ELI that processed by Y signaled, has BoS bit set, Y MUST discard the packet,
   and whose
       other fields MAY log an error.  The EL's BoS bit will indicate whether or not
   there are set as per Section 3.

   Note that ingress more labels in the stack.

4.2.  Ingress LSR X MUST NOT include

   If an entropy label unless the egress LSR Y for this application has indicated indicates via signaling that it is ready can process ELs on
   a particular tunnel, an ingress LSR X can choose whether or not to
   receive entropy labels.  Furthermore, if
   insert ELs for packets going into that tunnel.  Y has signaled MUST handle both
   cases.

   The steps that an ELI
   is needed, then X MUST include the ELI before the entropy label.

   Note that the signaling and use of entropy labels in one direction
   (signaling from Y performs to X, and data path from X insert ELs are as follows:

   1.  On an incoming packet, identify the application to Y) has no bearing on which the behavior in
       packet belongs, and thereby pick the opposite direction (signaling from X fields to Y, and
   data path from Y input to X).

4.2.  Transit LSR

   Transit LSRs have virtually no change in forwarding behavior.  For the load balancing, transit LSRs SHOULD use
       balancing function; call the whole output LB.

   2.  Determine the application label stack as keys
   for AL (if any).  Push <AL> onto the
       packet.

   3.  Based on the application, the load balancing function.  Transit LSRs MUST NOT include
   reserved labels as input output LB and other
       factors, determine the egress LSR Y, the tunnel to its load balancing function.  Transit
   LSRs MAY choose Y, the
       specific interface to look beyond the label stack for further keys;
   however, if entropy labels are being used, this may not be very
   useful.  Looking beyond next hop, and thus the tunnel label stack may be TL.
       Use LB to generate the simplest approach
   in an environment where some ingress LSRs use entropy labels and
   others don't, or for backward compatibility.  Thus, other than using
   the full label stack as input to EL.

   4.  If, for the load balancing function, transit
   LSRs are almost unaffected by chosen tunnel, Y has not indicated that it can
       process ELs, push <TL> onto the use of entropy labels.

4.3.  Egress LSR

   Suppose egress LSR packet.  If Y signals has indicated that
       it is capable of processing entropy
   labels can process ELs for a tunnel or an application with label L. There are three
   cases of interest: (a) L is the implicit NULL label, in which case an
   ELI is mandatory; (b) L is not tunnel, push <TL, ELI, EL> onto the implicit NULL label and an ELI is
   not required (L's S bit will be used to determine whether or not
   there is an EL); and (c) L is not the implicit NULL label but an ELI
   is required.

   a1) Y receives an unlabeled
       packet.  There is obviously no EL; Y
       processes the packet as usual.

   a2) Y receives a packet whose top label is the ELI.  Y processes  X SHOULD put the same TTL and CoS TC fields of for the ELI label, ensures that the S bit is 0,
       then pops it, and pops the next label as well (which must
       it does for TL.  The TTL for the EL MUST be zero.  The TC for the
       EL),
       EL may be any value.

   5.  X then pops it.  Y processes the remaining payload as usual.

   b)  Y receives a packet with top label L, and an ELI determines whether further tunnel hierarchy is not required.
       Y processes L as usual; needed; if L's S bit is 1, the label stack is
       done.  If L's S bit is 0, the following label is the EL.  Y pops
       the EL.  Y processes the payload as usual.

   c)  Y receives a packet
       so, X goes back to step 3, possibly with top label L. Y processes L as usual; if
       L's S bit is 1, the label stack is done.  If L's S bit is 0, a new egress Y
       checks for the following label.  If it
       new tunnel.  Otherwise, X is the ELI label, Y processes
       the TTL done, and CoS fields of the ELI, ensures that the S bit is 0,
       pops sends out the ELI label packet.

   Notes:

   a.  X computes load balancing information and generates the following label (which is EL based
       on the EL), and
       processes incoming application packet, even though the remaining payload as usual.

   If there signaling of
       EL capability is an ELI associated with S bit = 1, there is an error tunnels.

   b.  X MAY insert several entropy labels in the label
   stack.  Note that the TTL field stack (each, of
       course, preceded by an ELI), potentially one for each
       hierarchical tunnel, provided that the EL (if present) will be 0; Y egress for that tunnel has
       indicated that it can process ELs for that tunnel.

   c.  X MUST NOT react to this.

5.  Signaling include an entropy label for Entropy Labels

   An a given tunnel unless the
       egress LSR Y may signal to ingress LSR(s) its ability to has indicated that it can process entropy labels on a per-application (or per-FEC) basis.  As part for
       that tunnel.

   d.  The signaling and use of
   this signaling, entropy labels in one direction
       (signaling from Y also signals the ELI to use, if any.

   In cases where an application label is used X, and must be data path from X to Y) is completely
       independent of the
   bottommost label signaling and use of entropy labels in the label stack,
       reverse direction (signaling from X to Y, and data path from Y to
       X).

4.3.  Transit LSR

   Transit LSRs MAY signal that operate with no ELI is
   needed change in forwarding behavior.  The
   following are suggestions for optimizations that application.

   In cases where no application label exists, or where improve load
   balancing, reduce the application
   label may not be amount of packet data processed, and/or enhance
   backward compatibility.

   If a transit LSR recognizes the ELI, it MAY choose to load balance
   solely on the bottommost following label in (the EL); otherwise, it SHOULD use as
   much of the whole label stack, Y stack as feasible as keys for the load
   balancing function, with the exception that reserved labels MUST
   signal a valid ELI to NOT
   be used in conjunction with used.

   Some transit LSRs look beyond the entropy label stack for better load
   balancing information.  This is a simple, backward compatible
   approach in networks where some ingress LSRs impose ELs and others
   don't.  However, this FEC.  In this case, is of limited incremental value if an ingress LSR will either not add an
   entropy label, or push EL is
   indeed present, and requires more packet processing from the ELI before LSR.  A
   transit LSR MAY choose to parse the entropy label.  This makes label stack for the use or non-use presence of an entropy label by
   the ingress LSR
   unambiguous.  Valid ELI label values are strictly greater than 15.

   It should be noted that egress LSR Y may use ELI, and look beyond the same ELI value for
   all applications for which an ELI is needed.  The ELI MUST be a label
   that stack only if it does not conflict with any other labels that find it,
   thus retaining the old behavior when needed, yet avoided unnecessary
   work if not.

4.4.  Penultimate Hop LSR

   No change is needed at penultimate hop LSRs.

5.  Signaling for Entropy Labels

   An egress LSR Y has advertised can signal to
   other LSRs for other applications.  Furthermore, it should be noted
   that the ingress LSR(s) its ability to process
   entropy labels (and the corresponding
   ELI) (henceforth called "Entropy Label Capability" or ELC)
   on a given tunnel.  Note that Entropy Label Capability may be
   asymmetric: an LSR if LSRs X and Y are at opposite ends of a tunnel, X may
   be willing able to process entropy labels, whereas LSR Y may not be willing to process entropy labels. not.  The signaling
   extensions below allow for this asymmetry.

   For an illustration of signaling and forwarding with entropy labels,
   see Figure Section 9.

5.1.  LDP Signaling

   When using

   A new LDP for signaling tunnel labels ([RFC5036]), a Label
   Mapping Message sub-TLV (Entropy Label sub-TLV) TLV ([RFC5036]) is used defined to signal an
   egress LSR's egress's ability to
   process entropy labels.  This is called the ELC TLV, and may appear
   as an Optional Parameter of the Label Mapping Message TLV.

   The presence of the Entropy Label sub-TLV ELC TLV in the a Label Mapping Message indicates to
   ingress LSRs that the egress LSR can process an entropy label.  In addition, the Entropy Label sub-TLV contains a
   label value labels for the ELI.  If
   associated LDP tunnel.  The ELC TLV has Type (TBD by IANA) and Length
   0.

   The structure of the ELI is zero, this indicates the
   egress doesn't need an ELI for the signaled application; if not, the
   egress requires the given ELI with entropy labels.  An example where
   an ELI is needed is when the signaled application is an LSP that can
   carry IP traffic.

   The structure of the Entropy Label sub-TLV ELC TLV is shown below.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |U|F|        Type (TBD)         |           Length (8)          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |               Value                   |     Must Be Zero (0)          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                  Figure 1: Entropy Label sub-TLV Capability TLV

   where:

      U: Unknown bit.  This bit MUST be set to 1.  If the Entropy Label
      sub-TLV
      Capability TLV is not understood, then the TLV is not known to the
      receiver and MUST be ignored.

      F: Forward bit.  This bit MUST be set be set to 1.  Since this
      sub-TLV
      Capability TLV is going to be propagated hop-by-hop, the sub-TLV TLV
      should be forwarded even by nodes that may not understand it.

      Type: sub-TLV Type field, as specified field.  To be assigned by IANA.

      Length: sub-TLV Length field.  This field specifies the total length in
      octets of the Entropy Label sub-TLV.

      Value: value of the Entropy Label Indicator Label. ELC TLV, and is currently defined to be 0.

5.2.  BGP Signaling

   When BGP [RFC4271] is used for distributing Network Layer
   Reachability Information (NLRI) as described in, for example,
   [RFC3107], [RFC4364] and [RFC4761], the BGP UPDATE message may include the Entropy Label attribute. ELC attribute as
   part of the Path Attributes.  This is an optional, transitive BGP
   attribute of type TBD. (to be assigned by IANA).  The inclusion of this
   attribute with an NLRI indicates that the advertising BGP router can
   process entropy labels as an egress LSR for all routes in that NLRI.  If the attribute length is
   less than three octets, this indicates that the egress doesn't need
   an ELI for the signaled application.  If the attribute length is at
   least three octets, the first three octets encode an ELI label value
   as the high order 20 bits; the egress requires this ELI with entropy
   labels.  An example where an ELI is needed is when the NLRI contains
   unlabeled IP prefixes.

   A BGP speaker S that originates an UPDATE should only include the
   Entropy Label ELC
   attribute only if both of the following are true:

   A1:  S sets the BGP NEXT_HOP attribute to itself; AND

   A2:  S can process entropy labels for the given application.

   If both A1 and A2 are true, and S needs an ELI to recognize entropy
   labels, then S MUST include the ELI label value as part of the
   Entropy Label attribute.  An UPDATE SHOULD contain at most one
   Entropy Label attribute. labels.

   Suppose a BGP speaker T receives an UPDATE U with the Entropy Label
   attribute ELA. ELC attribute.
   T has two choices.  T can simply re-advertise U with the same ELA ELC
   attribute if either of the following is true:

   B1:  T does not change the NEXT_HOP attribute; OR

   B2:  T simply swaps labels without popping the entire label stack and
        processing the payload below.

   An example of the use of B1 is Route Reflectors; an example of the
   use of B2 is illustrated in Section 9.3.1.2. Reflectors.

   However, if T changes the NEXT_HOP attribute for U and in the data
   plane pops the entire label stack to process the payload, T MUST
   remove ELA.  T MAY
   include a new Entropy Label an ELC attribute ELA' for UPDATE U' if both of the following are
   true:

   C1:  T sets the NEXT_HOP attribute of U' to itself; AND

   C2:  T can process entropy labels for the given application.

   Again, if both C1 and C2 are true, and T needs an ELI to recognize
   entropy labels, then labels.

   Otherwise, T MUST include the ELI label value as part of remove the Entropy Label ELC attribute.

5.3.  RSVP-TE Signaling

   Entropy Label support is signaled in RSVP-TE [RFC3209] using an the
   Entropy Label Capability (ELC) flag in the Attribute Flags TLV (Type TBD) of the
   LSP_ATTRIBUTES object [RFC5420].  The presence of this attribute indicates that the
   signaler (the egress ELC flag in the downstream direction using Resv messages; a
   Path message indicates that the ingress in the upstream direction using Path messages) can process entropy labels.  The Entropy Label Attribute contains a value
   for the ELI.  If labels in
   the ELI is zero, upstream direction; this indicates that the signaler
   doesn't need an ELI only makes sense for this application; if not, then the signaler
   requires the given ELI with entropy labels.  An example where an ELI
   is needed is when the signaled a bidirectional LSP can carry IP traffic.
   and MUST be ignored otherwise.  The format presence of the Entropy Label Attribute is as follows:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Entropy Label Attribute    |           Length (4)          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |              ELI Label                |         MBZ           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   An egress LSR includes the Entropy Label Attribute ELC flag in a
   Resv message
   to indicate indicates that it the egress can process entropy labels in
   the downstream
   direction of the signaled LSP.

   An ingress LSR includes the Entropy Label Attribute in a Path message direction.

   The bit number for a bi-directional LSP to indicate that it can process entropy
   labels in the upstream direction of the signaled LSP.  If the
   signaled LSP ELC flag is not bidirectional, the Entropy Label Attribute SHOULD
   NOT to be included assigned by IANA.

6.  Operations, Administration, and Maintenance (OAM) and Entropy Labels

   Generally OAM comprises a set of functions operating in the Path message, data
   plane to allow a network operator to monitor its network
   infrastructure and egress LSR(s) SHOULD ignore
   the attribute, if any.

   As described to implement mechanisms in Section 8, there is also the need order to distribute an
   ELI from enhance the ingress (upstream label allocation).  In
   general behavior and the case level of
   RSVP-TE, this is accomplished using the Upstream ELI Attribute TLV performance of
   the LSP_ATTRIBUTES object, as shown below:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |    Upstream ELI Attribute     |           Length (4)          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |              ELI Label                |         MBZ           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

6.  Operations, Administration, and Maintenance (OAM) and Entropy Labels

   Generally OAM comprises a set of functions operating in the data
   plane to allow a network operator to monitor its network
   infrastructure and to implement mechanisms in order to enhance the
   general behavior and the level of performance of its network, e.g., its network, e.g.,
   the efficient and automatic detection, localization, diagnosis and
   handling of defects.

   Currently defined OAM mechanisms for MPLS include LSP Ping/Traceroute
   [RFC4379] and Bidirectional Failure Detection (BFD) for MPLS
   [RFC5884].  The latter provides connectivity verification between the
   endpoints of an LSP, and recommends establishing a separate BFD
   session for every path between the endpoints.

   The LSP traceroute procedures of [RFC4379] allow an ingress LSR to
   obtain label ranges that can be used to send packets on every path to
   the egress LSR.  It works by having ingress LSR sequentially ask the
   transit LSRs along a particular path to a given egress LSR to return
   a label range such that the inclusion of a label in that range in a
   packet will cause the replying transit LSR to send that packet out
   the egress interface for that path.  The ingress provides the label
   range returned by transit LSR N to transit LSR N + 1, which returns a
   label range which is less than or equal in span to the range provided
   to it.  This process iterates until the penultimate transit LSR
   replies to the ingress LSR with a label range that is acceptable to
   it and to all LSRs along path preceding it for forwarding a packet
   along the path.

   However, the LSP traceroute procedures do not specify where in the
   label stack the value from the label range is to be placed, whether
   deep packet inspection is allowed and if so, which keys and key
   values are to be used.

   This memo updates LSP traceroute by specifying that the value from
   the label range is to be placed in the entropy label.  Deep packet
   inspection is thus not necessary, although an LSR may use it,
   provided it do so consistently, i.e., if the label range to go to a
   given downstream LSR is computed with deep packet inspection, then
   the data path should use the same approach and the same keys.

   In order to have a BFD session on a given path, a value from the
   label range for that path should be used as the EL value for BFD
   packets sent on that path.

   As part of the MPLS-TP work, an in-band OAM channel is defined in
   [RFC5586].  Packets sent in this channel are identified with a
   reserved label, the Generic Associated Channel Label (GAL) placed at
   the bottom of the MPLS label stack.  In order to use the inband OAM
   channel with entropy labels, this memo relaxes the restriction that
   the GAL must be at the bottom of the MPLS label stack.  Rather, the
   GAL is placed in the MPLS label stack above the entropy label so that
   it effectively functions as an application label.

7.  MPLS-TP and Entropy Labels

   Since MPLS-TP does not use ECMP, entropy labels are not applicable to
   an MPLS-TP deployment.

8.  Point-to-Multipoint LSPs and Entropy Labels

   Point-to-Multipoint (P2MP) LSPs [RFC4875] typically do not use ECMP
   for load balancing, as the combination of replication and
   multipathing can lead to duplicate traffic delivery.  However, P2MP
   LSPs can traverse Bundled Links bundled links [RFC4201] and LAGs.  In both these
   cases, load balancing is useful, and hence entropy labels can be of
   some
   value for P2MP LSPs.

   There are two is a potential complications complication with the use of entropy labels in
   the context of P2MP LSPs, both a consequence of the fact that the entire
   label stack below the P2MP label must be the same for all egress
   LSRs.  First,  This is that all egress LSRs must be willing to receive
   entropy labels; if even one egress LSR is not willing, then entropy
   labels MUST NOT be used for this P2MP LSP.  Second, if an ELI is
   required, all egress LSRs must agree to the same value of ELI.  This
   can be achieved by upstream allocation of the ELI; in particular, for
   RSVP-TE P2MP LSPs, the ingress LSR distributes the ELI value using
   the Upstream ELI Attribute TLV of the LSP_ATTRIBUTES object, defined
   in Section 5.3.

   With regard to the first issue,

   In this regard, the ingress LSR MUST keep track of the ability of
   each egress LSR to process entropy labels, especially since the set
   of egress LSRs of a given P2MP LSP may change over time.  Whenever an
   existing egress LSR leaves, or a new egress LSR joins the P2MP LSP,
   the ingress MUST re-evaluate whether or not to include entropy labels
   for the P2MP LSP.

   In some cases, it may be feasible to deploy two P2MP LSPs, one to
   entropy label capable ELC
   egress LSRs, and the other to the remaining non-ELC egress LSRs.
   However, this requires more state in the network, more bandwidth, and
   more operational overhead (tracking EL-capable ELC LSRs, and provisioning P2MP
   LSPs accordingly).  Furthermore, this approach  Alternatively, an ingress LSR may not work for some applications (such mVPNs and VPLS) which
   automatically create and/or use choose to
   signal two separate P2MP LSPs LSPs, one to ELC egresses, the other to non-
   ELC egresses, trading off implementation complexity for their multicast
   requirements. operational
   complexity.

9.  Entropy Labels and Applications in Various Scenarios

   This section describes the usage use of entropy labels in various
   scenarios with different applications.

9.1.  Tunnels

   Tunnel LSPs, signaled with either LDP or RSVP-TE, typically carry
   other MPLS applications such as VPNs or pseudowires.  This being the
   case, if the egress LSR of a tunnel LSP is willing to process entropy
   labels, it would signal the need for an Entropy Label Indicator to
   distinguish between entropy labels and other application labels.
   scenarios.

   In the figures below, the following convention is conventions used to depict
   information signaled
   processing between X and Y. Note that control plane signaling goes
   right to left, whereas data plane processing goes left to right.

   Protocols
   Y:        <--- [L, E]                         Y signals L to X ---------- ... ----------
       X ------------- Y
                        app:   <--- [label L, ELI value]
   LS:   <L, ELI, EL>                            Label stack
   X:  +<L, ELI, EL>                             X pushes <L, ELI, EL>
   Y:                  -<L, ELI, EL>             Y pops <L, ELI, EL>
   This means that Y signals to X label L for application app.  The ELI value an LDP tunnel.  E can be
   one of:

      -: meaning entropy labels are NOT accepted;

      0: meaning entropy labels are accepted, no ELI egress is needed; NOT entropy label capable, or

      E:

      1: meaning egress is entropy labels are accepted, ELI label E capable.

   The line with LS: shows the label stack on the wire.  Below that is
   the operation that each LSR does in the data plane, where + means
   push the following label stack, - means pop the following label
   stack, L~L' means swap L with L', and * means that the operation is required.
   not depicted.

9.1.  LDP Tunnel

   The following illustrates a several simple intra-AS tunnel LSP.

                         X -------- A --- ... --- B -------- Y
           tunnel LSP L:   [TL,  E] <---  ...  <--- [TL0, E]

           IP pkt:         push <TL, E, EL> --------------->

                 Figure 2: Tunnel LSPs and Entropy Labels

   Tunnel LSPs may cross Autonomous System (AS) boundaries, usually
   using BGP ([RFC3107]).  In this case, LDP tunnels.  The
   first diagram shows ultimate hop popping (UHP) with ingress inserting
   an EL, the AS Border Routers (ASBRs)
   MAY simply propagate second UHP with no ELs, the egress LSR's ability to process entropy
   labels, or they MAY declare that entropy labels may not be used.  If third PHP with ELs, and
   finally, PHP with no ELs, but also with an ASBR (say A2 below) chooses to propagate application label AL
   (which could, for example, be a VPN label).

   Note that, in all the egress LSR Y's
   ability to process entropy labels, A2 MUST also propagate Y's choice
   of ELI. cases below, the MPLS application does not
   matter; it may be that X ---- ... ---- A1 ------- A2 ---- ... ---- pushes some more labels (perhaps for a VPN
   or VPLS) below the ones shown, and Y
     intra-AS LSP A2-Y: pops them.

   A:        <--- [TL4, 1]
   B:                     <-- [TL3, 1]
   ...
   W:                           <-- [TL1, 1]
   Y:                                        <-- [TL0, E]
     inter-AS LSP A1-A2:                 [AL, E]
     intra-AS LSP X-A1: 1]
       X --------------- A --------- B ... W ---------- Y
   LS:    <TL4, ELI, EL>   <TL3,ELI,EL>      <TL0,ELI,EL>
   X:  +<TL4, ELI, EL>
   A:                    TL4~TL3
   B:                                TL3~TL2
   ...
   W:                                      TL1~TL0
   Y:                                                   -<TL0, ELI, EL>

                     LDP with UHP; ingress inserts ELs

   A:        <--- [TL4, 1]
   B:                     <-- [TL3, 1]
   ...
   W:                           <-- [TL1, E]

     IP pkt:           push <TL1, E, EL>

   Here, ASBR A2 chooses to propagate Y's ability to process entropy
   labels, by "translating" Y's signaling of entropy label capability
   (say using LDP) to BGP; and A1 translate A2's BGP signaling to (say)
   RSVP-TE.  The end-to-end tunnel (X to Y) will have entropy labels if 1]
   Y:                                        <-- [TL0, 1]
       X chooses to insert them.

             Figure 3: Inter-AS Tunnel LSP --------------- A --------- B ... W ---------- Y
   LS:        <TL4>          <TL3>              <TL0>
   X:  +<TL4>
   A:                    TL4~TL3
   B:                                TL3~TL2
   ...
   W:                                      TL1~TL0
   Y:                                                   -<TL0>

                 LDP with Entropy Labels

                      X ---- UHP; ingress does not insert ELs

   A:        <--- [TL4, 1]
   B:                     <-- [TL3, 1]
   ... ---- A1 ------- A2 ----
   W:                           <-- [TL1, 1]
   Y:                                          <-- [3, 1]
       X --------------- A --------- B ... ---- W ---------- Y
     intra-AS LSP A2-Y:                             <--- [TL0, E]
     inter-AS LSP A1-A2:                 [AL, E]
     intra-AS LSP X-A1:
   X:  +<TL4, ELI, EL>
   A:                    TL4~TL3
   B:                                TL3~TL2
   ...
   W:                                      -TL1
   Y:                                                   -<ELI, EL>

                     LDP with PHP; ingress inserts ELs

   A:        <--- [TL4, 1]
   B:                     <-- [TL3, 1]
   ...
   W:                           <-- [TL1, -]

     IP pkt:            push <TL1> -->

   Here, ASBR A1 decided that entropy labels are not to be used; thus,
   the end-to-end tunnel cannot have entropy labels, even though both 1]
   Y:                                          <-- [3, 1]
   VPN:  <------------------------------------------ [AL]
       X
   and --------------- A --------- B ... W ---------- Y may be capable of inserting and processing entropy labels.

           Figure 4: Inter-AS Tunnel LSP
   LS:      <TL4, AL>      <TL3, AL>             <AL>
   X:  +<TL4, AL>
   A:                    TL4~TL3
   B:                                TL3~TL2
   ...
   W:                                      -TL1
   Y:                                                   -<AL>
              LDP with no Entropy Labels PHP + VPN; ingress does not insert ELs

9.2.  LDP Pseudowires

   [I-D.ietf-pwe3-fat-pw] describes Over RSVP-TE

   The following illustrates "LDP over RSVP-TE" tunnels.  X and Y are
   the signaling ingress and use egress (respectively) of entropy
   labels in the context of RFC 4447 pseudowires, so this will not be
   described further here.

   [RFC4762] specifies the use of LDP for signaling VPLS pseudowires.
   An egress VPLS PE that can process entropy labels can indicate this
   by adding the Entropy Label sub-TLV in tunnel; A and W are
   the LDP message it sends to
   other PEs.  An ELI is not required.  An ingress PE must maintain
   state per and egress PE as to whether it can process entropy labels.

                         X -------- A --- ... --- B -------- Y
           tunnel LSP L:   [TL,  E] <---  ...  <--- [TL0, E]
           VPLS label:     <------------------------ [VL, 0]

           VPLS pkt:       push <TL, VL, EL> -------------->

                  Figure 5: Entropy Labels with LDP VPLS

   Note that although of the underlying tunnel LSP signaling indicated RSVP-TE tunnel.  It is assumed that
   both the
   need for an ELI, VPLS packets don't need an ELI, LDP and thus the label
   stack pushed by X do not RSVP-TE tunnels have one.

   [RFC4762] also describes the notion of "hierarchical VPLS" (H-VPLS).
   In H-VPLS, 'hub PEs' remove the label stack and process VPLS packets;
   thus, they must make their own decisions on the use of entropy
   labels, independent of other hub PEs or spoke PEs PHP.

   LDP with which they
   exchange signaling.  In the example below, spoke PEs X and Y and hub
   PE B can process entropy labels, but hub PE A cannot. ELs, RSVP-TE without ELs
   LDP:       <--- [L4, 1]  <------- [L3, 1]  <--- [3, 1]
   RSVP-TE:                <-- [Rn, 0]
                                  <-- [3, 0]
       X ---- ... ---- --------------- A ---- ... ---- --------- B ---- ... ---- W ---------- Y
   spoke PW1:                                      <--- [SL1, 0]
   hub-hub PW:                     <---- [HL, 0]
   spoke PW2:      <--- [SL2, -]

   SPW2 pkt:       push <TL1, SL2>
   H-H PW pkt:                     push <TL2,HL,EL>
   SPW1 pkt:                                       push <TL3,SL1,EL>
   LS:    <L4, ELI, EL>   <Rn,L3,ELI,EL> ...  <ELI, EL>
   DP: +<L4, ELI, EL>    L4~<Rn, L3> *     -L1          -<ELI, EL>

                    Figure 6: Entropy Labels with H-VPLS 2: LDP over RSVP-TE Tunnels

9.3.  BGP  MPLS Applications

   Section 9.1 described a BGP application for the creation of inter-AS
   tunnel LSPs.  This section describes two other BGP applications, IP
   VPNs ([RFC4364]) and BGP VPLS ([RFC4761]).

   An ingress LSR X must keep state per unicast tunnel as to whether the
   egress PE for either
   of these applications indicates its ability to process entropy labels
   by adding the Entropy Label attribute to its BGP UPDATE message.
   Again, ingress PEs must maintain per-egress PE state regarding its
   ability to that tunnel can process entropy labels.  In this section, both of these
   applications will be referred to as VPNs.

   In the intra-AS case, PEs signal application labels and entropy label
   capability to each other, either directly, or via Route Reflectors
   (RRs).  If RRs are used, they must not change the BGP NEXT_HOP
   attribute in the UPDATE messages; furthermore, they can simply pass
   on the Entropy Label attribute as is.  X -------- A --- ... --- B -------- Y
           tunnel LSP L:   [TL,  E] <---  ...  <--- [TL0, E]
           BGP VPN label:  <------------------------ [VL, 0]

           BGP VPN pkt:    push <TL, VL, EL> -------------->

              Figure 7: Entropy Labels with Intra-AS BGP apps

   For BGP VPLS, the application label is at the bottom of stack, so no
   ELI is needed.  For BGP IP VPNs, the application label is usually at
   the bottom of stack, so again no ELI is needed.  However, in the case
   of Carrier's Carrier (CsC) VPNs, the BGP VPN label may not be at the
   bottom of stack.  In this case, an ELI is necessary for CsC VPN
   packets with entropy labels to distinguish them from nested VPN
   packets.  In the example below, the nested VPN signaling is not
   shown; the egress PE for the nested VPN (not shown) must signal
   whether or does not it can process egress labels, and the ingress nested
   VPN PE may insert an entropy label if so.

   Three cases are shown: a plain BGP VPN packet, a CsC VPN packet
   originating from X, and a transit nested VPN packet originating from
   a nested VPN ingress PE (conceptually to the left of X).  It is
   assumed that the nested VPN packet arrives at X with label stack <ZL,
   CVL> where ZL is the tunnel label (to be swapped with <TL, CL>) and
   CVL is the nested VPN label.  Note that Y can use the same ELI for
   the tunnel LSP and the CsC VPN (and any other application that needs
   an ELI).

                         X -------- A --- ... --- B -------- Y
       tunnel LSP L:       [TL,  E] <---  ...  <--- [TL0, E]
       BGP VPN label:      <------------------------ [VL, 0]
       BGP CsC VPN label:  <------------------------ [CL, E]

       BGP VPN pkt:        push <TL, VL, EL> -------------->
       CsC VPN pkt:        push <TL, CL, E, EL> ----------->
       nested VPN pkt:     swap <ZL> with <TL, CL> -------->

                   Figure 8: Entropy Labels with CoC VPN

9.3.1.  Inter-AS BGP VPNs

   There are three commonly used options for inter-AS IP VPNs and BGP
   VPLS, known informally as "Option A", "Option B" and "Option C".
   This section describes how entropy labels can be used in these
   options.

9.3.1.1.  Option A Inter-AS VPNs

   In option A, an ASBR pops the full label stack of a VPN packet
   exiting an AS, processes the payload header (IP or Ethernet), and
   forwards the packet natively (i.e., as IP or Ethernet, but not as
   MPLS) to the peer ASBR.  Thus, entropy label signaling and insertion
   are completely local to each AS.  The inter-AS paths do not use
   entropy labels, as they do not use a label stack.

9.3.1.2.  Option B Inter-AS VPNs

   The ASBRs in option B inter-AS VPNs have a choice (usually determined
   by configuration) of whether to just swap labels (from within the AS
   to the neighbor AS or vice versa), or to pop the full label stack and
   process the packet natively.  This choice occurs at each ASBR in each
   direction.  In the case of native packet processing at an ASBR,
   entropy label signaling and insertion is local to each AS and to the
   inter-AS paths (which, unlike option A, do have labeled packets).

   In the case of simple label swapping at an ASBR, the ASBR can
   propagate received entropy label signaling onward.  That is, if a PE
   signals to its ASBR that it can process entropy labels (via an
   Entropy Label attribute), the ASBR can propagate that attribute to
   its peer ASBR; if a peer ASBR signals that it can process entropy
   labels, the ASBR can propagate that to all PEs within its AS).  Note
   that this is the case even though ASBRs change the BGP NEXT_HOP
   attribute to "self", because of clause B2 in Section 5.2.

9.3.1.3.  Option C Inter-AS VPNs

   In Option C inter-AS VPNs, the ASBRs are not involved in signaling;
   they do not have VPN state; they simply swap labels of inter-AS
   tunnels.  Signaling is PE to PE, usually via Route Reflectors;
   however, if RRs are used, the RRs do not change the BGP NEXT_HOP
   attribute.  Thus, entropy label signaling and insertion are on a PE-
   pair basis, and the intermediate routers, ASBRs and RRs do not play a
   role.

9.4.  Multiple Applications

   It has been mentioned earlier that an ingress PE must keep state per
   egress PE with regard to its ability have
   to process entropy labels.  An
   ingress PE must also keep state per application, as entropy label
   processing must be based on the application context in which a packet
   is received (and of course, the corresponding entropy label
   signaling).

   In the example below, running over that tunnel.  However, an egress LSR Y signals
   ingress PE can choose on a tunnel LSP L, and is
   prepared per-application basis whether or not to receive entropy labels on L, but requires
   insert ELs.  For example, X may have an ELI.
   Furthermore, Y signals two pseudowires PW1 and PW2 with labels PL1
   and PL2, respectively, and indicates that it can receive entropy
   labels application for both pseudowires without the need of an ELI; and finally,
   Y signals a L3 VPN with label VL, but Y which it does
   not indicate that it can
   receive entropy labels for the L3 VPN.  Ingress LSR X chooses wish to send
   native IP packets use ECMP (e.g., circuit emulation), or for which it does
   not know which keys to Y use for load balancing (e.g., Appletalk over L with entropy labels, thus X must
   include the given ELI (yielding a label stack
   pseudowire).  In either of <TL, ELI, EL>). those cases, X
   chooses may choose not to add insert
   entropy labels on PW1 packets to Y, with a label stack
   of <TL, PL1, EL>, labels, but chooses not may choose to do so for PW2 packets.  X must
   not send insert entropy labels on L3 VPN packets to Y, i.e., the label stack
   must be <TL, VL>.

                         X -------- A --- ... --- B -------- Y
           tunnel LSP L:   [TL,  E] <---  ...  <--- [TL0, E]
           PW1 label:      <----------------------- [PL1, 0]
           PW2 label:      <----------------------- [PL2, 0]
           VPN label:      <----------------------- [VL,  -] for an IP pkt:         push <TL, ELI, EL> ------------->
           PW1 pkt:        push <TL, PL1, EL> ------------->
           PW2 pkt:        push <TL, PL2> -----------------> VPN pkt:        push <TL, VL> ------------------>

            Figure 9: Entropy Labels for Multiple Applications
   over the same tunnel.

10.  Security Considerations

   This document describes advertisement of the capability to support
   receipt of entropy-labels and an Entropy Label Indicator that entropy labels which an ingress LSR may apply to insert in MPLS
   packets in order to allow transit LSRs to attain better load-balancing load
   balancing across LAG and/or ECMP paths in the network.

   This document does not introduce new security vulnerabilities to LDP. LDP,
   BGP or RSVP-TE.  Please refer to the Security Considerations section
   of LDP
   ([RFC5036]) these protocols ([RFC5036], [RFC4271] and [RFC3209]) for security
   mechanisms applicable to LDP. each.

   Given that there is no end-user control over the values used for
   entropy labels, there is little risk of Entropy Label forgery which
   could cause uneven load-balancing in the network.

   If Entropy Label Capability is not signaled from an egress PE to an
   ingress PE, due to, for example, malicious configuration activity on
   the egress PE, then the PE's PE will fall back to not using entropy labels
   for load-balancing traffic over LAG or ECMP paths which, in
   some cases, which is in general
   no worse than the behavior observed in current production networks.
   That said, operators are it is recommended to that operators monitor changes to PE
   configurations and, more importantly, the fairness of load
   distribution over equal-cost LAG or ECMP paths.  If the fairness of load
   distribution over a set of paths changes that could indicate a
   misconfiguration, bug or other non-optimal behavior on their PE's PEs and
   they should take corrective action.

   Given that most applications already signal an Application Label,
   e.g.: IPVPNs, LDP VPLS, BGP VPLS, whose Bottom of Stack bit

11.  IANA Considerations

11.1.  Reserved Label for ELI

   IANA is being
   re-used requested to signal entropy allocate a reserved label capability, there is little to no
   additional risk that traffic could be misdirected into an
   inappropriate IPVPN VRF or VPLS VSI at the egress PE.

   In the context of downstream-signaled entropy labels that require for the
   use of an Entropy Label
   Indicator (ELI), there should be little to no
   additional risk because the egress PE is solely responsible for
   allocating an ELI value and ensuring that ELI label value DOES NOT
   conflict with other MPLS labels it has previously allocated.  On the
   other hand, for upstream-signaled entropy labels, e.g.: RSVP-TE
   point-to-point or point-to-multipoint LSP's or Multicast LDP (mLDP)
   point-to-multipoint or multipoint-to-multipoint LSP's, there is a
   risk that the head-end MPLS LER may choose an ELI value that is
   already in use by a downstream LSR or LER.  In this case, it is the
   responsibility of (ELI) from the downstream LSR or LER to ensure that it MUST
   NOT accept signaling for an ELI value that conflicts with MPLS
   label(s) that are already in use.

11.  IANA Considerations

11.1. "Multiprotocol Label Switching Architecture
   (MPLS) Label Values" Registry.

11.2.  LDP Entropy Label Capability TLV

   IANA is requested to allocate the next available value from the IETF
   Consensus range in the LDP TLV Type Name Space Registry as the
   "Entropy Label Capability TLV".

11.2.

11.3.  BGP Entropy Label Capability Attribute

   IANA is requested to allocate the next available Path Attribute Type
   Code from the "BGP Path Attributes" registry as the "BGP Entropy
   Label Capability Attribute".

11.3.  Attribute Flags for LSP_Attributes Object

11.4.  RSVP-TE Entropy Label Capability flag

   IANA is requested to allocate a new bit from the "Attribute Flags"
   sub-registry of the "RSVP TE Parameters" registry.

   Bit | Name                     | Attribute  | Attribute  | RRO
   No  |                          | Flags Path | Flags Resv |
   ----+----------------------+------------+------------+-----
   ----+--------------------------+------------+------------+-----
   TBD   Entropy Label LSP Capability       Yes          Yes       No

11.4.  Attributes TLV for LSP_Attributes Object

   IANA is requested to allocate the next available value from the
   "Attributes TLV" sub-registry of the "RSVP TE Parameters" registry.

12.  Acknowledgments

   We wish to thank Ulrich Drafz for his contributions, as well as the
   entire 'hash label' team for their valuable comments and discussion.

   Sincere thanks to Nischal Sheth for his many suggestions and
   comments, and his careful reading of the document, especially with
   regard to data plane processing of entropy labels.

13.  References

13.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3032]  Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
              Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack
              Encoding", RFC 3032, January 2001.

   [RFC3107]  Rekhter, Y. and E. Rosen, "Carrying Label Information in
              BGP-4", RFC 3107, May 2001.

   [RFC3209]  Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V.,
              and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP
              Tunnels", RFC 3209, December 2001.

   [RFC5420]  Farrel, A., Papadimitriou, D., Vasseur, JP., and A.
              Ayyangarps, "Encoding of Attributes for MPLS LSP
              Establishment Using Resource Reservation Protocol Traffic
              Engineering (RSVP-TE)", RFC 5420, February 2009.

13.2.  Informative References

   [I-D.ietf-pwe3-fat-pw]
              Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan,
              J., and S. Amante, "Flow Aware Transport of Pseudowires
              over an MPLS Packet Switched Network",
              draft-ietf-pwe3-fat-pw-07 (work in progress), July 2011.

   [RFC4201]  Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
              in MPLS Traffic Engineering (TE)", RFC 4201, October 2005.

   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
              Protocol 4 (BGP-4)", RFC 4271, January 2006.

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.

   [RFC4379]  Kompella, K. and G. Swallow, "Detecting Multi-Protocol
              Label Switched (MPLS) Data Plane Failures", RFC 4379,
              February 2006.

   [RFC4447]  Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G.
              Heron, "Pseudowire Setup and Maintenance Using the Label
              Distribution Protocol (LDP)", RFC 4447, April 2006.

   [RFC4761]  Kompella, K. and Y. Rekhter, "Virtual Private LAN Service
              (VPLS) Using BGP for Auto-Discovery and Signaling",
              RFC 4761, January 2007.

   [RFC4762]  Lasserre, M. and V. Kompella, "Virtual Private LAN Service
              (VPLS) Using Label Distribution Protocol (LDP) Signaling",
              RFC 4762, January 2007.

   [RFC4875]  Aggarwal, R., Papadimitriou, D., and S. Yasukawa,
              "Extensions to Resource Reservation Protocol - Traffic
              Engineering (RSVP-TE) for Point-to-Multipoint TE Label
              Switched Paths (LSPs)", RFC 4875, May 2007.

   [RFC5036]  Andersson, L., Minei, I., and B. Thomas, "LDP
              Specification", RFC 5036, October 2007.

   [RFC5586]  Bocci, M., Vigoureux, M., and S. Bryant, "MPLS Generic
              Associated Channel", RFC 5586, June 2009.

   [RFC5884]  Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow,
              "Bidirectional Forwarding Detection (BFD) for MPLS Label
              Switched Paths (LSPs)", RFC 5884, June 2010.

Appendix A.  Applicability of LDP Entropy Label sub-TLV Capability TLV

   In the case of unlabeled IPv4 (Internet) traffic, the Best Current
   Practice is for an egress LSR to propagate eBGP learned routes within
   a SP's Autonomous System after resetting the BGP next-hop attribute
   to one of its Loopback IP addresses.  That Loopback IP address is
   injected into the Service Provider's IGP and, concurrently, a label
   assigned to it via LDP.  Thus, when an ingress LSR is performing a
   forwarding lookup for a BGP destination it recursively resolves the
   associated next-hop to a Loopback IP address and associated LDP label
   of the egress LSR.

   Thus, in the context of unlabeled IPv4 traffic, the LDP Entropy Label
   sub-TLV
   Capability TLV will typically be applied only to the FEC for the
   Loopback IP address of the egress LSR and the egress LSR will need not
   announce an entropy label capability for the eBGP learned route.

Authors' Addresses

   Kireeti Kompella
   Juniper Networks
   1194 N. Mathilda Ave.
   Sunnyvale, CA  94089
   US

   Email: kireeti@juniper.net

   John Drake
   Juniper Networks
   1194 N. Mathilda Ave.
   Sunnyvale, CA  94089
   US

   Email: jdrake@juniper.net

   Shane Amante
   Level 3 Communications, LLC
   1025 Eldorado Blvd
   Broomfield, CO  80021
   US

   Email: shane@level3.net

   Wim Henderickx
   Alcatel-Lucent
   Copernicuslaan 50
   2018 Antwerp
   Belgium

   Email: wim.henderickx@alcatel-lucent.com
   Lucy Yong
   Huawei USA
   1700 Alma
   5340 Legacy Dr. Suite 500
   Plano, TX  75075  75024
   US

   Email: lucyyong@huawei.com lucy.yong@huawei.com