[Docs] [txt|pdf] [Tracker] [Email] [Diff1] [Diff2] [Nits] [IPR]

Versions: 00 01 02 03

Network Working Group                                       A. Bashandy
Internet Draft                                             B. Pithawala
Intended status: Standards Track                               K. Patel
Expires: September 2012                                   Cisco Systems
                                                          March 2, 2012

          Scalable BGP FRR Protection against Edge Node Failure
                 draft-bashandy-bgp-edge-node-frr-02.txt


Abstract

Consider a BGP free core scenario. Suppose the edge BGP speakers PE1,
PE2,..., PEn know about a prefix P/m via the external routers CE1,
CE2,..., CEm.  If the edge router PEi crashes or becomes totally
disconnected from the core, it is desirable for a core  router "P"
carrying traffic to the failed edge router PEi to immediately restore
traffic by re-tunneling packets originally tunneled to PEi and
destined to the prefix P/m to one of the other edge routers that
advertised P/m, say PEj, until BGP re-converges. In doing so, it is
highly desirable to keep the core BGP-free while not imposing
restrictions on external connectivity. Thus (1) a core router should
not be required to learn any BGP prefix, (2) the size of the
forwarding and routing tables in the core routers should be
independent of the number of BGP prefixes,(3) there should be no
special router (or group of routers) that handles restoring traffic
or the need for one router to store the forwarding table of another
router, (4) re-routing traffic without waiting for re-convergence
must not cause loops, and (4) there should be no restrictions on what
edge routers advertise what prefixes. For labeled prefixes, (6) the
repairing core router must swap the label stack advertised by the
failed edge router PEi for the prefix P/m with the label stack
advertised for the same prefix by the edge router PEj before re-
tunneling the packet to PEj

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008. The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s)
   controlling the copyright in such materials, this document may not
   be modified outside the IETF Standards Process, and derivative
   works of it may not be created outside the IETF Standards Process,




Bashandy              Expires September 2, 2012                [Page 1]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   except to format it for publication as an RFC or to translate it
   into languages other than English.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This Internet-Draft will expire on September 5, 2012.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors. All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document. Code Components extracted from this
   document must include Simplified BSD License text as described in
   Section 4.e of the Trust Legal Provisions and are provided without
   warranty as described in the Simplified BSD License.



Table of Contents

   1. Introduction...................................................3
      1.1. Conventions used in this document.........................4
      1.2. Terminology...............................................5
      1.3. Problem definition........................................6
   2. Control Plane Operation........................................7
      2.1. Step 1: Calculation of the Repair PE......................8
      2.2. Step 2: Assigning and Advertising the BGP Next-hop........8
      2.3. Step 3: Informing core routers about the repair path......9

Bashandy              Expires September 5, 2012                [Page 2]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

      2.4. Step 4: How a repairing P router (a core router) programs its
      forwarding plane..............................................10
   3. Rules for Choosing and Managing The Repair path...............11
      3.1. General Rules for Managing the Repair Path...............11
      3.2. Rules for the "Push" Flag................................12
      3.3. Rules for Choosing the Repair Path for Labeled Prefixes..13
   4. Forwarding Plane Operation....................................14
   5. Inter-operability with Existing IP FRR Mechanisms.............15
   6. Example.......................................................16
   7. Security Considerations.......................................18
   8. IANA Considerations...........................................18
   9. Conclusions...................................................18
   10. References...................................................18
      10.1. Normative References....................................18
      10.2. Informative References..................................18
   11. Acknowledgments..............................................19
   Appendix A. Changes from Version 01..............................20

1. Introduction

   In a BGP free core, where traffic is tunneled between edge routers,
   BGP speakers advertise reachability information about prefixes to
   other edge routers not to core rourers. For labeled address
   families, namely AFI/SAFI 1/4, 2/4, 1/128, and 2/128, an edge
   router assigns local labels to prefixes and associates the local
   label with each advertised prefix such as L3VPN [6], 6PE [7], and
   Softwire [5]. Suppose that a given edge router is chosen as the
   best next-hop for a prefix P/m. An ingress router that receives a
   packet from an external router and destined to the prefix P/m
   "tunnels" the packet across the core to that egress router. If the
   prefix P/m is a labeled prefix, the ingress router pushes the label
   advertised by the egress router before tunneling the packet to the
   egress router. Upon receiving the packet from the core, the egress
   router takes the appropriate forwarding decision based on the
   content of the packet or the label pushed on the packet.

   In modern networks, it is not uncommon to have a prefix reachable
   via multiple edge routers. One example is the best external path
   [4]. Another more common and widely deployed scenario is L3VPN [6]
   with multi-homed VPN sites. As an example, consider the L3VPN
   topology depicted in Figure 1.









Bashandy              Expires September 5, 2012                [Page 3]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012


                                PE1 .............+
                                                |
                                       +--------+---------------+
                                       |                        |
                                       |   VPN 1 Network        |
                                       |                        |
                                       |            VPN prefix  |
                                       |           (10.0.0.0/8) |
                                       |                        |
                                       +---+--------------------+
                                           |
                                   /------CE1
                                  /
                                 /
    BGP-free core      P--------PE0
                                 \
                                  \
                                   \------CE2
                                           |
                                       +---+--------------------+
                                       |                        |
                                       |   VPN 2 Network        |
                                       |                        |
                                       |            VPN prefix  |
                                       |           (20.0.0.0/8) |
                                       |                        |
                                       +--------+---------------+
                                                |
                               PE2 .............+

             Figure 1 VPN prefix reachable via multiple PEs



   As illustrated in Figure 1, the edge router PE0 is the primary NH
   for both 10.0.0.0/8 and 20.0.0.0/8. At the same time, both
   10.0.0.0/8 and 20.0.0.0/8 are reachable through the other edge
   routers PE1 and PE2, respectively.

   1.1. Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC-2119 [1].

   In this document, these words will appear with that interpretation
   only when in ALL CAPS. Lower case uses of these words are not to be
   interpreted as carrying RFC-2119 significance.

Bashandy              Expires September 5, 2012                [Page 4]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   1.2. Terminology

   This section defines the terms used in this document. For ease of
   use, we will use terms similar to those used by L3VPN [6]

   o  BGP-Free core: A network where BGP prefixes are only known to
      the edge routers and traffic is tunneled between edge routers

   o  Protected prefix: It is a prefix P/m (of any AFI) that a BGP
      speaker has an external path to. The BGP speaker may learn about
      the prefix from an external peer through BGP, some other
      protocol, or manual configuration. The protected prefix is
      advertised to some or all of the internal peers.

   o  Primary egress PE (Primary PE for simplicity): It is an IBGP
      peer that can reach the protected prefix P/m through an external
      path and advertised the prefix to the other IBGP peers. The
      primary egress PE was chosen as the best path by one or more
      internal peers. In other words, the primary egress PE is an
      egress PE that will normally be used by some ingress PEs when
      there is no failure. Referring to Figure 1, PE0 is a primary
      egress PE.

   o  Primary next-hop: It is an IPv4 or IPv6 host address belonging
      to the primary egress PE. If the prefix is advertised via BGP,
      then the primary next-hop is the next-hop attribute in the BGP
      update message [2][3].

   o  CE: It is an external router through which an egress PE can
      reach a prefix P/m. The routers "CE1" and "E2" in Figure 1 are
      examples of such CE.

   o  Ingress PE: It is a BGP speaker that learns about a prefix
      through another IBGP peer and chooses that IBGP peer as the
      next-hop for the prefix.

   o  Repairing P router (Also "Repairing core router" and "repairing
      router"): A core router that attempts to restore traffic when
      the primary egress PE is no longer reachable without waiting for
      IGP or BGP to re-converge. The repairing P router restores the
      traffic by rerouting the traffic (through a tunnel) towards the
      pre-calculated repair PE when it detects that the primary egress
      PE is no longer reachable. Referring to Figure 1, the router "P"
      is the repairing P router.






Bashandy              Expires September 5, 2012                [Page 5]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   o  Repair egress PE (Repair PE for simplicity): It is an egress PE
      other than the primary egress PE that can reach the protected
      prefix P/m through an external neighbor. The repair PE is pre-
      calculated via other PEs prior to any failure. Referring to
      Figure 1, PE1 is the repair PE for 10.0.0.0/8 while PE2 is the
      repair PE for 20.0.0.0/8.

   o  Underlying Repair label stack: The underlying repair label stack
      is the label stack that will be pushed or swapped in when the
      repairing P router re-tunnels traffic to the repair PE after
      detecting that the primary egress PE is no longer reachable.

   o  Protected egress PE (Protected PE for simplicity): Any primary
      egress PE protected by a repairing P router.

   o  Protected edge router: Any protected egress PE.

   o  Repair next-hop: It is an IPv4 or IPv6 host address belonging to
      the repair egress PE. If the protected prefix is advertised via
      BGP, then the repair next-hop MAY be the next-hop attribute in
      the BGP update message [2][3].

   o  Repair path (Also Repair Egress Path): It is the repair next-
      hop. If an underlying repair label exists, the repair path is
      the repair next-hop together with the underlying repair label
      that will be pushed or swapped in when the repairing P router
      reroutes traffic to the repair PE.

   o  Primary tunnel: It is the tunnel from the ingress PE to the
      primary egress PE

   o  Repair tunnel: It is the tunnel from the repairing P router to
      the repair egress PE

   1.3. Problem definition

   The problem that we are trying to solve is as follows

   o  Even though multiple prefixes may share the same egress router,
      they have different repair edge router. In Figure 1 above, both
      10.0.0.0/8 and 20.0.0.0/8 share the same primary next hop PE0,
      the routing protocol(s) must identify that the node protecting
      repair node for 10.0.0.0/8 is PE1 while the node protecting
      repair node for 11.0.0.0/8 is PE2






Bashandy              Expires September 5, 2012                [Page 6]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   o  On loosing connection to the edge router, the core router "P"
      MUST reroute traffic towards the *correct* repair edge router
      without waiting for IGP or BGP to re-converge and update the
      routing tables. On the failure of PE0 illustrated in Figure 1,
      the core router P needs to  reroute traffic for 10.0.0.0/8
      towards PE1 and traffic for 11.0.0.0/8 towards PE2

   o  The repairing core router P MUST NOT be forced to learn about
      the BGP prefixes on any of the edge router. The same applies for
      all core routers.

   o  There SHOULD NOT be a need for a special router or group of
      routers to handle rerouting traffic on edge node failure. Also a
      router SHOULD not be forced to learn or mirror part or all of
      the routing or forwarding table of other router(s).

   o  The size of the routing table on any core router MUST be
      independent of the number of BGP prefixes in the network.

   o  Rerouting traffic without waiting for IGP and BGP to re-converge
      after a failure MUST NOT cause loops.

   o  For labeled prefixes, repair router MUST forward the re-routed
      traffic correctly to the external neighbor. Thus the core router
      MUST either swap the label stack advertised by primary egress
      edge router (PE0 in Figure 1) with the underlying repair label
      stack advertised by the repair router (PE1 and PE2 in Figure 1)
      or push a label on top of the label stack advertised by the
      primary edge router to allow the repair edge router to forward
      the packet correctly.

2. Control Plane Operation

   This section specifies the control plane operation needed to solve
   the problem defined in the Section 1.3. The control plane operation
   cover both labeled (AFI/SAFI 1/4, 2/4, 1/128, and 2/128) and
   unlabeled (AFI/SAFI 1/1, 2/1, 1/2, and 2/2) protected prefixes.

   The control plane operation can be summarized in 4 steps:

   1. Calculation of the repair PE

   2. Assigning and advertising the next-hop for protected prefixes

   3. Informing core routers about repair paths

   4. How a repairing P router (a core router) programs its forwarding
      plane


Bashandy              Expires September 5, 2012                [Page 7]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   2.1. Step 1: Calculation of the Repair PE

   1. Consider the prefix P/m learnt by an egress edge router PEi via
      an external neighbor. If the prefix P/m is labeled, then PEi
      allocates a local label for each prefix it can reach through an
      external neighbor CE. A PEi MAY also allocate a repair label
      such as specified in [11].

   2. Each edge router advertises the unlabeled prefix P/m (belonging
      to AFI/SAFI 1/1, 2/1, 1/2, and 2/2) and possibly a repair label
      [11] to some or all of its IBGP peers. If P/m is a labeled
      prefix (belonging to AFI/SAFI 1/4, 2/4, 1/128, and 2/128), the
      edge router also advertises the local label.

   3. As a result, an edge router PEi having an external path to the
      prefix P/m may learn about the prefix through other IBGP peers.

   4. The edge router PEi MAY chooses a repair PE for the external
      prefix P/m from among the iBGP peers advertising the prefix P/m.
      Rules for choosing the repair PE are specified in Section 3.

   5. The edge router PEi chooses the one of labels advertised by the
      other edge router PEj for the prefix P/m as the "underlying
      repair label". The algorithm for choosing the underlying repair
      label is specified in Section 3.

   6. In the end, if the edge router PEi can reach the prefix P/m
      through and external path and the prefix P/m is advertised by at
      least one other PE, the edge router PEi will have

       o a primary path towards the CE through which the prefix is
          reachable, and

       o a repair path consisting of a repair PE and possibly an
          underlying repair label advertised by the chosen repair PE.

   2.2. Step 2: Assigning and Advertising the BGP Next-hop

   1. The edge router PEi groups all BGP prefixes for which PEi has an
      external path and a repair path as follows:

       a. If the edge router received a repair label from the repair
          PE: Two prefixes belong to the same group Gi if they share
          the same repair PE and underlying repair label

       b. If the edge router did NOT receive a repair label from the
          repair PE: Two prefixes belong to the same group if they
          share the same repair PE


Bashandy              Expires September 5, 2012                [Page 8]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   2. For each group, the PE assigns a distinct local next-hop. Thus
      if a prefix P/m belongs to group Gi, its primary next-hop is
      NHi.

   3. When advertising the prefix and its primary next-hop to its IBGP
      peers, the PE router uses NHi as the next-hop attribute of
      prefixes belonging to the group Gi.

   2.3. Step 3: Informing core routers about the repair path

   1. In step 2 (Section 2.2) the egress PE assigns a primary next-
      hop NHi and a repair path (consisting of a repair next-hop and
      possibly an underlying repair label) for protected prefixes
      belonging to group Gi.

   1. The primary next-hop NHi is advertised into IGP as usual

   2. The repair next-hop is a host address local or advertised by the
      repair PE. The repair next-hop MAY be the BGP next-hop attribute
      advertised for the prefix P/m by the repair edge router PEj or
      some other, possibly future attribute, in the BGP advertisement.
      Denote the repair next-hop for prefixes belonging to group Gi by
      "rNHi". Denote the underlying repair label for prefixes
      belonging to group Gi by "rLi". Because rNHi is the next-hop
      attribute advertised by the repair PE, rNHi will also be known
      to all core routers via IGP

   3. The repairing egress PE MUST advertise the pair (NHi, rNHi) OR
      the quadruple (NHi, rNHi, rLi, Push) to core routers that are
      designated as candidate repairing routers. Note that designating
      a core router as a candidate repairing router may be subject to
      administrative actions and/or policy. For example, an
      administrator may limit candidate repairing routers to only core
      routers that are directly connected to edge routers. The
      mechanism by which a core router is designated as a candidate
      repair router is beyond the scope of this document.

   4. The pair (NHi, rNHi) or the quadruple (NHi, rNHi, rLi, Push) may
      be advertised through various means, such as ISIS optional TLV.
      The structure and method of advertising (NHi, rNHi) and/or (NHi,
      rNHi, rLi, Push) is beyond the scope of this document.

   5. The semantics of the pair (NHi,rNHi) are: If the next-hop NHi
      becomes unreachable, then traffic tunneled to the next-hop NHi
      SHOULD be re-tunneled to the next-hop rNHi because rNHi can
      reach protected prefixes reachable via the next-hop NHi.




Bashandy              Expires September 5, 2012                [Page 9]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   6. The semantics of the quadruple (NHi,r NHi, rLi, Push) are: If
      the next-hop NHi becomes unreachable, then traffic destined to
      the next-hop NHi should be re-tunneled to the next-hop rNHi and

       a. If the PUSH flag is cleared, the label pushed by the ingress
          PE MUST be swapped with the label rLi before re-tunneling to
          the repair PE, irrespective of the value of the label pushed
          by the ingress PE.

       b. If the PUSH flag is set, then the label rLi MUST be pushed
          on the packet before re-tunneling to the repair PE.

   7. Because of the previous steps, candidate repairing core routers
      become aware of the repair path for protected BGP prefixes
      reachable via the primary egress edge router. Note that all core
      routers remain totally unaware of the BGP prefixes.

   2.4. Step 4: How a repairing P router (a core router) programs its
      forwarding plane

   1. Through usual IGP mechanisms, the P router has a prefix matching
      every BGP next-hop. Let the primary next-hop NHi match the IGP
      route Ri

   2. Any next-hop of prefix Ri is on the path towards the protected
      egress edge router PEi. A next-hop of the prefix Ri is considered
      the primary path for the prefix Ri

   3. Thus the FIB entry for Ri is programmed as follows

       a. Primary path: All next hop routers on the path towards NHi

       b. Repair path when the candidate repair router receives the
          pair (NHi,rNHi)

           i. Primary next-hop: the next router on the path towards
               NHi

          ii. Repair next-hop: the next-router on the path towards
               rNHi

       c. Repair path when the candidate repair router receives the
          quadruple (NHi,rNHi, rLi, Push)

           i. If the "Push" flag is *cleared*

                    Pop label in the packet right under the tunnel
                    header (irrespective of the value of that label)


Bashandy              Expires September 5, 2012               [Page 10]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

               endif

          ii. Push the underlying repair label rLi

         iii. Re-tunnel the packet towards the repair next-hop rNHi

3. Rules for Choosing and Managing The Repair path

   This section specifies rules governing how an egress edge router
   PEi chooses and advertises the repair path. Other than the rules in
   this section, the method of choosing the repair path is beyond the
   scope of this document.

   3.1. General Rules for Managing the Repair Path

   This section specifies general rules for choosing the repair path
   for both labeled and unlabeled prefixes.

   1. A repair PE MUST be another edge router PEj that advertises the
      same prefix to the primary edge router PEi via IBGP peering.

   2. A primary PE MAY advertise more than one repair path for the
      same primary next-hop NHi. In that case, all advertised repair
      next-hop identify valid repair PEs for the primary next-hop NHi.
      Thus a core router MAY choose any repair PE as the repair path
      for the primary next-hop NHi

   3. If a repairing "P" router determines that the path taken by the
      repair tunnel to a repair edge router PEj passes through the
      protected edge router PEi, then the repairing router "P" SHOULD
      NOT install this repair path in its forwarding plane.

   4. Let the primary next-hop NHi match the IGP route Ri. If the
      repairing "P" router determines that the repair tunnel to a
      repair edge router passes through a next-hop of the IGP route
      Ri, then the repairing router SHOULD NOT install this repair
      path in its forwarding plane.

   5. A primary next-hop identifies an egress PE. Thus a primary next-
      hop NH MUST NOT be advertised by two different PEs. However a
      primary next-hop of one PE MAY be the repair next-hop for
      another PE.








Bashandy              Expires September 5, 2012               [Page 11]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   6. At any point in time, for the same primary and repair next-hops
      NHi and rNHi, only one advertisement is valid. Thus for the same
      value of NHi and rNHi, an advertisement of the pair (NHi,rNHi)
      or the quadruple (NHi,rNHi,rLi,Push) for any values of "rLi" and
      "Push" MUST override or be preceded by the withdrawal of any
      previously advertised pair (NHi,rNHi) or the quadruple
      (NHi,rNHi,rLi,Push).



   If rules (3) and (4) are not applied, then the tunnel to the repair
   edge router PEj does not provide protection against the failure of
   the edge node PEi. Instead it provides core protection against the
   failure of the path through the core leading to the protected edge
   node PEi. Thus existing core FRR protection mechanisms such as those
   specified in [8], [9], and [10] can be used instead.

   Rules (5) and (6) ensures that there is no ambiguity about the
   primary and repair next-hops

   3.2. Rules for the "Push" Flag

   This section covers basic rules for advertising, setting, and
   clearing the "Push" flag

   1. If the repairing PE advertises a repair label, then the "Push"
      flag MUST be advertised. I.e., the repairing PE MUST either
      advertise the pair (NHi, rNHi) or the quadruple (NHi, rNHi, rLi,
      Push)

   2. If the repair PE advertises a repair label, then the repair PE MAY
      advertise the "Push" flag with the repair label. In that case, if
      the primary egress PE (PEi) decides to advertise the quadruple
      (NHi, rNHi, rLi, Push), then the primary egress PE SHOULD set the
      "Push" flag to the same value that is received from the repair PE.

   3. If the protected prefix is unlabeled (i.e. belongs to AFI/SAFI
      1/1, 2/1, 1/2, or 2/2) and the repairing router advertises the
      quadruple (NHi, rNHi, rLi, Push), then the "Push" flag MUST be
      set.

   4. If the protected prefix is labeled (i.e. belongs to AFI/SAFI 1/4,
      2/4, 1/128, and 2/128), then

       a. The repairing router MUST advertise the quadruple (NHi, rNHi,
          rLi, Push)

       b. repairing router MAY set the "Push" flag


Bashandy              Expires September 5, 2012               [Page 12]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   Rule (1) is required because the repairing core router(s) need to
   know what to do with the underlying repair label if it exists

   Rule (2) is required because the repairing label is allocated by the
   repairing PE. Hence the repairing PE should be able to specify what
   to do with it. For example, suppose the repair PE has eiBGP paths for
   the protected prefix and the protected prefix is unlabeled. In that
   case, to make sure that the repair PE does not loop the repaired
   packet back to the primary egress PE, the repair PE advertises a
   repair label for the unlabeled protected prefix with semantics
   defined in [11]

   Rule (3) is needed because the protected prefix is unlabeled. Hence
   the tunneled packet arriving at the repairing core router "P" has no
   label and thus the label swapping operation cannot be performed.

   This document defines the minimum set of rules governing the "Push"
   flag. Additional rules may be set by other documents.

   3.3. Rules for Choosing the Repair Path for Labeled Prefixes

   This section specifies rules in additions to those mentioned in
   Section 3.1. by which an egress edge router PEi chooses and
   advertises the repair path for a protected labeled prefix P/m.

   1. A primary edge router PEi SHOULD only choose the edge router PEj
      and the underlying repair label rLi as a repair path for the
      prefix P/m if the label advertised for the prefix P/m by the
      repair edge router PEj is allocated on per-VPN or per-CE/per-
      next-hop basis.

      The reason for this is as follows. As mentioned in Section 1.3.
      the core of the network MUST remain BGP-free and the size of the
      routing table on a core router MUST remain independent of the
      number of BGP prefixes. BGP prefix grouping in section 2.2.
      requires two prefixes to belong to two different groups if the
      labels advertised by the repair PE for the two prefixes are
      different. Thus if the repair edge router allocates labels on
      per-prefix basis, the protected edge router PEi will advertise a
      different primary next-hop for each protected prefix. This is
      equivalent to having core router "P" knowing about every BGP
      prefix. In addition, the size of the routing table of the "P"
      router becomes comparable to the number of BGP prefixes.

   2. If the repair edge router PEj advertises a repair label as
      described in [11] and the protected edge router understands the
      repair label attribute described in [11], then the protected
      edge router PEi SHOULD choose the repair label advertised by PEj
      as the underlying repair label for the prefix P/m.

Bashandy              Expires September 5, 2012               [Page 13]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

      Using the repair label specified in [11] has few advantages:

       o A repairing edge router PEj need not change the primary
          label allocation policy (which may be per-prefix) but can be
          chosen as repair PE if the repair labels are allocated on
          per-CE or per-VRF basis.

       o As mentioned in [11], an edge router does not repair a
          packet arriving with a repair label. Hence using the repair
          label when re-tunneling the packet towards PEj guarantees
          loop freedom in case of PE-CE link failure.

       o If the repair PE programs an eiBGP multipath for the
          protected prefix, then choosing the repair label as
          specified in [11] guarantees that the repaired packet will
          not be looped back towards the primary egress PE during
          repair

4. Forwarding Plane Operation

   This section specifies the forwarding plane operation on the core
   router "P" when it detects that the protected edge router PEi is no
   longer reachable. We assume that the core router has pre-programmed
   its forwarding plane according to Sections 2.4.

   As soon as the "P" router detects that the primary next-hop for Ri
   is not reachable it does the following for any arriving packet
   destined to the protected edge router PEi

   1. Decapsulate the tunnel header to expose the tunneled packet

   2. If the underlying repair label rLi is programmed in the
      forwarding plane

       a. If the "Push" flag is set

               Push the underlying repair label rLi

       b. Else

               Swap the label on the top of the packet (irrespective
               of the value of that label) with the underlying repair
               label rLi

   3. Tunnel the packet towards the repair egress PE identified by
      rNHj




Bashandy              Expires September 5, 2012               [Page 14]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

5. Inter-operability with Existing IP FRR Mechanisms

   Current existing IP FRR mechanisms can be divided into two
   categories: core protection and edge protection. Core protection
   techniques, such as [8], [9], and [10], provide protection against
   internal node and/or link failure. Thus the technique proposed in
   this document is not related to existing IP FRR mechanisms. If the
   failure of an internal node or link results in completely
   disconnecting a protectable edge node, then an administrator MAY
   configure the repairing router to prefer the technique proposed in
   this document over existing IP FRR mechanisms.

   Edge protection techniques, such as [12] and its practical
   implementation [11] provide protection against the failure of the
   link between PE and CE routers. Thus existing PE-CE link protection
   can co-exist with the techniques proposed in this document because
   the two techniques are independent of each other.

































Bashandy              Expires September 5, 2012               [Page 15]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

6. Example

   We will use and LDP core as an example. Consider the diagram
   depicted in Figure 2 below. We assume that the PEs advertise repair
   labels as specified in [11]

   +-----------------------------------+
   |                                   |
   |   LDP Core                        |
   |                                   |
   |                                  PE1
   |                                   |\
   |                                   | \
   |                                   |  \
   |                                   |   \
   |                                   |  CE1....... VPN prefix
   |                                   |   /       (10.0.0.0/8)
   |                                   |  /        (11.0.0.0/8)
   |                                   | /
   |                                   |/
   PEx                        P--------PE0    Lo1 = 1.1.1.1/32
   |                                   |\     Lo2 = 2.2.2.2/32
   |                                   | \
   |                                   |  \
   |                                   |   \
   |                                   |  CE2....... VPN prefix
   |                                   |   /       (20.0.0.0/8)
   |                                   |  /        (21.0.0.0/8)
   |                                   | /
   |                                   |/
   |                                  PE2
   |                                   |
   |                                   |
   +-----------------------------------+
                Figure 2 : Edge node BGP FRR in LDP core


   1. As we can see, PE0 has 4 prefixes: 10.0.0.0/8, 11.0.0.0/8,
      20.0.0.0/8, and 21.0.0.0/8. PE0 may assign a separate label to
      each prefix. The method and policy of assigning primary labels
      to each prefixes is irrelevant to this document.

   2. PE1 advertises the repair label rL1 for prefixes 10.0.0.0/8 and
      11.0.0.0/8

   3. PE2 advertises the repair label rL2 for prefixes 20.0.0.0/8 and
      21.0.0.0/8



Bashandy              Expires September 5, 2012               [Page 16]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   4. As such, PE0 divides its prefixes into two groups
      G1 = {10.0.0.0/8, 11.0.0.0/8}
      G2 = {20.0.0.0/8, 21.0.0.0/8}

   5. When advertising the next-hop to its IBGP peer, PE0 advertises
      1.1.1.1 as the next-hop for prefixes belonging to group G1 and
      2.2.2.2 as the next-hop for prefixes belonging to group G2.

   6. PE0 advertises the prefixes 1.1.1.1/32 and 2.2.2.2/32 using the
      usual IGP mechanism.

   7. When advertising 1.1.1.1/32 into the core, PE0 advertises rL1
      and PE1 as a repair path. When advertising 2.2.2.2/32 into the
      core, PE0 advertises rL2 and PE2 as a repair path. The mechanism
      by which a repair path is advertised is beyond the scope of the
      proposal.

   8. On the penultimate hop router "P", LDP assigns a different LDP
      label to 1.1.1.1/32 and 2.2.2.2/32. Core routers other than
      penultimate hop routers may employ some sort of label
      aggregation to reduce the number of LDP labels

   9. Assume that the penultimate hop router "P" assigns the local LDP
      label L1 for prefix 1.1.1.1/32 and L2 for prefix 2.2.2.2/32

   10.On the penultimate router P, the forwarding entry for L1 will be
      as follows
      Primary path:
      - nexthop is PE0.
      - swap the incoming outer label with the LDP label towards
      1.1.1.1
      Repair path
       - Pop the incoming LDP label
       - Swap the internal label with the repair label rL1
       - Push the LDP label towards PE1
       - Forward the packet

   11.On the core router P, the forwarding entry for L2 will be as
      follows

      Primary path: Same as L1

      Repair Path
       - Pop the incoming LDP label
       - Swap the internal label with the repair label rL2
       - Push the LDP label towards PE2
       - Forward the packet



Bashandy              Expires September 5, 2012               [Page 17]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   12.If the P router detects that PE0 is no longer reachable, it can
      use the repair path already pre-programmed in the forwarding
      plane as described above. Because the repair path is pre-
      programmed as in the case of TE and IP FRR, the P router can re-
      route traffic very fast

7. Security Considerations

   No additional security risk is introduced by using the mechanisms
   proposed in this document

8. IANA Considerations

   No requirements for IANA

9. Conclusions

   This document proposes a method that allows fast re-route
   protection against edge node failure or complete disconnected from
   the core in a BGP-free core

10. References

   10.1. Normative References

   [1]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
         Levels", BCP 14, RFC 2119, March 1997.

   [2]   Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4
         (BGP-4), RFC 4271, January 2006

   [3]   Bates, T., Chandra, R., Katz, D., and Rekhter Y.,
         "Multiprotocol Extensions for BGP", RFC 4760, January 2007

   10.2. Informative References

   [4]   Marques,P., Fernando, R., Chen, E, Mohapatra, P., Gredler, H.,
         "Advertisement of the best external route in BGP", draft-ietf-
         idr-best-external-04.txt, April 2011.

   [5]   Wu, J., Cui, Y., Metz, C., and E. Rosen, "Softwire Mesh
         Framework", RFC 5565, June 2009.

   [6]   Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks
         (VPNs)", RFC 4364, February 2006.

   [7]   De Clercq, J. , Ooms, D., Prevost, S., Le Faucheur, F.,
         "Connecting IPv6 Islands over IPv4 MPLS Using IPv6 Provider
         Edge Routers (6PE)", RFC 4798, February 2007

Bashandy              Expires September 5, 2012               [Page 18]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

   [8]   Atlas, A. and A. Zinin, "Basic Specification for IP Fast
         Reroute: Loop-Free Alternates", RFC 5286, September 2008.

   [9]   Shand, S., and Bryant, S., "IP Fast Reroute", RFC5714, January
         2010

   [10]  Shand, M. and S. Bryant, "A Framework for Loop-Free
         Convergence", RFC 5715, January 2010.

   [11]  Bashandy, A., Pithawala, P., and Heitz, J., "Scalable, Loop-
         Free BGP FRR using Repair Label", draft-bashandy-idr-bgp-
         repair-label-02.txt", July 2011

   [12]  O. Bonaventure, C. Filsfils, and P. Francois. "Achieving sub-50
         milliseconds recovery upon bgp peering link failures," IEEE/ACM
         Transactions on Networking, 15(5):1123-1135, 2007

11. Acknowledgments

   Special thanks to Les Ginsberg and Anton Smirnov for the valuable
   comments

   This document was prepared using 2-Word-v2.0.template.dot.



























Bashandy              Expires September 5, 2012               [Page 19]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

Appendix A.                 Changes from Version 01

   1. Use the term "underlying repair label" instead of just "repair
      label" to avoid confusion with the term "repair label" used in
      [11].

   2. In version 01, it was assumed in many places that the repairing
      router is the penultimate hop P router. Although this would
      probably be the most common case, it is not always true. Hence in
      this version the repairing router may be any core router

   3. Merged handling labeled and unlabeled prefixes into a single
      algorithm.

   4. Allowed sending a repair label for unlabeled prefixes and added
      the "Push" flag. This ensures loop-free repair even for unlabeled
      prefixes in case that the repair PE has eiBGP paths as mentioned
      in Section 3.3.

   5. In Section 3.3 discussing the rules governing the choice of the
      underlying repair label for labeled prefix, we changed the wording
      so that the primary egress PE "SHOULD" instead of "MAY" use the
      repair label advertised according to [11] as an underlying repair
      label.

   6. All occurrences of the term "backup" were replaced by "repair"  as
      the term "repair" is the commonly used term in the IP FRR context
      such as [10][9][8]

   7. Added the definition of primary and repair tunnels in Section 1.2.

   8. Added a definition of the term "Repair Next-hop" in Section 1.2.

   9. Modified the definition of "repair path" in Section 1.2 to being
      the repair next-hop plus the underlying repair label instead of
      being the repair PE plus the underlying repair label.

   10.Outlined inter-operability with existing IP FRR techniques in
      Section 5.

   11.There were few editorial corrections.









Bashandy              Expires September 5, 2012               [Page 20]


Internet-Draft      BGP FRR For Edge Node Failure            March 2012

Authors' Addresses

   Ahmed Bashandy
   Cisco Systems
   170 West Tasman Dr, San Jose, CA 95134
   Email: bashandy@cisco.com

   Burjiz Pithawala
   Cisco Systems
   170 West Tasman Dr, San Jose, CA 95134
   Email: bpithaw@cisco.com

   Keyur Patel
   Cisco Systems
   170 West Tasman Dr, San Jose, CA 95134
   Email: keyupate@cisco.com


































Bashandy              Expires September 5, 2012               [Page 21]


Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/