[Docs] [txt|pdf] [Tracker] [Email] [Diff1] [Diff2] [Nits] [IPR]

Versions: 00 01 02 03 draft-ietf-rtgwg-lf-conv-frmwk

INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005




Network Working Group                                         S. Bryant
Internet Draft                                                 M. Shand
Expiration Date: Dec 2005                                 Cisco Systems

                                                               Jun 2005




                A Framework for Loop-free Convergence
              <draft-bryant-shand-lf-conv-frmwk-01.txt>


Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other
   documents at any time.  It is inappropriate to use Internet-Drafts
   as reference material or to cite them other than as "work in
   progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/1id-abstracts.html

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

Abstract
   This draft describes mechanisms that may be used to prevent or to
   suppress the formation of micro-loops when an IP or MPLS network
   undergoes topology change due to failure, repair or management
   action.

Conventions used in this document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
   this document are to be interpreted as described in RFC 2119
   [RFC2119].


Bryant, Shand              Expires Jun 2004                   [Page 1]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


Table of Contents
1. Introduction........................................................3

2. The Nature of Micro-loops...........................................4

3. Applicability.......................................................5

4. Micro-loop Control Strategies.......................................5

5. Loop mitigation.....................................................6

6. Micro-loop Prevention...............................................8
 6.1. Incremental Cost Advertisement..................................8
 6.2. Single Tunnel Per Router........................................9
 6.3. Distributed Tunnels............................................11
 6.4. Packet Marking.................................................11
 6.5. Ordered SPFs...................................................12
 6.6. Synchronised FIB Updates.......................................14
7. Loop Suppression...................................................14

8. Compatibility Issues...............................................15

9. Comparison of Loop-free Convergence Methods........................15

10. IANA considerations...............................................16

11. Security Considerations...........................................16

12. Intellectual Property Statement...................................16

13. Full copyright statement..........................................17

14. Normative References..............................................17

15. Informative References............................................17

16. Authors' Addresses................................................18













Bryant, Shand              Expires Jun 2004                   [Page 2]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005






1.
    Introduction

   When there is a change to the network topology (due to the failure
   or restoration of a link or router, or as a result of management
   action) the routers need to converge on a common view of the new
   topology, and the paths to be used for forwarding traffic to each
   destination. During this process, referred to as a routing
   transition, packet delivery between certain source/destination
   pairs may be disrupted. This occurs due to the time it takes for
   the topology change to be propagated around the network together
   with the time it takes each individual router to determine and then
   update the forwarding information base (FIB) for the affected
   destinations. During this transition, packets are lost due to the
   continuing attempts to use of the failed component, and due to
   forwarding loops. Forwarding loops arise due to the inconsistent
   FIBs that occur as a result of the difference in time taken by
   routers to execute the transition process. This is a problem that
   occurs in both IP networks and MPLS networks that use LDP [RFC3036]
   as the label switched path (LSP) signaling protocol.

   The service failures caused by routing transitions are largely
   hidden by higher-level protocols that retransmit the lost data.
   However new Internet services are emerging which are more sensitive
   to the packet disruption that occurs during a transition. To make
   the transition transparent to their users, these services require a
   short routing transition. Ideally, routing transitions would be
   completed in zero time with no packet loss.

   Regardless of how optimally the mechanisms involved have been
   designed and implemented, it is inevitable that a routing
   transition will take some minimum interval that is greater than
   zero. This has led to the development of a TE fast-reroute
   mechanism for MPLS [MPLS-TE]. Alternative mechanisms that might be
   deployed in an MPLS network and mechanisms that may be used in an
   IP network are work in progress in the IETF [IPFRR]. Any repair
   mechanism may however be disrupted by the formation of micro-loops
   during the period between the time when the failure is announced,
   and the time when all FIBs have been updated to reflect the new
   topology.

   There is, however, little point is introducing new mechanisms into
   an IP network to provide fast re-route, without also deploying
   mechanisms that prevent the disruptive effects of micro-loops which
   may starve the repair or cause congestion loss as a result of
   looping packets.

   The disruptive effect of micro-loops is not confined to periods
   when there is a component failure. Micro-loops can, for example,
   form when a component is put back into service following repair.


Bryant, Shand              Expires Jun 2004                   [Page 3]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   Micro-loops can also form as a result of a network maintenance
   action such as adding a new network component, removing a network
   component or modifying a link cost.

   This framework provides a summary of the mechanisms that have been
   proposed to address the micro-loop issue.



2.
   The Nature of Micro-loops

   Micro-loops may form during the periods when a network is re-
   converging following ANY topology change, and are caused by
   inconsistent FIBs in the routers. During the transition, micro-
   loops may occur over a single link between a pair of routers that
   temporarily use each other as the next hop for a prefix. Micro-
   loops may also form when a cycle of routers have the next router in
   the cycle as a next hop for a prefix. Cyclic micro-loops always
   include at least one link with an asymmetric cost, and/or at least
   two symmetric cost link cost changes within the convergence time.

   Micro-loops have two undesirable side-effects, congestion and
   repair starvation. A looping packet consumes bandwidth until it
   either escapes as a result of the re-synchronization of the FIBs,
   or its TTL expires. This transiently increases the traffic over a
   link by as much as 128 times, and may cause the link to congest.
   This congestion reduces the bandwidth available to other traffic
   (which is not otherwise affected by the topology change). As a
   result the "innocent" traffic using the link experiences increased
   latency, and is liable to congestive packet loss.

   In cases where the link or node failure has been protected by a
   fast re-route repair, the inconsistency in the FIBs prevents some
   traffic from reaching the failure and hence being repaired. The
   repair may thus become starved of traffic and hence become
   ineffective. Thus in addition to the congestive damage, the repair
   is rendered ineffective by the micro-loop. Similarly, if the
   topology change is the result of management action the link could
   have been retained in service throughout the transition (i.e. the
   link acts as its own repair path), however, if micro-loops form,
   they prevent productive forwarding during the transition.

   Unless otherwise controlled, micro-loops may form in any part of
   the network that forwards (or in the case of a new link, will
   forward) packets over a path that includes the affected topology
   change. The time taken to propagate the topology change through the
   network, and the non-uniform time taken by each router to calculate
   the new SPT and update its FIB may significantly extend the
   duration of the packet disruption caused by the micro-loops. In
   some cases a packet may be subject to disruption from micro-loops
   which occur sequentially at links along the path, thus further
   extending the period of disruption beyond that required to resolve
   a single loop.


Bryant, Shand              Expires Jun 2004                   [Page 4]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


3.
   Applicability

   Loop free convergence techniques are applicable [APPL] to any
   situation in which micro-loops may form. For example the
   convergence of a network following:

   1) Component failure.

   2) Component repair.

   3) Management withdrawal of a component.

   4) Management insertion or a component.

   5) Management change of link cost (either positive or negative).

   6) External cost change, for example change of external gateway as
      a result of a BGP change.

   7) An SRLG failure.

   In each case, a component may be a link or a router.

   Loop free convergence techniques are applicable to both IP networks
   and MPLS enabled networks that use LDP, including LDP networks that
   use the single-hop tunnel fast-reroute mechanism.



4.
   Micro-loop Control Strategies.

   Micro-loop control strategies fall into three basic classes:

     1. Micro-loop mitigation

     2. Micro-loop prevention

     3. Micro-loop suppression

   A micro-loop mitigation scheme works by re-converging the network
   in such a way that it reduces, but does not eliminate, the
   formation of micro-loops. Such schemes cannot guarantee the
   productive forwarding of packets during the transition.

   A micro-loop prevention mechanism controls the re-convergence of
   network in such a way that no micro-loops form. Such a micro-loop
   prevention mechanism allows the continued use of any fast repair
   method until the network has converged on its new topology, and
   prevents the collateral damage that occurs to other traffic for the
   duration of each micro-loop. These mechanisms normally extend the
   duration of the re-convergence process. In the case of a fast re-
   route repair this means that the network requires the repair to
   remain in place longer than would otherwise be the case. This

Bryant, Shand              Expires Jun 2004                   [Page 5]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   causes extended problems to any traffic which is NOT repaired by an
   imperfect repair (as does ANY method which delays re-convergence).

   When a component is returned to service, or when a network
   management action has taken place, this additional delay does not
   cause traffic disruption, because there is no repair involved.
   However the extended delay is undesirable, because it increases the
   time that the network takes to be ready for another failure, and
   hence leaves it vulnerable to multiple failures.

   A micro-loop suppression mechanism attempts to eliminate the
   collateral damage done by micro-loops to other traffic. This may be
   achieved by, for example, using a packet monitoring method, which
   detects that a packet is looping and drops it. Such schemes make no
   attempt to productively forward the packet throughout the network
   transition.



5.
   Loop mitigation

   The only known loop mitigation approach is the safe-neighbors
   method described in [ZININ]. In this method, a micro-loop free
   next-hop safety condition is defined as follows:

   In a symmetric cost network, it is safe for router X to change to
   the use of neighbor Y as its next-hop for a specific destination if
   the path through Y to that destination satisfies both of the
   following criteria:

     1.   X considers Y as its loop-free neighbor based on the
          topology before the change AND

     2.   X considers Y as its downstream neighbor based on the
          topology after the change.

   In an asymmetric cost network, a stricter safety condition is
   needed, and the criterion is that:

          X considers Y as its downstream neighbor based on the
          topology both before and after the change.

   Based on these criteria, destinations are classified by each router
   into three classes:

    Type A destinations: Destinations unaffected by the change and
    also destinations whose next hop after the change satisfies the
    safety criteria.

    Type B destinations: Destinations that cannot be sent via the new
    primary next-hop because the safety criteria are not satisfied,
    but which can be sent via another next-hop that does satisfy the
    safety criteria.


Bryant, Shand              Expires Jun 2004                   [Page 6]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


    Type C destinations: All other destinations.

   Following a topology change, Type A destinations are immediately
   changed to go via the new topology. Type B destinations are
   immediately changed to go via the next hop that satisfies the
   safety criteria, even though this is not the shortest path. Type B
   destinations continue to go via this path until all routers have
   changed their Type C destinations over to the new next hop. Routers
   must not change their Type C destinations until all routers have
   changed their Type A2 and Type B destinations to the new or
   intermediate (safe) next hop.

   Simulations indicate that this approach produces a significant
   reduction in the number of links that are subject to micro-looping.
   However unlike all of the micro-loop prevention methods it is only
   a partial solution. In particular, micro-loops may form on any link
   joining a pair of type C routers.

   Because routers delay updating their Type C destination FIB
   entries, they will continue to route towards the failure during the
   time when the routers are changing their Type A and B destinations,
   and hence will continue to productively forward packets provided
   that viable repair paths exist.

   A backwards compatibility issue arises with the safe-next-hop
   scheme. If a router is not capable of micro-loop control, it will
   not correctly delay its FIB update. If all such routers had only
   type A destinations this loop migration mechanism would work as it
   was designed. Alternatively, if all such incapable routers had only
   type C destinations, the "covert" announcement mechanism used to
   trigger the tunnel based schemes could be used to cause the Type A
   and Type B destinations to be changed, with the incapable routers
   and routers having type C destinations delaying until they received
   the "real" announcement. Unfortunately, these two approaches are
   mutually incompatible.

   To recap, routers classify their destinations into three types A, B
   or C. Routers update their FIBs in three phases. A router first
   updates destinations that it has classified as type A or type B, it
   then updates destinations that it has classified as type C, and
   finally it corrects the temporary next hop used for destinations
   classified as type B.

   Note that simulations indicate that in most topologies treating
   type B destinations as type C results in only a small degradation
   in loop prevention. Also note that early simulation result appear
   to indicate that in production networks where some, but not all,
   links have asymmetric costs, using the asymmetric cost criterion
   actually REDUCES number of loop free destinations.

   This mechanism operates identically for both "bad-news" events,
   "good-news" events and SRLG failure.



Bryant, Shand              Expires Jun 2004                   [Page 7]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


6.
   Micro-loop Prevention

   Six micro-loop prevention methods have been proposed:

     1. Incremental cost advertisement

     2. Single Tunnel

     3. Distributed Tunnels

     4. Packet Marking

     5. Ordered SPF

     6. Synchronized FIBS



   Both of the tunnel methods, packet marking and ordered SPF could be
   combined with safe-neighbors [Zinin] to reduce the traffic that
   used the advanced method. Specifically all traffic could use safe
   neighbors except traffic between a pair of routers both of which
   consider the destination to be type C. The type C to type C traffic
   would be protected from micro-looping through the use of a loop
   prevention method.

   However, determining whether the new next hop router considers a
   destination to be type C may be computationally intensive. An
   alternative approach would be to use a loop prevention method for
   all local type C destinations. This would not require any
   additional computation, but would require the additional loop
   prevention method to be used in cases which would not have
   generated loops (i.e. when the new next-hop router considered this
   to be a type A or B destination).

   The amount of traffic that would use safe neighbors is highly
   dependent on the network topology and the specific change, but
   would be expected to be in the region %70 to %90 in typical
   networks.


6.1.
    Incremental Cost Advertisement

   When a link fails, the cost of the link is normally changed from
   its assigned metric to "infinity" in one step.  However, it can be
   proved that no micro-loops will form if the link cost is increased
   in suitable increments, and the network is allowed to stabilize
   before the next cost increment is advertised. Once the link cost
   has been increased to a value greater than that of the lowest
   alternative cost around the link, the link may be disabled without
   causing a micro-loop.



Bryant, Shand              Expires Jun 2004                   [Page 8]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   The criterion for a link cost change to be safe is that any link
   which is subjected to a cost change of x can only cause loops in a
   part of the network that has a cyclic cost less than or equal to x.
   Because there may exist links which have a cost of one in each
   direction, resulting in a cyclic cost of two, this can result in
   the link cost having to be raised in increments of one. However the
   increment can be larger where the minimum cost permits. Determining
   the minimum link cost in the network is trivial, but unfortunately,
   calculating the optimum increment is thought to be a costly
   calculation.

   This approach has the advantage that it requires no change to the
   routing protocol. It will work in any network that uses a link-
   state IGP because it does not require any co-operation from the
   other routers in the network. However the method can be extremely
   slow, particularly if large metrics are used. For the duration of
   the transition some parts of the network continue to use the old
   forwarding path, and hence use any repair mechanism for an extended
   period. In the case of a failure that cannot be fully repaired,
   some destinations may become unreachable for an extended period.

   Where the micro-loop prevention mechanism was being used to support
   a fast re-route repair the network may be vulnerable to a second
   failure for the duration of the controlled re-convergence.

   Where the micro-loop prevention mechanism was being used to support
   a reconfiguration of the network the extended time is less of an
   issue. In this case, because the real forwarding path is available
   throughout the whole transition, there is no conflict between
   concurrent change actions throughout the network.

   It will be appreciated that when a link is returned to service, its
   cost is reduced in small steps from "infinity" to its final cost,
   thereby providing similar micro-loop prevention during a
   "good-news" event. Note that the link cost may be decreased from
   "infinity" to any value greater than that of the lowest alternative
   cost around the link in one step without causing a micro-loop.

   When the failure is an SRLG the link cost increments must be
   coordinated across all members of the SRLG. This may be achieved by
   completing the transition of one link before starting the next, or
   by interleaving the changes. This can be achieved without the need
   for any protocol extensions, by for example, using existing
   identifiers to establish the ordering and the arrival of LSP/LSAs
   to trigger the generation of the next increment.


6.2.
    Single Tunnel Per Router

   This mechanism works by creating an overlay network using tunnels
   whose path is not effected by the topology change and carrying the
   traffic affected by the change in that new network. When all the
   traffic is in the new, tunnel based, network, the real network is


Bryant, Shand              Expires Jun 2004                   [Page 9]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   allowed to converge on the new topology. Because all the traffic
   that would be affected by the change is carried in the overlay
   network no micro-loops form. When all micro-loop preventing routers
   have their tunnels in place, all the routers in the network are
   informed of the change in the normal way, at which point micro-
   loops may form within isolated islands of non-micro-loop preventing
   routers. However, only traffic entering the network via such
   routers can micro-loop. All traffic entering the network via a
   micro-loop preventing router will be tunneled correctly to the
   nearest repairing router, including, if necessary being tunneled
   via a non-micro-loop preventing router, and will not micro-loop.
   When all the non-micro-loop preventing routers have converged, the
   micro-loop preventing routers can change from tunneling the packets
   to forwarding normally according to the new topology. This
   transition can occur in any order without micro-loops forming.

   When a failure is detected (or a link is withdrawn from service),
   the router adjacent to the failure issues a new ("covert") routing
   message announcing the topology change. This message is propagated
   through the network by all routers, but is only understood by
   routers capable of using one of the tunnel based micro-loop
   prevention mechanisms.

   Each of the micro-loop preventing routers builds a tunnel to the
   closest router adjacent to the failure. They then determine which
   of their traffic would transit the failure and place that traffic
   in the tunnel. When all of these tunnels are in place, the failure
   is then announced as normal. Because these tunnels will be
   unaffected by the transition, and because the routers protecting
   the link will continue the repair (or forward across the link being
   withdrawn), no traffic will be disrupted by the failure. When the
   network has converged these tunnels are withdrawn, allowing traffic
   to be forwarded along its new "natural" path. The order of tunnel
   insertion and withdrawal is not important, provided that the
   tunnels are all in place before the normal announcement is issued.

   This method completes in bounded time, and is much faster then the
   incremental cost method. Depending on the exact design it completes
   in two or three flood-SPF-FIB update cycles.

   Where there is no requirement to prevent the formation of micro-
   loops involving non-micro-loop preventing routers, a single,
   "normal" announcement may be made, and a local timer used to
   determine the time at which transition from tunneled forwarding to
   normal forwarding over the new topology may commence.

   This technique has the disadvantage that it requires traffic to be
   tunneled during the transition. This is an issue in IP networks
   because not all router designs are capable of high performance IP
   tunneling. It is also an issue in MPLS networks because the
   encapsulating router has to know the labels set that the
   decapsulating router is distributing.



Bryant, Shand              Expires Jun 2004                  [Page 10]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   A further disadvantage of this method is that it requires
   co-operation from all the routers within the routing domain to
   fully protect the network against micro-loops. However it can be
   shown that these micro-loops will be confined to contiguous groups
   of routers not executing this micro-loop prevention mechanism, and
   that it will only affect traffic arriving at the network through
   one of those routers.

   When a new link is added, the mechanism is run in reverse. When the
   "covert" announcement is heard, routers determine which traffic
   they will send over the new link, and tunnel that traffic to the
   router on the near side of that link. This path will not be
   affected by the presence of the new link. When the "normal"
   announcement is heard, they then update their FIB to send the
   traffic normally according to the new topology. Any traffic
   encountering a router that has not yet updated its FIB will be
   tunneled to the near side of the link, and will therefore not loop.

   When a management change to the topology is required, again exactly
   the same mechanism protects against micro-looping of packets by the
   micro-loop preventing routers.

   When the failure is an SRLG, the required strategy is to classify
   traffic according the first member of the SRLG that it will
   traverse on its way to the destination, and to tunnel that traffic
   to the router that is closest to that SRLG member. This will
   require multiple tunnel destinations, in the limiting case, one per
   SRLG member.


6.3.      Distributed Tunnels

   In the distributed tunnels loop prevention method, each router
   calculated its own PQ repair [TUNNEL] for its traffic affected by
   the failure. The path to the P router will not be affected by the
   convergence process. In a manner similar to the single tunnel case,
   traffic is repaired in response to the "covert" announcement and
   moved to a "natural" path using the new topology in response to a
   "normal" announcement.

   This reduces the load on the tunnel endpoints, but the length of
   time taken to calculate the repairs increases the convergence time.

   This method suffers from the same disadvantages as the single
   tunnel method.


6.4.     Packet Marking

   If packets could be marked in some way, this information could be
   used to assign them to either, the new topology, the old topology
   or a transition topology. They would then be correctly forwarded
   during the transition. This could, for example, be achieved by

Bryant, Shand              Expires Jun 2004                  [Page 11]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   allocating a Type of Service bit to the task [RFC791]. This
   mechanism works identically for both "bad-news" and "good-news"
   events. It also works identically for SRLG failure. There are three
   problems with this solution:

     1. The packet marking bit is generally not available.

     2. The mechanism would introduce a non-standard forwarding
        procedure.

     3. Packet marking using either the old or the new topology would
        double the size of the FIB, although the use of a transition
        topology, for example always via the failure and its repair,
        would have a trivial impact on FIB size.


6.5.     Ordered SPFs

   Micro loops occur following a failure or a cost increase, when a
   router closer to the failed component revises its routes to take
   account of the failure before a router which is further away. By
   analyzing the reverse spanning tree over which traffic is directed
   to the failed component in the old topology, it is possible to
   determine a strict ordering which ensures that nodes closer to the
   root always process the failure after any nodes further away, and
   hence micro loops are prevented.

   When the failure has been announced, each router waits a multiple
   of some time delay value. The multiple is determined by the node's
   position in the reverse spanning tree, and the delay value is
   chosen to guarantee that a node can complete its processing within
   this time. The convergence time may be reduced by employing a
   signaling mechanism to notify the parent when all the children have
   completed their processing, and hence when it was safe for the
   parent to instantiate its new routes.

   The property of this approach is therefore that it imposes a delay
   which is bounded by the network diameter although in many cases it
   will be much less.

   When a link is returned to service the convergence process above is
   reversed. A router first calculates the reverse spanning tree in
   the new topology rooted at the far end of the new link, and
   determines its distance from the new link (in hops). It then waits
   a time that is proportional to that distance before updating its
   FIB.  It will be seen that network management actions can similarly
   be undertaken by treating a cost increase in a manner similar to a
   failure and a cost decrease similar to a restoration.

   The ordered SPF mechanism requires all nodes in the domain to
   operate according to these procedures, and the presence of non
   co-operating nodes can give rise to loops for any traffic which
   traverses them (not just traffic which is originated through them).


Bryant, Shand              Expires Jun 2004                  [Page 12]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   Without additional mechanisms these loops could remain in place for
   a significant time.

   It should be noted that this method requires per router ordering,
   but not per prefix ordering. A router must wait its turn to update
   its FIB, but it should then update its entire FIB.

   Another way of viewing the operation of this method is to realize
   that there is a horizon of routers affected by the failure. Routers
   beyond the horizon do not send packets via the failure. Routers at
   the horizon have a neighbor that does not send packets via the
   failure. It is then obvious that routers on the horizon can use
   their neighbor that is over the horizon as a loop free alternate to
   the destination and can hence update their FIBs immediately. Once
   these routers have updated their FIBs, they move over the horizon
   with respect to the failure and their neighbors that are closer to
   the failure become the new horizon routers.

   Only routers within the horizon need to change their FIBs and hence
   only those routers need to delay changing their FIBs.

   When an SRLG failure occurs a router must classify traffic into the
   classes that pass over each member of the SRLG. Ordered SPF
   convergence is then carried out on each SRLG member individually
   and the FIB updated for only those prefixes allowed to change at
   each epoch. Again, as for the single failure case, signaling may be
   used to speed up the convergence process. Note that the special
   SRLG case of a full or partial node failure, can be deal with
   without using per prefix ordering, by running a single reverse SPF
   rooted at the failed node (or common point of the subset of failing
   links in the partial case).

   There are two classes of signaling optimization that can be applied
   to the ordered SPF loop-prevention method:

     1. When the router makes NO change, it can signal immediately.
        This significantly reduces the time taken by the network to
        process long chains of routers that have no change to make to
        their FIB.

     2. When a router HAS changed, it can signal that it has
        completed. This is more problematic since this may be
        difficult to determine, particularly in a distributed
        architecture, and the optimization obtained is only the
        difference between the actual time taken to make the FIB
        change and the worst case timer value.

   There is another method of executing ordered SPF which is based on
   pure signaling [OB]. Methods that use signaling as an optimization
   are safe because eventually they fall back on the established IGP
   mechanisms which ensure that networks converge under conditions of
   packet loss. However a mechanism that relies on signaling in order
   to converge requires a reliable signaling mechanism which must be
   proven to recover from any failure circumstance.

Bryant, Shand              Expires Jun 2004                  [Page 13]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


6.6.     Synchronised FIB Updates

   Micro-loops form because of the asynchronous nature of the FIB
   update process during a network transition. In many router
   architectures it is the time taken to update the FIB itself that is
   the dominant term. One approach would be to have two FIBs and, in a
   synchronized action throughout the network, to switch from the old
   to the new. One way to achieve this synchronized change would be to
   signal or otherwise determine the wall clock time of the change,
   and then execute the change at that time, using NTP to synchronize
   the wall clocks in the routers.

   This approach has a number of major issues. Firstly two complete
   FIBs are needed which may create a scaling issue and secondly a
   suitable network wide synchronization method is needed. However,
   neither of these are insurmountable problems.

   Since the FIB change synchronization will not be perfect there may
   be some interval during which micro-loops form. Whether this scheme
   is classified as a micro-loop prevention mechanism or a micro-loop
   avoidance mechanism within this taxonomy is therefore dependent on
   the degree of synchronization achieved.

   This mechanism works identically for both "bad-news" and "good-
   news" events. It also works identically for SRLG failure.

   Further consideration needs to be given to interoperating with
   routers that do not support this mechanism. Without a suitable
   interoperating mechanism, loops may form for the duration of the
   synchronization delay.



7.    Loop Suppression

   A micro-loop suppression mechanism recognizes that a packet is
   looping and drops it. One such approach would be for a router to
   recognize, by some means, that it had seen the same packet before.
   It is difficult to see how sufficiently reliable discrimination
   could be achieved without some form of per-router signature such as
   route recording. A packet recognizing approach therefore seems
   infeasible.

   An alternative approach would be to recognize that a packet was
   looping by recognizing that it was being sent back to the place
   that it had just come from. This would work for the types of loop
   that form in symmetric cost networks, but would not suppress the
   cyclic loops that form in asymmetric networks.

   This mechanism operates identically for both "bad-news" events,
   "good-news" events and SRLG failure.



Bryant, Shand              Expires Jun 2004                  [Page 14]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   The problem with this class of micro-loop control strategies is
   that whilst they prevent collateral damage they do nothing to
   enhance the productive forwarding of packets during the network
   transition.



8.    Compatibility Issues

   Deployment of any micro-loop control mechanism is a major change to
   a network. Full consideration must be given to interoperation
   between routers that are capable of micro-loop control, and those
   that are not. Additionally there may be a desire to limit the
   complexity of micro-loop control by choosing a method based purely
   on its simplicity. Any such decision must take into account that if
   a more capable scheme is needed in the future, its deployment will
   be complicated by interaction with the scheme previously deployed.



9.    Comparison of Loop-free Convergence Methods

   Safe-neighbors is an efficient mechanism to prevent the formation
   of micro-loops, but is only a partial solution. It is a useful
   adjunct to one of the complete solutions.

   Incremental cost advertisement is impractical because it takes too
   long to complete.

   Packet Marking is impractical because of the need to find the
   marking bit.

   Of the remaining methods distributed tunnels is significantly more
   complex than single tunnels, and should only be considered if a
   tunnel solution is preferred, and even with the use of loop
   mitigation, the tunnel decapsulation load needs to be reduced on
   the router adjacent to the topology change.

   Synchronised FIBs is a fast method, but has the issue that a
   suitable synchronization mechanism needs to be defined. One method
   would be to use NTP, however the coupling of routing convergence to
   a protocol that uses the network may be a problem. During the
   transition there will be some micro-looping for a short interval
   because it is not possible to achieve complete synchronization of
   the FIB changeover.

   The ordered SPF mechanism has the major advantage that it is a
   control plane only solution. However, SRLGs require a per-
   destination calculation, and the convergence delay is high, bounded
   by the network diameter. When combined with signaling as an
   accelerator and safe-neigbours to reduce the number of destinations
   that experience the full delay this method is one of the two best
   choices.

Bryant, Shand              Expires Jun 2004                  [Page 15]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


   The single tunnel method deals relatively easily with SRLGs and
   uncorrelated changes. The convergence delay would be small. However
   the method requires the use of tunneled forwarding which is not
   supported on all router hardware, and raises issues of forwarding
   performance. When used with safe-neighbors, the amount of traffic
   that was tunneled would be significantly reduced, thus reducing the
   forwarding performance concerns. This method would be a good choice
   in combination with a tunneled IPFRR method. It is the other
   promising loop prevention candidate.



10.     IANA considerations

   There are no IANA considerations that arise from this draft.



11.     Security Considerations

   All micro-loop control mechanisms raise significant security issues
   which must be addressed in their detailed technical description.



12.     Intellectual Property Statement


   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed
   to pertain to the implementation or use of the technology described
   in this document or the extent to which any license under such
   rights might or might not be available; nor does it represent that
   it has made any independent effort to identify any such rights.
   Information on the procedures with respect to rights in RFC
   documents can be found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use
   of such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository
   at http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.





Bryant, Shand              Expires Jun 2004                  [Page 16]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


13.      Full copyright statement

   Copyright (C) The Internet Society (2005). This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.

   This document and the information contained herein are provided on
   an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
   REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT
   THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR
   ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
   PARTICULAR PURPOSE.



14.     Normative References

   There are no normative references.



15.     Informative References

   Internet-drafts are works in progress available from
   <http://www.ietf.org/internet-drafts/>

   [APPL]        Bryant, S., Shand, M., "Applicability of Loop-
                 free Convergence", <draft-bryant-shand-lf-
                 applicability-00.txt>, Jun 2005, (work in
                 progress).

   [OB]          Avoiding transient loops during IGP convergence
                 P. Francois, O. Bonaventure
                 IEEE INFOCOM 2005, March 2005, Miami, Fl., USA

   IPFRR         Shand, M., "IP Fast-reroute Framework",
                 <draft-ietf-rtgwg-ipfrr-framework-01.txt>, June
                 2004, (work in progress).

   LDP           Andersson, L., Doolan, P., Feldman, N.,
                 Fredette, A. and B. Thomas, "LDP
                 Specification", RFC 3036,
                 January 2001.

   MPLS-TE       Ping Pan, et al, "Fast Reroute Extensions to
                 RSVP-TE for LSP Tunnels",
                 <draft-ietf-mpls-rsvp-lsp-fastreroute-07.txt>,
                 (work in progress).

   RFC791        RFC-791, Internet Protocol Protocol


Bryant, Shand              Expires Jun 2004                  [Page 17]


INTERNET DRAFT  A Framework for Loop-free Convergence        Dec 2005


                 Specification, September 1981

   TUNNEL        Bryant, S., Shand, M., "IP Fast Reroute using
                 tunnels", <draft-bryant-ipfrr-tunnels-02.txt>,
                 Apr 2005 (work in progress).

   ZININ         Zinin, A., "Analysis and Minimization of
                 Microloops in Link-state Routing Protocols",
                 <draft-zinin-microloop-analysis-01.txt>, May
                 2005 (work in progress).




16.    Authors' Addresses


   Mike Shand
   Cisco Systems,
   250, Longwater,
   Green Park,
   Reading, RG2 6GB,
   United Kingdom.             Email: mshand@cisco.com



   Stewart Bryant
   Cisco Systems,
   250, Longwater,
   Green Park,
   Reading, RG2 6GB,
   United Kingdom.             Email: stbryant@cisco.com






















Bryant, Shand              Expires Jun 2004                  [Page 18]


Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/