draft-ietf-rtgwg-ipfrr-framework-07.txt   draft-ietf-rtgwg-ipfrr-framework-08.txt 
Network Working Group M. Shand Network Working Group M. Shand
Internet Draft S. Bryant Internet-Draft S. Bryant
Expiration Date: Dec 2007 Cisco Systems Intended status: Informational Cisco Systems
IP Fast Reroute Framework Expires: August 28, 2008 February 25, 2008
draft-ietf-rtgwg-ipfrr-framework-07.txt IP Fast Reroute Framework
draft-ietf-rtgwg-ipfrr-framework-08.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other Task Force (IETF), its areas, and its working groups. Note that
groups may also distribute working documents as Internet-Drafts. other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress". material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 28, 2008.
Copyright Notice Copyright Notice
Copyright (C) The IETF Trust (2007). All rights reserved. Copyright (C) The IETF Trust (2008).
Abstract Abstract
This document provides a framework for the development of IP This document provides a framework for the development of IP fast-
fast-reroute mechanisms which provide protection against link or reroute mechanisms which provide protection against link or router
router failure by invoking locally determined repair paths. Unlike failure by invoking locally determined repair paths. Unlike MPLS
MPLS Fast-reroute, the mechanisms are applicable to a network Fast-reroute, the mechanisms are applicable to a network employing
employing conventional IP routing and forwarding. An essential part conventional IP routing and forwarding. An essential part of such
of such mechanisms is the prevention of packet loss caused by the mechanisms is the prevention of packet loss caused by the loops which
loops which normally occur during the re-convergence of the network normally occur during the re-convergence of the network following a
following a failure. failure.
Terminology Table of Contents
1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . 6
4. Mechanisms for IP Fast-reroute . . . . . . . . . . . . . . . . 7
4.1. Mechanisms for fast failure detection . . . . . . . . . . 7
4.2. Mechanisms for repair paths . . . . . . . . . . . . . . . 8
4.2.1. Scope of repair paths . . . . . . . . . . . . . . . . 9
4.2.2. Analysis of repair coverage . . . . . . . . . . . . . 9
4.2.3. Link or node repair . . . . . . . . . . . . . . . . . 10
4.2.4. Maintenance of Repair paths . . . . . . . . . . . . . 11
4.2.5. Multiple failures and Shared Risk Link Groups . . . . 11
4.3. Local Area Networks . . . . . . . . . . . . . . . . . . . 12
4.4. Mechanisms for micro-loop prevention . . . . . . . . . . . 12
5. Management Considerations . . . . . . . . . . . . . . . . . . 12
6. Scope and applicability . . . . . . . . . . . . . . . . . . . 13
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13
8. Security Considerations . . . . . . . . . . . . . . . . . . . 13
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13
10. Informative References . . . . . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15
Intellectual Property and Copyright Statements . . . . . . . . . . 16
1. Terminology
This section defines words and acronyms used in this draft and other This section defines words and acronyms used in this draft and other
drafts discussing IP Fast-reroute. drafts discussing IP Fast-reroute.
D Used to denote the destination router under D Used to denote the destination router under
discussion. discussion.
Distance_opt(A,B) The distance of the shortest path from A Distance_opt(A,B) The distance of the shortest path from A to B.
to B.
Downstream Path This is a subset of the loop-free alternates Downstream Path This is a subset of the loop-free alternates
where the neighbor N meet the following where the neighbor N meet the following
condition:- condition:-
Distance_opt(N, D) < Distance_opt(S,D) Distance_opt(N, D) < Distance_opt(S,D)
E Used to denote the router which is the E Used to denote the router which is the primary
primary next-hop neighbor to get from S to next-hop neighbor to get from S to the
the destination D. Where there is an ECMP set destination D. Where there is an ECMP set for the
for the shortest path from S to D, these are shortest path from S to D, these are referred to
referred to as E_1, E_2, etc. as E_1, E_2, etc.
ECMP Equal cost multi-path: Where, for a ECMP Equal cost multi-path: Where, for a particular
particular destination D, multiple primary destination D, multiple primary next-hops are
next-hops are used to forward traffic because used to forward traffic because there exist
there exist multiple shortest paths from S multiple shortest paths from S via different
via different output layer-3 interfaces. output layer-3 interfaces.
FIB Forwarding Information Base. The database FIB Forwarding Information Base. The database used
used by the packet forwarder to determine by the packet forwarder to determine what actions
what actions to perform on a packet. to perform on a packet.
IPFRR IP fast-reroute IPFRR IP fast-reroute.
Link(A->B) A link connecting router A to router B. Link(A->B) A link connecting router A to router B.
Loop-Free This is a neighbor N, that is not a primary LFA Loop Free Alternate. This is a neighbor N, that
Alternate next-hop neighbor E, whose shortest path to is not a primary next-hop neighbor E, whose
the destination D does not go back through shortest path to the destination D does not go
the router S. The neighbor N must meet the back through the router S. The neighbor N must
following condition:- meet the following condition:-
Distance_opt(N, D) < Distance_opt(N, S) + Distance_opt(N, D) < Distance_opt(N, S) +
Distance_opt(S, D) Distance_opt(S, D)
Loop Free Neighbor A neighbor N_i, which is not the particular
primary neighbor E_k under discussion, and whose
shortest path to D does not traverse S. For
example, if there are two primary neighbors E_1
and E_2, E_1 is a loop-free neighbor with regard
to E_2 and vice versa.
Loop-Free A neighbor N_i, which is not the particular Loop Free Link Protecting Alternate
Neighbor primary neighbor E_k under discussion, and This is a path via a Loop-Free Neighbor N_i which
whose shortest path to D does not traverse S. does not go through the particular link of S
For example, if there are two primary which is being protected to reach the destination
neighbors E_1 and E_2, E_1 is a loop-free D.
neighbor with regard to E_2 and vice versa.
Loop-Free This is a path via a Loop-Free Neighbor N_i
Link-Protecting which does not go through the particular link
Alternate of S which is being protected to reach the
destination D.
Loop-Free This is a path via a Loop-Free Neighbor N_i
Node-Protecting which does not go through the particular
Alternate primary neighbor of S which is being
protected to reach the destination D.
Micro-loop A temporary forwarding loop which exists Loop Free Node-protecting Alternate
during a routing transition as a result of This is a path via a Loop-Free Neighbor N_i which
temporary inconsistencies between FIBs. does not go through the particular primary
neighbor of S which is being protected to reach
the destination D.
N_i The ith neighbor of S. N_i The ith neighbor of S.
Primary Neighbor A neighbor N_i of S which is one of the next Primary Neighbor A neighbor N_i of S which is one of the next hops
hops for destination D in S's FIB prior to for destination D in S's FIB prior to any
any failure. failure.
R_i_j The jth neighbor of N_i. R_i_j The jth neighbor of N_i.
Routing The process whereby routers converge on a new Routing Transition The process whereby routers converge on a new
transition topology. In conventional networks this topology. In conventional networks this process
process frequently causes some disruption to frequently causes some disruption to packet
packet delivery. delivery.
RPF Reverse Path Forwarding. I.e. checking that a RPF Reverse Path Forwarding. I.e. checking that a
packet is received over the interface which packet is received over the interface which would
would be used to send packets addressed to be used to send packets addressed to the source
the source address of the packet. address of the packet.
S Used to denote a router that is the source of S Used to denote a router that is the source of a
a repair that is computed in anticipation of repair that is computed in anticipation of the
the failure of a neighboring router denoted failure of a neighboring router denoted as E, or
as E, or of the link between S and E. It is of the link between S and E. It is the viewpoint
the viewpoint from which IP Fast-Reroute is from which IP Fast-Reroute is described.
described.
S_i The set of neighbors of E, in addition to S, S_i The set of neighbors of E, in addition to S,
which will independently take the role of S which will independently take the role of S for
for the traffic they carry. the traffic they carry.
SPF Shortest Path First, e.g. Dijkstra's SPF Shortest Path First, e.g. Dijkstra's algorithm.
algorithm.
SPT Shortest path tree SPT Shortest path tree
Upstream This is a forwarding loop which involves a Upstream Forwarding Loop
Forwarding Loop set of routers, none of which are directly This is a forwarding loop which involves a set of
connected to the link which has caused the routers, none of which are directly connected to
topology change that triggered a new SPF in the link which has caused the topology change
any of the routers. that triggered a new SPF in any of the routers.
1. Introduction 2. Introduction
When a link or node failure occurs in a routed network, there is When a link or node failure occurs in a routed network, there is
inevitably a period of disruption to the delivery of traffic until inevitably a period of disruption to the delivery of traffic until
the network re-converges on the new topology. Packets for the network re-converges on the new topology. Packets for
destinations which were previously reached by traversing the failed destinations which were previously reached by traversing the failed
component may be dropped or may suffer looping. Traditionally such component may be dropped or may suffer looping. Traditionally such
disruptions have lasted for periods of at least several seconds, and disruptions have lasted for periods of at least several seconds, and
most applications have been constructed to tolerate such a quality of most applications have been constructed to tolerate such a quality of
service. service.
skipping to change at page 5, line 31 skipping to change at page 5, line 41
Addressing these issues is difficult because the distributed nature Addressing these issues is difficult because the distributed nature
of the network imposes an intrinsic limit on the minimum convergence of the network imposes an intrinsic limit on the minimum convergence
time which can be achieved. time which can be achieved.
However, there is an alternative approach, which is to compute backup However, there is an alternative approach, which is to compute backup
routes that allow the failure to be repaired locally by the router(s) routes that allow the failure to be repaired locally by the router(s)
detecting the failure without the immediate need to inform other detecting the failure without the immediate need to inform other
routers of the failure. In this case, the disruption time can be routers of the failure. In this case, the disruption time can be
limited to the small time taken to detect the adjacent failure and limited to the small time taken to detect the adjacent failure and
invoke the backup routes. This is analogous to the technique employed invoke the backup routes. This is analogous to the technique
by MPLS Fast-Reroute [MPLSFRR], but the mechanisms employed for the employed by MPLS Fast-Reroute [RFC4090], but the mechanisms employed
backup routes in pure IP networks are necessarily very different. for the backup routes in pure IP networks are necessarily very
different.
This document provides a framework for the development of this This document provides a framework for the development of this
approach. approach.
2. Problem Analysis 3. Problem Analysis
The duration of the packet delivery disruption caused by a The duration of the packet delivery disruption caused by a
conventional routing transition is determined by a number of factors: conventional routing transition is determined by a number of factors:
1. The time taken to detect the failure. This may be of the order 1. The time taken to detect the failure. This may be of the order
of a few mS when it can be detected at the physical layer, up to of a few mS when it can be detected at the physical layer, up to
several tens of seconds when a routing protocol hello is several tens of seconds when a routing protocol hello is
employed. During this period packets will be unavoidably lost. employed. During this period packets will be unavoidably lost.
2. The time taken for the local router to react to the failure. 2. The time taken for the local router to react to the failure.
This will typically involve generating and flooding new routing This will typically involve generating and flooding new routing
updates, perhaps after some hold-down delay, and re-computing updates, perhaps after some hold-down delay, and re-computing the
the router's FIB. router's FIB.
3. The time taken to pass the information about the failure to 3. The time taken to pass the information about the failure to other
other routers in the network. In the absence of routing protocol routers in the network. In the absence of routing protocol
packet loss, this is typically between 10mS and 100mS per hop. packet loss, this is typically between 10mS and 100mS per hop.
4. The time taken to re-compute the forwarding tables. This is 4. The time taken to re-compute the forwarding tables. This is
typically a few mS for a link state protocol using Dijkstra's typically a few mS for a link state protocol using Dijkstra's
algorithm. algorithm.
5. The time taken to load the revised forwarding tables into the 5. The time taken to load the revised forwarding tables into the
forwarding hardware. This time is very implementation dependant forwarding hardware. This time is very implementation dependant
and also depends on the number of prefixes affected by the and also depends on the number of prefixes affected by the
failure, but may be several hundred mS. failure, but may be several hundred mS.
The disruption will last until the routers adjacent to the failure The disruption will last until the routers adjacent to the failure
have completed steps 1 and 2, and then all the routers in the network have completed steps 1 and 2, and then all the routers in the network
whose paths are affected by the failure have completed the remaining whose paths are affected by the failure have completed the remaining
steps. steps.
The initial packet loss is caused by the router(s) adjacent to the The initial packet loss is caused by the router(s) adjacent to the
failure continuing to attempt to transmit packets across the failure failure continuing to attempt to transmit packets across the failure
until it is detected. This loss is unavoidable, but the detection until it is detected. This loss is unavoidable, but the detection
time can be reduced to a few tens of mS as described in section 3.1. time can be reduced to a few tens of mS as described in Section 4.1.
Subsequent packet loss is caused by the "micro-loops" which form Subsequent packet loss is caused by the "micro-loops" which form
because of temporary inconsistencies between routers' forwarding because of temporary inconsistencies between routers' forwarding
tables. These occur as a result of the different times at which tables. These occur as a result of the different times at which
routers update their forwarding tables to reflect the failure. These routers update their forwarding tables to reflect the failure. These
variable delays are caused by steps 3, 4 and 5 above and in many variable delays are caused by steps 3, 4 and 5 above and in many
routers it is step 5 which is both the largest factor and which has routers it is step 5 which is both the largest factor and which has
the greatest variance between routers. The large variance arises from the greatest variance between routers. The large variance arises
implementation differences and from the differing impact that a from implementation differences and from the differing impact that a
failure has on each individual router. For example, the number of failure has on each individual router. For example, the number of
prefixes affected by the failure may vary dramatically from one prefixes affected by the failure may vary dramatically from one
router to another. router to another.
In order to achieve packet disruption times which are commensurate In order to achieve packet disruption times which are commensurate
with the failure detection times it is necessary to perform two with the failure detection times it is necessary to perform two
distinct tasks: distinct tasks:
1. Provide a mechanism for the router(s) adjacent to the failure to 1. Provide a mechanism for the router(s) adjacent to the failure to
rapidly invoke a repair path, which is unaffected by any rapidly invoke a repair path, which is unaffected by any
skipping to change at page 7, line 15 skipping to change at page 7, line 38
Similarly, micro-loop avoidance can be used in isolation to prevent Similarly, micro-loop avoidance can be used in isolation to prevent
loops arising from pre-planned management action, because the link or loops arising from pre-planned management action, because the link or
node being shut down can remain in service for a short time after its node being shut down can remain in service for a short time after its
removal has been announced into the network, and hence it can removal has been announced into the network, and hence it can
function as its own "repair path". function as its own "repair path".
Note that micro-loops can also occur when a link or node is restored Note that micro-loops can also occur when a link or node is restored
to service and thus a micro-loop avoidance mechanism is required for to service and thus a micro-loop avoidance mechanism is required for
both link up and link down cases. both link up and link down cases.
3. Mechanisms for IP Fast-reroute 4. Mechanisms for IP Fast-reroute
The set of mechanisms required for an effective solution to the The set of mechanisms required for an effective solution to the
problem can be broken down into the following sub-problems. problem can be broken down into the following sub-problems.
3.1. Mechanisms for fast failure detection 4.1. Mechanisms for fast failure detection
It is critical that the failure detection time is minimized. A number It is critical that the failure detection time is minimized. A
of approaches are possible, such as: number of approaches are possible, such as:
1. Physical detection; for example, loss of light. 1. Physical detection; for example, loss of light.
2. Routing protocol independent protocol detection; for example, 2. Routing protocol independent protocol detection; for example, The
The Bidirectional Failure Detection protocol [BFD]. Bidirectional Failure Detection protocol [I-D.ietf-bfd-base].
3. Routing protocol detection; for example, use of "fast hellos". 3. Routing protocol detection; for example, use of "fast hellos".
3.2. Mechanisms for repair paths 4.2. Mechanisms for repair paths
Once a failure has been detected by one of the above mechanisms, Once a failure has been detected by one of the above mechanisms,
traffic which previously traversed the failure is transmitted over traffic which previously traversed the failure is transmitted over
one or more repair paths. The design of the repair paths should be one or more repair paths. The design of the repair paths should be
such that they can be pre-calculated in anticipation of each local such that they can be pre-calculated in anticipation of each local
failure and made available for invocation with minimal delay. There failure and made available for invocation with minimal delay. There
are three basic categories of repair paths: are three basic categories of repair paths:
1. Equal cost multi-paths (ECMP). Where such paths exist, and one 1. Equal cost multi-paths (ECMP). Where such paths exist, and one
or more of the alternate paths do not traverse the failure, they or more of the alternate paths do not traverse the failure, they
may trivially be used as repair paths. may trivially be used as repair paths.
2. Loop free alternate paths. Such a path exists when a direct 2. Loop free alternate paths. Such a path exists when a direct
neighbor of the router adjacent to the failure has a path to the neighbor of the router adjacent to the failure has a path to the
destination which can be guaranteed not to traverse the failure. destination which can be guaranteed not to traverse the failure.
3. Multi-hop repair paths. When there is no feasible loop free 3. Multi-hop repair paths. When there is no feasible loop free
alternate path it may still be possible to locate a router, alternate path it may still be possible to locate a router, which
which is more than one hop away from the router adjacent to the is more than one hop away from the router adjacent to the
failure, from which traffic will be forwarded to the destination failure, from which traffic will be forwarded to the destination
without traversing the failure. without traversing the failure.
ECMP and loop free alternate paths (as described in [BASE]) offer the ECMP and loop free alternate paths (as described in
simplest repair paths and would normally be used when they are [I-D.ietf-rtgwg-ipfrr-spec-base]) offer the simplest repair paths and
available. It is anticipated that around 80% of failures (see section would normally be used when they are available. It is anticipated
3.2.2) can be repaired using these basic methods alone. that around 80% of failures (see Section 4.2.2) can be repaired using
these basic methods alone.
Multi-hop repair paths are more complex, both in the computations Multi-hop repair paths are more complex, both in the computations
required to determine their existence, and in the mechanisms required required to determine their existence, and in the mechanisms required
to invoke them. They can be further classified as: to invoke them. They can be further classified as:
1. Mechanisms where one or more alternate FIBs are pre-computed in 1. Mechanisms where one or more alternate FIBs are pre-computed in
all routers and the repaired packet is instructed to be all routers and the repaired packet is instructed to be forwarded
forwarded using a "repair FIB" by some method of per packet using a "repair FIB" by some method of per packet signaling such
signaling such as detecting a "U-turn" [U-TURNS, FIFR] or by as detecting a "U-turn" [I-D.atlas-ip-local-protect-uturn] ,
marking the packet [SIMULA]. [FIFR] or by marking the packet [SIMULA].
2. Mechanisms functionally equivalent to a loose source route which 2. Mechanisms functionally equivalent to a loose source route which
is invoked using the normal FIB. These include tunnels is invoked using the normal FIB. These include tunnels
[TUNNELS], alternative shortest paths [ALT-SP] and label based [I-D.bryant-ipfrr-tunnels], alternative shortest paths
mechanisms. [I-D.tian-frr-alt-shortest-path] and label based mechanisms.
3. Mechanisms employing special addresses or labels which are 3. Mechanisms employing special addresses or labels which are
installed in the FIBs of all routers with routes pre-computed to installed in the FIBs of all routers with routes pre-computed to
avoid certain components of the network. For example [NOT-VIA]. avoid certain components of the network. For example
[I-D.ietf-rtgwg-ipfrr-notvia-addresses].
In many cases a repair path which reaches two hops away from the In many cases a repair path which reaches two hops away from the
router detecting the failure will suffice, and it is anticipated that router detecting the failure will suffice, and it is anticipated that
around 98% of failures (see section 3.2.2) can be repaired by this around 98% of failures (see Section 4.2.2) can be repaired by this
method. However, to provide complete repair coverage some use of method. However, to provide complete repair coverage some use of
longer multi-hop repair paths is generally necessary. longer multi-hop repair paths is generally necessary.
3.2.1. Scope of repair paths 4.2.1. Scope of repair paths
A particular repair path may be valid for all destinations which A particular repair path may be valid for all destinations which
require repair or may only be valid for a subset of destinations. If require repair or may only be valid for a subset of destinations. If
a repair path is valid for a node immediately downstream of the a repair path is valid for a node immediately downstream of the
failure, then it will be valid for all destinations previously failure, then it will be valid for all destinations previously
reachable by traversing the failure. However, in cases where such a reachable by traversing the failure. However, in cases where such a
repair path is difficult to achieve because it requires a high order repair path is difficult to achieve because it requires a high order
multi-hop repair path, it may still be possible to identify lower multi-hop repair path, it may still be possible to identify lower
order repair paths (possibly even loop free alternate paths) which order repair paths (possibly even loop free alternate paths) which
allow the majority of destinations to be repaired. When IPFRR is allow the majority of destinations to be repaired. When IPFRR is
skipping to change at page 9, line 16 skipping to change at page 9, line 49
be repaired using only the "basic" repair mechanism, leaving a be repaired using only the "basic" repair mechanism, leaving a
smaller subset of the destinations to be repaired using one of the smaller subset of the destinations to be repaired using one of the
more complex multi-hop methods. Such a hybrid approach may go some more complex multi-hop methods. Such a hybrid approach may go some
way to resolving the conflict between completeness and complexity. way to resolving the conflict between completeness and complexity.
The use of repair paths may result in excessive traffic passing over The use of repair paths may result in excessive traffic passing over
a link, resulting in congestion discard. This reduces the a link, resulting in congestion discard. This reduces the
effectiveness of IPFRR. Mechanisms to influence the distribution of effectiveness of IPFRR. Mechanisms to influence the distribution of
repaired traffic to minimize this effect are therefore desirable. repaired traffic to minimize this effect are therefore desirable.
3.2.2. Analysis of repair coverage 4.2.2. Analysis of repair coverage
In some cases the repair strategy will permit the repair of all In some cases the repair strategy will permit the repair of all
single link or node failures in the network for all possible single link or node failures in the network for all possible
destinations. This can be defined as 100% coverage. However, where destinations. This can be defined as 100% coverage. However, where
the coverage is less than 100% it is important for the purposes of the coverage is less than 100% it is important for the purposes of
comparisons between different proposed repair strategies to define comparisons between different proposed repair strategies to define
what is meant by such a percentage. There are four possibilities: what is meant by such a percentage. There are four possibilities:
1. The percentage of links (or nodes) which can be fully protected 1. The percentage of links (or nodes) which can be fully protected
for all destinations. This is appropriate where the requirement for all destinations. This is appropriate where the requirement
is to protect all traffic, but some percentage of the possible is to protect all traffic, but some percentage of the possible
failures may be identified as being un-protectable. failures may be identified as being un-protectable.
2. The percentage of destinations which can be fully protected for 2. The percentage of destinations which can be fully protected for
all link (or node) failures. This is appropriate where the all link (or node) failures. This is appropriate where the
requirement is to protect against all possible failures, but requirement is to protect against all possible failures, but some
some percentage of destinations may be identified as being percentage of destinations may be identified as being un-
un-protectable. protectable.
3. For all destinations (d) and for all failures (f), the 3. For all destinations (d) and for all failures (f), the percentage
percentage of the total potential failure cases (d*f) which are of the total potential failure cases (d*f) which are protected.
protected. This is appropriate where the requirement is an This is appropriate where the requirement is an overall "best
overall "best effort" protection. effort" protection.
4. The percentage of packets normally passing though the network 4. The percentage of packets normally passing though the network
that will continue to reach their destination. This requires a that will continue to reach their destination. This requires a
traffic matrix for the network as part of the analysis. traffic matrix for the network as part of the analysis.
The coverage obtained is dependent on the repair strategy and highly The coverage obtained is dependent on the repair strategy and highly
dependent on the detailed topology and metrics. Any figures quoted in dependent on the detailed topology and metrics. Any figures quoted
this document are for illustrative purposes only. in this document are for illustrative purposes only.
3.2.3. Link or node repair 4.2.3. Link or node repair
A repair path may be computed to protect against failure of an A repair path may be computed to protect against failure of an
adjacent link, or failure of an adjacent node. In general, link adjacent link, or failure of an adjacent node. In general, link
protection is simpler to achieve. A repair which protects against protection is simpler to achieve. A repair which protects against
node failure will also protect against link failure for all node failure will also protect against link failure for all
destinations except those for which the adjacent node is a single destinations except those for which the adjacent node is a single
point of failure. point of failure.
In some cases it may be necessary to distinguish between a link or In some cases it may be necessary to distinguish between a link or
node failure in order that the optimal repair strategy is invoked. node failure in order that the optimal repair strategy is invoked.
Methods for link/node failure determination may be based on Methods for link/node failure determination may be based on
techniques such as BFD. This determination may be made prior to techniques such as BFD[I-D.ietf-bfd-base]. This determination may be
invoking any repairs, but this will increase the period of packet made prior to invoking any repairs, but this will increase the period
loss following a failure unless the determination can be performed as of packet loss following a failure unless the determination can be
part of the failure detection mechanism itself. Alternatively, a performed as part of the failure detection mechanism itself.
subsequent determination can be used to optimise an already invoked Alternatively, a subsequent determination can be used to optimise an
default strategy. already invoked default strategy.
3.2.4. Maintenance of Repair paths 4.2.4. Maintenance of Repair paths
In order to meet the response time goals, it is expected (though not In order to meet the response time goals, it is expected (though not
required) that repair paths, and their associated FIB entries, will required) that repair paths, and their associated FIB entries, will
be pre-computed and installed ready for invocation when a failure is be pre-computed and installed ready for invocation when a failure is
detected. Following invocation the repair paths remain in effect detected. Following invocation the repair paths remain in effect
until they are no longer required. This will normally be when the until they are no longer required. This will normally be when the
routing protocol has re-converged on the new topology taking into routing protocol has re-converged on the new topology taking into
account the failure, and traffic will no longer be using the repair account the failure, and traffic will no longer be using the repair
paths. paths.
The repair paths have the property that they are unaffected by any The repair paths have the property that they are unaffected by any
topology changes resulting from the failure which caused their topology changes resulting from the failure which caused their
instantiation. Therefore there is no need to re-compute them during instantiation. Therefore there is no need to re-compute them during
the convergence period. They may be affected by an unrelated the convergence period. They may be affected by an unrelated
simultaneous topology change, but such events are out of scope of simultaneous topology change, but such events are out of scope of
this work (see section 3.2.5). this work (see Section 4.2.5).
Once the routing protocol has re-converged it is necessary for all Once the routing protocol has re-converged it is necessary for all
repair paths to take account of the new topology. Various repair paths to take account of the new topology. Various
optimizations may permit the efficient identification of repair paths optimizations may permit the efficient identification of repair paths
which are unaffected by the change, and hence do not require full which are unaffected by the change, and hence do not require full re-
re-computation. Since the new repair paths will not be required until computation. Since the new repair paths will not be required until
the next failure occurs, the re-computation may be performed as a the next failure occurs, the re-computation may be performed as a
background task and be subject to a hold-down, but excessive delay in background task and be subject to a hold-down, but excessive delay in
completing this operation will increase the risk of a new failure completing this operation will increase the risk of a new failure
occurring before the repair paths are in place. occurring before the repair paths are in place.
3.2.5. Multiple failures and Shared Risk Link Groups 4.2.5. Multiple failures and Shared Risk Link Groups
Complete protection against multiple unrelated failures is out of Complete protection against multiple unrelated failures is out of
scope of this work. However, it is important that the occurrence of a scope of this work. However, it is important that the occurrence of
second failure while one failure is undergoing repair should not a second failure while one failure is undergoing repair should not
result in a level of service which is significantly worse than that result in a level of service which is significantly worse than that
which would have been achieved in the absence of any repair strategy. which would have been achieved in the absence of any repair strategy.
Shared Risk Link Groups are an example of multiple related failures, Shared Risk Link Groups are an example of multiple related failures,
and the more complex aspects of their protection is a matter for and the more complex aspects of their protection is a matter for
further study. further study.
One specific example of an SRLG which is clearly within the scope of One specific example of an SRLG which is clearly within the scope of
this work is a node failure. This causes the simultaneous failure of this work is a node failure. This causes the simultaneous failure of
multiple links, but their closely defined topological relationship multiple links, but their closely defined topological relationship
makes the problem more tractable. makes the problem more tractable.
3.3. Local Area Networks 4.3. Local Area Networks
Protection against partial or complete failure of LANs is more Protection against partial or complete failure of LANs is more
complex than the point to point case. In general there is a tradeoff complex than the point to point case. In general there is a tradeoff
between the simplicity of the repair and the ability to provide between the simplicity of the repair and the ability to provide
complete and optimal repair coverage. complete and optimal repair coverage.
3.4. Mechanisms for micro-loop prevention 4.4. Mechanisms for micro-loop prevention
Control of micro-loops is important not only because they can cause Control of micro-loops is important not only because they can cause
packet loss in traffic which is affected by the failure, but because packet loss in traffic which is affected by the failure, but because
by saturating a link with looping packets they can also cause by saturating a link with looping packets they can also cause
congestion loss of traffic flowing over that link which would congestion loss of traffic flowing over that link which would
otherwise be unaffected by the failure. otherwise be unaffected by the failure.
A number of solutions to the problem of micro-loop formation have A number of solutions to the problem of micro-loop formation have
been proposed and are summarized in [MICROLOOP]. The following been proposed and are summarized in [I-D.ietf-rtgwg-lf-conv-frmwk].
factors are significant in their classification: The following factors are significant in their classification:
1. Partial or complete protection against micro-loops. 1. Partial or complete protection against micro-loops.
2. Delay imposed upon convergence. 2. Delay imposed upon convergence.
3. Tolerance of multiple failures (from node failures, and in 3. Tolerance of multiple failures (from node failures, and in
general). general).
4. Computational complexity (pre-computed or real time). 4. Computational complexity (pre-computed or real time).
5. Applicability to scheduled events. 5. Applicability to scheduled events.
6. Applicability to link/node reinstatement. 6. Applicability to link/node reinstatement.
4. Management Considerations 5. Management Considerations
While many of the management requirements will be specific to While many of the management requirements will be specific to
particular IPFRR solutions, the following general aspects need to be particular IPFRR solutions, the following general aspects need to be
addressed: addressed:
1. Configuration 1. Configuration
a. Enabling/disabling IPFRR support. A. Enabling/disabling IPFRR support.
b. Enabling/disabling protection on a per link/node basis. B. Enabling/disabling protection on a per link/node basis.
c. Expressing preferences regarding the links/nodes used for C. Expressing preferences regarding the links/nodes used for
repair paths. repair paths.
d. Configuration of failure detection mechanisms. D. Configuration of failure detection mechanisms.
e. Configuration of loop avoidance strategies. E. Configuration of loop avoidance strategies
2. Monitoring 2. Monitoring
a. Notification of links/nodes/destinations which cannot be A. Notification of links/nodes/destinations which cannot be
protected. protected.
b. Notification of pre-computed repair paths, and anticipated B. Notification of pre-computed repair paths, and anticipated
traffic patterns. traffic patterns.
c. Counts of failure detections, protection invocations and C. Counts of failure detections, protection invocations and
packets forwarded over repair paths. packets forwarded over repair paths.
5. Scope and applicability 6. Scope and applicability
The initial scope of this work is in the context of link state IGPs. The initial scope of this work is in the context of link state IGPs.
Link state protocols provide ubiquitous topology information, which Link state protocols provide ubiquitous topology information, which
facilitates the computation of repairs paths. facilitates the computation of repairs paths.
Provision of similar facilities in non-link state IGPs and BGP is a Provision of similar facilities in non-link state IGPs and BGP is a
matter for further study, but the correct operation of the repair matter for further study, but the correct operation of the repair
mechanisms for traffic with a destination outside the IGP domain is mechanisms for traffic with a destination outside the IGP domain is
an important consideration for solutions based on this framework an important consideration for solutions based on this framework
6. IANA considerations 7. IANA Considerations
There are no IANA considerations that arise from this framework There are no IANA considerations that arise from this framework
document. document.
7. Security Considerations 8. Security Considerations
This framework document does not itself introduce any security This framework document does not itself introduce any security
issues, but attention must be paid to the security implications of issues, but attention must be paid to the security implications of
any proposed solutions to the problem. any proposed solutions to the problem.
8. IPR Disclosure Acknowledgement
Certain IPR may be applicable to the mechanisms outlined in this
document. Please check the detailed specifications for possible IPR
notices.
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
9. Acknowledgements 9. Acknowledgements
The authors would like to acknowledge contributions made by Alia The authors would like to acknowledge contributions made by Alia
Atlas and Alex Zinin. Atlas, Clarence Filsfils, Pierre Francois, Joel Halpern, Stefano
Previdi and Alex Zinin.
10. Normative References 10. Informative References
Internet-drafts are works in progress available from [FIFR] Nelakuditi, S., Lee, S., Lu, Y., Zhang, Z., and C. Chuah,
http://www.ietf.org/internet-drafts/ "Fast local rerouting for handling transient link
failures."", Tech. Rep. TR-2004-004, 2004.
11. Informative References [I-D.atlas-ip-local-protect-uturn]
Atlas, A., "U-turn Alternates for IP/LDP Fast-Reroute",
draft-atlas-ip-local-protect-uturn-03 (work in progress),
March 2006.
Internet-drafts are works in progress available from [I-D.bryant-ipfrr-tunnels]
http://www.ietf.org/internet-drafts/ Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP
Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03
(work in progress), November 2007.
[ALT-SP] Tian, A., Chen, N., "Fast Reroute using [I-D.ietf-bfd-base]
Alternative Shortest Paths", draft-tian-frr- Katz, D. and D. Ward, "Bidirectional Forwarding
alt-shortest-path-01.txt, (work in progress) Detection", draft-ietf-bfd-base-07 (work in progress),
January 2008.
[BASE] Atlas, A., Zinin, A., "Basic Specification [I-D.ietf-rtgwg-ipfrr-notvia-addresses]
for IP Fast-Reroute: Loop-free Alternates", Bryant, S., "IP Fast Reroute Using Not-via Addresses",
draft-ietf-rtgwg-ipfrr-spec-base-06.txt, draft-ietf-rtgwg-ipfrr-notvia-addresses-01 (work in
(work in progress) progress), July 2007.
[BFD] Katz, D. and Ward, D., "Bidirectional [I-D.ietf-rtgwg-ipfrr-spec-base]
Forwarding Detection", Atlas, A., Zinin, A., Torvi, R., Choudhury, G., Martin,
draft-ietf-bfd-base-06.txt, (work in C., Imhoff, B., and D. Fedyk, "Basic Specification for IP
progress). Fast-Reroute: Loop-free Alternates",
draft-ietf-rtgwg-ipfrr-spec-base-10 (work in progress),
November 2007.
[FIFR] S. Nelakuditi, S. Lee, Y. Yu, Z.-L. Zhang, [I-D.ietf-rtgwg-lf-conv-frmwk]
and C.-N. Chuah, "Fast local rerouting for Shand, M. and S. Bryant, "A Framework for Loop-free
handling transient link failures.," Tech. Convergence", draft-ietf-rtgwg-lf-conv-frmwk-02 (work in
Rep. TR-2004-004, University of South progress), February 2008.
Carolina, 2004.
[MPLSFRR] Pan, P. et al, "Fast Reroute Extensions to [I-D.tian-frr-alt-shortest-path]
RSVP-TE for LSP Tunnels", RFC 4090. Tian, A., "Fast Reroute using Alternative Shortest Paths",
draft-tian-frr-alt-shortest-path-01 (work in progress),
July 2004.
[MICROLOOP] Bryant, S. and Shand, M., "A Framework for [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute
Loop-free Convergence", Extensions to RSVP-TE for LSP Tunnels", RFC 4090,
draft-bryant-shand-lf-conv-frmwk-04.txt, May 2005.
(work in progress).
[NOT-VIA] Bryant, S., Previdi, S., Shand, M., "IP Fast [SIMULA] Lysne, O., Kvalbein, A., Cicic, T., Gjessing, S., and A.
Reroute Using Notvia Addresses",
draft-ietf-rtgwg-ipfrr-notvia-addresses-
01.txt, (work in progress).
[SIMULA] Lysne, O., et al, "Fast IP Network Recovery Hansen, "Fast IP Network Recovery using Multiple Routing
using Multiple Routing Configurations", Configurations."", Infocom 10.1109/INFOCOM.2006.227, 2006,
http://folk.uio.no/amundk/infocom06.pdf <http://folk.uio.no/amundk/infocom06.pdf>.
[TUNNELS] Bryant, S. et al, "IP Fast Reroute using Authors' Addresses
tunnels", draft-bryant-ipfrr-tunnels-02.txt,
(work in progress).
[U-TURNS] Atlas, A. et al, "IP/LDP Local Protection", Mike Shand
draft-atlas-ip-local-protect-03.txt, (work in Cisco Systems
progress). 250, Longwater Avenue.
Reading, Berks RG2 6GB
UK
12. Authors' Addresses Email: mshand@cisco.com
Stewart Bryant Stewart Bryant
Cisco Systems, Cisco Systems
250, Longwater Avenue, 250, Longwater Avenue.
Green Park, Reading, Berks RG2 6GB
Reading, RG2 6GB, UK
United Kingdom. Email: stbryant@cisco.com
Mike Shand Email: stbryant@cisco.com
Cisco Systems,
250, Longwater Avenue,
Green Park,
Reading, RG2 6GB,
United Kingdom. Email: mshand@cisco.com
Disclaimer of Validity Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright statement Intellectual Property
Copyright (C) The IETF Trust (2007). This document is subject to the
rights, licenses and restrictions contained in BCP 78, and except as The IETF takes no position regarding the validity or scope of any
set forth therein, the authors retain all their rights. Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
 End of changes. 91 change blocks. 
240 lines changed or deleted 244 lines changed or added

This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/