draft-ietf-rtgwg-ipfrr-framework-07.txt | draft-ietf-rtgwg-ipfrr-framework-08.txt | |||
---|---|---|---|---|
Network Working Group M. Shand | Network Working Group M. Shand | |||
Internet Draft S. Bryant | Internet-Draft S. Bryant | |||
Expiration Date: Dec 2007 Cisco Systems | Intended status: Informational Cisco Systems | |||
IP Fast Reroute Framework | Expires: August 28, 2008 February 25, 2008 | |||
draft-ietf-rtgwg-ipfrr-framework-07.txt | IP Fast Reroute Framework | |||
draft-ietf-rtgwg-ipfrr-framework-08.txt | ||||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that other | Task Force (IETF), its areas, and its working groups. Note that | |||
groups may also distribute working documents as Internet-Drafts. | other groups may also distribute working documents as Internet- | |||
Drafts. | ||||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress". | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on August 28, 2008. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (C) The IETF Trust (2007). All rights reserved. | Copyright (C) The IETF Trust (2008). | |||
Abstract | Abstract | |||
This document provides a framework for the development of IP | This document provides a framework for the development of IP fast- | |||
fast-reroute mechanisms which provide protection against link or | reroute mechanisms which provide protection against link or router | |||
router failure by invoking locally determined repair paths. Unlike | failure by invoking locally determined repair paths. Unlike MPLS | |||
MPLS Fast-reroute, the mechanisms are applicable to a network | Fast-reroute, the mechanisms are applicable to a network employing | |||
employing conventional IP routing and forwarding. An essential part | conventional IP routing and forwarding. An essential part of such | |||
of such mechanisms is the prevention of packet loss caused by the | mechanisms is the prevention of packet loss caused by the loops which | |||
loops which normally occur during the re-convergence of the network | normally occur during the re-convergence of the network following a | |||
following a failure. | failure. | |||
Terminology | Table of Contents | |||
1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | ||||
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
3. Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . 6 | ||||
4. Mechanisms for IP Fast-reroute . . . . . . . . . . . . . . . . 7 | ||||
4.1. Mechanisms for fast failure detection . . . . . . . . . . 7 | ||||
4.2. Mechanisms for repair paths . . . . . . . . . . . . . . . 8 | ||||
4.2.1. Scope of repair paths . . . . . . . . . . . . . . . . 9 | ||||
4.2.2. Analysis of repair coverage . . . . . . . . . . . . . 9 | ||||
4.2.3. Link or node repair . . . . . . . . . . . . . . . . . 10 | ||||
4.2.4. Maintenance of Repair paths . . . . . . . . . . . . . 11 | ||||
4.2.5. Multiple failures and Shared Risk Link Groups . . . . 11 | ||||
4.3. Local Area Networks . . . . . . . . . . . . . . . . . . . 12 | ||||
4.4. Mechanisms for micro-loop prevention . . . . . . . . . . . 12 | ||||
5. Management Considerations . . . . . . . . . . . . . . . . . . 12 | ||||
6. Scope and applicability . . . . . . . . . . . . . . . . . . . 13 | ||||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 | ||||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 13 | ||||
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
10. Informative References . . . . . . . . . . . . . . . . . . . . 14 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 | ||||
Intellectual Property and Copyright Statements . . . . . . . . . . 16 | ||||
1. Terminology | ||||
This section defines words and acronyms used in this draft and other | This section defines words and acronyms used in this draft and other | |||
drafts discussing IP Fast-reroute. | drafts discussing IP Fast-reroute. | |||
D Used to denote the destination router under | D Used to denote the destination router under | |||
discussion. | discussion. | |||
Distance_opt(A,B) The distance of the shortest path from A | Distance_opt(A,B) The distance of the shortest path from A to B. | |||
to B. | ||||
Downstream Path This is a subset of the loop-free alternates | Downstream Path This is a subset of the loop-free alternates | |||
where the neighbor N meet the following | where the neighbor N meet the following | |||
condition:- | condition:- | |||
Distance_opt(N, D) < Distance_opt(S,D) | Distance_opt(N, D) < Distance_opt(S,D) | |||
E Used to denote the router which is the | E Used to denote the router which is the primary | |||
primary next-hop neighbor to get from S to | next-hop neighbor to get from S to the | |||
the destination D. Where there is an ECMP set | destination D. Where there is an ECMP set for the | |||
for the shortest path from S to D, these are | shortest path from S to D, these are referred to | |||
referred to as E_1, E_2, etc. | as E_1, E_2, etc. | |||
ECMP Equal cost multi-path: Where, for a | ECMP Equal cost multi-path: Where, for a particular | |||
particular destination D, multiple primary | destination D, multiple primary next-hops are | |||
next-hops are used to forward traffic because | used to forward traffic because there exist | |||
there exist multiple shortest paths from S | multiple shortest paths from S via different | |||
via different output layer-3 interfaces. | output layer-3 interfaces. | |||
FIB Forwarding Information Base. The database | FIB Forwarding Information Base. The database used | |||
used by the packet forwarder to determine | by the packet forwarder to determine what actions | |||
what actions to perform on a packet. | to perform on a packet. | |||
IPFRR IP fast-reroute | IPFRR IP fast-reroute. | |||
Link(A->B) A link connecting router A to router B. | Link(A->B) A link connecting router A to router B. | |||
Loop-Free This is a neighbor N, that is not a primary | LFA Loop Free Alternate. This is a neighbor N, that | |||
Alternate next-hop neighbor E, whose shortest path to | is not a primary next-hop neighbor E, whose | |||
the destination D does not go back through | shortest path to the destination D does not go | |||
the router S. The neighbor N must meet the | back through the router S. The neighbor N must | |||
following condition:- | meet the following condition:- | |||
Distance_opt(N, D) < Distance_opt(N, S) + | Distance_opt(N, D) < Distance_opt(N, S) + | |||
Distance_opt(S, D) | Distance_opt(S, D) | |||
Loop Free Neighbor A neighbor N_i, which is not the particular | ||||
primary neighbor E_k under discussion, and whose | ||||
shortest path to D does not traverse S. For | ||||
example, if there are two primary neighbors E_1 | ||||
and E_2, E_1 is a loop-free neighbor with regard | ||||
to E_2 and vice versa. | ||||
Loop-Free A neighbor N_i, which is not the particular | Loop Free Link Protecting Alternate | |||
Neighbor primary neighbor E_k under discussion, and | This is a path via a Loop-Free Neighbor N_i which | |||
whose shortest path to D does not traverse S. | does not go through the particular link of S | |||
For example, if there are two primary | which is being protected to reach the destination | |||
neighbors E_1 and E_2, E_1 is a loop-free | D. | |||
neighbor with regard to E_2 and vice versa. | ||||
Loop-Free This is a path via a Loop-Free Neighbor N_i | ||||
Link-Protecting which does not go through the particular link | ||||
Alternate of S which is being protected to reach the | ||||
destination D. | ||||
Loop-Free This is a path via a Loop-Free Neighbor N_i | ||||
Node-Protecting which does not go through the particular | ||||
Alternate primary neighbor of S which is being | ||||
protected to reach the destination D. | ||||
Micro-loop A temporary forwarding loop which exists | Loop Free Node-protecting Alternate | |||
during a routing transition as a result of | This is a path via a Loop-Free Neighbor N_i which | |||
temporary inconsistencies between FIBs. | does not go through the particular primary | |||
neighbor of S which is being protected to reach | ||||
the destination D. | ||||
N_i The ith neighbor of S. | N_i The ith neighbor of S. | |||
Primary Neighbor A neighbor N_i of S which is one of the next | Primary Neighbor A neighbor N_i of S which is one of the next hops | |||
hops for destination D in S's FIB prior to | for destination D in S's FIB prior to any | |||
any failure. | failure. | |||
R_i_j The jth neighbor of N_i. | R_i_j The jth neighbor of N_i. | |||
Routing The process whereby routers converge on a new | Routing Transition The process whereby routers converge on a new | |||
transition topology. In conventional networks this | topology. In conventional networks this process | |||
process frequently causes some disruption to | frequently causes some disruption to packet | |||
packet delivery. | delivery. | |||
RPF Reverse Path Forwarding. I.e. checking that a | RPF Reverse Path Forwarding. I.e. checking that a | |||
packet is received over the interface which | packet is received over the interface which would | |||
would be used to send packets addressed to | be used to send packets addressed to the source | |||
the source address of the packet. | address of the packet. | |||
S Used to denote a router that is the source of | S Used to denote a router that is the source of a | |||
a repair that is computed in anticipation of | repair that is computed in anticipation of the | |||
the failure of a neighboring router denoted | failure of a neighboring router denoted as E, or | |||
as E, or of the link between S and E. It is | of the link between S and E. It is the viewpoint | |||
the viewpoint from which IP Fast-Reroute is | from which IP Fast-Reroute is described. | |||
described. | ||||
S_i The set of neighbors of E, in addition to S, | S_i The set of neighbors of E, in addition to S, | |||
which will independently take the role of S | which will independently take the role of S for | |||
for the traffic they carry. | the traffic they carry. | |||
SPF Shortest Path First, e.g. Dijkstra's | SPF Shortest Path First, e.g. Dijkstra's algorithm. | |||
algorithm. | ||||
SPT Shortest path tree | SPT Shortest path tree | |||
Upstream This is a forwarding loop which involves a | Upstream Forwarding Loop | |||
Forwarding Loop set of routers, none of which are directly | This is a forwarding loop which involves a set of | |||
connected to the link which has caused the | routers, none of which are directly connected to | |||
topology change that triggered a new SPF in | the link which has caused the topology change | |||
any of the routers. | that triggered a new SPF in any of the routers. | |||
1. Introduction | 2. Introduction | |||
When a link or node failure occurs in a routed network, there is | When a link or node failure occurs in a routed network, there is | |||
inevitably a period of disruption to the delivery of traffic until | inevitably a period of disruption to the delivery of traffic until | |||
the network re-converges on the new topology. Packets for | the network re-converges on the new topology. Packets for | |||
destinations which were previously reached by traversing the failed | destinations which were previously reached by traversing the failed | |||
component may be dropped or may suffer looping. Traditionally such | component may be dropped or may suffer looping. Traditionally such | |||
disruptions have lasted for periods of at least several seconds, and | disruptions have lasted for periods of at least several seconds, and | |||
most applications have been constructed to tolerate such a quality of | most applications have been constructed to tolerate such a quality of | |||
service. | service. | |||
skipping to change at page 5, line 31 | skipping to change at page 5, line 41 | |||
Addressing these issues is difficult because the distributed nature | Addressing these issues is difficult because the distributed nature | |||
of the network imposes an intrinsic limit on the minimum convergence | of the network imposes an intrinsic limit on the minimum convergence | |||
time which can be achieved. | time which can be achieved. | |||
However, there is an alternative approach, which is to compute backup | However, there is an alternative approach, which is to compute backup | |||
routes that allow the failure to be repaired locally by the router(s) | routes that allow the failure to be repaired locally by the router(s) | |||
detecting the failure without the immediate need to inform other | detecting the failure without the immediate need to inform other | |||
routers of the failure. In this case, the disruption time can be | routers of the failure. In this case, the disruption time can be | |||
limited to the small time taken to detect the adjacent failure and | limited to the small time taken to detect the adjacent failure and | |||
invoke the backup routes. This is analogous to the technique employed | invoke the backup routes. This is analogous to the technique | |||
by MPLS Fast-Reroute [MPLSFRR], but the mechanisms employed for the | employed by MPLS Fast-Reroute [RFC4090], but the mechanisms employed | |||
backup routes in pure IP networks are necessarily very different. | for the backup routes in pure IP networks are necessarily very | |||
different. | ||||
This document provides a framework for the development of this | This document provides a framework for the development of this | |||
approach. | approach. | |||
2. Problem Analysis | 3. Problem Analysis | |||
The duration of the packet delivery disruption caused by a | The duration of the packet delivery disruption caused by a | |||
conventional routing transition is determined by a number of factors: | conventional routing transition is determined by a number of factors: | |||
1. The time taken to detect the failure. This may be of the order | 1. The time taken to detect the failure. This may be of the order | |||
of a few mS when it can be detected at the physical layer, up to | of a few mS when it can be detected at the physical layer, up to | |||
several tens of seconds when a routing protocol hello is | several tens of seconds when a routing protocol hello is | |||
employed. During this period packets will be unavoidably lost. | employed. During this period packets will be unavoidably lost. | |||
2. The time taken for the local router to react to the failure. | 2. The time taken for the local router to react to the failure. | |||
This will typically involve generating and flooding new routing | This will typically involve generating and flooding new routing | |||
updates, perhaps after some hold-down delay, and re-computing | updates, perhaps after some hold-down delay, and re-computing the | |||
the router's FIB. | router's FIB. | |||
3. The time taken to pass the information about the failure to | 3. The time taken to pass the information about the failure to other | |||
other routers in the network. In the absence of routing protocol | routers in the network. In the absence of routing protocol | |||
packet loss, this is typically between 10mS and 100mS per hop. | packet loss, this is typically between 10mS and 100mS per hop. | |||
4. The time taken to re-compute the forwarding tables. This is | 4. The time taken to re-compute the forwarding tables. This is | |||
typically a few mS for a link state protocol using Dijkstra's | typically a few mS for a link state protocol using Dijkstra's | |||
algorithm. | algorithm. | |||
5. The time taken to load the revised forwarding tables into the | 5. The time taken to load the revised forwarding tables into the | |||
forwarding hardware. This time is very implementation dependant | forwarding hardware. This time is very implementation dependant | |||
and also depends on the number of prefixes affected by the | and also depends on the number of prefixes affected by the | |||
failure, but may be several hundred mS. | failure, but may be several hundred mS. | |||
The disruption will last until the routers adjacent to the failure | The disruption will last until the routers adjacent to the failure | |||
have completed steps 1 and 2, and then all the routers in the network | have completed steps 1 and 2, and then all the routers in the network | |||
whose paths are affected by the failure have completed the remaining | whose paths are affected by the failure have completed the remaining | |||
steps. | steps. | |||
The initial packet loss is caused by the router(s) adjacent to the | The initial packet loss is caused by the router(s) adjacent to the | |||
failure continuing to attempt to transmit packets across the failure | failure continuing to attempt to transmit packets across the failure | |||
until it is detected. This loss is unavoidable, but the detection | until it is detected. This loss is unavoidable, but the detection | |||
time can be reduced to a few tens of mS as described in section 3.1. | time can be reduced to a few tens of mS as described in Section 4.1. | |||
Subsequent packet loss is caused by the "micro-loops" which form | Subsequent packet loss is caused by the "micro-loops" which form | |||
because of temporary inconsistencies between routers' forwarding | because of temporary inconsistencies between routers' forwarding | |||
tables. These occur as a result of the different times at which | tables. These occur as a result of the different times at which | |||
routers update their forwarding tables to reflect the failure. These | routers update their forwarding tables to reflect the failure. These | |||
variable delays are caused by steps 3, 4 and 5 above and in many | variable delays are caused by steps 3, 4 and 5 above and in many | |||
routers it is step 5 which is both the largest factor and which has | routers it is step 5 which is both the largest factor and which has | |||
the greatest variance between routers. The large variance arises from | the greatest variance between routers. The large variance arises | |||
implementation differences and from the differing impact that a | from implementation differences and from the differing impact that a | |||
failure has on each individual router. For example, the number of | failure has on each individual router. For example, the number of | |||
prefixes affected by the failure may vary dramatically from one | prefixes affected by the failure may vary dramatically from one | |||
router to another. | router to another. | |||
In order to achieve packet disruption times which are commensurate | In order to achieve packet disruption times which are commensurate | |||
with the failure detection times it is necessary to perform two | with the failure detection times it is necessary to perform two | |||
distinct tasks: | distinct tasks: | |||
1. Provide a mechanism for the router(s) adjacent to the failure to | 1. Provide a mechanism for the router(s) adjacent to the failure to | |||
rapidly invoke a repair path, which is unaffected by any | rapidly invoke a repair path, which is unaffected by any | |||
skipping to change at page 7, line 15 | skipping to change at page 7, line 38 | |||
Similarly, micro-loop avoidance can be used in isolation to prevent | Similarly, micro-loop avoidance can be used in isolation to prevent | |||
loops arising from pre-planned management action, because the link or | loops arising from pre-planned management action, because the link or | |||
node being shut down can remain in service for a short time after its | node being shut down can remain in service for a short time after its | |||
removal has been announced into the network, and hence it can | removal has been announced into the network, and hence it can | |||
function as its own "repair path". | function as its own "repair path". | |||
Note that micro-loops can also occur when a link or node is restored | Note that micro-loops can also occur when a link or node is restored | |||
to service and thus a micro-loop avoidance mechanism is required for | to service and thus a micro-loop avoidance mechanism is required for | |||
both link up and link down cases. | both link up and link down cases. | |||
3. Mechanisms for IP Fast-reroute | 4. Mechanisms for IP Fast-reroute | |||
The set of mechanisms required for an effective solution to the | The set of mechanisms required for an effective solution to the | |||
problem can be broken down into the following sub-problems. | problem can be broken down into the following sub-problems. | |||
3.1. Mechanisms for fast failure detection | 4.1. Mechanisms for fast failure detection | |||
It is critical that the failure detection time is minimized. A number | It is critical that the failure detection time is minimized. A | |||
of approaches are possible, such as: | number of approaches are possible, such as: | |||
1. Physical detection; for example, loss of light. | 1. Physical detection; for example, loss of light. | |||
2. Routing protocol independent protocol detection; for example, | 2. Routing protocol independent protocol detection; for example, The | |||
The Bidirectional Failure Detection protocol [BFD]. | Bidirectional Failure Detection protocol [I-D.ietf-bfd-base]. | |||
3. Routing protocol detection; for example, use of "fast hellos". | 3. Routing protocol detection; for example, use of "fast hellos". | |||
3.2. Mechanisms for repair paths | 4.2. Mechanisms for repair paths | |||
Once a failure has been detected by one of the above mechanisms, | Once a failure has been detected by one of the above mechanisms, | |||
traffic which previously traversed the failure is transmitted over | traffic which previously traversed the failure is transmitted over | |||
one or more repair paths. The design of the repair paths should be | one or more repair paths. The design of the repair paths should be | |||
such that they can be pre-calculated in anticipation of each local | such that they can be pre-calculated in anticipation of each local | |||
failure and made available for invocation with minimal delay. There | failure and made available for invocation with minimal delay. There | |||
are three basic categories of repair paths: | are three basic categories of repair paths: | |||
1. Equal cost multi-paths (ECMP). Where such paths exist, and one | 1. Equal cost multi-paths (ECMP). Where such paths exist, and one | |||
or more of the alternate paths do not traverse the failure, they | or more of the alternate paths do not traverse the failure, they | |||
may trivially be used as repair paths. | may trivially be used as repair paths. | |||
2. Loop free alternate paths. Such a path exists when a direct | 2. Loop free alternate paths. Such a path exists when a direct | |||
neighbor of the router adjacent to the failure has a path to the | neighbor of the router adjacent to the failure has a path to the | |||
destination which can be guaranteed not to traverse the failure. | destination which can be guaranteed not to traverse the failure. | |||
3. Multi-hop repair paths. When there is no feasible loop free | 3. Multi-hop repair paths. When there is no feasible loop free | |||
alternate path it may still be possible to locate a router, | alternate path it may still be possible to locate a router, which | |||
which is more than one hop away from the router adjacent to the | is more than one hop away from the router adjacent to the | |||
failure, from which traffic will be forwarded to the destination | failure, from which traffic will be forwarded to the destination | |||
without traversing the failure. | without traversing the failure. | |||
ECMP and loop free alternate paths (as described in [BASE]) offer the | ECMP and loop free alternate paths (as described in | |||
simplest repair paths and would normally be used when they are | [I-D.ietf-rtgwg-ipfrr-spec-base]) offer the simplest repair paths and | |||
available. It is anticipated that around 80% of failures (see section | would normally be used when they are available. It is anticipated | |||
3.2.2) can be repaired using these basic methods alone. | that around 80% of failures (see Section 4.2.2) can be repaired using | |||
these basic methods alone. | ||||
Multi-hop repair paths are more complex, both in the computations | Multi-hop repair paths are more complex, both in the computations | |||
required to determine their existence, and in the mechanisms required | required to determine their existence, and in the mechanisms required | |||
to invoke them. They can be further classified as: | to invoke them. They can be further classified as: | |||
1. Mechanisms where one or more alternate FIBs are pre-computed in | 1. Mechanisms where one or more alternate FIBs are pre-computed in | |||
all routers and the repaired packet is instructed to be | all routers and the repaired packet is instructed to be forwarded | |||
forwarded using a "repair FIB" by some method of per packet | using a "repair FIB" by some method of per packet signaling such | |||
signaling such as detecting a "U-turn" [U-TURNS, FIFR] or by | as detecting a "U-turn" [I-D.atlas-ip-local-protect-uturn] , | |||
marking the packet [SIMULA]. | [FIFR] or by marking the packet [SIMULA]. | |||
2. Mechanisms functionally equivalent to a loose source route which | 2. Mechanisms functionally equivalent to a loose source route which | |||
is invoked using the normal FIB. These include tunnels | is invoked using the normal FIB. These include tunnels | |||
[TUNNELS], alternative shortest paths [ALT-SP] and label based | [I-D.bryant-ipfrr-tunnels], alternative shortest paths | |||
mechanisms. | [I-D.tian-frr-alt-shortest-path] and label based mechanisms. | |||
3. Mechanisms employing special addresses or labels which are | 3. Mechanisms employing special addresses or labels which are | |||
installed in the FIBs of all routers with routes pre-computed to | installed in the FIBs of all routers with routes pre-computed to | |||
avoid certain components of the network. For example [NOT-VIA]. | avoid certain components of the network. For example | |||
[I-D.ietf-rtgwg-ipfrr-notvia-addresses]. | ||||
In many cases a repair path which reaches two hops away from the | In many cases a repair path which reaches two hops away from the | |||
router detecting the failure will suffice, and it is anticipated that | router detecting the failure will suffice, and it is anticipated that | |||
around 98% of failures (see section 3.2.2) can be repaired by this | around 98% of failures (see Section 4.2.2) can be repaired by this | |||
method. However, to provide complete repair coverage some use of | method. However, to provide complete repair coverage some use of | |||
longer multi-hop repair paths is generally necessary. | longer multi-hop repair paths is generally necessary. | |||
3.2.1. Scope of repair paths | 4.2.1. Scope of repair paths | |||
A particular repair path may be valid for all destinations which | A particular repair path may be valid for all destinations which | |||
require repair or may only be valid for a subset of destinations. If | require repair or may only be valid for a subset of destinations. If | |||
a repair path is valid for a node immediately downstream of the | a repair path is valid for a node immediately downstream of the | |||
failure, then it will be valid for all destinations previously | failure, then it will be valid for all destinations previously | |||
reachable by traversing the failure. However, in cases where such a | reachable by traversing the failure. However, in cases where such a | |||
repair path is difficult to achieve because it requires a high order | repair path is difficult to achieve because it requires a high order | |||
multi-hop repair path, it may still be possible to identify lower | multi-hop repair path, it may still be possible to identify lower | |||
order repair paths (possibly even loop free alternate paths) which | order repair paths (possibly even loop free alternate paths) which | |||
allow the majority of destinations to be repaired. When IPFRR is | allow the majority of destinations to be repaired. When IPFRR is | |||
skipping to change at page 9, line 16 | skipping to change at page 9, line 49 | |||
be repaired using only the "basic" repair mechanism, leaving a | be repaired using only the "basic" repair mechanism, leaving a | |||
smaller subset of the destinations to be repaired using one of the | smaller subset of the destinations to be repaired using one of the | |||
more complex multi-hop methods. Such a hybrid approach may go some | more complex multi-hop methods. Such a hybrid approach may go some | |||
way to resolving the conflict between completeness and complexity. | way to resolving the conflict between completeness and complexity. | |||
The use of repair paths may result in excessive traffic passing over | The use of repair paths may result in excessive traffic passing over | |||
a link, resulting in congestion discard. This reduces the | a link, resulting in congestion discard. This reduces the | |||
effectiveness of IPFRR. Mechanisms to influence the distribution of | effectiveness of IPFRR. Mechanisms to influence the distribution of | |||
repaired traffic to minimize this effect are therefore desirable. | repaired traffic to minimize this effect are therefore desirable. | |||
3.2.2. Analysis of repair coverage | 4.2.2. Analysis of repair coverage | |||
In some cases the repair strategy will permit the repair of all | In some cases the repair strategy will permit the repair of all | |||
single link or node failures in the network for all possible | single link or node failures in the network for all possible | |||
destinations. This can be defined as 100% coverage. However, where | destinations. This can be defined as 100% coverage. However, where | |||
the coverage is less than 100% it is important for the purposes of | the coverage is less than 100% it is important for the purposes of | |||
comparisons between different proposed repair strategies to define | comparisons between different proposed repair strategies to define | |||
what is meant by such a percentage. There are four possibilities: | what is meant by such a percentage. There are four possibilities: | |||
1. The percentage of links (or nodes) which can be fully protected | 1. The percentage of links (or nodes) which can be fully protected | |||
for all destinations. This is appropriate where the requirement | for all destinations. This is appropriate where the requirement | |||
is to protect all traffic, but some percentage of the possible | is to protect all traffic, but some percentage of the possible | |||
failures may be identified as being un-protectable. | failures may be identified as being un-protectable. | |||
2. The percentage of destinations which can be fully protected for | 2. The percentage of destinations which can be fully protected for | |||
all link (or node) failures. This is appropriate where the | all link (or node) failures. This is appropriate where the | |||
requirement is to protect against all possible failures, but | requirement is to protect against all possible failures, but some | |||
some percentage of destinations may be identified as being | percentage of destinations may be identified as being un- | |||
un-protectable. | protectable. | |||
3. For all destinations (d) and for all failures (f), the | 3. For all destinations (d) and for all failures (f), the percentage | |||
percentage of the total potential failure cases (d*f) which are | of the total potential failure cases (d*f) which are protected. | |||
protected. This is appropriate where the requirement is an | This is appropriate where the requirement is an overall "best | |||
overall "best effort" protection. | effort" protection. | |||
4. The percentage of packets normally passing though the network | 4. The percentage of packets normally passing though the network | |||
that will continue to reach their destination. This requires a | that will continue to reach their destination. This requires a | |||
traffic matrix for the network as part of the analysis. | traffic matrix for the network as part of the analysis. | |||
The coverage obtained is dependent on the repair strategy and highly | The coverage obtained is dependent on the repair strategy and highly | |||
dependent on the detailed topology and metrics. Any figures quoted in | dependent on the detailed topology and metrics. Any figures quoted | |||
this document are for illustrative purposes only. | in this document are for illustrative purposes only. | |||
3.2.3. Link or node repair | 4.2.3. Link or node repair | |||
A repair path may be computed to protect against failure of an | A repair path may be computed to protect against failure of an | |||
adjacent link, or failure of an adjacent node. In general, link | adjacent link, or failure of an adjacent node. In general, link | |||
protection is simpler to achieve. A repair which protects against | protection is simpler to achieve. A repair which protects against | |||
node failure will also protect against link failure for all | node failure will also protect against link failure for all | |||
destinations except those for which the adjacent node is a single | destinations except those for which the adjacent node is a single | |||
point of failure. | point of failure. | |||
In some cases it may be necessary to distinguish between a link or | In some cases it may be necessary to distinguish between a link or | |||
node failure in order that the optimal repair strategy is invoked. | node failure in order that the optimal repair strategy is invoked. | |||
Methods for link/node failure determination may be based on | Methods for link/node failure determination may be based on | |||
techniques such as BFD. This determination may be made prior to | techniques such as BFD[I-D.ietf-bfd-base]. This determination may be | |||
invoking any repairs, but this will increase the period of packet | made prior to invoking any repairs, but this will increase the period | |||
loss following a failure unless the determination can be performed as | of packet loss following a failure unless the determination can be | |||
part of the failure detection mechanism itself. Alternatively, a | performed as part of the failure detection mechanism itself. | |||
subsequent determination can be used to optimise an already invoked | Alternatively, a subsequent determination can be used to optimise an | |||
default strategy. | already invoked default strategy. | |||
3.2.4. Maintenance of Repair paths | 4.2.4. Maintenance of Repair paths | |||
In order to meet the response time goals, it is expected (though not | In order to meet the response time goals, it is expected (though not | |||
required) that repair paths, and their associated FIB entries, will | required) that repair paths, and their associated FIB entries, will | |||
be pre-computed and installed ready for invocation when a failure is | be pre-computed and installed ready for invocation when a failure is | |||
detected. Following invocation the repair paths remain in effect | detected. Following invocation the repair paths remain in effect | |||
until they are no longer required. This will normally be when the | until they are no longer required. This will normally be when the | |||
routing protocol has re-converged on the new topology taking into | routing protocol has re-converged on the new topology taking into | |||
account the failure, and traffic will no longer be using the repair | account the failure, and traffic will no longer be using the repair | |||
paths. | paths. | |||
The repair paths have the property that they are unaffected by any | The repair paths have the property that they are unaffected by any | |||
topology changes resulting from the failure which caused their | topology changes resulting from the failure which caused their | |||
instantiation. Therefore there is no need to re-compute them during | instantiation. Therefore there is no need to re-compute them during | |||
the convergence period. They may be affected by an unrelated | the convergence period. They may be affected by an unrelated | |||
simultaneous topology change, but such events are out of scope of | simultaneous topology change, but such events are out of scope of | |||
this work (see section 3.2.5). | this work (see Section 4.2.5). | |||
Once the routing protocol has re-converged it is necessary for all | Once the routing protocol has re-converged it is necessary for all | |||
repair paths to take account of the new topology. Various | repair paths to take account of the new topology. Various | |||
optimizations may permit the efficient identification of repair paths | optimizations may permit the efficient identification of repair paths | |||
which are unaffected by the change, and hence do not require full | which are unaffected by the change, and hence do not require full re- | |||
re-computation. Since the new repair paths will not be required until | computation. Since the new repair paths will not be required until | |||
the next failure occurs, the re-computation may be performed as a | the next failure occurs, the re-computation may be performed as a | |||
background task and be subject to a hold-down, but excessive delay in | background task and be subject to a hold-down, but excessive delay in | |||
completing this operation will increase the risk of a new failure | completing this operation will increase the risk of a new failure | |||
occurring before the repair paths are in place. | occurring before the repair paths are in place. | |||
3.2.5. Multiple failures and Shared Risk Link Groups | 4.2.5. Multiple failures and Shared Risk Link Groups | |||
Complete protection against multiple unrelated failures is out of | Complete protection against multiple unrelated failures is out of | |||
scope of this work. However, it is important that the occurrence of a | scope of this work. However, it is important that the occurrence of | |||
second failure while one failure is undergoing repair should not | a second failure while one failure is undergoing repair should not | |||
result in a level of service which is significantly worse than that | result in a level of service which is significantly worse than that | |||
which would have been achieved in the absence of any repair strategy. | which would have been achieved in the absence of any repair strategy. | |||
Shared Risk Link Groups are an example of multiple related failures, | Shared Risk Link Groups are an example of multiple related failures, | |||
and the more complex aspects of their protection is a matter for | and the more complex aspects of their protection is a matter for | |||
further study. | further study. | |||
One specific example of an SRLG which is clearly within the scope of | One specific example of an SRLG which is clearly within the scope of | |||
this work is a node failure. This causes the simultaneous failure of | this work is a node failure. This causes the simultaneous failure of | |||
multiple links, but their closely defined topological relationship | multiple links, but their closely defined topological relationship | |||
makes the problem more tractable. | makes the problem more tractable. | |||
3.3. Local Area Networks | 4.3. Local Area Networks | |||
Protection against partial or complete failure of LANs is more | Protection against partial or complete failure of LANs is more | |||
complex than the point to point case. In general there is a tradeoff | complex than the point to point case. In general there is a tradeoff | |||
between the simplicity of the repair and the ability to provide | between the simplicity of the repair and the ability to provide | |||
complete and optimal repair coverage. | complete and optimal repair coverage. | |||
3.4. Mechanisms for micro-loop prevention | 4.4. Mechanisms for micro-loop prevention | |||
Control of micro-loops is important not only because they can cause | Control of micro-loops is important not only because they can cause | |||
packet loss in traffic which is affected by the failure, but because | packet loss in traffic which is affected by the failure, but because | |||
by saturating a link with looping packets they can also cause | by saturating a link with looping packets they can also cause | |||
congestion loss of traffic flowing over that link which would | congestion loss of traffic flowing over that link which would | |||
otherwise be unaffected by the failure. | otherwise be unaffected by the failure. | |||
A number of solutions to the problem of micro-loop formation have | A number of solutions to the problem of micro-loop formation have | |||
been proposed and are summarized in [MICROLOOP]. The following | been proposed and are summarized in [I-D.ietf-rtgwg-lf-conv-frmwk]. | |||
factors are significant in their classification: | The following factors are significant in their classification: | |||
1. Partial or complete protection against micro-loops. | 1. Partial or complete protection against micro-loops. | |||
2. Delay imposed upon convergence. | 2. Delay imposed upon convergence. | |||
3. Tolerance of multiple failures (from node failures, and in | 3. Tolerance of multiple failures (from node failures, and in | |||
general). | general). | |||
4. Computational complexity (pre-computed or real time). | 4. Computational complexity (pre-computed or real time). | |||
5. Applicability to scheduled events. | 5. Applicability to scheduled events. | |||
6. Applicability to link/node reinstatement. | 6. Applicability to link/node reinstatement. | |||
4. Management Considerations | 5. Management Considerations | |||
While many of the management requirements will be specific to | While many of the management requirements will be specific to | |||
particular IPFRR solutions, the following general aspects need to be | particular IPFRR solutions, the following general aspects need to be | |||
addressed: | addressed: | |||
1. Configuration | 1. Configuration | |||
a. Enabling/disabling IPFRR support. | A. Enabling/disabling IPFRR support. | |||
b. Enabling/disabling protection on a per link/node basis. | B. Enabling/disabling protection on a per link/node basis. | |||
c. Expressing preferences regarding the links/nodes used for | C. Expressing preferences regarding the links/nodes used for | |||
repair paths. | repair paths. | |||
d. Configuration of failure detection mechanisms. | D. Configuration of failure detection mechanisms. | |||
e. Configuration of loop avoidance strategies. | E. Configuration of loop avoidance strategies | |||
2. Monitoring | 2. Monitoring | |||
a. Notification of links/nodes/destinations which cannot be | A. Notification of links/nodes/destinations which cannot be | |||
protected. | protected. | |||
b. Notification of pre-computed repair paths, and anticipated | B. Notification of pre-computed repair paths, and anticipated | |||
traffic patterns. | traffic patterns. | |||
c. Counts of failure detections, protection invocations and | C. Counts of failure detections, protection invocations and | |||
packets forwarded over repair paths. | packets forwarded over repair paths. | |||
5. Scope and applicability | 6. Scope and applicability | |||
The initial scope of this work is in the context of link state IGPs. | The initial scope of this work is in the context of link state IGPs. | |||
Link state protocols provide ubiquitous topology information, which | Link state protocols provide ubiquitous topology information, which | |||
facilitates the computation of repairs paths. | facilitates the computation of repairs paths. | |||
Provision of similar facilities in non-link state IGPs and BGP is a | Provision of similar facilities in non-link state IGPs and BGP is a | |||
matter for further study, but the correct operation of the repair | matter for further study, but the correct operation of the repair | |||
mechanisms for traffic with a destination outside the IGP domain is | mechanisms for traffic with a destination outside the IGP domain is | |||
an important consideration for solutions based on this framework | an important consideration for solutions based on this framework | |||
6. IANA considerations | 7. IANA Considerations | |||
There are no IANA considerations that arise from this framework | There are no IANA considerations that arise from this framework | |||
document. | document. | |||
7. Security Considerations | 8. Security Considerations | |||
This framework document does not itself introduce any security | This framework document does not itself introduce any security | |||
issues, but attention must be paid to the security implications of | issues, but attention must be paid to the security implications of | |||
any proposed solutions to the problem. | any proposed solutions to the problem. | |||
8. IPR Disclosure Acknowledgement | ||||
Certain IPR may be applicable to the mechanisms outlined in this | ||||
document. Please check the detailed specifications for possible IPR | ||||
notices. | ||||
The IETF takes no position regarding the validity or scope of any | ||||
Intellectual Property Rights or other rights that might be claimed to | ||||
pertain to the implementation or use of the technology described in | ||||
this document or the extent to which any license under such rights | ||||
might or might not be available; nor does it represent that it has | ||||
made any independent effort to identify any such rights. Information | ||||
on the procedures with respect to rights in RFC documents can be | ||||
found in BCP 78 and BCP 79. | ||||
Copies of IPR disclosures made to the IETF Secretariat and any | ||||
assurances of licenses to be made available, or the result of an | ||||
attempt made to obtain a general license or permission for the use of | ||||
such proprietary rights by implementers or users of this | ||||
specification can be obtained from the IETF on-line IPR repository at | ||||
http://www.ietf.org/ipr. | ||||
The IETF invites any interested party to bring to its attention any | ||||
copyrights, patents or patent applications, or other proprietary | ||||
rights that may cover technology that may be required to implement | ||||
this standard. Please address the information to the IETF at | ||||
ietf-ipr@ietf.org. | ||||
9. Acknowledgements | 9. Acknowledgements | |||
The authors would like to acknowledge contributions made by Alia | The authors would like to acknowledge contributions made by Alia | |||
Atlas and Alex Zinin. | Atlas, Clarence Filsfils, Pierre Francois, Joel Halpern, Stefano | |||
Previdi and Alex Zinin. | ||||
10. Normative References | 10. Informative References | |||
Internet-drafts are works in progress available from | [FIFR] Nelakuditi, S., Lee, S., Lu, Y., Zhang, Z., and C. Chuah, | |||
http://www.ietf.org/internet-drafts/ | "Fast local rerouting for handling transient link | |||
failures."", Tech. Rep. TR-2004-004, 2004. | ||||
11. Informative References | [I-D.atlas-ip-local-protect-uturn] | |||
Atlas, A., "U-turn Alternates for IP/LDP Fast-Reroute", | ||||
draft-atlas-ip-local-protect-uturn-03 (work in progress), | ||||
March 2006. | ||||
Internet-drafts are works in progress available from | [I-D.bryant-ipfrr-tunnels] | |||
http://www.ietf.org/internet-drafts/ | Bryant, S., Filsfils, C., Previdi, S., and M. Shand, "IP | |||
Fast Reroute using tunnels", draft-bryant-ipfrr-tunnels-03 | ||||
(work in progress), November 2007. | ||||
[ALT-SP] Tian, A., Chen, N., "Fast Reroute using | [I-D.ietf-bfd-base] | |||
Alternative Shortest Paths", draft-tian-frr- | Katz, D. and D. Ward, "Bidirectional Forwarding | |||
alt-shortest-path-01.txt, (work in progress) | Detection", draft-ietf-bfd-base-07 (work in progress), | |||
January 2008. | ||||
[BASE] Atlas, A., Zinin, A., "Basic Specification | [I-D.ietf-rtgwg-ipfrr-notvia-addresses] | |||
for IP Fast-Reroute: Loop-free Alternates", | Bryant, S., "IP Fast Reroute Using Not-via Addresses", | |||
draft-ietf-rtgwg-ipfrr-spec-base-06.txt, | draft-ietf-rtgwg-ipfrr-notvia-addresses-01 (work in | |||
(work in progress) | progress), July 2007. | |||
[BFD] Katz, D. and Ward, D., "Bidirectional | [I-D.ietf-rtgwg-ipfrr-spec-base] | |||
Forwarding Detection", | Atlas, A., Zinin, A., Torvi, R., Choudhury, G., Martin, | |||
draft-ietf-bfd-base-06.txt, (work in | C., Imhoff, B., and D. Fedyk, "Basic Specification for IP | |||
progress). | Fast-Reroute: Loop-free Alternates", | |||
draft-ietf-rtgwg-ipfrr-spec-base-10 (work in progress), | ||||
November 2007. | ||||
[FIFR] S. Nelakuditi, S. Lee, Y. Yu, Z.-L. Zhang, | [I-D.ietf-rtgwg-lf-conv-frmwk] | |||
and C.-N. Chuah, "Fast local rerouting for | Shand, M. and S. Bryant, "A Framework for Loop-free | |||
handling transient link failures.," Tech. | Convergence", draft-ietf-rtgwg-lf-conv-frmwk-02 (work in | |||
Rep. TR-2004-004, University of South | progress), February 2008. | |||
Carolina, 2004. | ||||
[MPLSFRR] Pan, P. et al, "Fast Reroute Extensions to | [I-D.tian-frr-alt-shortest-path] | |||
RSVP-TE for LSP Tunnels", RFC 4090. | Tian, A., "Fast Reroute using Alternative Shortest Paths", | |||
draft-tian-frr-alt-shortest-path-01 (work in progress), | ||||
July 2004. | ||||
[MICROLOOP] Bryant, S. and Shand, M., "A Framework for | [RFC4090] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute | |||
Loop-free Convergence", | Extensions to RSVP-TE for LSP Tunnels", RFC 4090, | |||
draft-bryant-shand-lf-conv-frmwk-04.txt, | May 2005. | |||
(work in progress). | ||||
[NOT-VIA] Bryant, S., Previdi, S., Shand, M., "IP Fast | [SIMULA] Lysne, O., Kvalbein, A., Cicic, T., Gjessing, S., and A. | |||
Reroute Using Notvia Addresses", | ||||
draft-ietf-rtgwg-ipfrr-notvia-addresses- | ||||
01.txt, (work in progress). | ||||
[SIMULA] Lysne, O., et al, "Fast IP Network Recovery | Hansen, "Fast IP Network Recovery using Multiple Routing | |||
using Multiple Routing Configurations", | Configurations."", Infocom 10.1109/INFOCOM.2006.227, 2006, | |||
http://folk.uio.no/amundk/infocom06.pdf | <http://folk.uio.no/amundk/infocom06.pdf>. | |||
[TUNNELS] Bryant, S. et al, "IP Fast Reroute using | Authors' Addresses | |||
tunnels", draft-bryant-ipfrr-tunnels-02.txt, | ||||
(work in progress). | ||||
[U-TURNS] Atlas, A. et al, "IP/LDP Local Protection", | Mike Shand | |||
draft-atlas-ip-local-protect-03.txt, (work in | Cisco Systems | |||
progress). | 250, Longwater Avenue. | |||
Reading, Berks RG2 6GB | ||||
UK | ||||
12. Authors' Addresses | Email: mshand@cisco.com | |||
Stewart Bryant | Stewart Bryant | |||
Cisco Systems, | Cisco Systems | |||
250, Longwater Avenue, | 250, Longwater Avenue. | |||
Green Park, | Reading, Berks RG2 6GB | |||
Reading, RG2 6GB, | UK | |||
United Kingdom. Email: stbryant@cisco.com | ||||
Mike Shand | Email: stbryant@cisco.com | |||
Cisco Systems, | ||||
250, Longwater Avenue, | ||||
Green Park, | ||||
Reading, RG2 6GB, | ||||
United Kingdom. Email: mshand@cisco.com | ||||
Disclaimer of Validity | Full Copyright Statement | |||
Copyright (C) The IETF Trust (2008). | ||||
This document is subject to the rights, licenses and restrictions | ||||
contained in BCP 78, and except as set forth therein, the authors | ||||
retain all their rights. | ||||
This document and the information contained herein are provided on an | This document and the information contained herein are provided on an | |||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | |||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND | OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND | |||
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS | THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS | |||
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF | OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF | |||
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | |||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | |||
Copyright statement | Intellectual Property | |||
Copyright (C) The IETF Trust (2007). This document is subject to the | ||||
rights, licenses and restrictions contained in BCP 78, and except as | The IETF takes no position regarding the validity or scope of any | |||
set forth therein, the authors retain all their rights. | Intellectual Property Rights or other rights that might be claimed to | |||
pertain to the implementation or use of the technology described in | ||||
this document or the extent to which any license under such rights | ||||
might or might not be available; nor does it represent that it has | ||||
made any independent effort to identify any such rights. Information | ||||
on the procedures with respect to rights in RFC documents can be | ||||
found in BCP 78 and BCP 79. | ||||
Copies of IPR disclosures made to the IETF Secretariat and any | ||||
assurances of licenses to be made available, or the result of an | ||||
attempt made to obtain a general license or permission for the use of | ||||
such proprietary rights by implementers or users of this | ||||
specification can be obtained from the IETF on-line IPR repository at | ||||
http://www.ietf.org/ipr. | ||||
The IETF invites any interested party to bring to its attention any | ||||
copyrights, patents or patent applications, or other proprietary | ||||
rights that may cover technology that may be required to implement | ||||
this standard. Please address the information to the IETF at | ||||
ietf-ipr@ietf.org. | ||||
Acknowledgment | ||||
Funding for the RFC Editor function is provided by the IETF | ||||
Administrative Support Activity (IASA). | ||||
End of changes. 91 change blocks. | ||||
240 lines changed or deleted | 244 lines changed or added | |||
This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |