draft-ietf-rtgwg-ordered-fib-04.txt   draft-ietf-rtgwg-ordered-fib-05.txt 
Network Working Group Pierre Francois Network Working Group Pierre Francois
Internet-Draft Olivier Bonaventure Internet-Draft Olivier Bonaventure
Intended status: Experimental Universite catholique de Louvain Intended status: Experimental Universite catholique de Louvain
Expires: April 25, 2011 Mike Shand Expires: October 22, 2011 Mike Shand
Stewart Bryant Stewart Bryant
Stefano Previdi Stefano Previdi
Clarence Filsfils Clarence Filsfils
Cisco Systems Cisco Systems
October 22, 2010 April 20, 2011
Loop-free convergence using oFIB Loop-free convergence using oFIB
draft-ietf-rtgwg-ordered-fib-04 draft-ietf-rtgwg-ordered-fib-05
Abstract Abstract
This draft describes a mechanism for use in conjunction with link This document describes a mechanism for use in conjunction with link
state routing protocols which prevents the transient loops which state routing protocols which prevents the transient loops which
would otherwise occur during topology changes. It does this by would otherwise occur during topology changes. It does this by
correctly sequencing the FIB updates on the routers. correctly sequencing the FIB updates on the routers.
This mechanism can be used in the case of non-urgent link or node This mechanism can be used in the case of non-urgent link or node
shutdowns and restarts or link metric changes. It can also be used shutdowns and restarts or link metric changes. It can also be used
in conjunction with a FRR mechanism which converts a sudden link or in conjunction with a fast re-route mechanism which converts a sudden
node failure into a non-urgent topology change. This is possible link or node failure into a non-urgent topology change. This is
where a complete repair path is provided for all affected possible where a complete repair path is provided for all affected
destinations. destinations.
After a non-urgent topology change, each router computes a rank that After a non-urgent topology change, each router computes a rank that
defines the time at which it can safely update its FIB. A method for defines the time at which it can safely update its FIB. A method for
accelerating this loop-free convergence process by the use of accelerating this loop-free convergence process by the use of
completion messages is also described. completion messages is also described.
The technology described in this document has been subject to
extensive simulation using real network topologies and costs, and
pathological convergence behaviour. A variant of the technology
described here has been experimentally deployed in a production
network.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 25, 2011.
This Internet-Draft will expire on October 22, 2011.
Copyright Notice Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 3, line 17 skipping to change at page 3, line 17
1. Conventions used in the document . . . . . . . . . . . . . . . 4 1. Conventions used in the document . . . . . . . . . . . . . . . 4
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. The required FIB update order . . . . . . . . . . . . . . . . 5 3. The required FIB update order . . . . . . . . . . . . . . . . 5
3.1. Single Link Events . . . . . . . . . . . . . . . . . . . . 5 3.1. Single Link Events . . . . . . . . . . . . . . . . . . . . 5
3.1.1. Link Down / Metric Increase . . . . . . . . . . . . . 5 3.1.1. Link Down / Metric Increase . . . . . . . . . . . . . 5
3.1.2. Link Up / Metric Decrease . . . . . . . . . . . . . . 6 3.1.2. Link Up / Metric Decrease . . . . . . . . . . . . . . 6
3.2. Multi-link events . . . . . . . . . . . . . . . . . . . . 7 3.2. Multi-link events . . . . . . . . . . . . . . . . . . . . 7
3.2.1. Router Down events . . . . . . . . . . . . . . . . . . 7 3.2.1. Router Down events . . . . . . . . . . . . . . . . . . 7
3.2.2. Router Up events . . . . . . . . . . . . . . . . . . . 7 3.2.2. Router Up events . . . . . . . . . . . . . . . . . . . 7
3.2.3. Linecard Failure/Restoration Events . . . . . . . . . 7 3.2.3. Linecard Failure/Restoration Events . . . . . . . . . 7
4. Applying ordered FIB updates . . . . . . . . . . . . . . . . . 7 4. Applying ordered FIB updates . . . . . . . . . . . . . . . . . 8
4.1. Deducing the topology change . . . . . . . . . . . . . . . 7 4.1. Deducing the topology change . . . . . . . . . . . . . . . 8
4.2. Deciding if ordered FIB updates applies . . . . . . . . . 8 4.2. Deciding if ordered FIB updates applies . . . . . . . . . 8
5. Computation of the ordering . . . . . . . . . . . . . . . . . 9 5. Computation of the ordering . . . . . . . . . . . . . . . . . 9
5.1. Link or Router Down or Metric Increase . . . . . . . . . . 9 5.1. Link or Router Down or Metric Increase . . . . . . . . . . 9
5.2. Link or Router Up or Metric Decrease . . . . . . . . . . . 10 5.2. Link or Router Up or Metric Decrease . . . . . . . . . . . 10
6. Acceleration of Ordered Convergence . . . . . . . . . . . . . 10 6. Acceleration of Ordered Convergence . . . . . . . . . . . . . 10
6.1. Construction of the waiting list and notification list . . 11 6.1. Construction of the waiting list and notification list . . 11
6.1.1. Down events . . . . . . . . . . . . . . . . . . . . . 11 6.1.1. Down events . . . . . . . . . . . . . . . . . . . . . 11
6.1.2. Up Events . . . . . . . . . . . . . . . . . . . . . . 11 6.1.2. Up Events . . . . . . . . . . . . . . . . . . . . . . 11
6.2. Format of Completion Messages . . . . . . . . . . . . . . 11 6.2. Format of Completion Messages . . . . . . . . . . . . . . 11
7. Fall back to Conventional Convergence . . . . . . . . . . . . 12 7. Fall back to Conventional Convergence . . . . . . . . . . . . 12
8. oFIB state machine . . . . . . . . . . . . . . . . . . . . . . 12 8. oFIB state machine . . . . . . . . . . . . . . . . . . . . . . 12
8.1. OFIB_STABLE . . . . . . . . . . . . . . . . . . . . . . . 12 8.1. OFIB_STABLE . . . . . . . . . . . . . . . . . . . . . . . 12
8.2. OFIB_HOLDING_DOWN . . . . . . . . . . . . . . . . . . . . 13 8.2. OFIB_HOLDING_DOWN . . . . . . . . . . . . . . . . . . . . 13
8.3. OFIB_HOLDING_UP . . . . . . . . . . . . . . . . . . . . . 14 8.3. OFIB_HOLDING_UP . . . . . . . . . . . . . . . . . . . . . 14
8.4. OFIB_ONGOING . . . . . . . . . . . . . . . . . . . . . . . 15 8.4. OFIB_ONGOING . . . . . . . . . . . . . . . . . . . . . . . 15
8.5. OFIB_ABANDONED . . . . . . . . . . . . . . . . . . . . . . 15 8.5. OFIB_ABANDONED . . . . . . . . . . . . . . . . . . . . . . 15
9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 16 9. IANA considerations . . . . . . . . . . . . . . . . . . . . . 16
10. Security considerations . . . . . . . . . . . . . . . . . . . 16 10. Security considerations . . . . . . . . . . . . . . . . . . . 16
11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 16
12. Informative References . . . . . . . . . . . . . . . . . . . . 16 12. Informative References . . . . . . . . . . . . . . . . . . . . 16
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17 Appendix A. Mechanisms for Safely Abandoning Loop-Free
Convergence (AAH) . . . . . . . . . . . . . . . . . . 17
A.1. Possible Solutions . . . . . . . . . . . . . . . . . . . . 17
A.2. Hold-down timer only . . . . . . . . . . . . . . . . . . . 17
A.3. AAH messages . . . . . . . . . . . . . . . . . . . . . . . 18
A.3.1. Per Router State Machine . . . . . . . . . . . . . . . 19
A.3.2. Per Neighbor State Machine . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 22
1. Conventions used in the document 1. Conventions used in the document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, [4]. document are to be interpreted as described in RFC2119 [4].
2. Introduction 2. Introduction
With link-state protocols, such as IS-IS [1] and OSPF [5], each time With link-state protocols, such as IS-IS [1] and OSPF [5], each time
the network topology changes, some routers need to modify their the network topology changes, some routers need to modify their
Forwarding Information Base (FIB) to take into account the new Forwarding Information Base (FIB) to take into account the new
topology. Each topology change causes a convergence phase. During topology. Each topology change causes a convergence phase. During
this phase, routers may transiently have inconsistent FIBs, which may this phase, routers may transiently have inconsistent FIBs, which may
lead to packet loops and losses, even if the reachability of the lead to packet loops and losses, even if the reachability of the
destinations is not compromised after the topology change. Packet destinations is not compromised after the topology change. Packet
skipping to change at page 5, line 8 skipping to change at page 5, line 8
| | | |
| | | |
S---------------------------R S---------------------------R
2 2
Figure 1: A simple topology Figure 1: A simple topology
It should be noted that the loops can occur remotely from the It should be noted that the loops can occur remotely from the
failure, not just adjacent to it. failure, not just adjacent to it.
The goal of this draft is to define a mechanism which sequences the The goal of this document is to define a mechanism which sequences
router FIB updates to maintain consistency throughout the network. the router FIB updates to maintain consistency throughout the
By correctly setting the FIB change order no looping or packet loss network. By correctly setting the FIB change order no looping or
can occur. This mechanism may be applied to the case of managed packet loss can occur. This mechanism may be applied to the case of
link-state changes, i.e. link metric change, manual link down/up, managed link-state changes, i.e. link metric change, manual link
manual router down/up, and managed state changes of a set of links down/up, manual router down/up, and managed state changes of a set of
attached to one router. It may also be applied to the case where one links attached to one router. It may also be applied to the case
or more network elements are protected by a fast re-route mechanism where one or more network elements are protected by a fast re-route
[7] [6]. The mechanisms that are used in the failure case are mechanism [7] [6]. The mechanisms that are used in the failure case
exactly the same as those used for managed changes. For simplicity are exactly the same as those used for managed changes. For
this draft makes no further distinction between managed and unplanned simplicity this document makes no further distinction between managed
changes. and unplanned changes.
The technology described in this document has been subject to
extensive simulation using real network topologies and costs and
pathological convergence behaviour. A variant of the technology
described here has been experimentally deployed in a production
network.
3. The required FIB update order 3. The required FIB update order
This section provides an overview of the required ordering of the FIB This section provides an overview of the required ordering of the FIB
updates. A more detailed analysis of the rerouting dynamics and updates. A more detailed analysis of the rerouting dynamics and
correctness proofs of the mechanism can be found in [3]. correctness proofs of the mechanism can be found in [3].
3.1. Single Link Events 3.1. Single Link Events
For simplicity the correct ordering for single link changes are For simplicity the correct ordering for single link changes are
described first. The draft then builds on this to demonstrate that described first. The document then builds on this to demonstrate
the same principles can be applied to more complex scenarios such as that the same principles can be applied to more complex scenarios
line card or node changes. such as line card or node changes.
3.1.1. Link Down / Metric Increase 3.1.1. Link Down / Metric Increase
First consider the non-urgent failure of a link or the increase of a First consider the non-urgent failure of a link or the increase of a
link metric. In this case, a router R MUST NOT update its FIB until link metric. In this case, a router R MUST NOT update its FIB until
all other routers that send traffic via R and the affected link have all other routers that send traffic via R and the affected link have
first updated their FIBs. first updated their FIBs.
The following argument shows that this rule ensures the correct order The following argument shows that this rule ensures the correct order
of FIB change when the link X->Y is shut down or its metric is of FIB change when the link X->Y is shut down or its metric is
skipping to change at page 8, line 50 skipping to change at page 9, line 11
If an event is received within the hold down period which does NOT If an event is received within the hold down period which does NOT
reference the common router (R) then in this version of the reference the common router (R) then in this version of the
specification normal convergence is invoked immediately (see specification normal convergence is invoked immediately (see
Section 7). Section 7).
The sudden failure of a link or a set of links that are not protected The sudden failure of a link or a set of links that are not protected
using a FRR mechanism must be processed using the conventional mode using a FRR mechanism must be processed using the conventional mode
of operation. of operation.
In summary an ordered FIB process is applicable iif the set of link In summary an ordered FIB process is applicable if the set of link
state notifications received between the first event and the hold state notifications received between the first event and the hold
down period reference a common router R, and one of the following down period reference a common router R, and one of the following
assertions is verified : assertions is verified :
. The set of notifications refer to link down events concerning . The set of notifications refer to link down events concerning
protected links and metric increase events protected links and metric increase events
. The set of notifications refer to link up events and metric . The set of notifications refer to link up events and metric
decrease events. decrease events.
5. Computation of the ordering 5. Computation of the ordering
This section describes how the required ordering is computed. This section describes how the required ordering is computed.
5.1. Link or Router Down or Metric Increase 5.1. Link or Router Down or Metric Increase
To respect the proposed ordering, routers compute a rank that will be To respect the proposed ordering, routers compute a rank that will be
used to determine the time at which they are permitted to perform used to determine the time at which they are permitted to perform
their FIB update. In the case of a failure event rooted at router Y their FIB update. In the case of a failure event rooted at router Y
or an increase of the metric of link X->Y, router R computes the or an increase of the metric of link X->Y, router R computes the
reverse Shortest Path Tree in the topology before the failure reverse Shortest Path Tree (rSPT) in the topology before the failure
(rSPT_OLD) rooted at Y. This rSPT gives the shortest paths to reach Y (rSPT_OLD) rooted at Y. This rSPT gives the shortest paths to reach Y
before the failure. The branch of the reverse SPT that is below R before the failure. The branch of the reverse SPT that is below R
corresponds to the set of shortest paths to R that are used by the corresponds to the set of shortest paths to R that are used by the
routers that reach Y via R. routers that reach Y via R.
The rank of router R is defined as the depth (in number of hops) of The rank of router R is defined as the depth (in number of hops) of
this branch. In the case of ECMP, the maximum depth of the ECMP path this branch. In the case of ECMP, the maximum depth of the ECMP path
set is used. set is used.
Router R is required to update its FIB at time Router R is required to update its FIB at time
T0 + H + rank * MAX_FIB T0 + H + rank * MAX_FIB
where T0 is the arrival time of the link-state packet containing the where T0 is the arrival time of the link-state packet containing the
topology change, H is the hold-down time and MAX_FIB is a network- topology change, H is the hold-down time and MAX_FIB is a network-
wide constant that reflects the maximum time required to update a FIB wide constant that reflects the maximum time required to update a FIB
irrespective of the change required. The value of MAX_FIB is network irrespective of the change required. The value of MAX_FIB is network
specific and its determination is out of the scope of this document. specific and its determination is out of the scope of this document.
This value must be agreed by all the routers in the network. This This value must be agreed by all the routers in the network. This
agreement can be performed by using a capability TLV as defined in agreement can be performed by using a capability TLV as defined in
[8].
[9].
All the routers that use R to reach Y will compute a lower rank than All the routers that use R to reach Y will compute a lower rank than
R, and hence the correct order will be respected. It should be noted R, and hence the correct order will be respected. It should be noted
that only the routers that used Y before the event need to compute that only the routers that used Y before the event need to compute
their rank. their rank.
5.2. Link or Router Up or Metric Decrease 5.2. Link or Router Up or Metric Decrease
In the case of a link or router up event rooted at Y or a link metric In the case of a link or router up event rooted at Y or a link metric
decrease affecting link Y->W, a router R must have a rank that is decrease affecting link Y->W, a router R must have a rank that is
skipping to change at page 11, line 29 skipping to change at page 11, line 36
list of R is equal to the latter. list of R is equal to the latter.
Note that R could include all its neighbors except those in the Note that R could include all its neighbors except those in the
Waiting list in the Notification list, this has no impact on the Waiting list in the Notification list, this has no impact on the
correctness of the protocol, but would be unnecessarily inefficient. correctness of the protocol, but would be unnecessarily inefficient.
6.1.2. Up Events 6.1.2. Up Events
Consider a link or node up event rooted at router Y or the cost Consider a link or node up event rooted at router Y or the cost
decrease of the link Y->X. A router R will compute its new SPT decrease of the link Y->X. A router R will compute its new SPT
(SPT_new(R)). The Waiting list is the set of nexthop routers that R (SPT_new(R)). The Waiting list is the set of next hop routers that R
uses to reach Y in SPT_new(R). uses to reach Y in SPT_new(R).
In a simple implementation the notification list of R is all the In a simple implementation the notification list of R is all the
neighbours of R excluding those in the Waiting list. This may be neighbours of R excluding those in the Waiting list. This may be
further optimized by computing rSPT_new(Y) to determine those routers further optimized by computing rSPT_new(Y) to determine those routers
that are waiting for R to complete. that are waiting for R to complete.
6.2. Format of Completion Messages 6.2. Format of Completion Messages
The format of completion messages and means of their delivery is The format of completion messages and means of their delivery is
skipping to change at page 12, line 11 skipping to change at page 12, line 18
. Node ID of the near end of the link . Node ID of the near end of the link
. Node ID of the far end of the link . Node ID of the far end of the link
. Old Metric . Old Metric
. New Metric . New Metric
7. Fall back to Conventional Convergence 7. Fall back to Conventional Convergence
In circumstances where a router detects that it is dealing with In circumstances where a router detects that it is dealing with
incomplete or inconsistent link state information, or when a further incomplete or inconsistent link state information, or when a further
topology event is received before completion of the current ordered topology event is received before completion of the current ordered
FIB update process it may be expedient to abandon the controlled FIB update process, it may be expedient to abandon the controlled
convergence process. Fall back mechanisms are investigated in [9]. convergence process. A number of possible fall back mechanisms are
The state machine defined in this version of the draft does not make described in Appendix A. The state machine defined in the body of
an assumption on which fall back mechanism will be used. this document does not make any assumption about which fall back
mechanism will be used.
8. oFIB state machine 8. oFIB state machine
An ofib capable router maintains an ofib state value which can be one An oFIB capable router maintains an oFIB state value which can be one
of : OFIB_STABLE, OFIB_HOLDING_DOWN, OFIB_HOLDING_UP, OFIB_ABANDONED, of : OFIB_STABLE, OFIB_HOLDING_DOWN, OFIB_HOLDING_UP, OFIB_ABANDONED,
OFIB_ONGOING. OFIB_ONGOING.
An ofib capable router maintains a timer, Hold_down_timer. An ofib An oFIB capable router maintains a timer, Hold_down_timer. An oFIB
capable router is configured with a value refered to as capable router is configured with a value referred to as
HOLD_DOWN_DURATION. This configuration can be performed manually or HOLD_DOWN_DURATION. This configuration can be performed manually or
using [8]. using [9].
An ofib capable router maintains a timer, rank_timer. An oFIB capable router maintains a timer, rank_timer.
8.1. OFIB_STABLE 8.1. OFIB_STABLE
OFIB_STABLE is the state of a router which is not currently involved OFIB_STABLE is the state of a router which is not currently involved
in any convergence process. This router is ready to process an event in any convergence process. This router is ready to process an event
by applying ofib. by applying oFIB.
EVENT : Reception of a link-state packet describing an event of the EVENT : Reception of a link-state packet describing an event of the
type link X--Y down or metric increase to be processed using oFIB. type link X--Y down or metric increase to be processed using oFIB.
ACTION : Set state to OFIB_HOLDING_DOWN. Start Hold_down_timer. ACTION : Set state to OFIB_HOLDING_DOWN. Start Hold_down_timer.
ofib_current_common_set = {X,Y}. Compute rank with respect to the ofib_current_common_set = {X,Y}. Compute rank with respect to the
event, as defined in Section 5. Store Waiting List and Notification event, as defined in Section 5. Store Waiting List and Notification
List for X--Y obtained from the rank computation. List for X--Y obtained from the rank computation.
EVENT : Reception of a link-state packet describing an event of the EVENT : Reception of a link-state packet describing an event of the
skipping to change at page 16, line 9 skipping to change at page 16, line 15
ACTION : Trigger AAH, reset Hold_down_timer. ACTION : Trigger AAH, reset Hold_down_timer.
EVENT : Hold_down_timer expires. EVENT : Hold_down_timer expires.
ACTION : Set state to OFIB_STABLE ACTION : Set state to OFIB_STABLE
9. IANA considerations 9. IANA considerations
There are no IANA considerations which arise from this document. Any There are no IANA considerations which arise from this document. Any
such considerations will be called out in protocol specific documents such considerations will be called out in protocol specific documents
such as [8]and [2] such as [9]and [2]
10. Security considerations 10. Security considerations
This draft requires only minor modifications to existing routing This document requires only minor modifications to existing routing
protocols and therefore does not add significant additional security protocols and therefore does not add significant additional security
risks. However a full security analysis would need to be provided risks. However a full security analysis would need to be provided
within the protocol specific specifications proposed for deployment. within the protocol specific specifications proposed for deployment.
11. Acknowledgments 11. Acknowledgments
We would like to thank Jean-Philippe Vasseur for his useful We would like to thank Jean-Philippe Vasseur and Les Ginsberg for
suggestions and comments. their useful suggestions and comments.
12. Informative References 12. Informative References
[1] International Organization for Standardization, "Intermediate [1] International Organization for Standardization, "Intermediate
system to Intermediate system intra-domain routeing information system to Intermediate system intra-domain routeing information
exchange protocol for use in conjunction with the protocol for exchange protocol for use in conjunction with the protocol for
providing the connectionless-mode Network Service (ISO 8473)", providing the connectionless-mode Network Service (ISO 8473)",
ISO/IEC 10589:2002, Second Edition, Nov 2002. ISO/IEC 10589:2002, Second Edition, Nov 2002.
[2] Bonaventure, O., "ISIS extensions for ordered FIB updates", [2] Bonaventure, O., "ISIS extensions for ordered FIB updates",
skipping to change at page 17, line 5 skipping to change at page 17, line 13
Levels", BCP 14, RFC 2119, March 1997. Levels", BCP 14, RFC 2119, March 1997.
[5] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998. [5] Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.
[6] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute Extensions to [6] Pan, P., Swallow, G., and A. Atlas, "Fast Reroute Extensions to
RSVP-TE for LSP Tunnels", RFC 4090, May 2005. RSVP-TE for LSP Tunnels", RFC 4090, May 2005.
[7] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 5714, [7] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 5714,
January 2010. January 2010.
[8] K, A. and S. Bryant, "Synchronisation of Loop Free Timer [8] Shand, M. and S. Bryant, "A Framework for Loop-Free
Convergence", RFC 5715, January 2010.
[9] K, A. and S. Bryant, "Synchronisation of Loop Free Timer
Values", draft-atlas-bryant-shand-lf-timers-04 (work in Values", draft-atlas-bryant-shand-lf-timers-04 (work in
progress), February 2008. progress), February 2008.
[9] Shand, M., Bryant, S., and P. Francois, "Mechanisms for safely Appendix A. Mechanisms for Safely Abandoning Loop-Free Convergence
abandoning loop-free convergence (AAH)", (AAH)
draft-bryant-francois-shand-ipfrr-aah-01 (work in progress),
October 2008. IPFRR[7] and loop-free convergence techniques [8] can deal with
single topology change events, multiple correlated change events, and
in some cases even certain uncorrelated events. However, in all
cases there are events which cannot be dealt with and the mechanism
needs to quickly revert to normal convergence. This is known as
"Abandoning All Hope" (AAH).
This appendix describes the outcome of a design study into the AAH
problem, and is included here to trigger discussion on the trade-offs
between complexity and robustness in the AAH solution-space.
A.1. Possible Solutions
Two approaches to this problem have been proposed:
1. Hold-down timer only.
2. Synchronization of AAH state using AAH messages.
These are described below.
A.2. Hold-down timer only
This method uses a hold-down to acquire a set of LSPs which should be
processed together. On expiry of the local hold-down timer, the
router begins processing the batch of LSPs according to the loop free
prevention algorithm.
There are a number of problems with this simple approach. In some
cases the timer value will be too short to ensure that all the
related events have arrived at all routers (perhaps because there was
some unexpected propagation delay, or one or more of the events are
slow in being detected). In other cases, a completely unrelated
event may occur after the timer has expired, but before the
processing is complete. In addition, since the timer is started at
each router on reception of the first LSP announcing a topology
change, the actual starting time is dependant upon the propagation
time of the first LSP. So, for a subsequent event occurring around
the time of the timer expiry, because of variations in propagation
delay it may reach some routers before the timer expires and others
after it has expired. In the former case this LSP will be included
in the set of changes to be considered, while in the latter it will
be excluded leasing to serious routing inconsistency. In such cases
continuing to operate the loop-free convergence protocol may
exacerbate the situation.
The simple approach to this would be to revert to normal convergence
(AAH) whenever an LSP is received after the timer has expired.
However this also has problems for the reasons above and therefore
AAH must be a synchronous operation, i.e. it is necessary to arrange
that an AAH invoked anywhere in the network causes ALL routers to
AAH.
It is also necessary to consider the means of exiting the AAH state.
Again the simplest method is to use a timer. However while in AAH
state any topology changes previously received, or which are
subsequently received, should be processed immediately using the
traditional convergence algorithms i.e. without invoking controlled
convergence. If the exit from the AAH state is not correctly
synchronized, a new event may be processed by some routers
immediately (as AAH), while those which have already left AAH state
will treat it as the first of a new batch of changes and attempt
controlled convergence. Thus both entry and exit from the AAH state
needs to be synchronised. A method of achieving this is described in
Appendix A.3.
A.3. AAH messages
Like the simple timer AAH method, this method uses a hold-down to
acquire a set of LSPs which should be processed together. On expiry
of the local hold-down timer, the router begins processing the batch
of LSPs according to the loop free prevention algorithm. This is the
same behaviour as the hold-down timer only method. However, if any
router, having started the loop-free convergence process receives an
LSP which would trigger a topology change, it locally abandons the
controlled convergence process, and sends an AAH message to all its
neighbors. This eventually triggers all routers to abandon the
controlled convergence. The routers remain in AAH state (i.e.
processing topology changes using normal "fast" convergence), until a
period of quiescence has elapsed. The exit from AAH state is
synchronized by using a two step process. To achieve the required
synchronization, two additional messages are required, AAH and AAH
ACK. The AAH message is reliably exchanged between neighbours using
the AAH ACK message. These could be implemented as a new message
within the routing protocol or carried in existing routing hello
messages. Two types of state machines are needed. A per-router AAH
state machine and a per neighbour AAH state machine(PNSM). These are
described below.
A.3.1. Per Router State Machine
Per Router State Table
+-------------+-----------+---------+--------+------------+----------+
| EVENT | Q | Hold | CC | AAH | AAH-hold |
+=============+===========+=========+========+============+==========+
| RX LSP | Start | - | TX-AAH | Re-start | TX-AAH |
| triggering | hold-down | | Start | AAH timer. | Start |
| change | timer | | AAH | [AAH] | AAH |
| | [Hold] | | timer. | | timer. |
| | | | [AAH] | | [AAH] |
+-------------+-----------+---------+--------+------------+----------+
| RX AAH | TX-AAH | TX-AAH | TX-AAH | [AAH] | TX-AAH |
| (Neighbor's | Start AAH | Start | Start | | Start |
| PNSM | timer. | AAH | AAH | | AAH |
| processes | [AAH] | timer | timer. | | timer. |
| RX AAH.) | | [AAH] | [AAH] | | [AAH] |
+-------------+-----------+---------+--------+------------+----------+
| Timer | - | Trigger | - | Start | [Q] |
| expiry | | CC. | | AAH-hold | |
| | | [CC] | | timer. | |
| | | | | [AAH-hold] | |
+-------------+-----------+---------+--------+------------+----------+
| Controlled | - | - | [Q] | - | - |
| convergence | | | | | |
| completed | | | | | |
+-------------+-----------+---------+--------+------------+----------+
TX-AAH = Send "goto TX-AAH" to all other PNSMs.
Operation of the per-router state machine is as follows:
Operation of this state machine under normal topology change involves
only states: Quiescent (Q), Hold-down (Hold) and Controlled
Convergence (CC). The remaining states are associated with an AAH
event.
The resting state is Quiescent. When the router in the Quiescent
state receives an LSP indicating a topology change, which would
normally trigger an SPF, it starts the Hold-down timer and changes
state to Hold-down. It normally remains in this state, collecting
additional LSPs until the Hold-down timer expires. Note that all
routers MUST use a common value for the Hold-down timer. When the
Hold-down timer expires the router then enters Controlled Convergence
(CC) state and executes the CC mechanism to re-converge the topology.
When the CC process has completed on the router, the router re-enters
the Quiescent state.
If this router receives a topology changing LSP whilst it is in the
CC state, it enters AAH state, and sends a "goto TX-AAH" command to
all per neighbour state machines which causes each per-neighbour
state machine to signal this state change to its neighbour.
Alternatively, if this router receives an AAH message from any of its
neighbors whilst in any state except AAH, it starts the AAH timer and
enters the AAH state. The per neighbor state machine corresponding
to the neighbor from which the AAH was received executes the RX AAH
action (which causes it to send an AAH ACK), while the remainder are
sent the "goto TX-AAH" command. The result is that the AAH is
acknowledged to the neighbor from which it was received and
propagated to all other neighbors. On entering AAH state, all CC
timers are expired and normal convergence takes place.
Whilst in the AAH state, LSPs are processed in the traditional
manner. Each time an LSP is received, the AAH timer is restarted.
In an unstable network ALL routers will remain in this state for some
time and the network will behave in the traditional uncontrolled
convergence manner.
When the AAH timer expires, the router enters AAH-hold state and
starts the AAH hold timer. The purpose of the AAH-hold state is to
synchronize the transition of the network from AAH to Quiescent. The
additional state ensures that the network cannot contain a mixture of
routers in both AAH and Quiescent states. If, whilst in AAH-Hold
state the router receives a topology changing LSP, it re-enters AAH
state and commands all per neighbour state machines to "goto TX-AAH".
If, whilst in AAH-Hold state the router receives an AAH message from
one of its neighbours, it re-enters the AAH state and commands all
other per neighbour state machines to "goto TX-AAH". Note that the
per-neighbor state machine receiving the AAH message will
autonomously acknowledge receipt of the AAH message. Commanding the
per-neighbour state machine to "goto TX-AAH" is necessary, because
routers may be in a mixture of Quiescent, Hold-down and AAH-hold
state, and it is necessary to rendezvous the entire network back to
AAH state.
When the AAH Hold timer expires the router changes to state Quiescent
and is ready for loop free convergence.
A.3.2. Per Neighbor State Machine
Per Neighbor State Table
+----------------------------+--------------+------------------------+
| EVENT | Idle | TX-AAH |
+============================+==============+========================+
| RX AAH | Send ACK. | Send ACK. |
| | | Cancel timer. |
| | [IDLE] | [IDLE] |
+----------------------------+--------------+------------------------+
| RX ACK | ignore | Cancel timer. |
| | | [IDLE] |
+----------------------------+--------------+------------------------+
| RX "goto TX-AAH" from | Send AAH | ignore |
| Router State Machine | [TX-AAH] | |
+----------------------------+--------------+------------------------+
| Timer expires | impossible | Send AAH |
| | | Restart timer. |
| | | [TX-AAH] |
+----------------------------+--------------+------------------------+
There is one instance of the per-neighbour (PN) state machine for
each neighbour within the convergence control domain.
The normal state is IDLE.
On command ("goto TX-AAH") from the router state machine, the state
machine enters TX-AAH state, transmits an AAH message to its
neighbour and starts a timer.
On receipt of an AAH ACK in state TX-AAH the state machine cancels
the timer and enters IDLE state.
In states IDLE, any AAH ACK message received is ignored.
On expiry of the timer in state TX-AAH the state machine transmits an
AAH message to the neighbour and restarts the timer. (The timer
cannot expire in any other state.)
In any state, receipt of an AAH causes the state machine to transmit
an AAH ACK and enter the IDLE state.
Note that for correct operation the state machine MUST remain in
state TX-AAH, until an AAH ACK or an AAH is received, or the state
machine is deleted. Deletion of the per neighbor state machine
occurs when routing determines that the neighbour has gone away, or
when the interface goes away.
When routing detects a new neighbour it creates a new instance of the
per-neighbour state machine in state Idle. The consequent generation
of the router's own LSP will then cause the router state machine to
execute the LSP receipt actions, which will if necessary result in
the new per-neighbour state machine receiving a "goto TX-AAH" command
and transitioning to TX-AAH state.
Authors' Addresses Authors' Addresses
Pierre Francois Pierre Francois
Universite catholique de Louvain Universite catholique de Louvain
Place Ste Barbe, 2 Place Ste Barbe, 2
Louvain-la-Neuve 1348 Louvain-la-Neuve 1348
BE BE
URI: http://inl.info.ucl.ac.be/ URI: http://inl.info.ucl.ac.be/
 End of changes. 28 change blocks. 
51 lines changed or deleted 305 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/