draft-ietf-rtgwg-backoff-algo-07.txt   draft-ietf-rtgwg-backoff-algo-08.txt 
Network Working Group B. Decraene Network Working Group B. Decraene
Internet-Draft Orange Internet-Draft Orange
Intended status: Standards Track S. Litkowski Intended status: Standards Track S. Litkowski
Expires: June 13, 2018 Orange Business Service Expires: August 30, 2018 Orange Business Service
H. Gredler H. Gredler
RtBrick Inc RtBrick Inc
A. Lindem A. Lindem
Cisco Systems Cisco Systems
P. Francois P. Francois
C. Bowers C. Bowers
Juniper Networks, Inc. Juniper Networks, Inc.
December 10, 2017 February 26, 2018
SPF Back-off algorithm for link state IGPs SPF Back-off algorithm for link state IGPs
draft-ietf-rtgwg-backoff-algo-07 draft-ietf-rtgwg-backoff-algo-08
Abstract Abstract
This document defines a standard algorithm to back-off link-state IGP This document defines a standard algorithm to temporarily postpone or
Shortest Path First (SPF) computations. 'back-off' link-state IGP Shortest Path First (SPF) computations.
This reduces the computational load and churn on IGP nodes when
multiple temporally close network events trigger multiple SPF
computations.
Having one standard algorithm improves interoperability by reducing Having one standard algorithm improves interoperability by reducing
the probability and/or duration of transient forwarding loops during the probability and/or duration of transient forwarding loops during
the IGP convergence when the IGP reacts to multiple temporally close the IGP convergence when the IGP reacts to multiple temporally close
IGP events. IGP events.
Requirements Language Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
document are to be interpreted as described in [RFC2119]. "OPTIONAL" in this document are to be interpreted as described in
[BCP14] [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 13, 2018.
This Internet-Draft will expire on August 30, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. High level goals . . . . . . . . . . . . . . . . . . . . . . 3 2. High level goals . . . . . . . . . . . . . . . . . . . . . . 3
3. Definitions and parameters . . . . . . . . . . . . . . . . . 4 3. Definitions and parameters . . . . . . . . . . . . . . . . . 4
4. Principles of SPF delay algorithm . . . . . . . . . . . . . . 5 4. Principles of SPF delay algorithm . . . . . . . . . . . . . . 5
5. Specification of the SPF delay state machine . . . . . . . . 5 5. Specification of the SPF delay state machine . . . . . . . . 6
5.1. States . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.1. State Machine . . . . . . . . . . . . . . . . . . . . . . 6
5.2. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 6 5.2. State . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.3. States Transitions . . . . . . . . . . . . . . . . . . . 6 5.3. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.4. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 7 5.4. FSM Events . . . . . . . . . . . . . . . . . . . . . . . 8
6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 9 6. Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 10
7. Partial Deployment . . . . . . . . . . . . . . . . . . . . . 10 7. Partial Deployment . . . . . . . . . . . . . . . . . . . . . 11
8. Impact on micro-loops . . . . . . . . . . . . . . . . . . . . 10 8. Impact on micro-loops . . . . . . . . . . . . . . . . . . . . 11
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
10. Security considerations . . . . . . . . . . . . . . . . . . . 11 10. Security considerations . . . . . . . . . . . . . . . . . . . 11
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
12.1. Normative References . . . . . . . . . . . . . . . . . . 11 12.1. Normative References . . . . . . . . . . . . . . . . . . 12
12.2. Informative References . . . . . . . . . . . . . . . . . 11 12.2. Informative References . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
Link state IGPs, such as IS-IS [ISO10589-Second-Edition] and OSPF Link state IGPs, such as IS-IS [ISO10589-Second-Edition], OSPF
[RFC2328], perform distributed route computation on all routers in [RFC2328] and OSPFv3 [RFC5340], perform distributed route computation
the area/level. In order to have consistent routing tables across on all routers in the area/level. In order to have consistent
the network, such distributed computation requires that all routers routing tables across the network, such distributed computation
have the same version of the network topology (Link State DataBase requires that all routers have the same version of the network
(LSDB)) and perform their computation at the same time. topology (Link State DataBase (LSDB)) and perform their computation
essentially at the same time.
In general, when the network is stable, there is a desire to compute In general, when the network is stable, there is a desire to trigger
a new Shortest Path First (SPF) as soon as a failure is detected in a new Shortest Path First (SPF) computation as soon as a failure is
order to quickly route around the failure. However, when the network detected in order to quickly route around the failure. However, when
is experiencing multiple temporally close failures over a short the network is experiencing multiple failures over a short period of
period of time, there is a conflicting desire to limit the frequency time, there is a conflicting desire to limit the frequency of SPF
of SPF computations. Indeed, this allows a reduction in control computations, which would allow a reduction in control plane
plane resources used by IGPs and all protocols/subsystems reacting on resources used by IGPs and all protocols/subsystems reacting on the
the attendant route change, such as LDP [RFC5036], RSVP-TE [RFC3209], attendant route change, such as LDP [RFC5036], RSVP-TE [RFC3209], BGP
BGP [RFC4271], Fast ReRoute computations (e.g. Loop Free Alternates [RFC4271], Fast ReRoute computations (e.g., Loop Free Alternates
(LFA) [RFC5286], FIB updates... This also reduces the churn on (LFA) [RFC5286]), FIB updates, etc. This also reduces network churn
routers and in the network and, in particular, reduces the side and, in particular, reduces the side effects such as micro-loops
effects such as micro-loops [RFC5715] that ensue during IGP [RFC5715] that ensue during IGP convergence.
convergence.
To allow for this, IGPs implement an SPF back-off algorithm. To allow for this, IGPs usually implement an SPF back-off algorithm
However, different implementations have choosen different algorithms. that postpones or backs-off the SPF computation. However, different
Hence, in a multi-vendor network, it's not possible to ensure that implementations have chosen different algorithms. Hence, in a multi-
all routers trigger their SPF computation after the same delay. This vendor network, it's not possible to ensure that all routers trigger
situation increases the average and maximum differential delay their SPF computation after the same delay. This situation increases
between routers completing their SPF computation. It also increases the average and maximum differential delay between routers completing
the probability that different routers compute their FIBs based on their SPF computation. It also increases the probability that
different LSDB versions. Both factors increase the probability and/ different routers compute their FIBs based on different LSDB
or duration of micro-loops as discussed in Section 8. versions. Both factors increase the probability and/or duration of
micro-loops as discussed in Section 8.
To allow multi-vendor networks to have all routers delay their SPF To allow multi-vendor networks to have all routers delay their SPF
computations for the same duration, this document specifies a computations for the same duration, this document specifies a
standard algorithm. Optionally, implementations may also offer standard algorithm. Optionally, implementations may also offer
alternative algorithms. alternative algorithms.
2. High level goals 2. High level goals
The high level goals of this algorithm are the following: The high level goals of this algorithm are the following:
o Very fast convergence for a single event (e.g., link failure). o Very fast convergence for a single event (e.g., link failure).
o Paced fast convergence for multiple temporally close IGP events o Paced fast convergence for multiple temporally close IGP events
while IGP stability is considered acceptable. while IGP stability is considered acceptable.
o Delayed convergence when IGP stability is problematic. This will o Delayed convergence when IGP stability is problematic. This will
allow the IGP and related processes to conserve resources during allow the IGP and related processes to conserve resources during
the period of instability. the period of instability.
o Always try to avoid different SPF_DELAY timers values across o Always try to avoid different SPF_DELAY Section 3 timer values
different routers in the area/level. Even though not all routers across different routers in the area/level. This requires
will receive IGP messages at the same time, due to differences specific consideration as different routers may receive IGP
messages at different interval or even order, due to differences
both in the distance from the originator of the IGP event and in both in the distance from the originator of the IGP event and in
flooding implementations. flooding implementations.
3. Definitions and parameters 3. Definitions and parameters
IGP events: The reception or origination of an IGP LSDB change IGP events: The reception or origination of an IGP LSDB change
requiring a new routing table computation. Examples are a topology requiring a new routing table computation. Examples are a topology
change, a prefix change, a metric change on a link or prefix... Note change, a prefix change and a metric change on a link or prefix.
that locally triggering a routing table computation is not considered Note that locally triggering a routing table computation is not
as an IGP event since other IGP routers are unaware of this considered as an IGP event since other IGP routers are unaware of
occurrence. this occurrence.
Routing table computation: Computation of the routing table, by the Routing table computation: Computation of the routing table, by the
IGP, using the IGP LSDB. No distinction is made between the type of IGP implementation, using the IGP LSDB. No distinction is made
computation performed. e.g., full SPF, incremental SPF, Partial Route between the type of computation performed. e.g., full SPF,
Computation (PRC). The type of computation is a local consideration. incremental SPF, Partial Route Computation (PRC). The type of
This document may interchangeably use the terms routing table computation is a local consideration. This document may
computation and SPF computation. interchangeably use the terms routing table computation and SPF
computation.
SPF_DELAY: The delay between the first IGP event triggering a new SPF_DELAY: The delay between the first IGP event triggering a new
routing table computation and the start of that routing table routing table computation and the start of that routing table
computation. It can take the following values: computation. It can take the following values:
INITIAL_SPF_DELAY: A very small delay to quickly handle link INITIAL_SPF_DELAY: A very small delay to quickly handle a single
failure, e.g., 0 milliseconds. isolated link failure, e.g., 0 milliseconds.
SHORT_SPF_DELAY: A small delay to have a fast convergence in case of SHORT_SPF_DELAY: A small delay to provide fast convergence in the
a single failure (node, SRLG..), e.g., 50-100 milliseconds. case of a single component failure (node, Shared Risk Link Group
(SRLG)..) that leads to multiple IGP events, e.g., 50-100
milliseconds.
LONG_SPF_DELAY: A long delay when the IGP is unstable, e.g., 2 LONG_SPF_DELAY: A long delay when the IGP is unstable, e.g., 2
seconds. Note that this allows the IGP network to stabilize. seconds. Note that this allows the IGP network to stabilize.
TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed TIME_TO_LEARN_INTERVAL: This is the maximum duration typically needed
to learn all the IGP events related to a single component failure to learn all the IGP events related to a single component failure
(e.g., router failure, SRLG failure), e.g., 1 second. It's mostly (e.g., router failure, SRLG failure), e.g., 1 second. It's mostly
dependent on failure detection time variation between all routers dependent on failure detection time variation between all routers
that are adjacent to the failure. Additionally, it may depend on the that are adjacent to the failure. Additionally, it may depend on the
different IGP implementations/parameters across the network, related different IGP implementations/parameters across the network, related
to origination and flooding of their link state advertisements. to origination and flooding of their link state advertisements.
HOLDDOWN_INTERVAL: The time required with no received IGP events HOLDDOWN_INTERVAL: The time required with no received IGP events
before considering the IGP to be stable again and allowing the before considering the IGP to be stable again and allowing the
SPF_DELAY to be restored to INITIAL_SPF_DELAY. e.g., 3 seconds. The SPF_DELAY to be restored to INITIAL_SPF_DELAY. e.g. a
HOLDDOWN_INTERVAL MUST be defaulted or configured to be longer than HOLDDOWN_INTERVAL of 3 seconds. The HOLDDOWN_INTERVAL MUST be
the TIME_TO_LEARN_INTERVAL. defaulted and configured to be longer than the
TIME_TO_LEARN_INTERVAL.
4. Principles of SPF delay algorithm 4. Principles of SPF delay algorithm
For this first IGP event, we assume that there has been a single For this first IGP event, we assume that there has been a single
simple change in the network which can be taken into account using a simple change in the network which can be taken into account using a
single routing computation (e.g., link failure, prefix (metric) single routing computation (e.g., link failure, prefix (metric)
change) and we optimize for very fast convergence, delaying the change) and we optimize for very fast convergence, delaying the
routing computation by INITIAL_SPF_DELAY. Under this assumption, routing computation by INITIAL_SPF_DELAY. Under this assumption,
there is no benefit in delaying the routing computation. In a there is no benefit in delaying the routing computation. In a
typical network, this is the most common type of IGP event. Hence, typical network, this is the most common type of IGP event. Hence,
it makes sense to optimize this case. it makes sense to optimize this case.
If subsequent IGP events are received in a short period of time If subsequent IGP events are received in a short period of time
(TIME_TO_LEARN_INTERVAL), we then assume that a single component (TIME_TO_LEARN_INTERVAL), we then assume that a single component
failed, but that this failure requires the knowledge of multiple IGP failed, but that this failure requires the knowledge of multiple IGP
events in order for IGP routing to converge. Under this assumption, events in order for IGP routing to converge. Under this assumption,
we want fast convergence since this is a normal network situation. we want fast convergence since this is a normal network situation.
However, there is a benefit in waiting for all IGP events related to However, there is a benefit in waiting for all IGP events related to
this single component failure so that the IGP can compute the post- this single component failure so that the IGP can compute the post-
failure routing table in a single route computation. In this failure routing table in a single additional route computation. In
situation, we delay the routing computation by SHORT_SPF_DELAY. this situation, we delay the routing computation by SHORT_SPF_DELAY.
If IGP events are still received after TIME_TO_LEARN_INTERVAL from If IGP events are still received after TIME_TO_LEARN_INTERVAL from
the initial IGP event received in QUIET state, then the network is the initial IGP event received in QUIET state Figure 1, then the
presumably experiencing multiple independent failures. In this case, network is presumably experiencing multiple independent failures. In
while waiting for network stability, the computations are delayed for this case, while waiting for network stability, the computations are
a longer time represented by LONG_SPF_DELAY. This SPF delay is kept delayed for a longer time represented by LONG_SPF_DELAY. This SPF
until no IGP events are received for HOLDDOWN_INTERVAL. delay is kept until no IGP events are received for HOLDDOWN_INTERVAL.
Note that previously implemented SPF delay algorithms counted the Note that in order to increase the consistency network wide, the
number of SPF computations. However, as all routers may receive the algorithm uses a delay (TIME_TO_LEARN_INTERVAL) from the initial IGP
IGP events at different times, we cannot assume that all routers will event, rather than the number of SPF computation performed. Indeed,
perform the same number of SPF computations or that they will as all routers may receive the IGP events at different times, we
schedule them at the same time. For example, assuming that the SPF cannot assume that all routers will perform the same number of SPF
delay is 50 ms, router R1 may receive 3 IGP events (E1, E2, E3) in computations. For example, assuming that the SPF delay is 50 ms,
those 50 ms and hence will perform a single routing computation. router R1 may receive 3 IGP events (E1, E2, E3) in those 50 ms and
While another router R2 may only receive 2 events (E1, E2) in those hence will perform a single routing computation. While another
50 ms and hence will schedule another routing computation when router R2 may only receive 2 events (E1, E2) in those 50 ms and hence
receiving E3. That's why this document uses a time will schedule another routing computation when receiving E3.
(TIME_TO_LEARN_INTERVAL) from the initial event detection/reception
as opposed to counting the number of SPF computations to determine
when the IGP is unstable.
5. Specification of the SPF delay state machine 5. Specification of the SPF delay state machine
5.1. States
This section describes the state machine. The naming and semantics This section describes the abstract finite state machine (FSM)
of each state corresponds directly to the SPF delay used for IGP intended to control the timing of the execution of SPF calculations
events received in that state. Three states are defined: in response to IGP events.
QUIET: This is the initial state, when no IGP events have occured for
at least HOLDDOWN_INTERVAL since the previous routing table
computation. The state is meant to handle link failures very
quickly.
SHORT_WAIT: State entered when an IGP event has been received in
QUIET state. This state is meant to handle single component failure
requiring multiple IGP events (e.g., node, SRLG).
LONG_WAIT: State reached after TIME_TO_LEARN_INTERVAL. In other
words, state reached after TIME_TO_LEARN_INTERVAL in state
SHORT_WAIT. This state is meant to handle multiple independent
component failures during periods of IGP instability.
5.2. Timers
SPF_TIMER: The Finite State Machine (FSM) abstract timer that uses
the computed SPF delay. Upon expiration, the Route Table Computation
(as defined in Section 3) is performed.
HOLDDOWN_TIMER: The Finite State Machine (FSM) abstract timer that is
(re)started whan an IGP event is received and set to
HOLDDOWN_INTERVAL. Upon expiration, the FSM is moved to the QUIET
state.
LEARN_TIMER: The Finite State Machine (FSM) abstract timer that is
started when an IGP event is recevied while the FSM is in the QUIET
state. Upon expiration, the FSM is moved to the LONG_WAIT state.
5.3. States Transitions 5.1. State Machine
The FSM is initialized to the QUIET state with all three timers The FSM is initialized to the QUIET state with all three timers
timers (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER) deactivated. timers (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER) deactivated.
The events which may change the FSM states are an IGP event or the The events which may change the FSM states are an IGP event or the
expiration of one timer (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER). expiration of one timer (SPF_TIMER, HOLDDOWN_TIMER, LEARN_TIMER).
The following diagram briefly describes the state transitions. The following diagram briefly describes the state transitions.
+-------------------+ +-------------------+
skipping to change at page 7, line 37 skipping to change at page 7, line 37
+-------------------+ | +-------------------+ |
+---->| | | +---->| | |
| | LONG_WAIT |------------>-------+ | | LONG_WAIT |------------>-------+
+-----| | +-----| |
4: +-------------------+ 5: HOLDDOWN_TIMER 4: +-------------------+ 5: HOLDDOWN_TIMER
IGP event expiration IGP event expiration
9: SPF_TIMER expiration 9: SPF_TIMER expiration
Figure 1: State Machine Figure 1: State Machine
5.2. State
The naming and semantics of each state corresponds directly to the
SPF delay used for IGP events received in that state. Three states
are defined:
QUIET: This is the initial state, when no IGP events have occurred
for at least HOLDDOWN_INTERVAL since the previous routing table
computation. The state is meant to handle link failures very
quickly.
SHORT_WAIT: State entered when an IGP event has been received in
QUIET state. This state is meant to handle single component failure
requiring multiple IGP events (e.g., node, SRLG).
LONG_WAIT: State reached after TIME_TO_LEARN_INTERVAL. In other
words, state reached after TIME_TO_LEARN_INTERVAL in state
SHORT_WAIT. This state is meant to handle multiple independent
component failures during periods of IGP instability.
5.3. Timers
SPF_TIMER: The FSM abstract timer that uses the computed SPF delay.
Upon expiration, the Route Table Computation (as defined in
Section 3) is performed.
HOLDDOWN_TIMER: The FSM abstract timer that is (re)started whan an
IGP event is received and set to HOLDDOWN_INTERVAL. Upon expiration,
the FSM is moved to the QUIET state.
LEARN_TIMER: The FSM abstract timer that is started when an IGP event
is recevied while the FSM is in the QUIET state. Upon expiration,
the FSM is moved to the LONG_WAIT state.
5.4. FSM Events 5.4. FSM Events
This section describes the events and the actions performed in This section describes the events and the actions performed in
response. response.
Transition 1: IGP event, while in QUIET_STATE. Transition 1: IGP event, while in QUIET state.
Actions on event 1: Actions on event 1:
o If SPF_TIMER is not already running, start it with value o If SPF_TIMER is not already running, start it with value
INITIAL_SPF_DELAY. INITIAL_SPF_DELAY.
o Start LEARN_TIMER with TIME_TO_LEARN_INTERVAL. o Start LEARN_TIMER with TIME_TO_LEARN_INTERVAL.
o Start HOLDDOWN_TIMER with HOLDDOWN_INTERVAL. o Start HOLDDOWN_TIMER with HOLDDOWN_INTERVAL.
skipping to change at page 9, line 31 skipping to change at page 10, line 19
Transition 9: SPF_TIMER expiration, while in LONG_WAIT. Transition 9: SPF_TIMER expiration, while in LONG_WAIT.
Actions on event 9: Actions on event 9:
o Compute SPF. o Compute SPF.
o Remain in current state. o Remain in current state.
6. Parameters 6. Parameters
All the parameters MUST be configurable [I-D.ietf-isis-yang-isis-cfg] All the parameters MUST be configurable at the protocol instance
[I-D.ietf-ospf-yang] at the protocol instance granularity. They MAY granularity. They MAY be configurable at the area/level granularity.
be configurable at the area/level granularity. All the delays All the delays (INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY,
(INITIAL_SPF_DELAY, SHORT_SPF_DELAY, LONG_SPF_DELAY,
TIME_TO_LEARN_INTERVAL, HOLDDOWN_INTERVAL) SHOULD be configurable at TIME_TO_LEARN_INTERVAL, HOLDDOWN_INTERVAL) SHOULD be configurable at
the millisecond granularity. They MUST be configurable at least at the millisecond granularity. They MUST be configurable at least at
the tenth of second granularity. The configurable range for all the the tenth of second granularity. The configurable range for all the
parameters SHOULD at least be from 0 milliseconds to 60 seconds. parameters SHOULD at least be from 0 milliseconds to 60 seconds.
This document does not propose default values for the parameters This document does not propose default values for the parameters
because these values are expected to be context dependent. because these values are expected to be context dependent.
Implementations are free to propose their own default values. Implementations are free to propose their own default values.
However the HOLDDOWN_INTERVAL MUST be defaulted or configured to be
longer than the TIME_TO_LEARN_INTERVAL.
In order to satisfy the goals stated in Section 2, operators are In order to satisfy the goals stated in Section 2, operators are
RECOMMENDED to configure delay intervals such that SPF_INITIAL_DELAY RECOMMENDED to configure delay intervals such that INITIAL_SPF_DELAY
<= SPF_SHORT_DELAY and SPF_SHORT_DELAY <= SPF_LONG_DELAY. <= SHORT_SPF_DELAY and SHORT_SPF_DELAY <= LONG_SPF_DELAY.
When setting (default) values, one SHOULD consider the customers and When setting (default) values, one should consider the customers and
their application requirements, the computational power of the their application requirements, the computational power of the
routers, the size of the network, and, in particular, the number of routers, the size of the network, and, in particular, the number of
IP prefixes advertised in the IGP, the frequency and number of IGP IP prefixes advertised in the IGP, the frequency and number of IGP
events, the number of protocols reactions/computations triggered by events, the number of protocols reactions/computations triggered by
IGP SPF (e.g., BGP, PCEP, Traffic Engineering CSPF, Fast ReRoute IGP SPF computation (e.g., BGP, PCEP, Traffic Engineering CSPF, Fast
computations). ReRoute computations).
Note that some or all of these factors may change over the life of Note that some or all of these factors may change over the life of
the network. In case of doubt, it's RECOMMENDED to play it safe and the network. In case of doubt, it's RECOMMENDED that timer intervals
start with safe, i.e., longer timers. should be chosen conservatively (i.e., longer timer values).
For the standard algorithm to be effective in mitigating micro-loops, For the standard algorithm to be effective in mitigating micro-loops,
it is RECOMMENDED that all routers in the IGP domain, or at least all it is RECOMMENDED that all routers in the IGP domain, or at least all
the routers in the same area/level, have exactly the same configured the routers in the same area/level, have exactly the same configured
values. values.
7. Partial Deployment 7. Partial Deployment
In general, the SPF delay algorithm is only effective in mitigating In general, the SPF delay algorithm is only effective in mitigating
micro-loops if it is deployed, with the same parameters, on all micro-loops if it is deployed, with the same parameters, on all
routers, in the IGP domain or, at least, all routers in an IGP area/ routers in the IGP domain or, at least, all routers in an IGP area/
level. The impact of partial deployment is based on the particular level. The impact of partial deployment is dependent on the
event, topology, and the SPF algorithm(s) used on other routers in particular event, topology, and the SPF algorithm(s) used on other
the IGP area/level. In cases where the previous SPF algorithm was routers in the IGP area/level. In cases where the previous SPF
implemented uniformly, partial deployment will increase the frequency algorithm was implemented uniformly, partial deployment will increase
and duration of micro-loops. Hence, it is RECOMMENDED that all the frequency and duration of micro-loops. Hence, it is RECOMMENDED
routers in the IGP domain or at least within the same area/level be that all routers in the IGP domain or at least within the same area/
migrated to the SPF algorithm described herein at roughly the same level be migrated to the SPF algorithm described herein at roughly
time. the same time.
Note that this is not a new consideration as over times, network Note that this is not a new consideration as over times, network
operators have changed SPF delay parameters in order to accommodate operators have changed SPF delay parameters in order to accommodate
new customer requirements for fast convergence, as permitted by new new customer requirements for fast convergence, as permitted by new
software and hardware. They may also have progressively replaced an software and hardware. They may also have progressively replaced an
implementation with a given SPF delay algorithm by another implementation with a given SPF delay algorithm by another
implementation with a different one. implementation with a different one.
8. Impact on micro-loops 8. Impact on micro-loops
Micro-loops during IGP convergence are due to a non-synchronized or Micro-loops during IGP convergence are due to a non-synchronized or
non-ordered update of the forwarding information tables (FIB) non-ordered update of the forwarding information tables (FIB)
[RFC5715] [RFC6976] [I-D.ietf-rtgwg-spf-uloop-pb-statement]. FIBs [RFC5715] [RFC6976] [I-D.ietf-rtgwg-spf-uloop-pb-statement]. FIBs
are installed after multiple steps such as flooding of the IGP event are installed after multiple steps such as flooding of the IGP event
across the network, SPF wait time, SPF computation, FIB distribution across the network, SPF wait time, SPF computation, FIB distribution
across line cards, and FIB update. This document only addresses the across line cards, and FIB update. This document only addresses the
first contribution. This standardized procedure reduces the contribution from the SPF wait time. This standardized procedure
probability and/or duration of micro-loops when IGPs experience reduces the probability and/or duration of micro-loops when IGPs
multiple temporally close events. It does not prevent all micro- experience multiple temporally close events. It does not prevent all
loops. However, it is beneficial and is less complex and costly to micro-loops. However, it is beneficial and is less complex and
implement when compared to full solutions such as [RFC5715] or costly to implement when compared to full solutions such as [RFC5715]
[RFC6976]. or [RFC6976].
9. IANA Considerations 9. IANA Considerations
No IANA actions required. No IANA actions required.
10. Security considerations 10. Security considerations
The algorithm presented in this document does not compromise IGP The algorithm presented in this document does not compromise IGP
security. An attacker having the ability to generate IGP events security. An attacker having the ability to generate IGP events
would be able to delay the IGP convergence time. The LONG_SPF_DELAY would be able to delay the IGP convergence time. The LONG_SPF_DELAY
skipping to change at page 11, line 34 skipping to change at page 12, line 22
12. References 12. References
12.1. Normative References 12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
12.2. Informative References [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
[I-D.ietf-isis-yang-isis-cfg] May 2017, <https://www.rfc-editor.org/info/rfc8174>.
Litkowski, S., Yeung, D., Lindem, A., Zhang, Z., and L.
Lhotka, "YANG Data Model for IS-IS protocol", draft-ietf-
isis-yang-isis-cfg-19 (work in progress), November 2017.
[I-D.ietf-ospf-yang] 12.2. Informative References
Yeung, D., Qu, Y., Zhang, Z., Chen, I., and A. Lindem,
"Yang Data Model for OSPF Protocol", draft-ietf-ospf-
yang-09 (work in progress), October 2017.
[I-D.ietf-rtgwg-spf-uloop-pb-statement] [I-D.ietf-rtgwg-spf-uloop-pb-statement]
Litkowski, S., Decraene, B., and M. Horneffer, "Link State Litkowski, S., Decraene, B., and M. Horneffer, "Link State
protocols SPF trigger and delay algorithm impact on IGP protocols SPF trigger and delay algorithm impact on IGP
micro-loops", draft-ietf-rtgwg-spf-uloop-pb-statement-05 micro-loops", draft-ietf-rtgwg-spf-uloop-pb-statement-06
(work in progress), December 2017. (work in progress), January 2018.
[ISO10589-Second-Edition] [ISO10589-Second-Edition]
International Organization for Standardization, International Organization for Standardization,
"Intermediate system to Intermediate system intra-domain "Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in routeing information exchange protocol for use in
conjunction with the protocol for providing the conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", ISO/ connectionless-mode Network Service (ISO 8473)", ISO/
IEC 10589:2002, Second Edition, Nov 2002. IEC 10589:2002, Second Edition, Nov 2002.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328,
skipping to change at page 12, line 36 skipping to change at page 13, line 19
[RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed., [RFC5036] Andersson, L., Ed., Minei, I., Ed., and B. Thomas, Ed.,
"LDP Specification", RFC 5036, DOI 10.17487/RFC5036, "LDP Specification", RFC 5036, DOI 10.17487/RFC5036,
October 2007, <https://www.rfc-editor.org/info/rfc5036>. October 2007, <https://www.rfc-editor.org/info/rfc5036>.
[RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for [RFC5286] Atlas, A., Ed. and A. Zinin, Ed., "Basic Specification for
IP Fast Reroute: Loop-Free Alternates", RFC 5286, IP Fast Reroute: Loop-Free Alternates", RFC 5286,
DOI 10.17487/RFC5286, September 2008, DOI 10.17487/RFC5286, September 2008,
<https://www.rfc-editor.org/info/rfc5286>. <https://www.rfc-editor.org/info/rfc5286>.
[RFC5340] Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF
for IPv6", RFC 5340, DOI 10.17487/RFC5340, July 2008,
<https://www.rfc-editor.org/info/rfc5340>.
[RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free [RFC5715] Shand, M. and S. Bryant, "A Framework for Loop-Free
Convergence", RFC 5715, DOI 10.17487/RFC5715, January Convergence", RFC 5715, DOI 10.17487/RFC5715, January
2010, <https://www.rfc-editor.org/info/rfc5715>. 2010, <https://www.rfc-editor.org/info/rfc5715>.
[RFC6976] Shand, M., Bryant, S., Previdi, S., Filsfils, C., [RFC6976] Shand, M., Bryant, S., Previdi, S., Filsfils, C.,
Francois, P., and O. Bonaventure, "Framework for Loop-Free Francois, P., and O. Bonaventure, "Framework for Loop-Free
Convergence Using the Ordered Forwarding Information Base Convergence Using the Ordered Forwarding Information Base
(oFIB) Approach", RFC 6976, DOI 10.17487/RFC6976, July (oFIB) Approach", RFC 6976, DOI 10.17487/RFC6976, July
2013, <https://www.rfc-editor.org/info/rfc6976>. 2013, <https://www.rfc-editor.org/info/rfc6976>.
 End of changes. 39 change blocks. 
163 lines changed or deleted 175 lines changed or added

This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/