draft-ietf-rift-rift-01.txt   draft-ietf-rift-rift-02.txt 
RIFT Working Group T. Przygienda, Ed. RIFT Working Group T. Przygienda, Ed.
Internet-Draft Juniper Networks Internet-Draft Juniper Networks
Intended status: Standards Track A. Sharma Intended status: Standards Track A. Sharma
Expires: October 28, 2018 Comcast Expires: December 23, 2018 Comcast
P. Thubert P. Thubert
Cisco Cisco
A. Atlas A. Atlas
Individual
J. Drake J. Drake
Juniper Networks Juniper Networks
Apr 26, 2018 Jun 21, 2018
RIFT: Routing in Fat Trees RIFT: Routing in Fat Trees
draft-ietf-rift-rift-01 draft-ietf-rift-rift-02
Abstract Abstract
This document outlines a specialized, dynamic routing protocol for This document outlines a specialized, dynamic routing protocol for
Clos and fat-tree network topologies. The protocol (1) deals with Clos and fat-tree network topologies. The protocol (1) deals with
automatic construction of fat-tree topologies based on detection of automatic construction of fat-tree topologies based on detection of
links, (2) minimizes the amount of routing state held at each level, links, (2) minimizes the amount of routing state held at each level,
(3) automatically prunes the topology distribution exchanges to a (3) automatically prunes the topology distribution exchanges to a
sufficient subset of links, (4) supports automatic disaggregation of sufficient subset of links, (4) supports automatic disaggregation of
prefixes on link and node failures to prevent black-holing and prefixes on link and node failures to prevent black-holing and
skipping to change at page 1, line 48 skipping to change at page 1, line 49
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 28, 2018. This Internet-Draft will expire on December 23, 2018.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 51 skipping to change at page 3, line 51
5.1. Normal Operation . . . . . . . . . . . . . . . . . . . . 54 5.1. Normal Operation . . . . . . . . . . . . . . . . . . . . 54
5.2. Leaf Link Failure . . . . . . . . . . . . . . . . . . . . 55 5.2. Leaf Link Failure . . . . . . . . . . . . . . . . . . . . 55
5.3. Partitioned Fabric . . . . . . . . . . . . . . . . . . . 56 5.3. Partitioned Fabric . . . . . . . . . . . . . . . . . . . 56
5.4. Northbound Partitioned Router and Optional East-West 5.4. Northbound Partitioned Router and Optional East-West
Links . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Links . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6. Implementation and Operation: Further Details . . . . . . . . 59 6. Implementation and Operation: Further Details . . . . . . . . 59
6.1. Considerations for Leaf-Only Implementation . . . . . . . 60 6.1. Considerations for Leaf-Only Implementation . . . . . . . 60
6.2. Adaptations to Other Proposed Data Center Topologies . . 60 6.2. Adaptations to Other Proposed Data Center Topologies . . 60
6.3. Originating Non-Default Route Southbound . . . . . . . . 61 6.3. Originating Non-Default Route Southbound . . . . . . . . 61
7. Security Considerations . . . . . . . . . . . . . . . . . . . 61 7. Security Considerations . . . . . . . . . . . . . . . . . . . 61
8. Information Elements Schema . . . . . . . . . . . . . . . . . 62 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 62
8.1. common.thrift . . . . . . . . . . . . . . . . . . . . . . 62 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 62
8.2. encoding.thrift . . . . . . . . . . . . . . . . . . . . . 67 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 62
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 72 10.1. Normative References . . . . . . . . . . . . . . . . . . 62
10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 73 10.2. Informative References . . . . . . . . . . . . . . . . . 65
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 73 Appendix A. Information Elements Schema . . . . . . . . . . . . 66
11.1. Normative References . . . . . . . . . . . . . . . . . . 73 A.1. common.thrift . . . . . . . . . . . . . . . . . . . . . . 67
11.2. Informative References . . . . . . . . . . . . . . . . . 75 A.2. encoding.thrift . . . . . . . . . . . . . . . . . . . . . 72
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 77 Appendix B. Finite State Machines . . . . . . . . . . . . . . . 77
B.1. LIE . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
B.2. ZTP . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Appendix C. Constants . . . . . . . . . . . . . . . . . . . . . 87
C.1. Configurable Protocol Constants . . . . . . . . . . . . . 87
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 87
1. Introduction 1. Introduction
ANISOTROPIC
Clos [CLOS] and Fat-Tree [FATTREE] have gained prominence in today's Clos [CLOS] and Fat-Tree [FATTREE] have gained prominence in today's
networking, primarily as result of the paradigm shift towards a networking, primarily as result of the paradigm shift towards a
centralized data-center based architecture that is poised to deliver centralized data-center based architecture that is poised to deliver
a majority of computation and storage services in the future. a majority of computation and storage services in the future.
Today's routing protocols were geared towards a network with an Today's routing protocols were geared towards a network with an
irregular topology and low degree of connectivity originally but irregular topology and low degree of connectivity originally but
given they were the only available mechanisms, consequently several given they were the only available mechanisms, consequently several
attempts to apply those to Clos have been made. Most successfully attempts to apply those to Clos have been made. Most successfully
BGP [RFC4271] [RFC7938] has been extended to this purpose, not as BGP [RFC4271] [RFC7938] has been extended to this purpose, not as
much due to its inherent suitability to solve the problem but rather much due to its inherent suitability to solve the problem but rather
skipping to change at page 6, line 9 skipping to change at page 6, line 15
graphs (DAG). graphs (DAG).
Level: Clos and Fat Tree networks are trees and 'level' denotes the Level: Clos and Fat Tree networks are trees and 'level' denotes the
set of nodes at the same height in such a network, where the set of nodes at the same height in such a network, where the
bottom level is level 0. A node has links to nodes one level down bottom level is level 0. A node has links to nodes one level down
and/or one level up. Under some circumstances, a node may have and/or one level up. Under some circumstances, a node may have
links to nodes at the same level. As footnote: Clos terminology links to nodes at the same level. As footnote: Clos terminology
uses often the concept of "stage" but due to the folded nature of uses often the concept of "stage" but due to the folded nature of
the Fat Tree we do not use it to prevent misunderstandings. the Fat Tree we do not use it to prevent misunderstandings.
Spine/Aggregation/Edge Levels: Traditional names for Level 2, 1 and Superspine/Aggregation or Spine/Edge Levels: Traditional names for
0 respectively. Level 0 is often called leaf as well. Level 2, 1 and 0 respectively. Level 0 is often called leaf as
well.
Point of Delivery (PoD): A self-contained vertical slice of a Clos Point of Delivery (PoD): A self-contained vertical slice of a Clos
or Fat Tree network containing normally only level 0 and level 1 or Fat Tree network containing normally only level 0 and level 1
nodes. It communicates with nodes in other PoDs via the spine. nodes. It communicates with nodes in other PoDs via the spine.
We number PoDs to distinguish them and use PoD #0 to denote We number PoDs to distinguish them and use PoD #0 to denote
"undefined" PoD. "undefined" PoD.
Spine: The set of nodes that provide inter-PoD communication. These Superspine: The set of nodes that provide inter-PoD communication
nodes are also organized into levels (typically one, three, or and have no northbound adjacencies. Superspine nodes do not
five levels). Spine nodes do not belong to any PoD and are belong to any PoD and are assigned "undefined" PoD value to
assigned "undefined" PoD value to indicate the equivalent of "any" indicate the equivalent of "any" PoD.
PoD.
Leaf: A node without southbound adjacencies. Its level is 0 (except Leaf: A node without southbound adjacencies. Its level is 0 (except
cases where it is deriving its level via ZTP and is running cases where it is deriving its level via ZTP and is running
without LEAF_ONLY which will be explained in Section 4.2.9). without LEAF_ONLY which will be explained in Section 4.2.9).
Connected Spine: In case a spine level represents a connected graph Connected Spine: In case a spine level represents a connected graph
(discounting links terminating at different levels), we call it a (discounting links terminating at different levels), we call it a
"connected spine", in case a spine level consists of multiple "connected spine", in case a spine level consists of multiple
partitions, we call it a "disconnected" or "partitioned spine". partitions, we call it a "disconnected" or "partitioned spine".
In other terms, a spine without East-West links is disconnected In other terms, a spine without East-West links is disconnected
skipping to change at page 13, line 41 skipping to change at page 13, line 41
procedures aimed to fulfill the described requirements. procedures aimed to fulfill the described requirements.
4.2. Specification 4.2. Specification
4.2.1. Transport 4.2.1. Transport
All protocol elements are carried over UDP. Once QUIC [QUIC] All protocol elements are carried over UDP. Once QUIC [QUIC]
achieves the desired stability in deployments it may prove a valuable achieves the desired stability in deployments it may prove a valuable
candidate for TIE transport. candidate for TIE transport.
All packet formats are defined in Thrift models in Section 8. All packet formats are defined in Thrift models in Appendix A.
Future versions may include a [PROTOBUF] schema. Future versions may include a [PROTOBUF] schema.
4.2.2. Link (Neighbor) Discovery (LIE Exchange) 4.2.2. Link (Neighbor) Discovery (LIE Exchange)
LIE exchange happens over well-known administratively locally scoped LIE exchange happens over well-known administratively locally scoped
IPv4 multicast address [RFC2365] or link-local multicast scope and configured or otherwise well-known IPv4 multicast address
[RFC4291] for IPv6 [RFC8200] and SHOULD be sent with a TTL of 1 to [RFC2365] or link-local multicast scope [RFC4291] for IPv6 [RFC8200]
using a configured or otherwise a well-known destination UDP port
defined in Appendix C.1. LIEs SHOULD be sent with a TTL of 1 to
prevent RIFT information reaching beyond a single L3 next-hop in the prevent RIFT information reaching beyond a single L3 next-hop in the
topology. LIEs are exchanged over all links running RIFT. topology. LIEs SHOULD be sent with network control precendence.
Originating port of the LIE has no further significance. LIEs are
exchanged over all links running RIFT. An implementation MAY listen
and send LIEs on IPv4 and/or IPv6 multicast addresses. LIEs on same
link are considered part of the same negotiation independent on the
address family they arrive on. Observe further that the LIE source
address may not identify the peer uniquely in unnumbered or link-
local address cases so the transmission should occur over the same
interface the LIEs have been received on. A node can use any of the
neighbor's LIE source addresses to send TIEs.
Unless Section 4.2.9 is used, each node is provisioned with the level Unless Section 4.2.9 is used, each node is provisioned with the level
at which it is operating and its PoD (or otherwise a default level at which it is operating and its PoD (or otherwise a default level
and "undefined" PoD are assumed; meaning that leafs do not need to be and "undefined" PoD are assumed; meaning that leafs do not need to be
configured at all if initial configuration values are all left at 0). configured at all if initial configuration values are all left at 0).
Nodes in the spine are configured with "any" PoD which has the same Nodes in the spine are configured with "any" PoD which has the same
value "undefined" PoD hence we will talk about "undefined/any" PoD. value "undefined" PoD hence we will talk about "undefined/any" PoD.
This information is propagated in the LIEs exchanged. This information is propagated in the LIEs exchanged.
A node tries to form a three way adjacency if and only if A node tries to form a three way adjacency if and only if
skipping to change at page 15, line 51 skipping to change at page 16, line 15
TIEs contain sequence numbers, lifetimes and a type. Each type has a TIEs contain sequence numbers, lifetimes and a type. Each type has a
large identifying number space and information is spread across large identifying number space and information is spread across
possibly many TIEs of a certain type by the means of a hash function possibly many TIEs of a certain type by the means of a hash function
that a node or deployment can individually determine. One extreme that a node or deployment can individually determine. One extreme
point of the design space is a prefix per TIE which leads to BGP-like point of the design space is a prefix per TIE which leads to BGP-like
behavior vs. dense packing into few TIEs leading to more traditional behavior vs. dense packing into few TIEs leading to more traditional
IGP trade-off with fewer TIEs. An implementation may even rehash at IGP trade-off with fewer TIEs. An implementation may even rehash at
the cost of significant amount of re-advertisements of TIEs. the cost of significant amount of re-advertisements of TIEs.
More information about the TIE structure can be found in the schema More information about the TIE structure can be found in the schema
in Section 8. in Appendix A.
4.2.3.2. South- and Northbound Representation 4.2.3.2. South- and Northbound Representation
As a central concept to RIFT, each node represents itself differently As a central concept to RIFT, each node represents itself differently
depending on the direction in which it is advertising information. depending on the direction in which it is advertising information.
More precisely, a spine node represents two different databases to More precisely, a spine node represents two different databases to
its neighbors depending whether it advertises TIEs to the north or to its neighbors depending whether it advertises TIEs to the north or to
the south/sideways. We call those differing TIE databases either the south/sideways. We call those differing TIE databases either
south- or northbound (S-TIEs and N-TIEs) depending on the direction south- or northbound (S-TIEs and N-TIEs) depending on the direction
of distribution. of distribution.
The N-TIEs hold all of the node's adjacencies, local prefixes and The N-TIEs hold all of the node's adjacencies, local prefixes and
northbound policy-guided prefixes while the S-TIEs hold only all of northbound policy-guided prefixes while the S-TIEs hold only all of
the node's adjacencies and the default prefix with necessary the node's adjacencies, the default prefix with necessary
disaggregated prefixes and southbound policy-guided prefixes. We disaggregated prefixes, local prefixes and southbound policy-guided
will explain this in detail further in Section 4.2.8 and prefixes. We will explain this in detail further in Section 4.2.8
Section 4.2.4. and Section 4.2.4.
The TIE types are symmetric in both directions and Table 1 provides a The TIE types are symmetric in both directions and Table 1 provides a
quick reference to the different TIE types including direction and quick reference to the different TIE types including direction and
their function. their function.
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
| TIE-Type | Content | | TIE-Type | Content |
+----------+--------------------------------------------------------+ +----------+--------------------------------------------------------+
| node | node properties, adjacencies and information helping | | node | node properties, adjacencies and information helping |
| N-TIE | in complex disaggregation scenarios | | N-TIE | in complex disaggregation scenarios |
skipping to change at page 17, line 45 skipping to change at page 17, line 45
As an example illustrating a databases holding both representations, As an example illustrating a databases holding both representations,
consider the topology in Figure 2 with the optional link between node consider the topology in Figure 2 with the optional link between node
111 and node 112 (so that the flooding on an East-West link can be 111 and node 112 (so that the flooding on an East-West link can be
shown). This example assumes unnumbered interfaces. First, here are shown). This example assumes unnumbered interfaces. First, here are
the TIEs generated by some nodes. For simplicity, the key value the TIEs generated by some nodes. For simplicity, the key value
elements and the PGP elements which may be included in their S-TIEs elements and the PGP elements which may be included in their S-TIEs
or N-TIEs are not shown. or N-TIEs are not shown.
Spine21 S-TIEs: Spine21 S-TIEs:
Node S-TIE: Node S-TIE:
NodeElement(layer=2, neighbors((Node111, layer 1, cost 1), NodeElement(level=2, neighbors((Node111, level 1, cost 1),
(Node112, layer 1, cost 1), (Node121, layer 1, cost 1), (Node112, level 1, cost 1), (Node121, level 1, cost 1),
(Node122, layer 1, cost 1))) (Node122, level 1, cost 1)))
Prefix S-TIE: Prefix S-TIE:
SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1)) SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1))
Node111 S-TIEs: Node111 S-TIEs:
Node S-TIE: Node S-TIE:
NodeElement(layer=1, neighbors((Spine21, layer 2, cost 1, links(...)), NodeElement(level=1, neighbors((Spine21, level 2, cost 1, links(...)),
(Spine22, layer 2, cost 1, links(...)), (Spine22, level 2, cost 1, links(...)),
(Node112, layer 1, cost 1, links(...)), (Node112, level 1, cost 1, links(...)),
(Leaf111, layer 0, cost 1, links(...)), (Leaf111, level 0, cost 1, links(...)),
(Leaf112, layer 0, cost 1, links(...)))) (Leaf112, level 0, cost 1, links(...))))
Prefix S-TIE: Prefix S-TIE:
SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1)) SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1))
Node111 N-TIEs: Node111 N-TIEs:
Node N-TIE: Node N-TIE:
NodeElement(layer=1, NodeElement(level=1,
neighbors((Spine21, layer 2, cost 1, links(...)), neighbors((Spine21, level 2, cost 1, links(...)),
(Spine22, layer 2, cost 1, links(...)), (Spine22, level 2, cost 1, links(...)),
(Node112, layer 1, cost 1, links(...)), (Node112, level 1, cost 1, links(...)),
(Leaf111, layer 0, cost 1, links(...)), (Leaf111, level 0, cost 1, links(...)),
(Leaf112, layer 0, cost 1, links(...)))) (Leaf112, level 0, cost 1, links(...))))
Prefix N-TIE: Prefix N-TIE:
NorthPrefixesElement(prefixes(Node111.loopback) NorthPrefixesElement(prefixes(Node111.loopback)
Node121 S-TIEs: Node121 S-TIEs:
Node S-TIE: Node S-TIE:
NodeElement(layer=1, neighbors((Spine21,layer 2,cost 1), NodeElement(level=1, neighbors((Spine21,level 2,cost 1),
(Spine22, layer 2, cost 1), (Leaf121, layer 0, cost 1), (Spine22, level 2, cost 1), (Leaf121, level 0, cost 1),
(Leaf122, layer 0, cost 1))) (Leaf122, level 0, cost 1)))
Prefix S-TIE: Prefix S-TIE:
SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1)) SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1))
Node121 N-TIEs: Node121 N-TIEs:
Node N-TIE: Node N-TIE:
NodeLinkElement(layer=1, NodeElement(level=1,
neighbors((Spine21, layer 2, cost 1, links(...)), neighbors((Spine21, level 2, cost 1, links(...)),
(Spine22, layer 2, cost 1, links(...)), (Spine22, level 2, cost 1, links(...)),
(Leaf121, layer 0, cost 1, links(...)), (Leaf121, level 0, cost 1, links(...)),
(Leaf122, layer 0, cost 1, links(...)))) (Leaf122, level 0, cost 1, links(...))))
Prefix N-TIE: Prefix N-TIE:
NorthPrefixesElement(prefixes(Node121.loopback) NorthPrefixesElement(prefixes(Node121.loopback)
Leaf112 N-TIEs: Leaf112 N-TIEs:
Node N-TIE: Node N-TIE:
NodeLinkElement(layer=0, NodeElement(level=0,
neighbors((Node111, layer 1, cost 1, links(...)), neighbors((Node111, level 1, cost 1, links(...)),
(Node112, layer 1, cost 1, links(...)))) (Node112, level 1, cost 1, links(...))))
Prefix N-TIE: Prefix N-TIE:
NorthPrefixesElement(prefixes(Leaf112.loopback, Prefix112, NorthPrefixesElement(prefixes(Leaf112.loopback, Prefix112,
Prefix_MH)) Prefix_MH))
Figure 3: example TIES generated in a 2 level spine-and-leaf topology Figure 3: example TIES generated in a 2 level spine-and-leaf topology
4.2.3.3. Flooding 4.2.3.3. Flooding
The mechanism used to distribute TIEs is the well-known (albeit The mechanism used to distribute TIEs is the well-known (albeit
modified in several respects to address fat tree requirements) modified in several respects to address fat tree requirements)
skipping to change at page 22, line 35 skipping to change at page 22, line 35
originates in its south prefix TIE such a default route IIF originates in its south prefix TIE such a default route IIF
1. all other nodes at X's' level are overloaded OR 1. all other nodes at X's' level are overloaded OR
2. all other nodes at X's' level have NO northbound adjacencies OR 2. all other nodes at X's' level have NO northbound adjacencies OR
3. X has computed reachability to a default route during N-SPF. 3. X has computed reachability to a default route during N-SPF.
The term "all other nodes at X's' level" describes obviously just the The term "all other nodes at X's' level" describes obviously just the
nodes at the same level in the POD with a viable lower layer nodes at the same level in the POD with a viable lower level
(otherwise the node S-TIEs cannot be reflected and the nodes in e.g. (otherwise the node S-TIEs cannot be reflected and the nodes in e.g.
POD 1 and POD 2 are "invisible" to each other). POD 1 and POD 2 are "invisible" to each other).
A node originating a southbound default route MUST install a default A node originating a southbound default route MUST install a default
discard route if it did not compute a default route during N-SPF. discard route if it did not compute a default route during N-SPF.
4.2.3.8. Northbound TIE Flooding Reduction 4.2.3.8. Northbound TIE Flooding Reduction
Section 1.4 of the Optimized Link State Routing Protocol [RFC3626] Section 1.4 of the Optimized Link State Routing Protocol [RFC3626]
(OLSR) introduces the concept of a "multipoint relay" (MPR) that (OLSR) introduces the concept of a "multipoint relay" (MPR) that
skipping to change at page 27, line 6 skipping to change at page 27, line 6
easier troubleshooting, the approach taken in RIFT is that a node's easier troubleshooting, the approach taken in RIFT is that a node's
southbound policy-guided prefixes are sent in its S-TIE and the southbound policy-guided prefixes are sent in its S-TIE and the
receiver does inbound filtering based on the associated communities receiver does inbound filtering based on the associated communities
(an egress policy is imaginable but would lead to different S-TIEs (an egress policy is imaginable but would lead to different S-TIEs
per neighbor possibly which is not considered in RIFT protocol per neighbor possibly which is not considered in RIFT protocol
procedures). A southbound policy-guided prefix can only use links in procedures). A southbound policy-guided prefix can only use links in
the south direction. If an PGP S-TIE is received on an East-West or the south direction. If an PGP S-TIE is received on an East-West or
northbound link, it must be discarded by ingress filtering. northbound link, it must be discarded by ingress filtering.
Conceptually, a southbound policy-guided prefix guides traffic from Conceptually, a southbound policy-guided prefix guides traffic from
the leaves up to at most the north-most layer. It is also necessary the leaves up to at most the north-most level. It is also necessary
to to have northbound policy-guided prefixes to guide traffic from to to have northbound policy-guided prefixes to guide traffic from
the north-most layer down to the appropriate leaves. Therefore, RIFT the north-most level down to the appropriate leaves. Therefore, RIFT
includes northbound policy-guided prefixes in its N PGP-TIE and the includes northbound policy-guided prefixes in its N PGP-TIE and the
receiver does inbound filtering based on the associated communities. receiver does inbound filtering based on the associated communities.
A northbound policy-guided prefix can only use links in the northern A northbound policy-guided prefix can only use links in the northern
direction. If an N PGP TIE is received on an East-West or southbound direction. If an N PGP TIE is received on an East-West or southbound
link, it must be discarded by ingress filtering. link, it must be discarded by ingress filtering.
By separating southbound and northbound policy-guided prefixes and By separating southbound and northbound policy-guided prefixes and
requiring that the cost associated with a PGP is strictly requiring that the cost associated with a PGP is strictly
monotonically increasing at each hop, the path cannot loop. Because monotonically increasing at each hop, the path cannot loop. Because
the costs are strictly increasing, it is not possible to have a loop the costs are strictly increasing, it is not possible to have a loop
skipping to change at page 27, line 31 skipping to change at page 27, line 31
counting to infinity would become an issue to be solved. If complete counting to infinity would become an issue to be solved. If complete
generality of path - such as including East-West links and using both generality of path - such as including East-West links and using both
north and south links in arbitrary sequence - then a Path Vector north and south links in arbitrary sequence - then a Path Vector
protocol or a similar solution must be considered. protocol or a similar solution must be considered.
If a node has received the same prefix, after ingress filtering, as a If a node has received the same prefix, after ingress filtering, as a
PGP in an S-TIE and in an N-TIE, then the node determines which PGP in an S-TIE and in an N-TIE, then the node determines which
policy-guided prefix to use based upon the advertised cost. policy-guided prefix to use based upon the advertised cost.
A policy-guided prefix is always preferred to a regular prefix, even A policy-guided prefix is always preferred to a regular prefix, even
if the policy-guided prefix has a larger cost. Section 8 provides if the policy-guided prefix has a larger cost. Appendix A provides
normative indication of prefix preferences. normative indication of prefix preferences.
The set of policy-guided prefixes received in a TIE is subject to The set of policy-guided prefixes received in a TIE is subject to
ingress filtering and then re-originated to be sent out in the ingress filtering and then re-originated to be sent out in the
receiver's appropriate TIE. Both the ingress filtering and the re- receiver's appropriate TIE. Both the ingress filtering and the re-
origination use the communities associated with the policy-guided origination use the communities associated with the policy-guided
prefixes to determine the correct behavior. The cost on re- prefixes to determine the correct behavior. The cost on re-
advertisement MUST increase in a strictly monotonic fashion. advertisement MUST increase in a strictly monotonic fashion.
4.2.4.1. Ingress Filtering 4.2.4.1. Ingress Filtering
When a node X receives a PGP S-TIE or a PGP N-TIE that is originated When a node X receives a PGP S-TIE or a PGP N-TIE that is originated
from a node Y which does not have an adjacency with X, all PGPs in from a node Y which does not have an adjacency with X, all PGPs in
such a TIE MUST be filtered. Similarly, if node Y is at the same such a TIE MUST be filtered. Similarly, if node Y is at the same
layer as node X, then X MUST filter out PGPs in such S- and N-TIEs to level as node X, then X MUST filter out PGPs in such S- and N-TIEs to
prevent loops. prevent loops.
Next, policy can be applied to determine which policy-guided prefixes Next, policy can be applied to determine which policy-guided prefixes
to accept. Since ingress filtering is chosen rather than egress to accept. Since ingress filtering is chosen rather than egress
filtering and per-neighbor PGPs, policy that applies to links is done filtering and per-neighbor PGPs, policy that applies to links is done
at the receiver. Because the RIFT adjacency is between nodes and at the receiver. Because the RIFT adjacency is between nodes and
there may be parallel links between the two nodes, the policy-guided there may be parallel links between the two nodes, the policy-guided
prefix is considered to start with the next-hop set that has all prefix is considered to start with the next-hop set that has all
links to the originating node Y. links to the originating node Y.
skipping to change at page 30, line 31 skipping to change at page 30, line 31
To compute reachability, a node runs conceptually a northbound and a To compute reachability, a node runs conceptually a northbound and a
southbound SPF. We call that N-SPF and S-SPF. southbound SPF. We call that N-SPF and S-SPF.
Since neither computation can "loop" (with due considerations given Since neither computation can "loop" (with due considerations given
to PGPs), it is possible to compute non-equal-cost or even k-shortest to PGPs), it is possible to compute non-equal-cost or even k-shortest
paths [EPPSTEIN] and "saturate" the fabric to the extent desired. paths [EPPSTEIN] and "saturate" the fabric to the extent desired.
4.2.5.1. Northbound SPF 4.2.5.1. Northbound SPF
N-SPF uses northbound and East-West adjacencies in North Node TIEs N-SPF uses northbound and East-West adjacencies in the computing
when progressing Dijkstra. Observe that this is really just a one node's node N-TIEs (since if the node is a leaf it may not have
hop variety since South Node TIEs are not re-flooded southbound generated a node S-TIE) when starting Dijkstra. Observe that N-SPF
beyond a single level (or East-West) and with that the computation is really just a one hop variety since Node S-TIEs are not re-flooded
cannot progress beyond adjacent nodes. southbound beyond a single level (or East-West) and with that the
computation cannot progress beyond adjacent nodes.
Once progressing, we are using the next level's node S-TIEs to find
according adjacencies to verify backlink connectivity. Just as in
case of IS-IS or OSPF, two unidirectional links are associated
together to confirm bidirectional connectivity.
Default route found when crossing an E-W link is used IIF Default route found when crossing an E-W link is used IIF
1. the node itself does NOT have any northbound adjacencies AND 1. the node itself does NOT have any northbound adjacencies AND
2. the adjacent node has one or more northbound adjacencies 2. the adjacent node has one or more northbound adjacencies
This rule forms a "one-hop default route split-horizon" and prevents This rule forms a "one-hop default route split-horizon" and prevents
looping over default routes while allowing for "one-hop protection" looping over default routes while allowing for "one-hop protection"
of nodes that lost all northbound adjacencies. of nodes that lost all northbound adjacencies.
skipping to change at page 31, line 15 skipping to change at page 31, line 20
2. the node does not originate a non-default supersuming prefix 2. the node does not originate a non-default supersuming prefix
itself. itself.
i.e. the E-W link can be used as the gateway of last resort for a i.e. the E-W link can be used as the gateway of last resort for a
specific prefix only. Using south prefixes across E-W link can be specific prefix only. Using south prefixes across E-W link can be
beneficial e.g. on automatic de-aggregation in pathological fabric beneficial e.g. on automatic de-aggregation in pathological fabric
partitioning scenarios. partitioning scenarios.
A detailed example can be found in Section 5.4. A detailed example can be found in Section 5.4.
For N-SPF we are using the South Node TIEs to find according
adjacencies to verify backlink connectivity. Just as in case of IS-
IS or OSPF, two unidirectional links are associated together to
confirm bidirectional connectivity.
4.2.5.2. Southbound SPF 4.2.5.2. Southbound SPF
S-SPF uses only the southbound adjacencies in the south node TIEs, S-SPF uses only the southbound adjacencies in the node S-TIEs, i.e.
i.e. progresses towards nodes at lower levels. Observe that E-W progresses towards nodes at lower levels. Observe that E-W
adjacencies are NEVER used in the computation. This enforces the adjacencies are NEVER used in the computation. This enforces the
requirement that a packet traversing in a southbound direction must requirement that a packet traversing in a southbound direction must
never change its direction. never change its direction.
S-SPF uses northbound adjacencies in north node TIEs to verify S-SPF uses northbound adjacencies in node N-TIEs to verify backlink
backlink connectivity. connectivity.
4.2.5.3. East-West Forwarding Within a Level 4.2.5.3. East-West Forwarding Within a Level
Ultimately, it should be observed that in presence of a "ring" of E-W Ultimately, it should be observed that in presence of a "ring" of E-W
links in a level neither SPF will provide a "ring protection" scheme links in a level neither SPF will provide a "ring protection" scheme
since such a computation would have to deal necessarily with breaking since such a computation would have to deal necessarily with breaking
of "loops" in generic Dijkstra sense; an application for which RIFT of "loops" in generic Dijkstra sense; an application for which RIFT
is not intended. It is outside the scope of this document how an is not intended. It is outside the scope of this document how an
underlay can be used to provide a full-mesh connectivity between underlay can be used to provide a full-mesh connectivity between
nodes in the same layer that would allow for N-SPF to provide nodes in the same level that would allow for N-SPF to provide
protection for a single node loosing all its northbound adjacencies protection for a single node loosing all its northbound adjacencies
(as long as any of the other nodes in the level are northbound (as long as any of the other nodes in the level are northbound
connected). connected).
Using south prefixes over horizontal links is optional and can Using south prefixes over horizontal links is optional and can
protect against pathological fabric partitioning cases that leave protect against pathological fabric partitioning cases that leave
only paths to destinations that would necessitate multiple changes of only paths to destinations that would necessitate multiple changes of
forwarding direction between north and south. forwarding direction between north and south.
4.2.6. Attaching Prefixes 4.2.6. Attaching Prefixes
skipping to change at page 33, line 6 skipping to change at page 33, line 6
route database. The cost of the prefix is set to the cost received route database. The cost of the prefix is set to the cost received
plus the cost of the minimum cost next-hop to that neighbor. Then plus the cost of the minimum cost next-hop to that neighbor. Then
each prefix can be added into the RIFT route database with the each prefix can be added into the RIFT route database with the
next_hop_set; ties are broken based upon type first and then next_hop_set; ties are broken based upon type first and then
distance. RIFT route preferences are normalized by the according distance. RIFT route preferences are normalized by the according
thrift model type. thrift model type.
An exemplary implementation for node X follows: An exemplary implementation for node X follows:
for each S-TIE for each S-TIE
if S-TIE.layer > X.layer if S-TIE.level > X.level
next_hop_set = set of minimum cost links to the S-TIE.originator next_hop_set = set of minimum cost links to the S-TIE.originator
next_hop_cost = minimum cost link to S-TIE.originator next_hop_cost = minimum cost link to S-TIE.originator
end if end if
for each prefix P in the S-TIE for each prefix P in the S-TIE
P.cost = P.cost + next_hop_cost P.cost = P.cost + next_hop_cost
if P not in route_database: if P not in route_database:
add (P, type=DistVector, P.cost, next_hop_set) to route_database add (P, type=DistVector, P.cost, next_hop_set) to route_database
end if end if
if (P in route_database) and if (P in route_database) and
(route_database[P].type is not PolicyGuided): (route_database[P].type is not PolicyGuided):
skipping to change at page 35, line 5 skipping to change at page 35, line 5
the following steps: the following steps:
1. A DAG computation in the southern direction is performed first, 1. A DAG computation in the southern direction is performed first,
i.e. the N-TIEs are used to find all of prefixes it can reach and i.e. the N-TIEs are used to find all of prefixes it can reach and
the set of next-hops in the lower level for each. Such a the set of next-hops in the lower level for each. Such a
computation can be easily performed on a fat tree by e.g. setting computation can be easily performed on a fat tree by e.g. setting
all link costs in the southern direction to 1 and all northern all link costs in the southern direction to 1 and all northern
directions to infinity. We term set of those prefixes |R, and directions to infinity. We term set of those prefixes |R, and
for each prefix, r, in |R, we define its set of next-hops to for each prefix, r, in |R, we define its set of next-hops to
be |H(r). Observe that policy-guided prefixes are NOT affected be |H(r). Observe that policy-guided prefixes are NOT affected
since their scope is controlled by configuration. since their distribution scope is controlled by configuration.
2. The node uses reflected S-TIEs to find all nodes at the same 2. The node uses reflected S-TIEs to find all nodes at the same
level in the same PoD and the set of southbound adjacencies for level in the same PoD and the set of southbound adjacencies for
each. The set of nodes at the same level is termed |N and for each. The set of nodes at the same level is termed |N and for
each node, n, in |N, we define its set of southbound adjacencies each node, n, in |N, we define its set of southbound adjacencies
to be |A(n). to be |A(n).
3. For a given r, if the intersection of |H(r) and |A(n), for any n, 3. For a given r, if the intersection of |H(r) and |A(n), for any n,
is null then that prefix r must be explicitly advertised by the is null then that prefix r must be explicitly advertised by the
node in an S-TIE. node in an S-TIE.
skipping to change at page 35, line 39 skipping to change at page 35, line 39
a node X needs to determine if it can reach a different set of south a node X needs to determine if it can reach a different set of south
neighbors than other nodes at the same level, which are connected to neighbors than other nodes at the same level, which are connected to
it via at least one common south or East-West neighbor. If it can, it via at least one common south or East-West neighbor. If it can,
then prefix disaggregation may be required. If it can't, then no then prefix disaggregation may be required. If it can't, then no
prefix disaggregation is needed. An example of disaggregation is prefix disaggregation is needed. An example of disaggregation is
provided in Section 5.3. provided in Section 5.3.
A possible algorithm is described last: A possible algorithm is described last:
1. Create partial_neighbors = (empty), a set of neighbors with 1. Create partial_neighbors = (empty), a set of neighbors with
partial connectivity to the node X's layer from X's perspective. partial connectivity to the node X's level from X's perspective.
Each entry is a list of south neighbor of X and a list of nodes Each entry is a list of south neighbor of X and a list of nodes
of X.layer that can't reach that neighbor. of X.level that can't reach that neighbor.
2. A node X determines its set of southbound neighbors 2. A node X determines its set of southbound neighbors
X.south_neighbors. X.south_neighbors.
3. For each S-TIE originated from a node Y that X has which is at 3. For each S-TIE originated from a node Y that X has which is at
X.layer, if Y.south_neighbors is not the same as X.level, if Y.south_neighbors is not the same as
X.south_neighbors but the nodes share at least one southern X.south_neighbors but the nodes share at least one southern
neighbor, for each neighbor N in X.south_neighbors but not in neighbor, for each neighbor N in X.south_neighbors but not in
Y.south_neighbors, add (N, (Y)) to partial_neighbors if N isn't Y.south_neighbors, add (N, (Y)) to partial_neighbors if N isn't
there or add Y to the list for N. there or add Y to the list for N.
4. If partial_neighbors is empty, then node X does not to 4. If partial_neighbors is empty, then node X does not to
disaggregate any prefixes. If node X is advertising disaggregate any prefixes. If node X is advertising
disaggregated prefixes in its S-TIE, X SHOULD remove them and re- disaggregated prefixes in its S-TIE, X SHOULD remove them and re-
advertise its according S-TIEs. advertise its according S-TIEs.
A node X computes its SPF based upon the received N-TIEs. This A node X computes its SPF based upon the received N-TIEs. This
results in a set of routes, each categorized by (prefix, results in a set of routes, each categorized by (prefix,
path_distance, next-hop-set). Alternately, for clarity in the path_distance, next-hop-set). Alternately, for clarity in the
following procedure, these can be organized by next-hop-set as ( following procedure, these can be organized by next-hop-set as (
(next-hops), {(prefix, path_distance)}). If partial_neighbors isn't (next-hops), {(prefix, path_distance)}). If partial_neighbors isn't
empty, then the following procedure describes how to identify empty, then the following procedure describes how to identify
prefixes to disaggregate. prefixes to disaggregate.
disaggregated_prefixes = {empty } disaggregated_prefixes = {empty }
nodes_same_layer = { empty } nodes_same_level = { empty }
for each S-TIE for each S-TIE
if (S-TIE.layer == X.layer and if (S-TIE.level == X.level and
X shares at least one S-neighbor with X) X shares at least one S-neighbor with X)
add S-TIE.originator to nodes_same_layer add S-TIE.originator to nodes_same_level
end if end if
end for end for
for each next-hop-set NHS for each next-hop-set NHS
isolated_nodes = nodes_same_layer isolated_nodes = nodes_same_level
for each NH in NHS for each NH in NHS
if NH in partial_neighbors if NH in partial_neighbors
isolated_nodes = intersection(isolated_nodes, isolated_nodes = intersection(isolated_nodes,
partial_neighbors[NH].nodes) partial_neighbors[NH].nodes)
end if end if
end for end for
if isolated_nodes is not empty if isolated_nodes is not empty
for each prefix using NHS for each prefix using NHS
add (prefix, distance) to disaggregated_prefixes add (prefix, distance) to disaggregated_prefixes
skipping to change at page 37, line 28 skipping to change at page 37, line 28
3. all the lower level nodes are flooded the same disaggregated 3. all the lower level nodes are flooded the same disaggregated
prefixes since we don't want to build an S-TIE per node and prefixes since we don't want to build an S-TIE per node and
complicate things unnecessarily. The PoD containing the prefix complicate things unnecessarily. The PoD containing the prefix
will prefer southbound anyway. will prefer southbound anyway.
4. disaggregated prefixes do NOT have to propagate to lower levels. 4. disaggregated prefixes do NOT have to propagate to lower levels.
With that the disturbance in terms of new flooding is contained With that the disturbance in terms of new flooding is contained
to a single level experiencing failures only. to a single level experiencing failures only.
5. disaggregated prefix S-TIEs are not "reflected" by the lower 5. disaggregated prefix S-TIEs are not "reflected" by the lower
layer, i.e. nodes within same level do NOT need to be aware level, i.e. nodes within same level do NOT need to be aware
which node computed the need for disaggregation. which node computed the need for disaggregation.
6. The fabric is still supporting maximum load balancing properties 6. The fabric is still supporting maximum load balancing properties
while not trying to send traffic northbound unless necessary. while not trying to send traffic northbound unless necessary.
Ultimately, complex partitions of superspine on sparsely connected Ultimately, complex partitions of superspine on sparsely connected
fabrics can lead to necessity of transitive disaggregation through fabrics can lead to necessity of transitive disaggregation through
multiple layers. The topic will be described and standardized in multiple levels. The topic will be described and standardized in
later versions of this document. later versions of this document.
4.2.9. Optional Autoconfiguration 4.2.9. Optional Autoconfiguration
Each RIFT node can optionally operate in zero touch provisioning Each RIFT node can optionally operate in zero touch provisioning
(ZTP) mode, i.e. it has no configuration (unless it is a superspine (ZTP) mode, i.e. it has no configuration (unless it is a superspine
at the top of the topology or the must operate in the topology as at the top of the topology or the must operate in the topology as
leaf and/or support leaf-2-leaf procedures) and it will fully leaf and/or support leaf-2-leaf procedures) and it will fully
configure itself after being attached to the topology. Configured configure itself after being attached to the topology. Configured
nodes and nodes operating in ZTP can be mixed and will form a valid nodes and nodes operating in ZTP can be mixed and will form a valid
skipping to change at page 39, line 10 skipping to change at page 39, line 10
from LIEs with `not_a_ztp_offer` being true are not VOLs either. from LIEs with `not_a_ztp_offer` being true are not VOLs either.
Highest Available Level (HAL): Highest defined level value seen from Highest Available Level (HAL): Highest defined level value seen from
all VOLs received. all VOLs received.
Highest Adjacency Three Way (HAT): Highest neigbhor level of all the Highest Adjacency Three Way (HAT): Highest neigbhor level of all the
formed three way adjacencies for the node. formed three way adjacencies for the node.
SUPERSPINE_FLAG: Configuration flag provided to all superspines. SUPERSPINE_FLAG: Configuration flag provided to all superspines.
LEAF_FLAG and CONFIGURED_LEVEL cannot be defined at the same time LEAF_FLAG and CONFIGURED_LEVEL cannot be defined at the same time
as this flag. It implies CONFIGURED_LEVEL value of 16. In fact, as this flag. It implies a CONFIGURED_LEVEL value. In fact, it
it is basically a shortcut for configuring same level at all is basically a shortcut for configuring same level at all
superspine nodes which is unavoidable since an initial 'seed' is superspine nodes which is unavoidable since an initial 'seed' is
needed for other ZTP nodes to derive their level in the topology. needed for other ZTP nodes to derive their level in the topology.
4.2.9.2. Automatic SystemID Selection 4.2.9.2. Automatic SystemID Selection
RIFT identifies each node via a SystemID which is a 64 bits wide RIFT identifies each node via a SystemID which is a 64 bits wide
integer. It is relatively simple to derive a, for all practical integer. It is relatively simple to derive a, for all practical
purposes collision free, value for each node on startup. For that purposes collision free, value for each node on startup. For that
purpose, a node MUST use as system ID EUI-64 MA-L format where the purpose, a node MUST use as system ID EUI-64 MA-L format where the
organizationally governed 24 bits can be used to generate system IDs organizationally governed 24 bits can be used to generate system IDs
skipping to change at page 40, line 37 skipping to change at page 40, line 37
. | | | | | . | | | | |
. +-----------------+ | | . +-----------------+ | |
. | | | | | . | | | | |
. ++-++ ++-++ | . ++-++ ++-++ |
. | X +-----+ Y +-+ . | X +-----+ Y +-+
. |l2l| | l | . |l2l| | l |
. +---+ +---+ . +---+ +---+
Figure 7: Generic ZTP Cabling Considerations Figure 7: Generic ZTP Cabling Considerations
First, we need to anchor the "top" of the cabling and that's what the First, we must anchor the "top" of the cabling and that's what the
SUPERSPINE_FLAG at node A is for. Then things look smooth until we SUPERSPINE_FLAG at node A is for. Then things look smooth until we
have to decide whether node Y is at the same level as I, J or at the have to decide whether node Y is at the same level as I, J or at the
same level as Y and consequently, X is south of it. This is same level as Y and consequently, X is south of it. This is
unresolvable here until we "nail down the bottom" of the topology. unresolvable here until we "nail down the bottom" of the topology.
To achieve that we use the the leaf flags. We will see further then To achieve that we choose to use in this example the leaf flags. We
whether Y chooses to form adjacencies to F or I, J successively. will see further then whether Y chooses to form adjacencies to F or
I, J successively.
4.2.9.4. Level Determination Procedure 4.2.9.4. Level Determination Procedure
A node starting up with UNDEFINED_VALUE (i.e. without a A node starting up with UNDEFINED_VALUE (i.e. without a
CONFIGURED_LEVEL or any leaf or superspine flag) MUST follow those CONFIGURED_LEVEL or any leaf or superspine flag) MUST follow those
additional procedures: additional procedures:
1. It advertises its LEVEL_VALUE on all LIEs (observe that this can 1. It advertises its LEVEL_VALUE on all LIEs (observe that this can
be UNDEFINED_LEVEL which in terms of the schema is simply an be UNDEFINED_LEVEL which in terms of the schema is simply an
omitted optional value). omitted optional value).
skipping to change at page 41, line 32 skipping to change at page 41, line 32
5. A node that changed its defined level value MUST readvertise its 5. A node that changed its defined level value MUST readvertise its
own TIEs (since the new `PacketHeader` will contain a different own TIEs (since the new `PacketHeader` will contain a different
level than before). Sequence number of each TIE MUST be level than before). Sequence number of each TIE MUST be
increased. increased.
6. After a level has been derived the node MUST set the 6. After a level has been derived the node MUST set the
`not_a_ztp_offer` on LIEs towards all systems extending a VOL for `not_a_ztp_offer` on LIEs towards all systems extending a VOL for
HAL. HAL.
A node starting with LEVEL_VALUE being 0 (i.e. it assumes a leaf A node starting with LEVEL_VALUE being 0 (i.e. it assumes a leaf
function or has a CONFIGURED_LEVEL of 0) MUST follow those additional function by being configured with the appropriate flags or has a
procedures: CONFIGURED_LEVEL of 0) MUST follow those additional procedures:
1. It computes HAT per procedures above but does NOT use it to 1. It computes HAT per procedures above but does NOT use it to
compute DERIVED_LEVEL. HAT is used to limit adjacency formation compute DERIVED_LEVEL. HAT is used to limit adjacency formation
per Section 4.2.2. per Section 4.2.2.
Precise finite state machines will be provided in later versions of Precise finite state machines will be provided in later versions of
this specification. this specification.
4.2.9.5. Resulting Topologies 4.2.9.5. Resulting Topologies
The procedures defined in Section 4.2.9.4 will lead to the RIFT The procedures defined in Section 4.2.9.4 will lead to the RIFT
topology and levels depicted in Figure 8. topology and levels depicted in Figure 8.
. +---+ . +---+
. | As| . | As|
. | 64| . | 24|
. ++-++ . ++-++
. | | . | |
. +--+ +--+ . +--+ +--+
. | | . | |
. +--++ ++--+ . +--++ ++--+
. | E | | F | . | E | | F |
. | 63+-+ | 63+-----------+ . | 23+-+ | 23+-----------+
. ++--+ | ++-++ | . ++--+ | ++-++ |
. | | | | | . | | | | |
. | +-------+ | | . | +-------+ | |
. | | | | | . | | | | |
. | | +----+ | | . | | +----+ | |
. | | | | | . | | | | |
. ++-++ ++-++ | . ++-++ ++-++ |
. | I +-----+ J | | . | I +-----+ J | |
. | 62| | 62| | . | 22| | 22| |
. ++--+ +--++ | . ++--+ +--++ |
. | | | . | | |
. +---------+ | | . +---------+ | |
. | | | . | | |
. ++-++ +---+ | . ++-++ +---+ |
. | X | | Y +-+ . | X | | Y +-+
. | 0 | | 0 | . | 0 | | 0 |
. +---+ +---+ . +---+ +---+
Figure 8: Generic ZTP Topology Autoconfigured Figure 8: Generic ZTP Topology Autoconfigured
In case we imagine the LEAF_ONLY restriction on Y is removed the In case we imagine the LEAF_ONLY restriction on Y is removed the
outcome would be very different however and result in Figure 9. This outcome would be very different however and result in Figure 9. This
demonstrates basically that auto configuration prevents miscabling demonstrates basically that auto configuration prevents miscabling
detection and with that can lead to undesirable effects when leafs detection and with that can lead to undesirable effects in cases
are not "nailed" and arbitrarily cabled. where leafs are not "nailed" by the accordingly configured flags and
arbitrarily cabled.
. +---+ . +---+
. | As| . | As|
. | 64| . | 24|
. ++-++ . ++-++
. | | . | |
. +--+ +--+ . +--+ +--+
. | | . | |
. +--++ ++--+ . +--++ ++--+
. | E | | F | . | E | | F |
. | 63+-+ | 63+-------+ . | 23+-+ | 23+-------+
. ++--+ | ++-++ | . ++--+ | ++-++ |
. | | | | | . | | | | |
. | +-------+ | | . | +-------+ | |
. | | | | | . | | | | |
. | | +----+ | | . | | +----+ | |
. | | | | | . | | | | |
. ++-++ ++-++ +-+-+ . ++-++ ++-++ +-+-+
. | I +-----+ J +-----+ Y | . | I +-----+ J +-----+ Y |
. | 62| | 62| | 62| . | 22| | 22| | 22|
. ++-++ +--++ ++-++ . ++-++ +--++ ++-++
. | | | | | . | | | | |
. | +-----------------+ | . | +-----------------+ |
. | | | . | | |
. +---------+ | | . +---------+ | |
. | | | . | | |
. ++-++ | . ++-++ |
. | X +--------+ . | X +--------+
. | 0 | . | 0 |
. +---+ . +---+
skipping to change at page 49, line 16 skipping to change at page 49, line 16
independent BFD session or they may share a session. independent BFD session or they may share a session.
In case RIFT changes link identifiers both the hello as well as In case RIFT changes link identifiers both the hello as well as
the BFD sessions SHOULD be brought down and back up again. the BFD sessions SHOULD be brought down and back up again.
Multiple RIFT instances MAY choose to share a single BFD session Multiple RIFT instances MAY choose to share a single BFD session
(in such case it is undefined what discriminators are used albeit (in such case it is undefined what discriminators are used albeit
RIFT CAN advertise the same link ID for the same interface in RIFT CAN advertise the same link ID for the same interface in
multiple instances and with that "share" the discriminators). multiple instances and with that "share" the discriminators).
BFD TTL follows [RFC5082].
4.3.6. Fabric Bandwidth Balancing 4.3.6. Fabric Bandwidth Balancing
A well understood problem in fabrics is that in case of link losses A well understood problem in fabrics is that in case of link losses
it would be ideal to rebalance how much traffic is offered to it would be ideal to rebalance how much traffic is offered to
switches in the next layer based on the ingress and egress bandwidth switches in the next level based on the ingress and egress bandwidth
they have. Current attempts rely mostly on specialized traffic they have. Current attempts rely mostly on specialized traffic
engineering via controller or leafs being aware of complete topology engineering via controller or leafs being aware of complete topology
with according cost and complexity. with according cost and complexity.
RIFT can support a very light weight mechanism that can deal with the RIFT can support a very light weight mechanism that can deal with the
problem in an approximative way based on the fact that RIFT is loop- problem in an approximative way based on the fact that RIFT is loop-
free. free.
4.3.6.1. Northbound Direction 4.3.6.1. Northbound Direction
Every RIFT node SHOULD compute the amount of northbound bandwith Every RIFT node SHOULD compute the amount of northbound bandwith
available through neighbors at higher level and modify distance available through neighbors at higher level and modify distance
received on default route from this neighbor. Those different received on default route from this neighbor. Those different
distances SHOULD be used to support weighted ECMP forwarding towards distances SHOULD be used to support weighted ECMP forwarding towards
higher level when using default route. We call such a distance higher level when using default route. We call such a distance
Bandwidth Adjusted Distance or BAD. This is best illustrated by a Bandwidth Adjusted Distance or BAD. This is best illustrated by a
simple example. simple example.
. 100 x 100 100 MBits . 100 x 100 100 MBits
. | x | | . | x | |
. +-+---+-+ +-+---+-+ . +-+---+-+ +-+---+-+
. | | | | . | | | |
. |Node111| |Node112| . |Node111| |Node112|
. +-+---+++ ++----+++ . +-+---+++ ++----+++
. |x || || || . |x || || ||
. || |+---------------+ || . || |+---------------+ ||
. || +---------------+| || . || +---------------+| ||
. || || || || . || || || ||
. || || || || . || || || ||
. -----All Links 10 MBit------- . -----All Links 10 MBit-------
. || || || || . || || || ||
. || || || || . || || || ||
. || +------------+| || || . || +------------+| || ||
. || |+------------+ || || . || |+------------+ || ||
. |x || || || . |x || || ||
. +-+---+++ +--++-+++ . +-+---+++ +--++-+++
. | | | | . | | | |
. |Leaf111| |Leaf112| . |Leaf111| |Leaf112|
. +-------+ +-------+ . +-------+ +-------+
Figure 10: Balancing Bandwidth Figure 10: Balancing Bandwidth
All links from Leafs in Figure 10 are assumed to 10 MBit/s bandwidth All links from Leafs in Figure 10 are assumed to 10 MBit/s bandwidth
while the uplinks one level further up are assumed to be 100 MBit/s. while the uplinks one level further up are assumed to be 100 MBit/s.
Further, in Figure 10 we assume that Leaf111 lost one of the parallel Further, in Figure 10 we assume that Leaf111 lost one of the parallel
links to Node 111 and with that wants to possibly push more traffic links to Node 111 and with that wants to possibly push more traffic
onto Node 112. Leaf 112 has equal bandwidth to Node 111 and Node 112 onto Node 112. Leaf 112 has equal bandwidth to Node 111 and Node 112
but Node 111 lost one of its uplinks. but Node 111 lost one of its uplinks.
The local modification of the received default route distance from The local modification of the received default route distance from
upper layer is achieved by running a relatively simple algorithm upper level is achieved by running a relatively simple algorithm
where the bandwidth is weighted exponentially while the distance on where the bandwidth is weighted exponentially while the distance on
the default route represents a multiplier for the bandwidth weight the default route represents a multiplier for the bandwidth weight
for easy operational adjustements. for easy operational adjustements.
On a node L use Node TIEs to compute for each non-overloaded On a node L use Node TIEs to compute for each non-overloaded
northbound neighbor N three values: northbound neighbor N three values:
L_N_u: as sum of the bandwidth available to N L_N_u: as sum of the bandwidth available to N
N_u: as sum of the uplink bandwidth available on N N_u: as sum of the uplink bandwidth available on N
skipping to change at page 51, line 47 skipping to change at page 51, line 47
affected, however, a node MAY choose to compute and use BAD for other affected, however, a node MAY choose to compute and use BAD for other
routes. routes.
Observe further that a change in available bandwidth will only affect Observe further that a change in available bandwidth will only affect
at maximum two levels down in the fabric, i.e. blast radius of at maximum two levels down in the fabric, i.e. blast radius of
bandwidth changes is contained. bandwidth changes is contained.
4.3.6.2. Southbound Direction 4.3.6.2. Southbound Direction
Due to its loop free properties a node could take during S-SPF into Due to its loop free properties a node could take during S-SPF into
account the available bandwidth on the nodes in lower layers and account the available bandwidth on the nodes in lower levels and
modify the amount of traffic offered to next level's "southbound" modify the amount of traffic offered to next level's "southbound"
nodes based as what it sees is the total achievable maximum flow nodes based as what it sees is the total achievable maximum flow
through those nodes. It is worth observing that such computations through those nodes. It is worth observing that such computations
will work better if standardized but does not have to be necessarily. will work better if standardized but does not have to be necessarily.
As long the packet keeps on heading south it will take one of the As long the packet keeps on heading south it will take one of the
available paths and arrive at the intended destination. available paths and arrive at the intended destination.
Future versions of this document will fill in more details. Future versions of this document will fill in more details.
skipping to change at page 61, line 9 skipping to change at page 61, line 9
.+-+ | | | .+-+ | | |
. | L0 | | L1 | . | L0 | | L1 |
. +-----+ +-----+ . +-----+ +-----+
Figure 14: Level Shortcut Figure 14: Level Shortcut
Strictly speaking, RIFT is not limited to Clos variations only. The Strictly speaking, RIFT is not limited to Clos variations only. The
protocol preconditions only a sense of 'compass rose direction' protocol preconditions only a sense of 'compass rose direction'
achieved by configuration (or derivation) of levels and other achieved by configuration (or derivation) of levels and other
topologies are possible within this framework. So, conceptually, one topologies are possible within this framework. So, conceptually, one
could include leaf to leaf links and even shortcut between layers but could include leaf to leaf links and even shortcut between levels but
certain requirements in Section 3 will not be met anymore. As an certain requirements in Section 3 will not be met anymore. As an
example, shortcutting levels illustrated in Figure 14 will lead example, shortcutting levels illustrated in Figure 14 will lead
either to suboptimal routing when L0 sends traffic to L1 (since using either to suboptimal routing when L0 sends traffic to L1 (since using
S0's default route will lead to the traffic being sent back to A0 or S0's default route will lead to the traffic being sent back to A0 or
A1) or the leafs need each other's routes installed to understand A1) or the leafs need each other's routes installed to understand
that only A0 and A1 should be used to talk to each other. that only A0 and A1 should be used to talk to each other.
Whether such modifications of topology constraints make sense is Whether such modifications of topology constraints make sense is
dependent on many technology variables and the exhausting treatment dependent on many technology variables and the exhausting treatment
of the topic is definitely outside the scope of this document. of the topic is definitely outside the scope of this document.
skipping to change at page 62, line 7 skipping to change at page 62, line 7
ID of the discarded TIEs. ID of the discarded TIEs.
Section 4.2.9 presents many attack vectors in untrusted environments, Section 4.2.9 presents many attack vectors in untrusted environments,
starting with nodes that oscillate their level offers to the starting with nodes that oscillate their level offers to the
possiblity of a node offering a three way adjacency with the highest possiblity of a node offering a three way adjacency with the highest
possible level value with a very long holdtime trying to put itself possible level value with a very long holdtime trying to put itself
"on top of the lattice" and with that gaining access to the whole "on top of the lattice" and with that gaining access to the whole
southbound topology. Session authentication mechanisms are necessary southbound topology. Session authentication mechanisms are necessary
in environments where this is possible. in environments where this is possible.
8. Information Elements Schema 8. IANA Considerations
This specification will request at an opportune time multiple
registry points to exchange protocol packets in a standardized way,
amongst them multicast address assignments and standard port numbers.
The schema itself defines many values and codepoints which can be
considered registries themselves.
9. Acknowledgments
Many thanks to Naiming Shen for some of the early discussions around
the topic of using IGPs for routing in topologies related to Clos.
Russ White to be especially acknowledged for the key conversation on
epistomology that allowed to tie current asynchronous distributed
systems theory results to a modern protocol design presented here.
Adrian Farrel, Joel Halpern, Jeffrey Zhang and Krzysztof Szarkowicz
provided thoughtful comments that improved the readability of the
document and found good amount of corners where the light failed to
shine. Kris Price was first to mention single router, single arm
default considerations. Jeff Tantsura helped out with some initial
thoughts on BFD interactions while Jeff Haas corrected several
misconceptions about BFD's finer points. Artur Makutunowicz pointed
out many possible improvements and acted as sounding board in regard
to modern protocol implementation techniques RIFT is exploring.
Barak Gafni formalized first time clearly the problem of partitioned
spine on a (clean) napkin in Singapore.
10. References
10.1. Normative References
[I-D.ietf-6lo-rfc6775-update]
Thubert, P., Nordmark, E., Chakrabarti, S., and C.
Perkins, "Registration Extensions for 6LoWPAN Neighbor
Discovery", draft-ietf-6lo-rfc6775-update-21 (work in
progress), June 2018.
[ISO10589]
ISO "International Organization for Standardization",
"Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473), ISO/IEC
10589:2002, Second Edition.", Nov 2002.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328,
DOI 10.17487/RFC2328, April 1998,
<https://www.rfc-editor.org/info/rfc2328>.
[RFC2365] Meyer, D., "Administratively Scoped IP Multicast", BCP 23,
RFC 2365, DOI 10.17487/RFC2365, July 1998,
<https://www.rfc-editor.org/info/rfc2365>.
[RFC3626] Clausen, T., Ed. and P. Jacquet, Ed., "Optimized Link
State Routing Protocol (OLSR)", RFC 3626,
DOI 10.17487/RFC3626, October 2003,
<https://www.rfc-editor.org/info/rfc3626>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A
Border Gateway Protocol 4 (BGP-4)", RFC 4271,
DOI 10.17487/RFC4271, January 2006,
<https://www.rfc-editor.org/info/rfc4271>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing
Architecture", RFC 4291, DOI 10.17487/RFC4291, February
2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation
Element (PCE)-Based Architecture", RFC 4655,
DOI 10.17487/RFC4655, August 2006,
<https://www.rfc-editor.org/info/rfc4655>.
[RFC5082] Gill, V., Heasley, J., Meyer, D., Savola, P., Ed., and C.
Pignataro, "The Generalized TTL Security Mechanism
(GTSM)", RFC 5082, DOI 10.17487/RFC5082, October 2007,
<https://www.rfc-editor.org/info/rfc5082>.
[RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi
Topology (MT) Routing in Intermediate System to
Intermediate Systems (IS-ISs)", RFC 5120,
DOI 10.17487/RFC5120, February 2008,
<https://www.rfc-editor.org/info/rfc5120>.
[RFC5303] Katz, D., Saluja, R., and D. Eastlake 3rd, "Three-Way
Handshake for IS-IS Point-to-Point Adjacencies", RFC 5303,
DOI 10.17487/RFC5303, October 2008,
<https://www.rfc-editor.org/info/rfc5303>.
[RFC5709] Bhatia, M., Manral, V., Fanto, M., White, R., Barnes, M.,
Li, T., and R. Atkinson, "OSPFv2 HMAC-SHA Cryptographic
Authentication", RFC 5709, DOI 10.17487/RFC5709, October
2009, <https://www.rfc-editor.org/info/rfc5709>.
[RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
(BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
DOI 10.17487/RFC5881, June 2010,
<https://www.rfc-editor.org/info/rfc5881>.
[RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch,
"Network Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
<https://www.rfc-editor.org/info/rfc5905>.
[RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
(SHA and SHA-based HMAC and HKDF)", RFC 6234,
DOI 10.17487/RFC6234, May 2011,
<https://www.rfc-editor.org/info/rfc6234>.
[RFC6822] Previdi, S., Ed., Ginsberg, L., Shand, M., Roy, A., and D.
Ward, "IS-IS Multi-Instance", RFC 6822,
DOI 10.17487/RFC6822, December 2012,
<https://www.rfc-editor.org/info/rfc6822>.
[RFC7855] Previdi, S., Ed., Filsfils, C., Ed., Decraene, B.,
Litkowski, S., Horneffer, M., and R. Shakir, "Source
Packet Routing in Networking (SPRING) Problem Statement
and Requirements", RFC 7855, DOI 10.17487/RFC7855, May
2016, <https://www.rfc-editor.org/info/rfc7855>.
[RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of
BGP for Routing in Large-Scale Data Centers", RFC 7938,
DOI 10.17487/RFC7938, August 2016,
<https://www.rfc-editor.org/info/rfc7938>.
[RFC7987] Ginsberg, L., Wells, P., Decraene, B., Przygienda, T., and
H. Gredler, "IS-IS Minimum Remaining Lifetime", RFC 7987,
DOI 10.17487/RFC7987, October 2016,
<https://www.rfc-editor.org/info/rfc7987>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6
(IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
10.2. Informative References
[CLOS] Yuan, X., "On Nonblocking Folded-Clos Networks in Computer
Communication Environments", IEEE International Parallel &
Distributed Processing Symposium, 2011.
[DIJKSTRA]
Dijkstra, E., "A Note on Two Problems in Connexion with
Graphs", Journal Numer. Math. , 1959.
[DOT] Ellson, J. and L. Koutsofios, "Graphviz: open source graph
drawing tools", Springer-Verlag , 2001.
[DYNAMO] De Candia et al., G., "Dynamo: amazon's highly available
key-value store", ACM SIGOPS symposium on Operating
systems principles (SOSP '07), 2007.
[EPPSTEIN]
Eppstein, D., "Finding the k-Shortest Paths", 1997.
[FATTREE] Leiserson, C., "Fat-Trees: Universal Networks for
Hardware-Efficient Supercomputing", 1985.
[I-D.ietf-spring-segment-routing]
Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B.,
Litkowski, S., and R. Shakir, "Segment Routing
Architecture", draft-ietf-spring-segment-routing-15 (work
in progress), January 2018.
[IEEEstd1588]
IEEE, "IEEE Standard for a Precision Clock Synchronization
Protocol for Networked Measurement and Control Systems",
IEEE Standard 1588,
<https://ieeexplore.ieee.org/document/4579760/>.
[IEEEstd8021AS]
IEEE, "IEEE Standard for Local and Metropolitan Area
Networks - Timing and Synchronization for Time-Sensitive
Applications in Bridged Local Area Networks",
IEEE Standard 802.1AS,
<https://ieeexplore.ieee.org/document/5741898/>.
[ISO10589-Second-Edition]
International Organization for Standardization,
"Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", Nov 2002.
[MAKSIC2013]
Maksic et al., N., "Improving Utilization of Data Center
Networks", IEEE Communications Magazine, Nov 2013.
[PROTOBUF]
Google, Inc., "Protocol Buffers,
https://developers.google.com/protocol-buffers".
[QUIC] Iyengar et al., J., "QUIC: A UDP-Based Multiplexed and
Secure Transport", 2016.
[RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or
Converting Network Protocol Addresses to 48.bit Ethernet
Address for Transmission on Ethernet Hardware", STD 37,
RFC 826, DOI 10.17487/RFC0826, November 1982,
<https://www.rfc-editor.org/info/rfc826>.
[RFC2131] Droms, R., "Dynamic Host Configuration Protocol",
RFC 2131, DOI 10.17487/RFC2131, March 1997,
<https://www.rfc-editor.org/info/rfc2131>.
[RFC3315] Droms, R., Ed., Bound, J., Volz, B., Lemon, T., Perkins,
C., and M. Carney, "Dynamic Host Configuration Protocol
for IPv6 (DHCPv6)", RFC 3315, DOI 10.17487/RFC3315, July
2003, <https://www.rfc-editor.org/info/rfc3315>.
[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman,
"Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,
DOI 10.17487/RFC4861, September 2007,
<https://www.rfc-editor.org/info/rfc4861>.
[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless
Address Autoconfiguration", RFC 4862,
DOI 10.17487/RFC4862, September 2007,
<https://www.rfc-editor.org/info/rfc4862>.
[VAHDAT08]
Al-Fares, M., Loukissas, A., and A. Vahdat, "A Scalable,
Commodity Data Center Network Architecture", SIGCOMM ,
2008.
Appendix A. Information Elements Schema
This section introduces the schema for information elements. This section introduces the schema for information elements.
On schema changes that On schema changes that
1. change field numbers or 1. change field numbers or
2. add new required fields or 2. add new required fields or
3. remove fields or 3. remove fields or
4. change lists into sets, unions into structures or 4. change lists into sets, unions into structures or
5. change multiplicity of fields or 5. change multiplicity of fields or
6. changes name of any field 6. changes name of any field or
7. change datatypes of any field or 7. change datatypes of any field or
8. adds or removes a default value of any field or 8. adds, changes or removes a default value of any field or
9. changes default value of any field 9. removes or changes any defined constant or constant value
major version of the schema MUST increase. All other changes MUST major version of the schema MUST increase. All other changes MUST
increase minor version within the same major. increase minor version within the same major.
Observe however that introducing an optional field of a structure
type without a default does not cause a major version increase even
if the fields inside the structure are optional with defaults.
Thrift serializer/deserializer MUST not discard optional, unknown Thrift serializer/deserializer MUST not discard optional, unknown
fields but preserve and serialize them again when re-flooding whereas fields but preserve and serialize them again when re-flooding whereas
missing optional fields MAY be replaced with according default values missing optional fields MAY be replaced with according default values
if present. if present.
All signed integer as forced by Thrift support must be cast for All signed integer as forced by Thrift support must be cast for
internal purposes to equivalent unsigned values without discarding internal purposes to equivalent unsigned values without discarding
the signedness bit. An implementation SHOULD try to avoid using the the signedness bit. An implementation SHOULD try to avoid using the
signedness bit when generating values. signedness bit when generating values.
The schema is normative. The schema is normative.
8.1. common.thrift A.1. common.thrift
/** /**
Thrift file with common definitions for RIFT Thrift file with common definitions for RIFT
*/ */
/** @note MUST be interpreted in implementation as unsigned 64 bits. /** @note MUST be interpreted in implementation as unsigned 64 bits.
* The implementation SHOULD NOT use the MSB. * The implementation SHOULD NOT use the MSB.
*/ */
typedef i64 SystemIDType typedef i64 SystemIDType
typedef i32 IPv4Address typedef i32 IPv4Address
/** this has to be of length long enough to accomodate prefix */ /** this has to be of length long enough to accomodate prefix */
typedef binary IPv6Address typedef binary IPv6Address
/** @note MUST be interpreted in implementation as unsigned 16 bits */ /** @note MUST be interpreted in implementation as unsigned 16 bits */
typedef i16 UDPPortType typedef i16 UDPPortType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 TIENrType typedef i32 TIENrType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 MTUSizeType typedef i32 MTUSizeType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 SeqNrType typedef i32 SeqNrType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 LifeTimeInSecType typedef i32 LifeTimeInSecType
/** @note MUST be interpreted in implementation as unsigned 16 bits */ /** @note MUST be interpreted in implementation as unsigned 16 bits */
typedef i16 LevelType typedef i16 LevelType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 PodType typedef i32 PodType
/** @note MUST be interpreted in implementation as unsigned 16 bits */ /** @note MUST be interpreted in implementation as unsigned 16 bits */
typedef i16 VersionType typedef i16 VersionType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 MetricType typedef i32 MetricType
/** @note MUST be interpreted in implementation as unstructured 64 bits */ /** @note MUST be interpreted in implementation as unstructured 64 bits */
typedef i64 RouteTagType typedef i64 RouteTagType
/** @note MUST be interpreted in implementation as unstructured 32 bits label value */ /** @note MUST be interpreted in implementation as unstructured 32 bits
typedef i32 LabelType label value */
typedef i32 LabelType
/** @note MUST be interpreted in implementation as unsigned 32 bits */ /** @note MUST be interpreted in implementation as unsigned 32 bits */
typedef i32 BandwithInMegaBitsType typedef i32 BandwithInMegaBitsType
typedef string KeyIDType typedef string KeyIDType
/** node local, unique identification for a link (interface/tunnel /** node local, unique identification for a link (interface/tunnel
* etc. Basically anything RIFT runs on). This is kept * etc. Basically anything RIFT runs on). This is kept
* at 32 bits so it aligns with BFD [RFC5880] discriminator size. * at 32 bits so it aligns with BFD [RFC5880] discriminator size.
*/ */
typedef i32 LinkIDType typedef i32 LinkIDType
typedef string KeyNameType typedef string KeyNameType
typedef i8 PrefixLenType typedef i8 PrefixLenType
/** timestamp in seconds since the epoch */ /** timestamp in seconds since the epoch */
typedef i64 TimestampInSecsType typedef i64 TimestampInSecsType
/** security nonce */ /** security nonce */
typedef i64 NonceType typedef i64 NonceType
/** adjacency holdtime */ /** adjacency holdtime */
typedef i16 HoldTimeInSecType typedef i16 HoldTimeInSecType
/** Transaction ID type for prefix mobility as specified by RFC6550, value /** Transaction ID type for prefix mobility as specified by RFC6550, value
MUST be interpreted in implementation as unsigned */ MUST be interpreted in implementation as unsigned */
typedef i8 PrefixTransactionIDType typedef i8 PrefixTransactionIDType
/** timestamp per IEEE 802.1AS, values MUST be interpreted in implementation as unsigned */ /** timestamp per IEEE 802.1AS, values MUST be interpreted in
implementation as unsigned */
struct IEEE802_1ASTimeStampType { struct IEEE802_1ASTimeStampType {
1: required i64 AS_sec; 1: required i64 AS_sec;
2: optional i32 AS_nsec; 2: optional i32 AS_nsec;
} }
/** Flags indicating nodes behavior in case of ZTP and support /** Flags indicating nodes behavior in case of ZTP and support
for special optimization procedures. It will force level to `leaf_level` for special optimization procedures. It will force level to `leaf_level`
*/ */
enum LeafIndications { enum LeafIndications {
leaf_only =0, leaf_only =0,
skipping to change at page 64, line 50 skipping to change at page 69, line 47
/** by default LIE levels are ZTP offers */ /** by default LIE levels are ZTP offers */
const bool default_not_a_ztp_offer = false const bool default_not_a_ztp_offer = false
/** by default e'one is repeating flooding */ /** by default e'one is repeating flooding */
const bool default_you_are_not_flood_repeater = false const bool default_you_are_not_flood_repeater = false
/** 0 is illegal for SystemID */ /** 0 is illegal for SystemID */
const SystemIDType IllegalSystemID = 0 const SystemIDType IllegalSystemID = 0
/** empty set of nodes */ /** empty set of nodes */
const set<SystemIDType> empty_set_of_nodeids = {} const set<SystemIDType> empty_set_of_nodeids = {}
/** default UDP port to run LIEs on */ /** default UDP port to run LIEs on */
const UDPPortType default_lie_udp_port = 6949 const UDPPortType default_lie_udp_port = 911
const UDPPortType default_tie_udp_flood_port = 6950 /** default UDP port to receive TIEs on, that can be peer specific */
/** default MTU size to use */ const UDPPortType default_tie_udp_flood_port = 912
/** default MTU link size to use */
const MTUSizeType default_mtu_size = 1400 const MTUSizeType default_mtu_size = 1400
/** default mcast is v4 224.0.1.150, we make it i64 to
* help languages struggling with highest bit */
const i64 default_lie_v4_mcast_group = 3758096790
/** indicates whether the direction is northbound/east-west /** indicates whether the direction is northbound/east-west
* or southbound */ * or southbound */
enum TieDirectionType { enum TieDirectionType {
Illegal = 0, Illegal = 0,
South = 1, South = 1,
North = 2, North = 2,
DirectionMaxValue = 3, DirectionMaxValue = 3,
} }
skipping to change at page 65, line 48 skipping to change at page 70, line 44
1: optional IPv4Address ipv4address; 1: optional IPv4Address ipv4address;
2: optional IPv6Address ipv6address; 2: optional IPv6Address ipv6address;
} }
union IPPrefixType { union IPPrefixType {
1: optional IPv4PrefixType ipv4prefix; 1: optional IPv4PrefixType ipv4prefix;
2: optional IPv6PrefixType ipv6prefix; 2: optional IPv6PrefixType ipv6prefix;
} }
/** @note: Sequence of a prefix. Comparison function: /** @note: Sequence of a prefix. Comparison function:
if diff(timestamps) < 200 milliseconds better transactionid wins if diff(timestamps) < 200msecs better transactionid wins
else better time wins else better time wins
*/ */
struct PrefixSequenceType { struct PrefixSequenceType {
1: required IEEE802_1ASTimeStampType timestamp; 1: required IEEE802_1ASTimeStampType timestamp;
2: optional PrefixTransactionIDType transactionid; 2: optional PrefixTransactionIDType transactionid;
} }
enum TIETypeType { enum TIETypeType {
Illegal = 0, Illegal = 0,
TIETypeMinValue = 1, TIETypeMinValue = 1,
/** first legal value */ /** first legal value */
NodeTIEType = 2, NodeTIEType = 2,
PrefixTIEType = 3, PrefixTIEType = 3,
TransitivePrefixTIEType = 4, TransitivePrefixTIEType = 4,
PGPrefixTIEType = 5, PGPrefixTIEType = 5,
KeyValueTIEType = 6, KeyValueTIEType = 6,
TIETypeMaxValue = 7, TIETypeMaxValue = 7,
skipping to change at page 66, line 25 skipping to change at page 71, line 21
PGPrefixTIEType = 5, PGPrefixTIEType = 5,
KeyValueTIEType = 6, KeyValueTIEType = 6,
TIETypeMaxValue = 7, TIETypeMaxValue = 7,
} }
/** @note: route types which MUST be ordered on their preference /** @note: route types which MUST be ordered on their preference
* PGP prefixes are most preferred attracting * PGP prefixes are most preferred attracting
* traffic north (towards spine) and then south * traffic north (towards spine) and then south
* normal prefixes are attracting traffic south (towards leafs), * normal prefixes are attracting traffic south (towards leafs),
* i.e. prefix in NORTH PREFIX TIE is preferred over SOUTH PREFIX TIE * i.e. prefix in NORTH PREFIX TIE is preferred over SOUTH PREFIX TIE
*
* @todo: external routes
*/ */
enum RouteType { enum RouteType {
Illegal = 0, Illegal = 0,
RouteTypeMinValue = 1, RouteTypeMinValue = 1,
/** First legal value. */ /** First legal value. */
/** Discard routes are most prefered */ /** Discard routes are most prefered */
Discard = 2, Discard = 2,
/** Local prefixes are directly attached prefixes on the /** Local prefixes are directly attached prefixes on the
* system such as e.g. interface routes. * system such as e.g. interface routes.
skipping to change at page 67, line 4 skipping to change at page 72, line 4
/** advertised in N-TIEs */ /** advertised in N-TIEs */
NorthPGPPrefix = 5, NorthPGPPrefix = 5,
/** advertised in N-TIEs */ /** advertised in N-TIEs */
NorthPrefix = 6, NorthPrefix = 6,
/** advertised in S-TIEs */ /** advertised in S-TIEs */
SouthPrefix = 7, SouthPrefix = 7,
/** transitive southbound are least preferred */ /** transitive southbound are least preferred */
TransitiveSouthPrefix = 8, TransitiveSouthPrefix = 8,
RouteTypeMaxValue = 9 RouteTypeMaxValue = 9
} }
8.2. encoding.thrift A.2. encoding.thrift
/** /**
Thrift file for packet encodings for RIFT Thrift file for packet encodings for RIFT
*/ */
include "common.thrift" include "common.thrift"
/** represents protocol encoding schema major version */ /** represents protocol encoding schema major version */
const i32 protocol_major_version = 10 const i32 protocol_major_version = 11
/** represents protocol encoding schema minor version */ /** represents protocol encoding schema minor version */
const i32 protocol_minor_version = 0 const i32 protocol_minor_version = 0
/** common RIFT packet header */ /** common RIFT packet header */
struct PacketHeader { struct PacketHeader {
1: required common.VersionType major_version = protocol_major_version; 1: required common.VersionType major_version = protocol_major_version;
2: required common.VersionType minor_version = protocol_minor_version; 2: required common.VersionType minor_version = protocol_minor_version;
/** this is the node sending the packet, in case of LIE/TIRE/TIDE /** this is the node sending the packet, in case of LIE/TIRE/TIDE
also the originator of it */ also the originator of it */
3: required common.SystemIDType sender; 3: required common.SystemIDType sender;
skipping to change at page 67, line 45 skipping to change at page 72, line 45
} }
/** Neighbor structure */ /** Neighbor structure */
struct Neighbor { struct Neighbor {
1: required common.SystemIDType originator; 1: required common.SystemIDType originator;
2: required common.LinkIDType remote_id; 2: required common.LinkIDType remote_id;
} }
/** Capabilities the node supports */ /** Capabilities the node supports */
struct NodeCapabilities { struct NodeCapabilities {
/** can this node participate in flood reduction, /** can this node participate in flood reduction */
only relevant at level > 0 */
1: optional bool flood_reduction = 1: optional bool flood_reduction =
common.flood_reduction_default; common.flood_reduction_default;
/** does this node restrict itself to be leaf only (in ZTP) and /** does this node restrict itself to be leaf only (in ZTP) and
does it support leaf-2-leaf procedures */ does it support leaf-2-leaf procedures */
2: optional common.LeafIndications leaf_indications; 2: optional common.LeafIndications leaf_indications;
} }
/** RIFT LIE packet /** RIFT LIE packet
@note this node's level is already included on the packet header */ @note this node's level is already included on the packet header */
struct LIEPacket { struct LIEPacket {
/** optional node or adjacency name */ /** optional node or adjacency name */
1: optional string name; 1: optional string name;
/** local link ID */ /** local link ID */
skipping to change at page 68, line 19 skipping to change at page 73, line 19
@note this node's level is already included on the packet header */ @note this node's level is already included on the packet header */
struct LIEPacket { struct LIEPacket {
/** optional node or adjacency name */ /** optional node or adjacency name */
1: optional string name; 1: optional string name;
/** local link ID */ /** local link ID */
2: required common.LinkIDType local_id; 2: required common.LinkIDType local_id;
/** UDP port to which we can receive flooded TIEs */ /** UDP port to which we can receive flooded TIEs */
3: required common.UDPPortType flood_port = 3: required common.UDPPortType flood_port =
common.default_tie_udp_flood_port; common.default_tie_udp_flood_port;
/** layer 3 MTU */ /** layer 3 MTU, used to discover to mismatch */
4: optional common.MTUSizeType link_mtu_size = 4: optional common.MTUSizeType link_mtu_size =
common.default_mtu_size; common.default_mtu_size;
/** this will reflect the neighbor once received to provid /** this will reflect the neighbor once received to provid
3-way connectivity */ 3-way connectivity */
5: optional Neighbor neighbor; 5: optional Neighbor neighbor;
6: optional common.PodType pod = common.default_pod; 6: optional common.PodType pod = common.default_pod;
/** optional nonce used for security computations */ /** optional nonce used for security computations */
7: optional common.NonceType nonce; 7: optional common.NonceType nonce;
/** optional node capabilities shown in the LIE. The capabilies /** optional node capabilities shown in the LIE. The capabilies
MUST match the capabilities shown in the Node TIEs, otherwise MUST match the capabilities shown in the Node TIEs, otherwise
skipping to change at page 70, line 4 skipping to change at page 74, line 50
/** all FFs mark end */ /** all FFs mark end */
2: required TIEID end_range; 2: required TIEID end_range;
/** _sorted_ list of headers */ /** _sorted_ list of headers */
3: required list<TIEHeader> headers; 3: required list<TIEHeader> headers;
} }
/** A TIRE packet */ /** A TIRE packet */
struct TIREPacket { struct TIREPacket {
1: required set<TIEHeader> headers; 1: required set<TIEHeader> headers;
} }
/** Neighbor of a node */ /** Neighbor of a node */
struct NodeNeighborsTIEElement { struct NodeNeighborsTIEElement {
/** Level of neighbor */ /** Level of neighbor */
2: required common.LevelType level; 1: required common.LevelType level;
/** Cost to neighbor. /** Cost to neighbor.
@note: All parallel links to same node @note: All parallel links to same node
incur same cost, in case the neighbor has multiple incur same cost, in case the neighbor has multiple
parallel links at different cost, the largest distance parallel links at different cost, the largest distance
(highest numerical value) MUST be advertised (highest numerical value) MUST be advertised
@note: any neighbor with cost <= 0 MUST be ignored in computations */ @note: any neighbor with cost <= 0 MUST be ignored in computations */
3: optional common.MetricType cost = common.default_distance; 3: optional common.MetricType cost = common.default_distance;
/** can carry description of multiple parallel links in a TIE */ /** can carry description of multiple parallel links in a TIE */
4: optional set<LinkIDPair> link_ids; 4: optional set<LinkIDPair> link_ids;
/** total bandwith to neighbor, this will be normally sum of the /** total bandwith to neighbor, this will be normally sum of the
* bandwidths of all the parallel links. bandwidths of all the parallel links. */
**/
5: optional common.BandwithInMegaBitsType bandwidth = 5: optional common.BandwithInMegaBitsType bandwidth =
common.default_bandwidth; common.default_bandwidth;
} }
/** Flags the node sets */ /** Flags the node sets */
struct NodeFlags { struct NodeFlags {
/** node is in overload, do not transit traffic through it */ /** node is in overload, do not transit traffic through it */
1: optional bool overload = common.overload_default; 1: optional bool overload = common.overload_default;
} }
skipping to change at page 71, line 26 skipping to change at page 76, line 24
* adjacencies to higher level nodes that this node doesn't see. * adjacencies to higher level nodes that this node doesn't see.
* This may be used in the computation at higher levels to prevent * This may be used in the computation at higher levels to prevent
* blackholing. Ignored in Node S-TIEs if present. * blackholing. Ignored in Node S-TIEs if present.
* Equivalent to |PUL(N) in spec. */ * Equivalent to |PUL(N) in spec. */
7: optional set<common.SystemIDType> same_level_unknown_north_partitions 7: optional set<common.SystemIDType> same_level_unknown_north_partitions
= common.empty_set_of_nodeids; = common.empty_set_of_nodeids;
} }
struct PrefixAttributes { struct PrefixAttributes {
2: required common.MetricType metric = common.default_distance; 2: required common.MetricType metric = common.default_distance;
/** generic unordered set of route tags, can be redistributed to other protocols or use /** generic unordered set of route tags, can be redistributed to
other protocols or use
within the context of real time analytics */ within the context of real time analytics */
3: optional set<common.RouteTagType> tags; 3: optional set<common.RouteTagType> tags;
/** optional monotonic clock for mobile addresses */ /** optional monotonic clock for mobile addresses */
4: optional common.PrefixSequenceType monotonic_clock; 4: optional common.PrefixSequenceType monotonic_clock;
} }
/** multiple prefixes */ /** multiple prefixes */
struct PrefixTIEElement { struct PrefixTIEElement {
/** prefixes with the associated attributes. /** prefixes with the associated attributes.
if the same prefix repeats in multiple TIEs of same node if the same prefix repeats in multiple TIEs of same node
skipping to change at page 72, line 21 skipping to change at page 77, line 20
/** transitive prefixes (always southbound) which SHOULD be propagated /** transitive prefixes (always southbound) which SHOULD be propagated
* southwards towards lower levels to heal * southwards towards lower levels to heal
* pathological upper level partitioning, otherwise * pathological upper level partitioning, otherwise
* blackholes may occur. MUST NOT be advertised within a North TIE. * blackholes may occur. MUST NOT be advertised within a North TIE.
*/ */
3: optional PrefixTIEElement transitive_prefixes; 3: optional PrefixTIEElement transitive_prefixes;
4: optional KeyValueTIEElement keyvalues; 4: optional KeyValueTIEElement keyvalues;
/** @todo: policy guided prefixes */ /** @todo: policy guided prefixes */
} }
/** @todo: flood header separately in UDP to allow caching to TIEs /** @todo: flood header separately in UDP to allow changing lifetime and SHA
while changing lifetime? without reserialization
*/ */
struct TIEPacket { struct TIEPacket {
1: required TIEHeader header; 1: required TIEHeader header;
2: required TIEElement element; 2: required TIEElement element;
} }
union PacketContent { union PacketContent {
1: optional LIEPacket lie; 1: optional LIEPacket lie;
2: optional TIDEPacket tide; 2: optional TIDEPacket tide;
3: optional TIREPacket tire; 3: optional TIREPacket tire;
4: optional TIEPacket tie; 4: optional TIEPacket tie;
} }
/** protocol packet structure */ /** protocol packet structure */
struct ProtocolPacket { struct ProtocolPacket {
1: required PacketHeader header; 1: required PacketHeader header;
2: required PacketContent content; 2: required PacketContent content;
} }
9. IANA Considerations Appendix B. Finite State Machines
This specification will request at an opportune time multiple All FSM figures are provided as [DOT] description due to limiations
registry points to exchange protocol packets in a standardized way, of ASCII art.
amongst them multicast address assignments and standard port numbers.
The schema itself defines many values and codepoints which can be
considered registries themselves.
10. Acknowledgments B.1. LIE
Many thanks to Naiming Shen for some of the early discussions around digraph G791bb566f5cf48b09e26193a727dadfd {
the topic of using IGPs for routing in topologies related to Clos. N91ea7c47496746d880c10a5def7874c2[label="TwoWay"][shape="oval"];
Russ White to be especially acknowledged for the key conversation on Nc5d62000e5dc45a9ac1379c28cfda9b3[label="OneWay"][shape="oval"];
epistomology that allowed to tie current asynchronous distributed Nd7b87acca28f4613a68bbc4ef79a3c50[label="Enter"][style="dashed"]
systems theory results to a modern protocol design presented here. [shape="plain"];
Adrian Farrel, Joel Halpern and Jeffrey Zhang provided thoughtful Ne0fb2564cd334a44ad080f73b07cca86[label="ThreeWay"][shape="oval"];
comments that improved the readability of the document and found good N51443826b9c84d8b83cc252b471047c9[label="Enter"][style="invis"]
amount of corners where the light failed to shine. Kris Price was [shape="plain"];
first to mention single router, single arm default considerations. N19343f3f3a9b41c29f3ac23c8dccc179[label="Exit"][style="invis"]
Jeff Tantsura helped out with some initial thoughts on BFD [shape="plain"];
interactions while Jeff Haas corrected several misconceptions about N91ea7c47496746d880c10a5def7874c2 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
BFD's finer points. Artur Makutunowicz pointed out many possible [label="|LevelChanged|"][color="blue"]
improvements and acted as sounding board in regard to modern protocol [arrowhead="normal" dir="both" arrowtail="none"];
implementation techniques RIFT is exploring. Barak Gafni formalized Ne0fb2564cd334a44ad080f73b07cca86 -> Ne0fb2564cd334a44ad080f73b07cca86
first time clearly the problem of partitioned spine on a (clean) [label="|HALChanged|\n|HATChanged|\n|HALSChanged|\n|UpdateZTPOffer|"]
napkin in Singapore. [color="blue"][arrowhead="normal" dir="both" arrowtail="none"];
Nc5d62000e5dc45a9ac1379c28cfda9b3 -> N91ea7c47496746d880c10a5def7874c2
[label="|NewNeighbor|"][color="black"]
[arrowhead="normal" dir="both" arrowtail="none"];
Ne0fb2564cd334a44ad080f73b07cca86 -> N91ea7c47496746d880c10a5def7874c2
[label="|NeighborDroppedReflection|"][color="red"]
[arrowhead="normal" dir="both" arrowtail="none"];
Nc5d62000e5dc45a9ac1379c28cfda9b3 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label="|TimerTick|\n|LieRcvd|\n|UnacceptableHeader|\n|HoldtimeExpired|\n|SendLie|"]
[color="black"][arrowhead="normal" dir="both" arrowtail="none"];
N91ea7c47496746d880c10a5def7874c2 -> Ne0fb2564cd334a44ad080f73b07cca86
[label="|ValidReflection|"][color="red"]
[arrowhead="normal" dir="both" arrowtail="none"];
Nd7b87acca28f4613a68bbc4ef79a3c50 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label=""]
[color="black"][arrowhead="normal" dir="both" arrowtail="none"];
N91ea7c47496746d880c10a5def7874c2 -> N91ea7c47496746d880c10a5def7874c2
[label="|HALChanged|\n|HATChanged|\n|HALSChanged|\n|UpdateZTPOffer|"]
[color="blue"][arrowhead="normal" dir="both" arrowtail="none"];
Nc5d62000e5dc45a9ac1379c28cfda9b3 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label="|LevelChanged|\n|HALChanged|\n|HATChanged|\n|HALSChanged|\n|UpdateZTPOffer|"]
[color="blue"][arrowhead="normal" dir="both" arrowtail="none"];
N91ea7c47496746d880c10a5def7874c2 -> N91ea7c47496746d880c10a5def7874c2
[label="|TimerTick|\n|LieRcvd|\n|SendLie|"][color="black"]
[arrowhead="normal" dir="both" arrowtail="none"];
Ne0fb2564cd334a44ad080f73b07cca86 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label="|NeighborChangedLevel|\n|NeighborChangedAddress|\n|UnacceptableHeader|\n|HoldtimeExpired|\n|MultipleNeighbors|"]
[color="black"][arrowhead="normal" dir="both" arrowtail="none"];
Ne0fb2564cd334a44ad080f73b07cca86 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label="|LevelChanged|"][color="blue"]
[arrowhead="normal" dir="both" arrowtail="none"];
Ne0fb2564cd334a44ad080f73b07cca86 -> Ne0fb2564cd334a44ad080f73b07cca86
[label="|TimerTick|\n|LieRcvd|\n|SendLie|"][color="black"]
[arrowhead="normal" dir="both" arrowtail="none"];
N91ea7c47496746d880c10a5def7874c2 -> Nc5d62000e5dc45a9ac1379c28cfda9b3
[label="|NeighborChangedLevel|\n|NeighborChangedAddress|\n|UnacceptableHeader|\n|HoldtimeExpired|\n|MultipleNeighbors|"]
[color="black"][arrowhead="normal" dir="both" arrowtail="none"];
}
LIE FSM DOT
11. References Events
11.1. Normative References o TimerTick: one second timer tic
[I-D.ietf-6lo-rfc6775-update] o LevelChanged: node's level has been changed by ZTP or
Thubert, P., Nordmark, E., Chakrabarti, S., and C. configuration
Perkins, "Registration Extensions for 6LoWPAN Neighbor
Discovery", draft-ietf-6lo-rfc6775-update-19 (work in
progress), April 2018.
[ISO10589] o HALChanged: best HAL computed by ZTP has changed
ISO "International Organization for Standardization",
"Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473), ISO/IEC
10589:2002, Second Edition.", Nov 2002.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate o HATChanged: HAT computed by ZTP has changed
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, o HALSChanged: set of HAL offering systems computed by ZTP has
DOI 10.17487/RFC2328, April 1998, changed
<https://www.rfc-editor.org/info/rfc2328>.
[RFC2365] Meyer, D., "Administratively Scoped IP Multicast", BCP 23, o LieRcvd: received LIE
RFC 2365, DOI 10.17487/RFC2365, July 1998,
<https://www.rfc-editor.org/info/rfc2365>.
[RFC3626] Clausen, T., Ed. and P. Jacquet, Ed., "Optimized Link o NewNeighbor: new neighbor parsed
State Routing Protocol (OLSR)", RFC 3626,
DOI 10.17487/RFC3626, October 2003,
<https://www.rfc-editor.org/info/rfc3626>.
[RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A o ValidReflection: received own reflection from neighbor
Border Gateway Protocol 4 (BGP-4)", RFC 4271,
DOI 10.17487/RFC4271, January 2006,
<https://www.rfc-editor.org/info/rfc4271>.
[RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing o NeighborDroppedReflection: lost previous own reflection from
Architecture", RFC 4291, DOI 10.17487/RFC4291, February neighbor
2006, <https://www.rfc-editor.org/info/rfc4291>.
[RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation o NeighborChangedLevel: neighbor changed advertised level
Element (PCE)-Based Architecture", RFC 4655,
DOI 10.17487/RFC4655, August 2006,
<https://www.rfc-editor.org/info/rfc4655>.
[RFC5120] Przygienda, T., Shen, N., and N. Sheth, "M-ISIS: Multi o NeighborChangedAddress: neighbor changed IP address
Topology (MT) Routing in Intermediate System to
Intermediate Systems (IS-ISs)", RFC 5120,
DOI 10.17487/RFC5120, February 2008,
<https://www.rfc-editor.org/info/rfc5120>.
[RFC5303] Katz, D., Saluja, R., and D. Eastlake 3rd, "Three-Way o UnacceptableHeader: unacceptable header seen
Handshake for IS-IS Point-to-Point Adjacencies", RFC 5303,
DOI 10.17487/RFC5303, October 2008,
<https://www.rfc-editor.org/info/rfc5303>.
[RFC5709] Bhatia, M., Manral, V., Fanto, M., White, R., Barnes, M., o HoldtimeExpired: adjacency hold down expired
Li, T., and R. Atkinson, "OSPFv2 HMAC-SHA Cryptographic
Authentication", RFC 5709, DOI 10.17487/RFC5709, October
2009, <https://www.rfc-editor.org/info/rfc5709>.
[RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection o MultipleNeighbors: more than one neighbor seen on interface
(BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
DOI 10.17487/RFC5881, June 2010,
<https://www.rfc-editor.org/info/rfc5881>.
[RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, o LIECorrupt: corrupted LIE seen
"Network Time Protocol Version 4: Protocol and Algorithms
Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010,
<https://www.rfc-editor.org/info/rfc5905>.
[RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms o SendLie: send a LIE out
(SHA and SHA-based HMAC and HKDF)", RFC 6234,
DOI 10.17487/RFC6234, May 2011,
<https://www.rfc-editor.org/info/rfc6234>.
[RFC6822] Previdi, S., Ed., Ginsberg, L., Shand, M., Roy, A., and D. o UpdateZTPOffer: update this node's ZTP offer
Ward, "IS-IS Multi-Instance", RFC 6822,
DOI 10.17487/RFC6822, December 2012,
<https://www.rfc-editor.org/info/rfc6822>.
[RFC7855] Previdi, S., Ed., Filsfils, C., Ed., Decraene, B., Actions
Litkowski, S., Horneffer, M., and R. Shakir, "Source
Packet Routing in Networking (SPRING) Problem Statement
and Requirements", RFC 7855, DOI 10.17487/RFC7855, May
2016, <https://www.rfc-editor.org/info/rfc7855>.
[RFC7938] Lapukhov, P., Premji, A., and J. Mitchell, Ed., "Use of on UpdateZTPOffer in TwoWay finishes in TwoWay: send offer to ZTP
BGP for Routing in Large-Scale Data Centers", RFC 7938, FSM
DOI 10.17487/RFC7938, August 2016,
<https://www.rfc-editor.org/info/rfc7938>.
[RFC7987] Ginsberg, L., Wells, P., Decraene, B., Przygienda, T., and on HALChanged in OneWay finishes in OneWay: store new HAL
H. Gredler, "IS-IS Minimum Remaining Lifetime", RFC 7987, on HALChanged in ThreeWay finishes in ThreeWay: store new HAL
DOI 10.17487/RFC7987, October 2016,
<https://www.rfc-editor.org/info/rfc7987>.
[RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 on HoldtimeExpired in OneWay finishes in OneWay: no action
(IPv6) Specification", STD 86, RFC 8200,
DOI 10.17487/RFC8200, July 2017,
<https://www.rfc-editor.org/info/rfc8200>.
11.2. Informative References on UpdateZTPOffer in OneWay finishes in OneWay: send offer to ZTP
FSM
[CLOS] Yuan, X., "On Nonblocking Folded-Clos Networks in Computer on LevelChanged in ThreeWay finishes in OneWay: update level with
Communication Environments", IEEE International Parallel & event value
Distributed Processing Symposium, 2011.
[DIJKSTRA] on MultipleNeighbors in TwoWay finishes in OneWay: no action
Dijkstra, E., "A Note on Two Problems in Connexion with
Graphs", Journal Numer. Math. , 1959.
[DYNAMO] De Candia et al., G., "Dynamo: amazon's highly available on NeighborChangedLevel in TwoWay finishes in OneWay: no action
key-value store", ACM SIGOPS symposium on Operating
systems principles (SOSP '07), 2007.
[EPPSTEIN] on HATChanged in OneWay finishes in OneWay: store HAT
Eppstein, D., "Finding the k-Shortest Paths", 1997.
[FATTREE] Leiserson, C., "Fat-Trees: Universal Networks for on HATChanged in ThreeWay finishes in ThreeWay: store HAT
Hardware-Efficient Supercomputing", 1985.
[I-D.ietf-spring-segment-routing] on MultipleNeighbors in ThreeWay finishes in OneWay: no action
Filsfils, C., Previdi, S., Ginsberg, L., Decraene, B.,
Litkowski, S., and R. Shakir, "Segment Routing
Architecture", draft-ietf-spring-segment-routing-15 (work
in progress), January 2018.
[IEEEstd1588] on SendLie in ThreeWay finishes in ThreeWay: SENDLIE
IEEE, "IEEE Standard for a Precision Clock Synchronization
Protocol for Networked Measurement and Control Systems",
IEEE Standard 1588,
<https://ieeexplore.ieee.org/document/4579760/>.
[IEEEstd8021AS] on TimerTick in TwoWay finishes in TwoWay: PUSH SendLie event, if
IEEE, "IEEE Standard for Local and Metropolitan Area holdtime expired PUSH HoldtimeExpired event
Networks - Timing and Synchronization for Time-Sensitive
Applications in Bridged Local Area Networks",
IEEE Standard 802.1AS,
<https://ieeexplore.ieee.org/document/5741898/>.
[ISO10589-Second-Edition] on HALSChanged in OneWay finishes in OneWay: store HALS
International Organization for Standardization,
"Intermediate system to Intermediate system intra-domain
routeing information exchange protocol for use in
conjunction with the protocol for providing the
connectionless-mode Network Service (ISO 8473)", Nov 2002.
[MAKSIC2013] on SendLie in OneWay finishes in OneWay: SENDLIE
Maksic et al., N., "Improving Utilization of Data Center
Networks", IEEE Communications Magazine, Nov 2013.
[PROTOBUF] on LevelChanged in TwoWay finishes in OneWay: update level with
Google, Inc., "Protocol Buffers, event value
https://developers.google.com/protocol-buffers".
[QUIC] Iyengar et al., J., "QUIC: A UDP-Based Multiplexed and on LieRcvd in TwoWay finishes in TwoWay: PROCESS_LIE
Secure Transport", 2016.
[RFC0826] Plummer, D., "An Ethernet Address Resolution Protocol: Or on HALSChanged in ThreeWay finishes in ThreeWay: store HALS
Converting Network Protocol Addresses to 48.bit Ethernet
Address for Transmission on Ethernet Hardware", STD 37,
RFC 826, DOI 10.17487/RFC0826, November 1982,
<https://www.rfc-editor.org/info/rfc826>.
[RFC2131] Droms, R., "Dynamic Host Configuration Protocol", on UpdateZTPOffer in ThreeWay finishes in ThreeWay: send offer to
RFC 2131, DOI 10.17487/RFC2131, March 1997, ZTP FSM
<https://www.rfc-editor.org/info/rfc2131>.
[RFC3315] Droms, R., Ed., Bound, J., Volz, B., Lemon, T., Perkins, on HALSChanged in TwoWay finishes in TwoWay: store HALS
C., and M. Carney, "Dynamic Host Configuration Protocol
for IPv6 (DHCPv6)", RFC 3315, DOI 10.17487/RFC3315, July
2003, <https://www.rfc-editor.org/info/rfc3315>.
[RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, on LieRcvd in OneWay finishes in OneWay: PROCESS_LIE
"Neighbor Discovery for IP version 6 (IPv6)", RFC 4861,
DOI 10.17487/RFC4861, September 2007,
<https://www.rfc-editor.org/info/rfc4861>.
[RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless on NeighborChangedLevel in ThreeWay finishes in OneWay: no action
Address Autoconfiguration", RFC 4862,
DOI 10.17487/RFC4862, September 2007,
<https://www.rfc-editor.org/info/rfc4862>.
[VAHDAT08] on HoldtimeExpired in TwoWay finishes in OneWay: no action
Al-Fares, M., Loukissas, A., and A. Vahdat, "A Scalable, on TimerTick in ThreeWay finishes in ThreeWay: PUSH SendLie event,
Commodity Data Center Network Architecture", SIGCOMM , if holdtime expired PUSH HoldtimeExpired event
2008.
Authors' Addresses on UnacceptableHeader in ThreeWay finishes in OneWay: no action
on SendLie in TwoWay finishes in TwoWay: SENDLIE
on LevelChanged in OneWay finishes in OneWay: update level with
event value, PUSH SendLie event
on NeighborChangedAddress in ThreeWay finishes in OneWay: no
action
on HALChanged in TwoWay finishes in TwoWay: store new HAL
on NewNeighbor in OneWay finishes in TwoWay: PUSH SendLie event
on ValidReflection in TwoWay finishes in ThreeWay: no action
on UnacceptableHeader in TwoWay finishes in OneWay: no action
on LieRcvd in ThreeWay finishes in ThreeWay: PROCESS_LIE
on NeighborDroppedReflection in ThreeWay finishes in TwoWay: no
action
on NeighborChangedAddress in TwoWay finishes in OneWay: no action
on HoldtimeExpired in ThreeWay finishes in OneWay: no action
on HATChanged in TwoWay finishes in TwoWay: store HAT
on UnacceptableHeader in OneWay finishes in OneWay: no action
on TimerTick in OneWay finishes in OneWay: PUSH SendLie event
on Entry into OneWay: CLEANUP and then process event SendLie
Following words are used for well known procedures:
1. PUSH Event: pushes an event to be executed by the FSM upon exit
of this action
2. CLEANUP: neighbor MUST be reset to unknown
3. SENDLIE: create a new LIE packet
1. reflecting the neighbor if known and valid and
2. setting the necessary `not_a_ztp_offer` variable if level was
derived from last known neighbor on this interface and
3. setting `you_are_not_flood_repeater` to computed value
4. PROCESS_LIE:
1. if lie has wrong major version OR our own system ID or
invalid system ID then CLEANUP else
2. if lie has undefined level OR my level is undefined OR this
node is leaf and remote level lower than HAT OR (lie's level
is not leaf AND its difference is more than one from my
level) then CLEANUP, PUSH UpdateZTPOffer, PUSH
UnacceptableHeader else
3. push UpdateZTPOffer, construct temporary new neighbor
structure with values from lie, if no current neighbor exists
then set neighbor to new neighbor, PUSH NewNeighbor event,
CHECK_THREE_WAY else
1. if current neighbor system ID differs from lie's system
ID then PUSH MultipleNeighbors else
2. if current neighbor stored level differs from lie's level
then PUSH NeighborChangedLevel else
3. if current neighbor stored IPv4/v6 address differs from
lie's address then PUSH NeighborChangedAddress else
4. if any of neighbor's flood address port, name, local
linkid changed then PUSH NeighborChangedMinorFields and
5. CHECK_THREE_WAY
5. CHECK_THREE_WAY: if current state is one-way do nothing else
1. if lie packet does not contain neighbor then if current state
is three-way then PUSH NeighborDroppedReflection else
2. if packet reflects this system's ID and local port and state
is three-way then PUSH event ValidReflection else PUSH event
MultipleNeighbors
B.2. ZTP
digraph G04743cd825cc40c5b93de0616ffb851b {
N29e7db3976644f62b6f3b2801bccb854[label="Enter"]
[style="dashed"][shape="plain"];
N33df4993a1664be18a2196001c27a64c[label="HoldingDown"][shape="oval"];
N839f77189e324c82b21b8a709b4b021d[label="ComputeBestOffer"][shape="oval"];
Nc97f2b02808d4751afcc630687bf7421[label="UpdatingClients"][shape="oval"];
N7ad21867360c44709be20a99f33dd1f7[label="Enter"]
[style="dashed"][shape="plain"];
N33df4993a1664be18a2196001c27a64c -> N33df4993a1664be18a2196001c27a64c
[label="|ComputationDone|"][color="green"]
[arrowhead="normal" dir="both" arrowtail="none"];
N29e7db3976644f62b6f3b2801bccb854 -> Nc97f2b02808d4751afcc630687bf7421
[label=""]
[color="black"][arrowhead="normal" dir="both" arrowtail="none"];
N839f77189e324c82b21b8a709b4b021d -> N839f77189e324c82b21b8a709b4b021d
[label="|NeighborOffer|\n|WithdrawNeighborOffer|"]
[color="blue"][arrowhead="normal" dir="both" arrowtail="none"];
N33df4993a1664be18a2196001c27a64c -> N839f77189e324c82b21b8a709b4b021d
[label="|ChangeLocalLeafIndications|\n|ChangeLocalConfiguredLevel|"]
[color="gold"]
[arrowhead="normal" dir="both" arrowtail="none"];
N839f77189e324c82b21b8a709b4b021d -> N839f77189e324c82b21b8a709b4b021d
[label="|BetterHAL|\n|BetterHAT|\n|LostHAT|"]
[color="red"][arrowhead="normal" dir="both" arrowtail="none"];
N33df4993a1664be18a2196001c27a64c -> N33df4993a1664be18a2196001c27a64c
[label="|NeighborOffer|\n|WithdrawNeighborOffer|"][color="blue"]
[arrowhead="normal" dir="both" arrowtail="none"];
Nc97f2b02808d4751afcc630687bf7421 -> N839f77189e324c82b21b8a709b4b021d
[label="|BetterHAL|\n|BetterHAT|\n|LostHAT|"][color="red"]
[arrowhead="normal" dir="both" arrowtail="none"];
N33df4993a1664be18a2196001c27a64c -> N33df4993a1664be18a2196001c27a64c
[label="|ShortTic|"][color="black"][arrowhead="normal" dir="both"
arrowtail="none"];
Nc97f2b02808d4751afcc630687bf7421 -> Nc97f2b02808d4751afcc630687bf7421
[label="|NeighborOffer|\n|WithdrawNeighborOffer|"][color="blue"]
[arrowhead="normal" dir="both" arrowtail="none"];
N33df4993a1664be18a2196001c27a64c -> N33df4993a1664be18a2196001c27a64c
[label="|BetterHAL|\n|BetterHAT|\n|LostHAL|\n|LostHAT|"][color="red"]
[arrowhead="normal" dir="both" arrowtail="none"];
N839f77189e324c82b21b8a709b4b021d -> N33df4993a1664be18a2196001c27a64c
[label="|LostHAL|"][color="red"][arrowhead="normal" dir="both"
arrowtail="none"];
N7ad21867360c44709be20a99f33dd1f7 -> N839f77189e324c82b21b8a709b4b021d
[label=""][color="black"][arrowhead="normal" dir="both" arrowtail="none"];
N839f77189e324c82b21b8a709b4b021d -> Nc97f2b02808d4751afcc630687bf7421
[label="|ComputationDone|"][color="green"][arrowhead="normal" dir="both"
arrowtail="none"];
N839f77189e324c82b21b8a709b4b021d -> N839f77189e324c82b21b8a709b4b021d
[label="|ChangeLocalLeafIndications|\n|ChangeLocalConfiguredLevel|"]
[color="gold"]
[arrowhead="normal" dir="both" arrowtail="none"];
Nc97f2b02808d4751afcc630687bf7421 -> N33df4993a1664be18a2196001c27a64c
[label="|LostHAL|"]
[color="red"][arrowhead="normal" dir="both" arrowtail="none"];
N33df4993a1664be18a2196001c27a64c -> N839f77189e324c82b21b8a709b4b021d
[label="|HoldDownExpired|"][color="green"][arrowhead="normal" dir="both"
arrowtail="none"];
Nc97f2b02808d4751afcc630687bf7421 -> N839f77189e324c82b21b8a709b4b021d
[label="|ChangeLocalLeafIndications|\n|ChangeLocalConfiguredLevel|"]
[color="gold"]
[arrowhead="normal" dir="both" arrowtail="none"];
}
LIE FSM DOT
Events
o ChangeLocalLeafIndications: node configured with new leaf flags
o ChangeLocalConfiguredLevel: node locally configured with a defined
level
o NeighborOffer: a new neighbor offer with optional level and
neighbor state
o WithdrawNeighborOffer: a neighbor's offer withdrawn
o BetterHAL: better HAL computed internally
o BetterHAT: better HAT computed internally
o LostHAL: lost last HAL in computation
o LostHAT: lost HAT in computation
o ComputationDone: computation performed
o HoldDownExpired: holddown expired
Actions
on LostHAT in ComputeBestOffer finishes in ComputeBestOffer:
LEVEL_COMPUTE
on LostHAT in HoldingDown finishes in HoldingDown: no action
on LostHAL in HoldingDown finishes in HoldingDown:
on ChangeLocalLeafIndications in UpdatingClients finishes in
ComputeBestOffer: store leaf flags
on LostHAT in UpdatingClients finishes in ComputeBestOffer: no
action
on BetterHAT in HoldingDown finishes in HoldingDown: no action
on NeighborOffer in ComputeBestOffer finishes in ComputeBestOffer:
if no level offered REMOVE_OFFER else
if level > leaf then UPDATE_OFFER else REMOVE_OFFER
on BetterHAT in UpdatingClients finishes in ComputeBestOffer: no
action
on ChangeLocalConfiguredLevel in HoldingDown finishes in
ComputeBestOffer: store level
on BetterHAL in ComputeBestOffer finishes in ComputeBestOffer:
LEVEL_COMPUTE
on HoldDownExpired in HoldingDown finishes in ComputeBestOffer:
PURGE_OFFERS
on ShortTic in HoldingDown finishes in HoldingDown: if holddown
timer expired PUSH_EVENT HoldDownExpired
on ComputationDone in ComputeBestOffer finishes in
UpdatingClients: no action
on LostHAL in UpdatingClients finishes in HoldingDown: if any
southbound adjacencies present update holddown timer to normal
duration else fire holddown timer immediately
on NeighborOffer in UpdatingClients finishes in UpdatingClients:
if no level offered REMOVE_OFFER else
if level > leaf then UPDATE_OFFER else REMOVE_OFFER
on ChangeLocalConfiguredLevel in ComputeBestOffer finishes in
ComputeBestOffer: store level and LEVEL_COMPUTE
on NeighborOffer in HoldingDown finishes in HoldingDown:
if no level offered REMOVE_OFFER else
if level > leaf then UPDATE_OFFER else REMOVE_OFFER
on LostHAL in ComputeBestOffer finishes in HoldingDown: if any
southbound adjacencies present update holddown timer to normal
duration else fire holddown timer immediately
on BetterHAT in ComputeBestOffer finishes in ComputeBestOffer:
LEVEL_COMPUTE
on WithdrawNeighborOffer in ComputeBestOffer finishes in
ComputeBestOffer: REMOVE_OFFER
on ChangeLocalLeafIndications in ComputeBestOffer finishes in
ComputeBestOffer: store leaf flags and LEVEL_COMPUTE
on BetterHAL in HoldingDown finishes in HoldingDown: no action
on WithdrawNeighborOffer in HoldingDown finishes in HoldingDown:
REMOVE_OFFER
on ChangeLocalLeafIndications in HoldingDown finishes in
ComputeBestOffer: store leaf flags
on ChangeLocalConfiguredLevel in UpdatingClients finishes in
ComputeBestOffer: store level
on ComputationDone in HoldingDown finishes in HoldingDown:
on BetterHAL in UpdatingClients finishes in ComputeBestOffer: no
action
on WithdrawNeighborOffer in UpdatingClients finishes in
UpdatingClients: REMOVE_OFFER
on Entry into UpdatingClients: update all LIE FSMs with
computation results
on Entry into ComputeBestOffer: LEVEL_COMPUTE
Following words are used for well known procedures:
1. PUSH Event: pushes an event to be executed by the FSM upon exit
of this action
2. COMPARE_OFFERS: checks whether based on current offers and held
last results the events BetterHAL/LostHAL/BetterHAT/LostHAT are
necessary and returns them
3. UPDATE_OFFER: store current offer and COMPARE_OFFERS, PUSH
according events
4. LEVEL_COMPUTE: compute best offered or configured level and HAL/
HAT, if anything changed PUSH ComputationDone
5. REMOVE_OFFER: remove the according offer and COMPARE_OFFERS, PUSH
according events
6. PURGE_OFFERS: REMOVE_OFFER for all held offers, COMPARE OFFERS,
PUSH according events
Appendix C. Constants
C.1. Configurable Protocol Constants
+-----------------+--------------+----------------------------------+
| | Type | Value |
+-----------------+--------------+----------------------------------+
| LIE IPv4 | Default | 224.0.0.120 or all-rift-routers |
| Multicast | Value, | to be assigned in IPv4 Multicast |
| Address | Configurable | Address Space Registry in Local |
| | | Network Control Block |
+-----------------+--------------+----------------------------------+
| LIE IPv6 | Default | FF02::0078 or all-rift-routers |
| Multicast | Value, | to be assigned in IPv6 Multicast |
| Address | Configurable | Address Assignments |
+-----------------+--------------+----------------------------------+
| LIE Destination | Default | 911 |
| Port | Value, | |
| | Configurable | |
+-----------------+--------------+----------------------------------+
| Level value for | Constant | 24 |
| SUPERSPINE_FLAG | | |
+-----------------+--------------+----------------------------------+
Table 5: all_constants
Authors' Addresses
Tony Przygienda (editor) Tony Przygienda (editor)
Juniper Networks Juniper Networks
1194 N. Mathilda Ave 1194 N. Mathilda Ave
Sunnyvale, CA 94089 Sunnyvale, CA 94089
US US
Email: prz@juniper.net Email: prz@juniper.net
Alankar Sharma Alankar Sharma
Comcast Comcast
skipping to change at page 78, line 15 skipping to change at page 88, line 31
Cisco Systems, Inc Cisco Systems, Inc
Building D Building D
45 Allee des Ormes - BP1200 45 Allee des Ormes - BP1200
MOUGINS - Sophia Antipolis 06254 MOUGINS - Sophia Antipolis 06254
FRANCE FRANCE
Phone: +33 497 23 26 34 Phone: +33 497 23 26 34
Email: pthubert@cisco.com Email: pthubert@cisco.com
Alia Atlas Alia Atlas
Juniper Networks Individual
10 Technology Park Drive
Westford, MA 01886
US
Email: akatlas@juniper.net Email: akatlas@juniper.net
John Drake John Drake
Juniper Networks Juniper Networks
1194 N. Mathilda Ave 1194 N. Mathilda Ave
Sunnyvale, CA 94089 Sunnyvale, CA 94089
US US
Email: jdrake@juniper.net Email: jdrake@juniper.net
 End of changes. 145 change blocks. 
351 lines changed or deleted 864 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/