draft-ietf-ccamp-gr-description-02.txt | draft-ietf-ccamp-gr-description-03.txt | |||
---|---|---|---|---|
Network Working Group Dan Li (Huawei) | Network Working Group Dan Li (Huawei) | |||
Internet Draft Jianhua Gao (Huawei) | Internet Draft Jianhua Gao (Huawei) | |||
Arun Satyanarayana (Cisco) | Arun Satyanarayana (Cisco) | |||
Intended Status: Informational | Intended Status: Informational | |||
Expires: November 5, 2008 May 5, 2008 | Expires: November 19, 2008 May 19, 2008 | |||
Description of the RSVP-TE Graceful Restart Procedures | Description of the RSVP-TE Graceful Restart Procedures | |||
draft-ietf-ccamp-gr-description-02.txt | draft-ietf-ccamp-gr-description-03.txt | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 2, line 26 | skipping to change at page 2, line 26 | |||
can be recovered in different scenarios where the order in which | can be recovered in different scenarios where the order in which | |||
the nodes restart is different. | the nodes restart is different. | |||
This document does not define any new processes or procedures. All | This document does not define any new processes or procedures. All | |||
protocol mechanisms are already defined in the referenced documents. | protocol mechanisms are already defined in the referenced documents. | |||
Table of Contents | Table of Contents | |||
1. Introduction.................................................3 | 1. Introduction.................................................3 | |||
2. Existing Procedures for Single Node Restart..................4 | 2. Existing Procedures for Single Node Restart..................4 | |||
2.1. Procedures defined in [RFC3473]............................4 | 2.1. Procedures Defined in [RFC3473]............................4 | |||
2.2. Procedures defined in [RFC5063]............................5 | 2.2. Procedures Defined in [RFC5063]............................5 | |||
3. Multiple Node Restart Scenarios..............................5 | 3. Multiple Node Restart Scenarios..............................5 | |||
4. RSVP State...................................................7 | 4. RSVP State...................................................7 | |||
5. Procedures for Multiple Node Restart.........................7 | 5. Procedures for Multiple Node Restart.........................7 | |||
5.1. Procedures for the Normal Node.............................7 | 5.1. Procedures for the Normal Node.............................7 | |||
5.2. Procedures for the Restarting Node.........................7 | 5.2. Procedures for the Restarting Node.........................7 | |||
5.2.1. Procedures for Scenario 1................................8 | 5.2.1. Procedures for Scenario 1................................8 | |||
5.2.2. Procedures for Scenario 2................................9 | 5.2.2. Procedures for Scenario 2................................9 | |||
5.2.3. Procedures for scenario 3...............................10 | 5.2.3. Procedures for Scenario 3...............................10 | |||
5.2.4. Procedures for scenario 4...............................11 | 5.2.4. Procedures for Scenario 4...............................11 | |||
5.2.5. Procedures for scenario 5...............................12 | 5.2.5. Procedures for Scenario 5...............................12 | |||
5.3. Consideration of Re-Use of Data Plane Resources...........12 | 5.3. Consideration of Re-Use of Data Plane Resources...........12 | |||
5.4. Consideration of Management Plane Intervention............12 | 5.4. Consideration of Management Plane Intervention............12 | |||
6. Clarification of Restarting Node Procedure..................13 | 6. Clarification of Restarting Node Procedure..................13 | |||
7. Security Considerations.....................................14 | 7. Security Considerations.....................................14 | |||
8. IANA Considerations.........................................14 | 8. IANA Considerations.........................................16 | |||
9. Acknowledgments.............................................15 | 9. Acknowledgments.............................................16 | |||
10. References.................................................15 | 10. References.................................................16 | |||
10.1. Normative References.....................................15 | 10.1. Normative References.....................................16 | |||
10.2. Informative References...................................15 | 10.2. Informative References...................................16 | |||
11. Author's Addresses.........................................16 | 11. Author's Addresses.........................................17 | |||
12. Full Copyright Statement...................................16 | 12. Full Copyright Statement...................................18 | |||
13. Intellectual Property Statement............................17 | 13. Intellectual Property Statement............................18 | |||
1. Introduction | 1. Introduction | |||
The Hello message for the Resource Reservation Protocol (RSVP) has | The Hello message for the Resource Reservation Protocol (RSVP) has | |||
been defined to establish and maintain basic signaling node | been defined to establish and maintain basic signaling node | |||
adjacencies for Label Switching Routers (LSRs) participating in a | adjacencies for Label Switching Routers (LSRs) participating in a | |||
Multiprotocol Label Switching (MPLS) traffic engineered (TE) | Multiprotocol Label Switching (MPLS) traffic engineered (TE) | |||
network [RFC3209]. The Hello message has been extended for use in | network [RFC3209]. The Hello message has been extended for use in | |||
Generalized MPLS (GMPLS) network for state recovery of control | Generalized MPLS (GMPLS) network for state recovery of control | |||
channel or nodal faults through the exchange of the Restart | channel or nodal faults through the exchange of the Restart | |||
Capabilities object [RFC3473]. | Capabilities object [RFC3473]. | |||
GMPLS protocol definitions for RSVP [RFC3473] also allow a | GMPLS protocol definitions for RSVP [RFC3473] also allow a | |||
restarting node to learn the label that it previously allocated for | restarting node to learn the label that it previously allocated for | |||
use on a Label Switching Path (LSP) through the Recovery Label | use on a Label Switching Path (LSP) through the RECOVERY_LABEL | |||
object carried on a Path message sent to a restarting node from its | object carried on a Path message sent to a restarting node from its | |||
upstream neighbor. | upstream neighbor. | |||
Further RSVP protocol extensions have been defined [RFC5063] to | Further RSVP protocol extensions have been defined [RFC5063] to | |||
perform graceful restart and to enable a restarting node to recover | perform graceful restart and to enable a restarting node to recover | |||
full control plane state by exchanging RSVP messages with its | full control plane state by exchanging RSVP messages with its | |||
upstream and downstream neighbors. State previously transmitted to | upstream and downstream neighbors. State previously transmitted to | |||
the upstream neighbor (principally the downstream label) is | the upstream neighbor (principally the downstream label) is | |||
recovered from the upstream neighbor on a Path message (using the | recovered from the upstream neighbor on a Path message (using the | |||
Recovery Label object as described in [RFC3473]). State previously | RECOVERY_LABEL object as described in [RFC3473]). State previously | |||
transmitted to the downstream neighbor (including the upstream | transmitted to the downstream neighbor (including the upstream | |||
label, interface identifiers, and the explicit route) is recovered | label, interface identifiers, and the explicit route) is recovered | |||
from the downstream neighbor using a RecoveryPath message. | from the downstream neighbor using a RecoveryPath message. | |||
[RFC5063] also extends the Hello message to exchange information | [RFC5063] also extends the Hello message to exchange information | |||
about the ability to support the RecoveryPath message. | about the ability to support the RecoveryPath message. | |||
The examples and procedures in [RFC3473] and [RFC5063] focus on the | The examples and procedures in [RFC3473] and [RFC5063] focus on the | |||
description of a single node restart when adjacent network nodes | description of a single node restart when adjacent network nodes | |||
are operative. Although the procedures are equally applicable to | are operative. Although the procedures are equally applicable to | |||
multi-node restarts, no detailed explanation is provided. | multi-node restarts, no detailed explanation is provided. | |||
This document provides and informational clarification of the | This document provides an informational clarification of the | |||
control plane procedures for a GMPLS network when there are | control plane procedures for a GMPLS network when there are | |||
multiple node failures, and describes how full control plane state | multiple node failures, and describes how full control plane state | |||
can be recovered in different scenarios where the order in which | can be recovered in different scenarios where the order in which | |||
the nodes restart is different. | the nodes restart is different. | |||
This document does not define any new processes or procedures. All | This document does not define any new processes or procedures. All | |||
protocol mechanisms already defined in [RFC3473] and [RFC5063] are | protocol mechanisms already defined in [RFC3473] and [RFC5063] are | |||
definitive. | definitive. | |||
2. Existing Procedures for Single Node Restart | 2. Existing Procedures for Single Node Restart | |||
This section documents for information the existing procedures | This section documents for information the existing procedures | |||
defined in [RFC3473] and [RFC5063]. Those documents are definitive, | defined in [RFC3473] and [RFC5063]. Those documents are definitive, | |||
and the description here is non-normative. It is provided for | and the description here is non-normative. It is provided for | |||
informational clarification only. | informational clarification only. | |||
2.1. Procedures defined in [RFC3473] | 2.1. Procedures Defined in [RFC3473] | |||
In the case of nodal faults, the procedures for the restarting node | In the case of nodal faults, the procedures for the restarting node | |||
and the procedures for the neighbor of a restarting node are | and the procedures for the neighbor of a restarting node are | |||
applied to the corresponding nodes. These procedures described in | applied to the corresponding nodes. These procedures described in | |||
[RFC3473] are summarized as follows: | [RFC3473] are summarized as follows: | |||
For the Restarting Node: | For the Restarting Node: | |||
1) Tells its neighbors that state recovery is supported using the | 1) Tells its neighbors that state recovery is supported using the | |||
Hello message; | Hello message; | |||
2) Recovers its RSVP state with the help of a Path message received | 2) Recover its RSVP state with the help of a Path message received | |||
from its upstream neighbor carrying the RECOVERY_LABEL object; | from its upstream neighbor carrying the RECOVERY_LABEL object; | |||
3) For bidirectional LSPs, the UPSTREAM_LABEL object on the received | 3) For bidirectional LSPs, the UPSTREAM_LABEL object on the received | |||
Path message is used to recover the corresponding RSVP state; | Path message is used to recover the corresponding RSVP state; | |||
4) If the corresponding forwarding state in data plane is not existed, | 4) If the corresponding forwarding state in the data plane does not | |||
the node treats this as a setup for a new LSP. If the forwarding | exist, the node treats this as a setup for a new LSP. If the | |||
state in data plane is existed, the forwarding state is bound to the | forwarding state in the data plane exists, the forwarding state is | |||
LSP associated with the message, and related forwarding state should | bound to the LSP associated with the message, and related forwarding | |||
be considered as valid and refreshed. In addition, if the node is not | state should be considered as valid and refreshed. In addition, if | |||
the tail-end of the LSP, the corresponding outgoing Path messages is | the node is not the tail-end of the LSP, the incoming label on the | |||
sent with the incoming label from that entry carried in the | downstream interface is retrieved from the forwarding state on the | |||
UPSTREAM_LABEL object. | restarting node and set in the UPSTREAM_LABEL object in the Path | |||
message sent to the downstream neighbor. | ||||
For the Neighbor of a restarting node: | For the Neighbor of a restarting node: | |||
1) Sends the Path message with RECOVERY_LABEL object containing a | 1) Sends a Path message with RECOVERY_LABEL object containing a label | |||
label value corresponding to the label value received in the most | value corresponding to the label value received in the most recently | |||
recently received corresponding Resv message; | received corresponding Resv message; | |||
2) Resumes refreshing Path state with the restarting node; | 2) Resumes refreshing Path state with the restarting node; | |||
3) Resumes refreshing Resv state with the restarting node. | 3) Resumes refreshing Resv state with the restarting node. | |||
2.2. Procedures defined in [RFC5063] | 2.2. Procedures Defined in [RFC5063] | |||
A new message is introduced in [RFC5063] which is called the | A new message is introduced in [RFC5063] called the RecoveryPath | |||
RecoveryPath message. The message is sent by the downstream | message. The message is sent by the downstream neighbor of a | |||
neighbor of a restarting node to convey the contents of the last | restarting node to convey the contents of the last received Path | |||
received Path message back to the restarting node. | message back to the restarting node. | |||
The restarting node will receive the Path message with the | The restarting node will receive the Path message with the | |||
RECOVERY_LABEL object from its upstream neighbor, and/or the | RECOVERY_LABEL object from its upstream neighbor, and/or the | |||
RecoveryPath message from its downstream neighbor. The full RSVP | RecoveryPath message from its downstream neighbor. The full RSVP | |||
state of the restarting node can be recovered from these two | state of the restarting node can be recovered from these two | |||
messages. | messages. | |||
From the received Path message the following state can be recovered: | The following state can be recovered from the received Path message: | |||
o Upstream data interface (from RSVP_HOP object) | o Upstream data interface (from RSVP_HOP object) | |||
o Label on the upstream data interface (from RECOVERY_LABEL object) | o Label on the upstream data interface (from RECOVERY_LABEL object) | |||
o Upstream label for bidirectional LSP (from UPSTREAM_LABEL object) | o Upstream label for bidirectional LSP (from UPSTREAM_LABEL object) | |||
From the received RecoveryPath message the following state can be | The following state can be recovered from the received RecoveryPath | |||
recovered: | message: | |||
o Downstream data interface (from RSVP_HOP object) | o Downstream data interface (from RSVP_HOP object) | |||
o Label on the downstream data interface (from RECOVERY_LABEL object) | o Label on the downstream data interface (from RECOVERY_LABEL object) | |||
o Upstream direction label for bidirectional LSP (from | o Upstream direction label for bidirectional LSP (from | |||
UPSTREAM_LABEL object) | UPSTREAM_LABEL object) | |||
The other objects also can be recovered either by regular Path | The other objects also can be recovered either from the regular | |||
message or RecoveryPath message, and Resv message. | Path and Resv messages, or from the RecoveryPath message. | |||
3. Multiple Node Restart Scenarios | 3. Multiple Node Restart Scenarios | |||
We define the following terms for the different node types: | We define the following terms for the different node types: | |||
Restarting - The node has restarted; communication with its | Restarting - The node has restarted; communication with its | |||
neighbor nodes is restored, its RSVP state is under recovery. | neighbor nodes is restored, its RSVP state is under recovery. | |||
Delayed Restarting - The node has restarted, but the communication | Delayed Restarting - The node has restarted, but the communication | |||
with a neighbor node is interrupted (for example, the neighbor node | with a neighbor node is interrupted (for example, the neighbor node | |||
skipping to change at page 6, line 42 | skipping to change at page 6, line 42 | |||
a Delayed Restarting node. Nodes C and D are Normal nodes. | a Delayed Restarting node. Nodes C and D are Normal nodes. | |||
5) A Restarting Egress node with upstream Delayed Restarting node. | 5) A Restarting Egress node with upstream Delayed Restarting node. | |||
For example, in Figure 1, Nodes A and B are Normal nodes, Node C is a | For example, in Figure 1, Nodes A and B are Normal nodes, Node C is a | |||
Delayed Restarting node, and Node D is a Restarting node. | Delayed Restarting node, and Node D is a Restarting node. | |||
If the communication between two nodes is interrupted, the upstream | If the communication between two nodes is interrupted, the upstream | |||
node may think the downstream node is a Delayed Restarting node, or | node may think the downstream node is a Delayed Restarting node, or | |||
vice versa. | vice versa. | |||
Note that if multi nodes which are not neighbors are restarted, the | Note that if multiple nodes which are not neighbors are restarted, | |||
restart Procedures could be applied as multiple separated restart | the restart Procedures could be applied as multiple separated | |||
procedures which are exactly the same as the procedures described | restart procedures which are exactly the same as the procedures | |||
in [RFC3473] and [RFC5063]. Therefore, these scenarios are not | described in [RFC3473] and [RFC5063]. Therefore, these scenarios | |||
described in this document. For example, in Figure 1, Node A and | are not described in this document. For example, in Figure 1, Node | |||
Node C are normal nodes, and Node B and Node D are restarting nodes, | A and Node C are normal nodes, and Node B and Node D are restarting | |||
so Node B could be restarted through Node A and Node C, meanwhile, | nodes, so Node B could be restarted through Node A and Node C, | |||
Node D could be restarted through Node C separately. | meanwhile, Node D could be restarted through Node C separately. | |||
4. RSVP State | 4. RSVP State | |||
For each scenario, the RSVP state needs to be recovered at the | For each scenario, the RSVP state needs to be recovered at the | |||
restarting nodes are Path State Block (PSB) and Resv State Block | restarting nodes are Path State Block (PSB) and Resv State Block | |||
(RSB), which are created when the node receives the corresponding | (RSB), which are created when the node receives the corresponding | |||
Path message and Resv message. | Path message and Resv message. | |||
According to [RFC2209], how to construct the PSB and RSB is really | According to [RFC2209], how to construct the PSB and RSB is really | |||
an implementation issue. In fact, there is no requirement to | an implementation issue. In fact, there is no requirement to | |||
skipping to change at page 10, line 42 | skipping to change at page 10, line 42 | |||
state. | state. | |||
Note that if Node B restarts after this operation, the Path message | Note that if Node B restarts after this operation, the Path message | |||
that it sends to Node C will not be matched with any state on Node | that it sends to Node C will not be matched with any state on Node | |||
C and will be treated as a new Path message resulting in LSP setup. | C and will be treated as a new Path message resulting in LSP setup. | |||
Node C should use the labels carried in the Path message (in the | Node C should use the labels carried in the Path message (in the | |||
UPSTREAM_LABEL object and in the RECOVERY_LABEL object) to drive | UPSTREAM_LABEL object and in the RECOVERY_LABEL object) to drive | |||
its label allocation, but may use other labels according to normal | its label allocation, but may use other labels according to normal | |||
LSP setup rules. | LSP setup rules. | |||
5.2.3. Procedures for scenario 3 | 5.2.3. Procedures for Scenario 3 | |||
In this example, the Restarting node (Node C) is isolated. It's | In this example, the Restarting node (Node C) is isolated. It's | |||
upstream and downstream neighbors have not restarted. | upstream and downstream neighbors have not restarted. | |||
The Restarting node (Node C) follows the procedures in section 9.3 | The Restarting node (Node C) follows the procedures in section 9.3 | |||
of [RFC3473] and may run a Restart Timer for each of its neighbors | of [RFC3473] and may run a Restart Timer for each of its neighbors | |||
(Nodes B and D). If a neighbor has not restarted before its Restart | (Nodes B and D). If a neighbor has not restarted before its Restart | |||
Timer expires, the corresponding LSPs may be torn down according to | Timer expires, the corresponding LSPs may be torn down according to | |||
local policy [RFC3473]. Note, however, that the Restart Time values | local policy [RFC3473]. Note, however, that the Restart Time values | |||
suggested in [RFC3473] are based on the previous Hello message | suggested in [RFC3473] are based on the previous Hello message | |||
skipping to change at page 11, line 19 | skipping to change at page 11, line 19 | |||
During the Recovery Time, if the upstream Delayed Restarting node | During the Recovery Time, if the upstream Delayed Restarting node | |||
has restarted, the procedure for scenario 1 can be applied. | has restarted, the procedure for scenario 1 can be applied. | |||
During the Recovery Time, if the downstream Delayed Restarting node | During the Recovery Time, if the downstream Delayed Restarting node | |||
has restarted, the procedure for scenario 2 can be applied. | has restarted, the procedure for scenario 2 can be applied. | |||
In the case that neither Delayed Restarting node ever comes back, | In the case that neither Delayed Restarting node ever comes back, | |||
and where a Restart Timer is not used to automatically tear down | and where a Restart Timer is not used to automatically tear down | |||
LSPs, management intervention is required to tidy up the control | LSPs, management intervention is required to tidy up the control | |||
plane and the data plane on the nodes that are waiting for the | plane and the data plane on the node that is waiting for the failed | |||
failed device to restart. | device to restart. | |||
If the downstream Delayed Restarting node restarts after the | If the downstream Delayed Restarting node restarts after the | |||
cleanup of LSPs at Node C, the RecoveryPath message from Node D | cleanup of LSPs at Node C, the RecoveryPath message from Node D | |||
will be responded with a PathTear message. If the upstream Delayed | will be responded with a PathTear message. If the upstream Delayed | |||
Restarting node restarts after the cleanup of LSPs at Node C, the | Restarting node restarts after the cleanup of LSPs at Node C, the | |||
Path message from Node B will be treated as a new LSP setup request, | Path message from Node B will be treated as a new LSP setup request, | |||
but the setup will fail because Node D cannot be reached - Node C | but the setup will fail because Node D cannot be reached - Node C | |||
will respond with a PathErr message. Since this happens to Node B | will respond with a PathErr message. Since this happens to Node B | |||
during its restart processing, it should follow the rules of | during its restart processing, it should follow the rules of | |||
[RFC5063] and tear down the LSP. | [RFC5063] and tear down the LSP. | |||
5.2.4. Procedures for scenario 4 | 5.2.4. Procedures for Scenario 4 | |||
When the Ingress node (Node A) restarts, it does not know which | When the Ingress node (Node A) restarts, it does not know which | |||
LSPs it caused to be created. Usually, however, this information is | LSPs it caused to be created. Usually, however, this information is | |||
retrieved from the management plane or from the configuration | retrieved from the management plane or from the configuration | |||
requests stored in non-volatile form in the node in order to | requests stored in non-volatile form in the node in order to | |||
recover the LSP state. | recover the LSP state. | |||
Furthermore, if the downstream node (Node B) is a Normal node, | Furthermore, if the downstream node (Node B) is a Normal node, | |||
according to the procedures in [RFC5063], the ingress will receive | according to the procedures in [RFC5063], the ingress will receive | |||
a RecoveryPath message and will understand that it was the ingress | a RecoveryPath message and will understand that it was the ingress | |||
of the LSP. | of the LSP. | |||
However, in this scenario, the downstream node is a Delayed | However, in this scenario, the downstream node is a Delayed | |||
Restarting node, so Node A must rely on the information from the | Restarting node, so Node A must rely on the information from the | |||
management plane or stored configuration, or it must wait for Node | management plane or stored configuration, or it must wait for Node | |||
B to restart. | B to restart. | |||
In the event that Node B never restarts, management plane | In the event that Node B never restarts, management plane | |||
intervention may be used at Node A to clean up any LSP state | intervention is needed at Node A to clean up any LSP control plane | |||
restored from the management plane or from local configuration. | state restored from the management plane or from local | |||
configuration, and to release any data plane resources. | ||||
5.2.5. Procedures for scenario 5 | 5.2.5. Procedures for Scenario 5 | |||
In this scenario the Egress node (Node D) restarts, and its | In this scenario the Egress node (Node D) restarts, and its | |||
upstream neighbor (Node C) has not restarted. In this case, the | upstream neighbor (Node C) has not restarted. In this case, the | |||
Egress node is completely unaware of the LSPs. It has no downstream | Egress node may have no control plane state relating to the LSPs. | |||
neighbor to help it, and no management plane or configuration | It has no downstream neighbor to help it, and no management plane | |||
information. The Egress node must simply wait until its upstream | or configuration information, although there will be data plane | |||
neighbor restarts and gives it the information as Path messages | state for the LSP. The Egress node must simply wait until its | |||
carrying RECOVERY_LABEL objects. | upstream neighbor restarts and gives it the information as Path | |||
messages carrying RECOVERY_LABEL objects. | ||||
5.3. Consideration of Re-Use of Data Plane Resources | 5.3. Consideration of Re-Use of Data Plane Resources | |||
Fundamental to the processes described above is an understanding | Fundamental to the processes described above is an understanding | |||
that data plane resources may remain in use (allocated and cross- | that data plane resources may remain in use (allocated and cross- | |||
connected) when control plane state has not been fully | connected) when control plane state has not been fully | |||
resynchronized because some control plane nodes have not restarted. | resynchronized because some control plane nodes have not restarted. | |||
It is assumed that these data plane resources might be carrying | It is assumed that these data plane resources might be carrying | |||
traffic and should not be reconfigured except through application | traffic and should not be reconfigured except through application | |||
skipping to change at page 13, line 41 | skipping to change at page 13, line 41 | |||
|<---------------| | |<---------------| | |||
| Path without | | | Path without | | |||
| recovery label | | | recovery label | | |||
|--------------->| | |--------------->| | |||
| X (resoure allocation failed because the | | X (resoure allocation failed because the | |||
| | resouces are in use) | | | resouces are in use) | |||
| PathErr | | | PathErr | | |||
|<---------------| | |<---------------| | |||
| PathTear | | | PathTear | | |||
|--------------->| | |--------------->| | |||
X(CON deletion) X (CON deletion) | X(LSP deletion) X (LSP deletion) | |||
| | | | | | |||
Figure 2 Message flow for accidental LSP deletion | ||||
The sequence diagram above depicts one scenario where the LSP may | The sequence diagram above depicts one scenario where the LSP may | |||
get deleted. | get deleted. | |||
In this sequence N1 did not detect hello failure and continues | In this sequence N1 did not detect Hello failure and continues | |||
sending SRefreshes which may get NACK'ed by N2 once restart | sending SRefreshes which may get NACK'ed by N2 once restart | |||
completes because there is no Path state corresponding to the | completes because there is no Path state corresponding to the | |||
SRefresh message. This NACK causes a Path refresh message to be | SRefresh message. This NACK causes a Path refresh message to be | |||
generated but there is no RECOVERY_LABEL because N1 did not yet | generated but there is no RECOVERY_LABEL because N1 did not yet | |||
detect that N2 has restarted as hello exchanges have not yet | detect that N2 has restarted as Hello exchanges have not yet | |||
started. The Path message is treated as "new" and fails to allocate | started. The Path message is treated as "new" and fails to allocate | |||
the resources because they are still in use. This causes a PathErr | the resources because they are still in use. This causes a PathErr | |||
message to be generated which may lead to the tear down of the LSP. | message to be generated which may lead to the tear down of the LSP. | |||
To resolve the aforementioned problem, the following procedures are | To resolve the aforementioned problem, the following procedures | |||
proposed and are meant to work together with the recovery | which are implicit in [RFC3473] and [RFC5063] should be followed. | |||
procedures documented in [RFC3473]. Here, it is assumed that the | These procedures work together with the recovery procedures | |||
restarting node and the neighboring node(s) support Hello extension | documented in [RFC3473]. Here, it is assumed that the restarting | |||
as documented in [RFC3209] and recovery procedures documented in | node and the neighboring node(s) support Hello extension as | |||
documented in [RFC3209] and recovery procedures documented in | ||||
[RFC3473]. | [RFC3473]. | |||
After a node restarts its control plane, it should ignore and | After a node restarts its control plane, it should ignore and | |||
silently drop all RSVP-TE messages, except hello messages, it | silently drop all RSVP-TE messages, except Hello messages, it | |||
receives from any neighbor to which, no HELLO session has been | receives from any neighbor to which, no HELLO session has been | |||
established. | established. | |||
The restarting node should follow [RFC3209] to establish Hello | The restarting node should follow [RFC3209] to establish Hello | |||
sessions with its neighbors, after its control plane becomes | sessions with its neighbors, after its control plane becomes | |||
operational. | operational. | |||
The restarting node resumes processing of RSVP-TE messages sent | The restarting node resumes processing of RSVP-TE messages sent | |||
from each neighbor to which the Hello session has been established. | from each neighbor to which the Hello session has been established. | |||
7. Security Considerations | 7. Security Considerations | |||
This document clarifies the procedures to be performed on RSVP | This document clarifies the procedures defined in [RFC3473] and | |||
agents that neighbor one or more restarting RSVP agents. In the | [RFC5063] to be performed on RSVP agents that neighbor one or more | |||
case of the control plane in general, and the RSVP agent in | restarting RSVP agents. It does not introduce any new procedures | |||
and, therefore, does not introduce any new security risks or issues. | ||||
In the case of the control plane in general, and the RSVP agent in | ||||
particular, where one or more nodes carrying one or more LSPs are | particular, where one or more nodes carrying one or more LSPs are | |||
restarted due to external attacks, the procedures defined in | restarted due to external attacks, the procedures defined in | |||
[RFC5063] and described in this document provide the ability for | [RFC5063] and described in this document provide the ability for | |||
the restarting RSVP agents to recover the RSVP state in each | the restarting RSVP agents to recover the RSVP state in each | |||
restarting node corresponding to the LSPs, with the least possible | restarting node corresponding to the LSPs, with the least possible | |||
perturbation to the rest of the network. Ideally, only the | perturbation to the rest of the network. These procedures can be | |||
neighboring RSVP agents should notice the restart and hence need to | considered to provide mechanisms by which the GMPLS network can | |||
perform additional processing. This allows for a network with | recover from physical attacks or from attacks on remotely | |||
active LSPs to recover LSP state gracefully from an external attack, | controlled power supplies. | |||
without perturbing the data/forwarding plane state. | ||||
The procedures described are such that, only the neighboring RSVP | ||||
agents should notice the restart of a node, and hence only they | ||||
need to perform additional processing. This allows for a network | ||||
with active LSPs to recover LSP state gracefully from an external | ||||
attack, without perturbing the data/forwarding plane state, and | ||||
without propagating the error condition in the control or data | ||||
plane. In other words, the effect of the restart (which might be | ||||
the result of an attack) does not spread into the network. | ||||
Note that concern has been expressed about the vulnerability of a | ||||
restarting node to false messages received from its neighbors. For | ||||
example, a restarting node might receive a false Path message with | ||||
a Recovery Label object from an upstream neighbor, or a false | ||||
RecoveryPath message from its downstream neighbor. This situation | ||||
might arise in one of four cases: | ||||
- The message is spoofed and does not come from the neighbor at all. | ||||
- The message has been modified as it was travelling from the | ||||
neighbor. | ||||
- The neighbor is defective and has generated a message in error. | ||||
- The neighbor has been subverted and has a "rogue" RSVP agent. | ||||
The first two cases may be handled using standard RSVP | ||||
authentication and integrity procedures [RFC3209], [RFC3473]. If | ||||
the operator is particularly worried, the control plane may be | ||||
operated using IPsec [RFC4301], [RFC4302], [RFC4835], [RFC4306], | ||||
and [RFC2411]. | ||||
Protection against defective or rogue RSVP implementations is | ||||
generally hard to impossible. Neighbor-to-neighbor authentication | ||||
and integrity validation is, by definition, ineffective in these | ||||
situations. For example, if a neighbor node sends a Resv during | ||||
normal LSP setup, and if that message carries a GENERALIZED_LABEL | ||||
object carrying an incorrect label value, then the receiving LSR | ||||
will use the supplied value and the LSP will be set up incorrectly. | ||||
Alternatively, if a Path message is modified by an upstream LSR to | ||||
change the destination and explicit route, there is no way for the | ||||
downstream LSR to detect this, and the LSP may be set up to the | ||||
wrong destination. Furthermore, the upstream LSR could disguise | ||||
this fact by modifying the recorded route reported in the Resv | ||||
message. Thus, these issues are in no way specific to the restart | ||||
case, do not cause any greater or different problems from the | ||||
normal case, and do not warrant specific security measure | ||||
applicable to restart scenarios. | ||||
Note that the RSVP POLICY_DATA object [RFC2205] provides a scope by | ||||
which secure end-to-end checks could be applied. However, very | ||||
little definition of the use of this object has been made to date. | ||||
See [MPLS-SEC] for a wider discussion of security in MPLS and GMPLS | ||||
networks. | ||||
8. IANA Considerations | 8. IANA Considerations | |||
This document defines no new protocols or extensions and makes no | This document defines no new protocols or extensions and makes no | |||
requests to IANA for registry management. | requests to IANA for registry management. | |||
9. Acknowledgments | 9. Acknowledgments | |||
We would like to thank Adrian Farrel, Dimitri Papadimitriou, and | We would like to thank Adrian Farrel, Dimitri Papadimitriou, and | |||
Lou Berger for their useful comments. | Lou Berger for their useful comments. | |||
skipping to change at page 15, line 31 | skipping to change at page 16, line 43 | |||
[RFC3473] Berger, L., "Generalized Multi-Protocol Label Switching | [RFC3473] Berger, L., "Generalized Multi-Protocol Label Switching | |||
(GMPLS) Signaling Resource ReserVation Protocol-Traffic | (GMPLS) Signaling Resource ReserVation Protocol-Traffic | |||
Engineering (RSVP-TE) Extensions", RFC 3473, January 2003. | Engineering (RSVP-TE) Extensions", RFC 3473, January 2003. | |||
[RFC5063] A. Satyanarayana, R. Rahman, "Extensions to GMPLS RSVP | [RFC5063] A. Satyanarayana, R. Rahman, "Extensions to GMPLS RSVP | |||
Graceful Restart", RFC 5063, September 2007. | Graceful Restart", RFC 5063, September 2007. | |||
10.2. Informative References | 10.2. Informative References | |||
None. | [MPLS-SEC] Fang, L., "Security Framework for MPLS and GMPLS Networks", | |||
draft-ietf-mpls-mpls-and-gmpls-security-framework, work in | ||||
progress. | ||||
[RFC2205] Braden, R. (Ed.), Zhang, L., Berson, S., Herzog, S. and S. | ||||
Jamin, "Resource ReserVation Protocol -- Version 1 | ||||
Functional Specification", RFC 2205, September 1997. | ||||
[RFC2411] R. Thayer, N. Doraswamy, R. Glenn, "IP Security Document | ||||
Roadmap", RFC 2411, November 1998. | ||||
[RFC4301] S. Kent, K. Seo, "Security Architecture for the Internet | ||||
Protocol", RFC 4301, December 2005. | ||||
[RFC4302] S. Kent, "IP Authentication Header", RFC 4302, December | ||||
2005. | ||||
[RFC4306] C. Kaufman, "Internet Key Exchange (IKEv2) Protocol", RFC | ||||
4306, December 2005. | ||||
[RFC4835] V. Manral, "Cryptographic Algorithm Implementation | ||||
Requirements for Encapsulating Security Payload (ESP) and | ||||
Authentication Header (AH)", RFC 4835, April 2007. | ||||
11. Author's Addresses | 11. Author's Addresses | |||
Dan Li | Dan Li | |||
Huawei Technologies Co., Ltd. | Huawei Technologies Co., Ltd. | |||
F3-5-B R&D Center, Huawei Base, | F3-5-B R&D Center, Huawei Base, | |||
Bantian, Longgang District | Bantian, Longgang District | |||
Shenzhen 518129, | Shenzhen 518129, P.R.China | |||
China | ||||
Phone: +86 755 28973237 | Phone: +86 755 28973237 | |||
Email: danli@huawei.com | Email: danli@huawei.com | |||
Jianhua Gao | Jianhua Gao | |||
Huawei Technologies Co., Ltd. | Huawei Technologies Co., Ltd. | |||
F3-5-B R&D Center, Huawei Base, | F3-5-B R&D Center, Huawei Base, | |||
Bantian, Longgang District | Bantian, Longgang District | |||
Shenzhen 518129, | Shenzhen 518129, P.R.China | |||
China | ||||
Phone: +86 755 28972902 | Phone: +86 755 28972902 | |||
Email: gjhhit@huawei.com | Email: gjhhit@huawei.com | |||
Arun Satyanarayana | Arun Satyanarayana | |||
Cisco Systems, Inc. | Cisco Systems, Inc. | |||
170 West Tasman Dr. | 170 West Tasman Dr. | |||
San Jose, CA 95134, | San Jose, CA 95134, USA | |||
USA | ||||
Phone: +1 408 853-3206 | Phone: +1 408 853-3206 | |||
Email: asatyana@cisco.com | Email: asatyana@cisco.com | |||
Snigdho C. Bardalai | Snigdho C. Bardalai | |||
Fujitsu Network Communications, Inc. | Fujitsu Network Communications, Inc. | |||
2801 Telecom Parkway, | 2801 Telecom Parkway, | |||
Richardson, Texas 75082 | Richardson, Texas 75082, USA | |||
USA | ||||
Phone: +1 972 479 2951 | Phone: +1 972 479 2951 | |||
Email: snigdho.bardalai@us.fujitsu.com | Email: snigdho.bardalai@us.fujitsu.com | |||
12. Full Copyright Statement | 12. Full Copyright Statement | |||
Copyright (C) The IETF Trust (2008). | Copyright (C) The IETF Trust (2008). | |||
This document is subject to the rights, licenses and restrictions | This document is subject to the rights, licenses and restrictions | |||
contained in BCP 78, and except as set forth therein, the authors | contained in BCP 78, and except as set forth therein, the authors | |||
retain all their rights. | retain all their rights. | |||
End of changes. 38 change blocks. | ||||
88 lines changed or deleted | 167 lines changed or added | |||
This html diff was produced by rfcdiff 1.34. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |