--- 1/draft-ietf-mpls-tp-1ton-protection-00.txt 2013-02-05 00:38:07.259707638 +0100 +++ 2/draft-ietf-mpls-tp-1ton-protection-01.txt 2013-02-05 00:38:07.327707797 +0100 @@ -1,33 +1,32 @@ Network Working Group E. Osborne Internet-Draft Cisco Intended status: Standards Track F. Zhang -Expires: February 7, 2013 ZTE +Expires: August 8, 2013 ZTE Y. Weingarten - August 6, 2012 + February 4, 2013 MPLS-TP 1toN Protection - draft-ietf-mpls-tp-1ton-protection-00.txt + draft-ietf-mpls-tp-1ton-protection-01.txt Abstract As part of the Transport Profile for Multiprotocol Label Switching (MPLS-TP) there is a requirement to support 1:n linear protection for - transport paths. This requirement is elaborated on in the MPLS-TP - Survivability Framework document [SurvivFwk]. The basic protocol for - linear protection was specified in the MPLS-TP Linear Protection - document [LinProt] but is limited to 1+1 and 1:1 protection. This - document extends the protocol defined there to address the additional - functionality necessary to support scenarios of a single protection - path preconfigured to provide protection of multiple transport paths - between two joint endpoints. + transport paths. This requirement is further elaborated in RFC6372 + [SurvivFwk]. The basic protocol for linear protection, specified in + RFC6378 [LinProt], is limited to 1+1 and 1:1 protection. This + document extends that protocol to address the additional + functionality necessary to support scenarios where a single + protection path is preconfigured to provide protection of multiple + transport paths between two joint endpoints. This document is a product of a joint Internet Engineering Task Force (IETF) / International Telecommunications Union Telecommunications Standardization Sector (ITU-T) effort to include an MPLS Transport Profile within the IETF MPLS and PWE3 architectures to support the capabilities and functionalities of a packet transport network as defined by the ITU-T. Status of this Memo @@ -37,25 +36,24 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 7, 2013. + This Internet-Draft will expire on August 8, 2013. Copyright Notice - - Copyright (c) 2012 IETF Trust and the persons identified as the + Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as @@ -109,65 +107,62 @@ 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31 8.1. Normative References . . . . . . . . . . . . . . . . . . . 31 8.2. Informative References . . . . . . . . . . . . . . . . . . 32 Appendix A. PSC state machine tables . . . . . . . . . . . . . . 32 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 36 1. Introduction The MPLS Transport Profile (MPLS-TP) Requirements document [TPReq] - includes requirements for the necessary survivability tools that are - required for MPLS based transport networks. Network survivability is - the ability of a network to recover traffic delivery following - failure, or degradation of network resources. Requirement 67 lists - various types of 1:n protection architectures that are required for - MPLS-TP. The MPLS-TP Survivability Framework [SurvivFwk] is a - framework for survivability in MPLS-TP networks, and describes - recovery elements, types, methods, and topological considerations, - focusing on mechanisms for recovering MPLS-TP Label Switched Paths - (LSPs). + includes requirements for the necessary survivability tools required + for MPLS based transport networks. Network survivability is the + ability of a network to recover traffic delivery following failure, + or degradation of network resources. Requirement 67 lists various + types of 1:n protection architectures that are required for MPLS-TP. + The MPLS-TP Survivability Framework [SurvivFwk] is a framework for + survivability in MPLS-TP networks, and describes recovery elements, + types, methods, and topological considerations, focusing on + mechanisms for recovering MPLS-TP Label Switched Paths (LSPs). Linear protection in mesh networks - networks with arbitrary interconnectivity between nodes - is described in Section 4.7 of [SurvivFwk]. Linear protection provides rapid and simple protection switching. In a mesh network, linear protection provides a very suitable protection mechanism because it can operate between any pair of points within the network. It can protect against a defect in an intermediate node, a span, a transport path segment, or an end-to-end transport path. [LinProt] defines a Protection State Coordination (PSC) protocol that supports the different 1+1 and 1:1 architectures described in [SurvivFwk]. The PSC protocol is a single-phased protocol that allows the two endpoints of the protection domain to coordinate the protection switching operation when a switching condition is detected on the transport paths of the protection domain. - This document extends the PSC protocol to allow it to support a - protection domain that includes multiple working transport paths that - are protected by a single protection transport path. All of the - working transport paths and the protection transport path share - common end points. The protection transport path is pre-allocated - with resources to transport the traffic normally carried by any one - of the working transport paths. This is the architecture described - in [SurvivFwk] as 1:n protection, and is the generalization of the - 1:1 protection architecture already supported by PSC. + This document extends the PSC protocol to support a protection domain + that includes multiple working transport paths, between common end + points, protected by a single protection transport path. The + protection transport path is pre-allocated with resources to + transport the traffic normally carried by any one of the working + transport paths. This is the architecture described in [SurvivFwk] + as 1:n protection, and is the generalization of the 1:1 protection + architecture already supported by PSC. 1.1. 1:n Protection architecture Linear protection switching is a fully allocated survivability - mechanism. It is fully allocated in the sense that the route and - bandwidth of the protection path is reserved for a set of working - paths. For 1:n protection the protection path is allocated to - protect any one of n working paths between the two endpoints of the - protection domain. + mechanism in the sense that the route and bandwidth of the protection + path is reserved for a set of working paths. For 1:n protection the + protection path is allocated to protect any one of n working paths + between the two endpoints of the protection domain. +-----+ +-----+ | |=============================| | |LER-A| Working Path #1 |LER-Z| | | | | | |=============================| | | | Working Path #2 | | | | | | | |=============================| | | | Working Path #3 | | @@ -200,27 +195,28 @@ The different working paths may be disjoint at the intermediary points on the path between LER-A and LER-Z and may also have different resource requirements. In addition, each of the working paths may be assigned a priority that could be used to decide which working path would be protected in cases of conflict (see more on this topic in Section 1.5). It is usually advised to arrange these protection groups in a way that would minimize any potential conflict situation. 1:n protection in MPLS supports two modes of operation - locking and - non-locking. Locking mirrors the behavior that is used by many - transport protection mechanisms, and is necessary in some cases but - may incur increased latency (and thus packet loss), as a result of - prolonged switching time, in comparison to the non-locking case. - Non-locking 1:n can be used in many MPLS networks and has far less - packet loss as compared to locking, but must be used with care - - since incorrect use of non-locking can lead to misconnectivity. + non-locking. The locking mode mirrors the behavior that is used by + many transport protection mechanisms, and is necessary in some cases + but may incur increased latency (and thus packet loss), as a result + of prolonged switching time, in comparison to the non-locking case. + Non-locking 1:n can be used in many MPLS networks and affords a lower + rate of packet loss as compared to locking mode, but must be used + with care - since incorrect use of non-locking can lead to + misconnectivity. 1.2. Locking operation The high-level functionality of the locking operation mode of 1:n protection would follow the following basic steps: o LER-A detects a unidirectional failure of W1 and stops sending traffic on W1. o LER-A transmits a PSC SF message to LER-Z indicating that W1 has @@ -252,45 +248,46 @@ points verify that both are ready to process the W1 traffic that is received on P. More detailed information on this mode of operation will be supplied later in the document when considering different scenarios. 1.3. Non-Locking In non-locking protection operation mode, LER-A switches data traffic onto P immediately upon failure detection. This minimizes traffic loss, but at the cost of temporary asymmetry of packet flow. At a - high level, it looks like this: + high level, it works like this: o LER-A detects the failure of W1 and stops sending traffic on W1. o LER-A immediately begins to transport W1's data traffic over the protection path P. o Simultaneously LER-A transmitts a PSC message to LER-Z indicating that W1 has failed and is currently being protected in P. o LER-Z receives the PSC message from LER-A, switches all W1 data traffic to P, and transmits a PSC message to LER-A indicating that W1 is now protected in P. o LER-A receives the PSC message from LER-Z and needs to take no action, as the protection switch had already been completed. In the non-locking case, the packet loss between the endpoints is - minimized. Packet loss in the A->Z direction is only the failure - detection time , which is assumed, for this document, to be - negligible. Packet loss in the Z->A direction is almost entirely the - result of the one-way propagation delay of the PSC message from LER-A - to LER-Z. Assuming the transport path from A->Z has the same delay - as that from Z->A, it can be said that the packet loss in the non- - locking case is roughly half that of the locking case. + minimized. Packet loss may occur in the A->Z direction only for the + duration of the failure detection time , which is assumed, for this + document, to be negligible. Packet loss in the Z->A direction is + almost entirely the result of the one-way propagation delay of the + PSC message from LER-A to LER-Z. Assuming the transport path from + A->Z has the same delay as that from Z->A, it can be said that the + packet loss in the non-locking case is roughly half that of the + locking case. 1.4. Path priority As the 1:n architecture requires the ability for one working path to preempt the traffic of another in the event of multiple failures (see Section 1.5), there must be an indication of priority between the different working paths so that an implementation can decide whether a new failure should be allowed to preempt a protection switch already in place. The priority for a given Working path is determined by the value used to represent that path in the FPath @@ -363,42 +360,42 @@ 2.2. Definitions and Terminology The terminology used in this document is based on the terminology defined in [RFC4427] and further adapted for MPLS-TP in [SurvivFwk]. In addition, we use the term LER to refer to a MPLS Network Element, whether it is a LSR, LER, T-PE, or S-PE. 3. Use cases and scenarios - This section will present some use-cases and scenarios that should + This section presents some use-cases and scenarios that should illucidate the use of PSC for 1:n protection. 3.1. Non-locking use case: Per-node label space Non-locking protection can be used when the payload that is received from the protection path is unambiguous and can be properly forwarded without the need to explicitly establish selector and bridge configuration at the time of failure. One example where this applies is when the endpoints of the protection domain are using per-platform label space [RFC3031]. In per-node or per-platform label space, the LIB is established on a node such that it can properly switch any labeled packet regardless of input interface. Consider, as an example, the protection topology as shown in Figure 1 with four working paths - W1, W2, W3, W4 and a single protection - path, P, that connect between LER-A and LER-Z. Each packet that + path, P, that connect between LER-A and LER-Z. Each packet transported from LER-A to LER-Z is labelled by LER-A depending upon - the path that it is being transmitted over. From there the packet - will traverse the relevant path and have its label manipulated by the + the path used to transmit the packet. From there the packet will + traverse the relevant path and have its label manipulated by the intermediate LSRs until it arrives at LER-Z, at which point, the LER will pop the label for the path used within the protection domain and process the next label down to determine how to forward the packet payload. The following table gives the label assigned by LER-A and the one expected by LER-Z for each of the transport paths: +------+----------------+-----------------+ | Path | Label at LER-A | Label for LER-Z | +------+----------------+-----------------+ | W1 | 100 | 105 | @@ -764,29 +761,29 @@ There are multiple scenarios of preemption depending on where the failures were detected. In addition to the combinations of failure directionality and preemption, it is also necessary to consider how these combinations behave in both the locking and non-locking modes of operation. First consider, the two flavors of preemption due to multiple unidirectional failures. - The difference between Locking and Non-Locking is that in Non-Locking - a node can continue to send traffic on the P-LSP during the - preemption process. The P-LSP contents may momentarily disagree (A - may send W1 on P, Z may send W2 on P) but in the non-locking case - there is no risk of misconnectivity as explained in the previous - discussion. For this reason, the identity of the path that the - endpoints are selecting incoming traffic from are irrelevant. In a - sense there is no selector; each node is able to properly process - arbitrary data on the P-LSP. + The difference between Locking and Non-Locking modes is that a node + can continue to send traffic on the P-LSP during the preemption + process, when operating in Non-Locking mode. The P-LSP contents may + momentarily disagree (A may send W1 on P, Z may send W2 on P) but in + the non-locking case there is no risk of misconnectivity as explained + in the previous discussion. For this reason, the identity of the + path that the endpoints are selecting incoming traffic from are + irrelevant. In a sense there is no selector; each node is able to + properly process arbitrary data on the P-LSP. However, WFA state is still necessary in order to ensure that the endpoints converge on the identity of the working path whose traffic is being transported on the P-LSP. Failure to converge is a problem that should be flagged to the operator. The scenarios start after the two endpoints have converged on protecting a unidirectional SF condition that was detected on W2, when a new SF condition is detected on W1 (with higher priority): @@ -986,33 +983,34 @@ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLV Length | Reserved2 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Optional TLVs ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 10: Format of basic PSC packet with a G-ACh header In regards to the G-ACh Header no changes are suggested in the extensions for 1:n protection, i.e., the channel type field will - continue to use the PSC-CT value defined in [LinProt]. The fields - from the PSC payload which are affected by this document are the Ver - field, the Reserved1 field, and the Fpath and Path fields. + continue to use the PSC-CT value defined in [LinProt]. The PSC + payload fields affected by this document are the Ver field, Reserved1 + field, and the Fpath and Path fields. 4.2. Changes to PSC Payload In order to support 1:n protection there is a need to make one small change to the format of the PSC payload (see Figure 11). In particular, we have added a new flag (L), taken from the Reserved1 - space, to whether the protection domain is locking or non-locking. - In addition, the semantics of the FPath and Path field are adjusted - to indicate an index of the multiple working paths. The details of - these changes are supplied in the following subsections. + space, that is used to indicate whether the protection domain is + opearting in locking or non-locking mode. In addition, the semantics + of the FPath and Path field are adjusted to indicate an index of the + multiple working paths. The details of these changes are supplied in + the following subsections. Due to the significance of these changes, the value of the Ver field (in the PSC payload) for 1:n protection domain MUST be set to 2. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Ver|Request|PT |R|L| Reserved1 | FPath | Path | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLV Length | Reserved2 | @@ -1020,26 +1018,26 @@ ~ Optional TLVs ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 11: Format of 1:n PSC message payload 4.2.1. Locking (L) flag The Locking flag is used to indicate that the end-point is configured for Locking mode (see Section 1.2). - If the value is 1 then the protection-domain is using the locking + If the value is 1 then the protection-domain is operating in locking mode The Locking flag must be the same on both ends; if the two endpoints of a protection domain have different L-flag settings, this MUST - raise an error to the network operator + raise an error to the network operator. 4.2.2. Fault path (FPath) field The Fpath field indicates which path is identified to be in a fault condition or affected by an administrative command. The following are the possible values: o 0: indicates that the anomaly condition is on the protection path o 1-128: indicates that the anomaly condition is on a working path @@ -1064,22 +1062,22 @@ o 129-255: for future extensions or experimental use. 4.3. Changes to PSC Operation In all of the following subsections, assume a protection domain between LER-A and LER-Z, using working paths 1-N and the protection path as shown in figure 1. A basic premise of this protection architecture is that both - endpoints of the protection domain are configured to associate the - indices of the working paths with the proper LSP identifiers. If + endpoints of the protection domain MUST be configured to associate + the indices of the working paths with the proper LSP identifiers. If this condition is not met then the protection scheme will cause inconsistencies in traffic transmission. 4.3.1. Basic operation Protection of the N working paths is based on the operational principles outlined in [LinProt] and will employ the same basic Protection State Coordination Protocol (PSC) outlined in that document. However, as can be expected, due to certain basic differences in the architecture of the protection domain, a small set @@ -1174,21 +1172,21 @@ of the WFA timer SHOULD be configured to allow protection switching within the normal time constraints. The WFA timer will expire only if no Acknowledge message was recieved by the LER in WFA state. The WFA Expires local input should have a priority just below that of the WTRExpires signal. 4.3.5. Additional PSC State As described above and demonstrated in the scenarios in Section 3.3, there is a need, in some scenarios, for the endpoint that is - reporting on a trigger for protection-switching to delay the actual + reporting a trigger for protection-switching to delay the actual switchover until an acknowledge is received from the far end LER. In order to facilitate this wait period it is necessary to define a new PSC State - Wait for Acknowledge (WFA) state. WFA is used in both the Locking and Non-Locking cases. It is more essential to the Locking mode of operation, as agreement is the mechanism to establish and release the lock on the protection LSP. However, it is necessary for the Non-Locking mode as a persistent disagreement on the contents of the protection LSP indicates an error in the network devices and WFA is the method used to detect this error. @@ -1405,21 +1403,21 @@ [SurvivFwk] Sprecher, N., Farrel, A., and H. Shah, "Multi-protocol Label Switching Transport Profile Survivability Framework", RFC 6372, Feb 2009. [SecureFwk] Fang, L., Niven-Jenkins, B., Mansfield, S., Zhang, R., Bitar, N., Daikoku, M., and L. Wang, "MPLS-TP Security Framework", - ID draft-ietf-mpls-tp-security-framework-02.txt, Feb 2011. + ID draft-ietf-mpls-tp-security-framework-07.txt, Jan 2013. Appendix A. PSC state machine tables Note/Disclaimer: This state machine is not currently in sync with the text of the document and will be updated in a future revision. The full PSC state machine is described in [LinProt], both in textual and tabular form. This appendix highlights the changes to the basic PSC state machine. In the event of a mismatch between these tables and the text either in [LinProt] or in this document, the text is