< draft-boutros-nvo3-mac-move-over-geneve-00.txt   draft-boutros-nvo3-mac-move-over-geneve-01.txt >
INTERNET-DRAFT Sami Boutros INTERNET-DRAFT Sami Boutros
Intended Status: Standard Track Jerome Catrouillet Intended Status: Standard Track Jerome Catrouillet
Ankur Sharma
VMware VMware
Expires: April 30, 2018 October 27, 2017 Expires: January 8, 2020 July 7, 2019
MAC move/flush over Geneve encapsulation MAC move over Geneve encapsulation
draft-boutros-nvo3-mac-move-over-geneve-00 draft-boutros-nvo3-mac-move-over-geneve-01
Abstract Abstract
This document specifies a mechanism to signal Media Access Control This document specifies a mechanism to signal Media Access Control
(MAC) addresses move or flush over a Network Virtualization Overlays (MAC) addresses move over a Network Virtualization Overlays over
over Layer 3 (NVO3) virtual tunnel. Such notification is useful in Layer 3 (NVO3) virtual tunnel. Such notification is useful in
redundancy scenarios when a Layer 2 service that was active on a redundancy scenarios when a Layer 2 service that was active on a
Network Virtualization Edge (NVE) fails over to a standby NVE. This Network Virtualization Edge (NVE) fails over to a standby NVE. This
notification can be used only when data plane mac learning is enabled notification can be used only when data plane mac learning is enabled
over the NVO3 tunnels. over the NVO3 tunnels.
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
skipping to change at page 1, line 46 skipping to change at page 1, line 45
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/1id-abstracts.html http://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
Copyright and License Notice Copyright and License Notice
Copyright (c) 2017 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . 3
4.0 MAC Move/Flush Frame Format . . . . . . . . . . . . . . . . 4 2. MAC Move Frame Format . . . . . . . . . . . . . . . . . . . . . 5
5.0 Operation . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.1 Operation of Sender . . . . . . . . . . . . . . . . . . . . 5 3.1 Operation of Sender . . . . . . . . . . . . . . . . . . . . 6
5.2 Operation of Receiver . . . . . . . . . . . . . . . . . . . 6 3.2 Operation of Receiver . . . . . . . . . . . . . . . . . . . 7
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 7
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 6 References . . . . . . . . . . . . . . . . . . . . . . . . . . 8
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 6.1 Normative References . . . . . . . . . . . . . . . . . . . 8
9.1 Normative References . . . . . . . . . . . . . . . . . . . 7 6.2 Informative References . . . . . . . . . . . . . . . . . . 8
9.2 Informative References . . . . . . . . . . . . . . . . . . 7 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 7
1. Introduction 1 Introduction
In multi-homing scenarios a Layer 2 service can be multi homed to In multi-homing scenarios a Layer 2 service can be multi-homed to
more than one Network virtualization Edge (NVE). Only one NVE can be more than one Network virtualization Edge (NVE). Only one NVE can be
active for a given Layer 2 service, and a standby NVE can be chosen active for a given Layer 2 service, and a standby NVE can be chosen
to take over the Layer 2 service when the active NVE goes down. The to take over the Layer 2 service when the active NVE goes down. The
mechanisms to elect which NVE will be active or standby to provide mechanisms to elect which NVE will be active or standby to provide
single active redundancy for a given Layer 2 service is outside the single active redundancy for a given Layer 2 service is outside the
scope of this document. scope of this document.
When a standby NVE gets activated, Standby NVE needs to send a MAC When a standby NVE gets activated, Standby NVE needs to send a MAC
Move/Flush message to all remote NVE(s) that spans this L2 service Move message to all remote NVE(s) that spans this L2 service over the
over the Geneve tunnels to Flush/Move all MAC learned in data plane Geneve tunnels to Move all MAC learned in data plane via the old
via the old active NVE. active NVE.
The MAC Move/Flush message will contain the NVE Identifier(s) of the The MAC Move message will contain the NVE Identifier(s) of the old
old Active NVE and the new active NVE. Active NVE and the new active NVE.
MAC Move/Flush can be used to optimize network convergence and reduce MAC Move can be used to optimize network convergence and reduce
blackholes, when an active NVE hosting a logical L2 service fails blackholes, when an active NVE hosting a logical L2 service fails
over to a standby NVE. over to a standby NVE.
The protocol defined in this document addresses possible loss of the The protocol defined in this document addresses possible loss of the
MAC Move/Flush messages due to network congestion, but does not MAC Move messages due to network congestion, but does not guarantee
guarantee delivery. delivery.
In the event that MAC Move/Flush messages does not reach the intended In the event that MAC Move messages does not reach the intended
target, the fallback to MAC re-learning or as a last resort aging out target, the fallback to MAC re-learning or as a last resort aging out
of MAC addresses in the absence of frames from the sources, will of MAC addresses in the absence of frames from the sources, will
resume the traffic via new active NVE. resume the traffic via new active NVE.
2. Terminology 1.1 Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
3. Abbreviations 1.2 Abbreviations
NVO3 Network Virtualization Overlays over Layer 3 NVO3 Network Virtualization Overlays over Layer 3
OAM Operations, Administration, and Maintenance OAM Operations, Administration, and Maintenance
TLV Type, Length, and Value TLV Type, Length, and Value
VNI Virtual Network Identifier VNI Virtual Network Identifier
NVE Network Virtualization Edge NVE Network Virtualization Edge
NVA Network Virtualization Authority NVA Network Virtualization Authority
NIC Network interface card NIC Network interface card
VTEP Virtual Tunnel End Point VTEP Virtual Tunnel End Point
Transit device Underlay network devices between NVE(s). Transit device Underlay network devices between NVE(s).
4.0 MAC Move/Flush Frame Format 2. MAC Move Frame Format
Geneve Header: Geneve Header:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Ver| Opt Len |O|C| Rsvd. | Protocol Type | |Ver| Opt Len |O|C| Rsvd. | Protocol Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Virtual Network Identifier (VNI) | Reserved | | Virtual Network Identifier (VNI) | Reserved |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Geneve Option Header: Geneve Option Header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Option Class | Type |R|R|R| Length | | Option Class | Type |R|R|R| Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Variable Option Data | | Variable Option Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Option Class = To be assigned by IANA (TBA). Option Class = To be assigned by IANA (TBA).
Type = TBA. Type = TBA.
'C' bit set, indicating endpoints must drop if they do not recognize 'C' bit, Endpoints must drop if they do not recognize this option)
this option)
Length = 2 (8 bytes) Length = 2 (8 bytes)
Variable option data: Variable option data:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|Flags |A|R| old active VTEP ID | |Version|Flags |A|R| old active VTEP ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Reserved (all zeros) | new active VTEP ID | |Reserved | new active VTEP ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number | | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Version (4 bits): Initially the Version will be 0. Version (4 bits): Initially the Version will be 0.
A (1 bit): is set by a receiver to acknowledge receipt and processing A (1 bit): Set by a receiver to acknowledge receipt and processing
of a MAC Flush message. of a MAC Move message.
R (1 bit): is set to indicate if the sender is requesting reset of R (1 bit): Set to indicate if the sender is requesting reset of
the sequence numbers. The sender sets this bit when it has no local the sequence numbers. The sender sets this bit when it has no local
record of previous send and expected receive sequence numbers. record of previous send and expected receive sequence numbers.
Flags(6): Reserved and should be set to 0. Flags(6): Reserved and should be set to 0.
VTEP ID (20 bits): Identifies an NVE, for old and new active NVE(s), VTEP ID (20 bits): Identifies old and new active NVE(s).
the new active NVE identifier will be set in case of a MAC move, and
will be 0 for a MAC flush.
Sequence Number (32) bits: For overflow detection a sequence number Sequence Number (32) bits: For overflow detection a sequence number
that exceeds 2,147,483,647 (0x7FFFFFFF) is considered an overflow and that exceeds (0x7FFFFFFF) is considered an overflow and
reset to 1. reset to 1.
5.0 Operation 3. Operation
This section describes how the initial MAC Flush/Move Messages are This section describes how the initial MAC Move Messages are sent and
sent and retransmitted, as well as how the messages are processed and retransmitted, as well as how the messages are processed and
retransmitted messages are identified. The mechanisms described are retransmitted messages are identified. The mechanisms described are
very similar to the one defined in [RFC 7769]. very similar to the one defined in [RFC 7769].
5.1 Operation of Sender 3.1 Operation of Sender
At the NVE , each L2 logical switch identified by a VNI is associated At the NVE , each L2 logical switch identified by a VNI is associated
with a counter to keep track of the sequence number of the with a counter to keep track of the sequence number of the
transmitted MAC Move/Flush messages. Whenever a node sends a MAC transmitted MAC Move messages. Whenever a node sends a MAC Move
Move/Flush message, it increments the transmitted sequence-number message, it increments the transmitted sequence-number counter and
counter and includes the new sequence number in the message. includes the new sequence number in the message.
The transmit sequence number is initialized to 1 at the onset, after The transmit sequence number is initialized to 1 at the onset, after
the wrap and after the sequence number reset request receipt. Hence the wrap and after the sequence number reset request receipt. Hence
the transmit sequence number is set to 2 in the first MAC Flush/Move the transmit sequence number is set to 2 in the first MAC Move
message sent after the sequence number is initialized to 1. message sent after the sequence number is initialized to 1.
The sender expects an ACK from the receiver within a retransmit time The sender expects an ACK from the receiver within a retransmit time
interval, which can be either a default (1 second) or a configured interval, which can be either a default (1 second) or a configured
value. If the ACK is not received within the Retransmit time, the value. If the ACK is not received within the Retransmit time, the
sender retransmits the message with the same sequence number as the sender retransmits the message with the same sequence number as the
original message. The retransmission MUST cease when an ACK is original message. The retransmission MUST cease when an ACK is
received. In order to avoid continuous re-transmissions in the received. In order to avoid continuous re-transmissions in the
absence of acknowledgements, the sender MUST cease retransmission absence of acknowledgements, the sender MUST cease retransmission
after a small number of transmissions, two retries is RECOMMENDED. after a small number of transmissions, two retries is RECOMMENDED.
skipping to change at page 6, line 13 skipping to change at page 6, line 45
message sent after the sequence number is initialized to 1. message sent after the sequence number is initialized to 1.
The sender expects an ACK from the receiver within a retransmit time The sender expects an ACK from the receiver within a retransmit time
interval, which can be either a default (1 second) or a configured interval, which can be either a default (1 second) or a configured
value. If the ACK is not received within the Retransmit time, the value. If the ACK is not received within the Retransmit time, the
sender retransmits the message with the same sequence number as the sender retransmits the message with the same sequence number as the
original message. The retransmission MUST cease when an ACK is original message. The retransmission MUST cease when an ACK is
received. In order to avoid continuous re-transmissions in the received. In order to avoid continuous re-transmissions in the
absence of acknowledgements, the sender MUST cease retransmission absence of acknowledgements, the sender MUST cease retransmission
after a small number of transmissions, two retries is RECOMMENDED. after a small number of transmissions, two retries is RECOMMENDED.
Alternatively, an increasing backoff delay with a larger number of Alternatively, an increasing backoff delay with a larger number of
retries MAY be implemented to improve scaling issues. retries MAY be implemented to improve scaling issues.
During the period of retransmission, if a need to send a new MAC During the period of retransmission, if a need to send a new MAC Move
Move/Flush message with updated sequence number arises, then message with updated sequence number arises, then retransmission of
retransmission of the older unacknowledged Move/Flush message MUST be the older unacknowledged Move message MUST be suspended and
suspended and retransmit time for the new sequence number MUST be retransmit time for the new sequence number MUST be initiated. In
initiated. In essence, a sender engages in retransmission logic only essence, a sender engages in retransmission logic only for the most
for the most recently sent Move/Flush message for a given L2 Logical recently sent Move message for a given L2 Logical Switch identified
Switch identified by a VNI. by a VNI.
In the event that the L2 logical switch is deleted and re-added or In the event that the L2 logical switch is deleted and re-added or
the VTEP node is restarted with new configuration, the NVE may lose the VTEP node is restarted with new configuration, the NVE may lose
information about the previously sent sequence number. This becomes information about the previously sent sequence number. This becomes
problematic for the remote peer as it will continue to ignore the problematic for the remote peer as it will continue to ignore the
received MAC Move/Flush messages with lower sequence numbers. In received MAC Move messages with lower sequence numbers. In such
such cases, it is desirable to reset the sequence numbers, the reset cases, it is desirable to reset the sequence numbers, the reset R-bit
R-bit is set in the first MAC Flush to notify the remote peer to is set in the first MAC Move to notify the remote peer to reset the
reset the send and receive sequence numbers. The R-bit must be send and receive sequence numbers. The R-bit must be cleared in
cleared in subsequent MAC Move/Flush messages after the subsequent MAC Move messages after the acknowledgement is received.
acknowledgement is received.
5.2 Operation of Receiver 3.2 Operation of Receiver
Each L2 logical switch identified by a VNI is associated with a Each L2 logical switch identified by a VNI is associated with a
receive sequence number per remote NVE to keep track of the expected receive sequence number per remote NVE to keep track of the expected
sequence number of the MAC Move/Flush message. sequence number of the MAC Move message.
Whenever a MAC Move/Flush message is received, and if the sequence Whenever a MAC Move message is received, and if the sequence number
number on the message is greater than the value in the receive on the message is greater than the value in the receive sequence
sequence number of this remote NVE, the MAC addresses learned from number of this remote NVE, the MAC addresses learned from the NVE
the NVE associated with the NVE identifier in the message are flushed associated with the NVE identifier in the message are moved to be
or moved to be associated with the new active NVE identifier, and the associated with the new active NVE identifier, and the receive
receive sequence number of the remote NVE is updated with the sequence number of the remote NVE is updated with the received
received sequence number. The receiver sends an ACK with the same sequence number. The receiver sends an ACK with the same sequence
sequence number in the received message. number in the received message.
If the sequence number in the received message is smaller than or If the sequence number in the received message is smaller than or
equal to the value in the receive sequence number per remote NVE, the equal to the value in the receive sequence number per remote NVE, the
MAC Move/Flush is not processed. However, an ACK with the received MAC Move is not processed. However, an ACK with the received sequence
sequence number MUST be sent as a response to stop the sender number MUST be sent as a response to stop the sender retransmission.
retransmission.
A MAC Move/Flush message with the R-bit set MUST be processed by
resetting the receive sequence number of the remote NVE, and
Moving/flushing the MACs as described above. The acknowledgement is
sent with the R-bit cleared.
6. Acknowledgements A MAC Move message with the R-bit set MUST be processed by resetting
the receive sequence number of the remote NVE, and Moving the MACs as
described above. The acknowledgement is sent with the R-bit cleared.
7. Security Considerations 4. Security Considerations
This document does not introduce any additional security constraints. This document does not introduce any additional security constraints.
8. IANA Considerations 5. IANA Considerations
IANA is requested to assign a new option class from the "Geneve
IANA is requested to assign a new option class from the "Geneve Option Class" registry for the Geneve MAC Move option.
Option Class" registry for the Geneve MAC Move/Flush option.
Option Class Description Option Class Description
------------ --------------------- ------------ ---------------------
XXXX Geneve MAC Move/Flush XXXX Geneve MAC Move
9. References 6 References
9.1 Normative References 6.1 Normative References
[KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
9.2 Informative References 6.2 Informative References
[Geneve] "Generic Network Virtualization Encapsulation", [I-D.ietf- [Geneve] "Generic Network Virtualization Encapsulation", [I-D.ietf-
nvo3-geneve] [RFC 7769] "MAC Address Withdrawal over Static PW", [RFC nvo3-geneve]
7769] [RFC 7769] "MAC Address Withdrawal over Static PW", [RFC 7769]
Authors' Addresses Authors' Addresses
Sami Boutros Sami Boutros
VMware VMware
Email: sboutros@vmware.com Email: boutross@vmware.com
Jerome Catrouillet Jerome Catrouillet
VMware, Inc. VMware
Email: jcatrouillet@vmware.com Email: jcatrouillet@vmware.com
Ankur Sharma Ankur Sharma
VMware, Inc. VMware
Email: ankursharma@vmware.com Email: ankursharma@vmware.com
 End of changes. 63 change blocks. 
144 lines changed or deleted 133 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/