draft-ietf-nvo3-mcast-framework-06.txt   draft-ietf-nvo3-mcast-framework-07.txt 
NVO3 working group A. Ghanwani NVO3 working group A. Ghanwani
Internet Draft Dell Internet Draft Dell
Intended status: Informational L. Dunbar Intended status: Informational L. Dunbar
Expires: November 8, 2017 M. McBride Expires: November 8, 2017 M. McBride
Huawei Huawei
V. Bannai V. Bannai
Google Google
R. Krishnan R. Krishnan
Dell Dell
February 1, 2017 February 16, 2017
A Framework for Multicast in Network Virtualization Overlays A Framework for Multicast in Network Virtualization Overlays
draft-ietf-nvo3-mcast-framework-06 draft-ietf-nvo3-mcast-framework-07
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be modified, provisions of BCP 78 and BCP 79. This document may not be modified,
and derivative works of it may not be created, except to publish it and derivative works of it may not be created, except to publish it
as an RFC and to translate it into languages other than English. as an RFC and to translate it into languages other than English.
skipping to change at page 2, line 39 skipping to change at page 2, line 39
Table of Contents Table of Contents
1. Introduction...................................................3 1. Introduction...................................................3
1.1. Infrastructure multicast..................................3 1.1. Infrastructure multicast..................................3
1.2. Application-specific multicast............................4 1.2. Application-specific multicast............................4
1.3. Terminology clarification.................................4 1.3. Terminology clarification.................................4
2. Acronyms.......................................................4 2. Acronyms.......................................................4
3. Multicast mechanisms in networks that use NVO3.................5 3. Multicast mechanisms in networks that use NVO3.................5
3.1. No multicast support......................................6 3.1. No multicast support......................................6
3.2. Replication at the source NVE.............................6 3.2. Replication at the source NVE.............................6
3.3. Replication at a multicast service node...................8 3.3. Replication at a multicast service node...................9
3.4. IP multicast in the underlay.............................10 3.4. IP multicast in the underlay.............................10
3.5. Other schemes............................................11 3.5. Other schemes............................................12
4. Simultaneous use of more than one mechanism...................12 4. Simultaneous use of more than one mechanism...................12
5. Other issues..................................................12 5. Other issues..................................................12
5.1. Multicast-agnostic NVEs..................................12 5.1. Multicast-agnostic NVEs..................................12
5.2. Multicast membership management for DC with VMs..........13 5.2. Multicast membership management for DC with VMs..........13
6. Summary.......................................................13 6. Summary.......................................................14
7. Security Considerations.......................................13 7. Security Considerations.......................................14
8. IANA Considerations...........................................14 8. IANA Considerations...........................................14
9. References....................................................14 9. References....................................................14
9.1. Normative References.....................................14 9.1. Normative References.....................................14
9.2. Informative References...................................14 9.2. Informative References...................................15
10. Acknowledgments..............................................16 10. Acknowledgments..............................................16
1. Introduction 1. Introduction
Network virtualization using Overlays over Layer 3 (NVO3) is a Network virtualization using Overlays over Layer 3 (NVO3) is a
technology that is used to address issues that arise in building technology that is used to address issues that arise in building
large, multitenant data centers that make extensive use of server large, multitenant data centers that make extensive use of server
virtualization [RFC7364]. virtualization [RFC7364].
This document provides a framework for supporting multicast traffic, This document provides a framework for supporting multicast traffic,
skipping to change at page 4, line 38 skipping to change at page 4, line 38
2. Acronyms 2. Acronyms
ASM: Any-Source Multicast ASM: Any-Source Multicast
IGMP: Internet Group Management Protocol IGMP: Internet Group Management Protocol
LISP: Locator/ID Separation Protocol LISP: Locator/ID Separation Protocol
MSN: Multicast Service Node MSN: Multicast Service Node
RLOC: Routing Locator
NVA: Network Virtualization Authority NVA: Network Virtualization Authority
NVE: Network Virtualization Edge NVE: Network Virtualization Edge
NVGRE: Network Virtualization using GRE NVGRE: Network Virtualization using GRE
PIM: Protocol-Independent Multicast PIM: Protocol-Independent Multicast
SSM: Source-Specific Multicast SSM: Source-Specific Multicast
TS: Tenant system TS: Tenant system
VM: Virtual Machine VM: Virtual Machine
VN: Virtual Network VN: Virtual Network
VTEP: VxLAN Tunnel End Points
VXLAN: Virtual eXtensible LAN VXLAN: Virtual eXtensible LAN
3. Multicast mechanisms in networks that use NVO3 3. Multicast mechanisms in networks that use NVO3
In NVO3 environments, traffic between NVEs is transported using an In NVO3 environments, traffic between NVEs is transported using an
encapsulation such as Virtual eXtensible Local Area Network (VXLAN) encapsulation such as Virtual eXtensible Local Area Network (VXLAN)
[RFC7348,VXLAN-GPE], Network Virtualization Using Generic Routing [RFC7348,VXLAN-GPE], Network Virtualization Using Generic Routing
Encapsulation (NVGRE) [RFC7637], , Geneve [Geneve], Generic UDP Encapsulation (NVGRE) [RFC7637], , Geneve [Geneve], Generic UDP
Encapsulation (GUE) [GUE], etc. Encapsulation (GUE) [GUE], etc.
skipping to change at page 7, line 14 skipping to change at page 7, line 19
packet. For the purpose of ARP/ND, this would involve knowing the packet. For the purpose of ARP/ND, this would involve knowing the
IP addresses of all the NVEs that have TSs in the virtual network IP addresses of all the NVEs that have TSs in the virtual network
(VN) of the TS that generated the request. For the support of (VN) of the TS that generated the request. For the support of
application-specific multicast traffic, a method similar to that of application-specific multicast traffic, a method similar to that of
receiver-sites registration for a particular multicast group receiver-sites registration for a particular multicast group
described in [LISP-Signal-Free] can be used. The registrations from described in [LISP-Signal-Free] can be used. The registrations from
different receiver-sites can be merged at the NVA, which can different receiver-sites can be merged at the NVA, which can
construct a multicast replication-list inclusive of all NVEs to construct a multicast replication-list inclusive of all NVEs to
which receivers for a particular multicast group are attached. The which receivers for a particular multicast group are attached. The
replication-list for each specific multicast group is maintained by replication-list for each specific multicast group is maintained by
the NVA. the NVA. Note: Using LISP-signal-free does not necessarily mean the
head-end (i.e. NVE) must do replication. If the mapping database
(i.e. NVA) indicates that packets are encapsulated to multicast
RLOCs, then there is no replication happening at the NVE.
The receiver-sites registration is achieved by egress NVEs The receiver-sites registration is achieved by egress NVEs
performing the IGMP/MLD snooping to maintain state for which performing the IGMP/MLD snooping to maintain state for which
attached TSs have subscribed to a given IP multicast group. When attached TSs have subscribed to a given IP multicast group. When
the members of a multicast group are outside the NVO3 domain, it is the members of a multicast group are outside the NVO3 domain, it is
necessary for NVO3 gateways to keep track of the remote members of necessary for NVO3 gateways to keep track of the remote members of
each multicast group. The NVEs and NVO3 gateways then communicate each multicast group. The NVEs and NVO3 gateways then communicate
the multicast groups that are of interest to the NVA. If the the multicast groups that are of interest to the NVA. If the
membership is not communicated to the NVA, and if it is necessary to membership is not communicated to the NVA, and if it is necessary to
prevent hosts attached to an NVE that have not subscribed to a prevent hosts attached to an NVE that have not subscribed to a
skipping to change at page 9, line 17 skipping to change at page 9, line 23
This mechanism is similar to that used by the Asynchronous Transfer This mechanism is similar to that used by the Asynchronous Transfer
Mode (ATM) Forum's LAN Emulation (LANE)LANE specification [LANE]. Mode (ATM) Forum's LAN Emulation (LANE)LANE specification [LANE].
The MSN is similar to the RP in PIM SM, but different in that the The MSN is similar to the RP in PIM SM, but different in that the
user data traffic are carried by the NVO3 tunnels. user data traffic are carried by the NVO3 tunnels.
The following are the possible ways for the MSN to get the The following are the possible ways for the MSN to get the
membership information for each multicast group: membership information for each multicast group:
- The MSN can obtain this membership information from the IGMP/MLD - The MSN can obtain this membership information from the IGMP/MLD
report messages sent from the TSs. The IGMP/MLD report messages report messages sent by TSs in response to IGMP/MLD query messages
are in response to IGMP/MLD query messages sent from the MSN to from the MSN. The IGMP/MLD query messages are sent from the MSN to
the TSs via NVEs that TSs are attached. In order for the MSN to the NVEs, which then multicast the query messages to TSs attached
receive the IGMP/MLD report messages from the TSs, each of the to them. An IGMP/MLD query messages sent out by the MSN to an NVE
IGMP/MLD query messages has to be encapsulated with the MSN is encapsulated with the MSN address in the outer source address
address in the outer source address field and the address of the field and the address of the NVE in the outer destination address
NVE in the outer destination address field. Each of the field. The encapsulated IGMP/MLD query messages also has a VNID
encapsulated IGMP/MLD query messages also has the VNID to which for a virtual network (VN) that TSs belong in the outer header and
TSs belong in the outer header and a multicast address that a multicast address in the inner destination address field. Upon
identifies a multicast group in the inner destination field. The receiving the encapsulated IGMP/MLD query message, the NVE
NVEs can establish the mapping between the MSN address and the establishes a mapping "MSN address" <-> "multicast address",
multicast address upon receiving the encapsulated IGMP/MLD query decapsulates the received encapsulated IGMP/MLD message, and
messages. With the proper "MSN Address" <-> "Multicast-Address" multicast the decapsulated IGMP/MLD query message to TSs that
mapping, the NVEs can encapsulate the IGMP/MLD report messages belong to the VN under the NVE. A IGMP/MLD report message sent by
from TSs with the address of the MSN in the outer destination a TS includes the multicast address and the address of the TS.
address field. With the proper "MSN Address" <-> "Multicast-Address" mapping, the
NVEs can encapsulate all multicast data frames sent by TSs to the
"Multicast-Address" with the address of the MSN in the outer
destination address field.
- The MSN can obtain the membership information from the NVEs that - The MSN can obtain the membership information from the NVEs that
have the capability to establish multicast groups by snooping have the capability to establish multicast groups by snooping
native IGMP/MLD messages (p.s. the communication must be specific native IGMP/MLD messages (p.s. the communication must be specific
to the multicast addresses), or by having the NVA obtain the to the multicast addresses), or by having the NVA obtain the
information from the NVEs, and in turn have MSN communicate with information from the NVEs, and in turn have MSN communicate with
the NVA. This approach requires additional protocol between MSN the NVA. This approach requires additional protocol between MSN
and NVEs. and NVEs.
Unlike the method described in Section 3.2, there is no performance Unlike the method described in Section 3.2, there is no performance
skipping to change at page 11, line 5 skipping to change at page 11, line 12
attached VMs and support proper mapping of IGMP/MLD's messages to attached VMs and support proper mapping of IGMP/MLD's messages to
the messages needed by the underlay IP multicast protocols. the messages needed by the underlay IP multicast protocols.
With this method, there are none of the issues with the methods With this method, there are none of the issues with the methods
described in Sections 3.2. described in Sections 3.2.
With PIM Sparse Mode (PIM-SM), the number of flows required would be With PIM Sparse Mode (PIM-SM), the number of flows required would be
(n*g), where n is the number of source NVEs that source packets for (n*g), where n is the number of source NVEs that source packets for
the group, and g is the number of groups. Bidirectional PIM (BIDIR- the group, and g is the number of groups. Bidirectional PIM (BIDIR-
PIM) would offer better scalability with the number of flows PIM) would offer better scalability with the number of flows
required being g. required being g. Unfortunately, many vendors still do not fully
support BIDIR or have limitations on its implementaion. RFC6831
[RFC6831] has good description of using SSM as an alternative to
BIDIR if the VTEP/NVE devices have a way to learn of each other's IP
address so that they could join all SSM SPT's to create/maintain an
underlay SSM IP Multicast tunnel solution.
In the absence of any additional mechanism, e.g. using an NVA for In the absence of any additional mechanism, e.g. using an NVA for
address resolution, for optimal delivery, there would have to be a address resolution, for optimal delivery, there would have to be a
separate group for each tenant, plus a separate group for each separate group for each tenant, plus a separate group for each
multicast address (used for multicast applications) within a tenant. multicast address (used for multicast applications) within a tenant.
Additional considerations are that only the lower 23 bits of the IP Additional considerations are that only the lower 23 bits of the IP
address (regardless of whether IPv4 or IPv6 is in use) are mapped to address (regardless of whether IPv4 or IPv6 is in use) are mapped to
the outer MAC address, and if there is equipment that prunes the outer MAC address, and if there is equipment that prunes
multicasts at Layer 2, there will be some aliasing. Finally, a multicasts at Layer 2, there will be some aliasing. Finally, a
skipping to change at page 14, line 21 skipping to change at page 14, line 36
9.1. Normative References 9.1. Normative References
[RFC7365] Lasserre, M. et al., "Framework for data center (DC) [RFC7365] Lasserre, M. et al., "Framework for data center (DC)
network virtualization", October 2014. network virtualization", October 2014.
[RFC7364] Narten, T. et al., "Problem statement: Overlays for [RFC7364] Narten, T. et al., "Problem statement: Overlays for
network virtualization", October 2014. network virtualization", October 2014.
[NVO3-ARCH] Narten, T. et al.," An Architecture for Overlay Networks [NVO3-ARCH] Narten, T. et al.," An Architecture for Overlay Networks
(NVO3)", <draft-ietf-nvo3-arch-06>, work in progress, (NVO3)", RFC8014, Dec. 2016.
April 2016.
[RFC3376] Cain B. et al., "Internet Group Management Protocol, [RFC3376] Cain B. et al., "Internet Group Management Protocol,
Version 3", October 2002. Version 3", October 2002.
[RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs",
February 2012. February 2012.
9.2. Informative References 9.2. Informative References
[RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area
skipping to change at page 16, line 4 skipping to change at page 16, line 20
Gross, J. and Ganga, I. (Eds.), "Geneve: Generic Network Gross, J. and Ganga, I. (Eds.), "Geneve: Generic Network
Virtualization Encapsulation", <draft-ietf-nvo3-geneve- Virtualization Encapsulation", <draft-ietf-nvo3-geneve-
01>, work in progress, January 2016. 01>, work in progress, January 2016.
[GUE] [GUE]
Herbert, T. et al., "Generic UDP Encapsulation", <draft- Herbert, T. et al., "Generic UDP Encapsulation", <draft-
ietf-nvo3-gue-02>, work in progress, December 2015. ietf-nvo3-gue-02>, work in progress, December 2015.
[BIER-ARCH] [BIER-ARCH]
Wijnands, IJ. (Ed.) et al., "Multicast using Bit Index Wijnands, IJ. (Ed.) et al., "Multicast using Bit Index
Explicit Replication," <draft-ietf-bier-architecture-03>, Explicit Replication," <draft-ietf-bier-architecture-03>,
January 2016. January 2016.
[RFC 3819] [RFC 3819]
P. Harn et al., "Advice for Internet Subnetwork Designers", P. Harn et al., "Advice for Internet Subnetwork Designers",
July 2004. July 2004.
[RFC6831] Farinacci, D. et al., "The Locator/ID Seperation Protocol
(LISP) for Multicast Environments", Jan, 2013.
10. Acknowledgments 10. Acknowledgments
Many thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, Many thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong,
Nicolas Bouliane, Saumya Dikshit, Joe Touch, Olufemi Komolafe, and Nicolas Bouliane, Saumya Dikshit, Joe Touch, Olufemi Komolafe, and
Matthew Bocci, for their valuable comments and suggestions. Matthew Bocci, for their valuable comments and suggestions.
This document was prepared using 2-Word-v2.0.template.dot. This document was prepared using 2-Word-v2.0.template.dot.
Authors' Addresses Authors' Addresses
 End of changes. 16 change blocks. 
28 lines changed or deleted 46 lines changed or added

This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/