draft-ietf-nvo3-mcast-framework-02.txt   draft-ietf-nvo3-mcast-framework-03.txt 
NVO3 working group A. Ghanwani NVO3 working group A. Ghanwani
Internet Draft Dell Internet Draft Dell
Intended status: Informational L. Dunbar Intended status: Informational L. Dunbar
Expires: August 9, 2016 M. McBride Expires: August 14, 2016 M. McBride
Huawei Huawei
V. Bannai V. Bannai
Google Google
R. Krishnan R. Krishnan
Dell Dell
February 10, 2016 February 15, 2016
A Framework for Multicast in NVO3 A Framework for Multicast in NVO3
draft-ietf-nvo3-mcast-framework-02 draft-ietf-nvo3-mcast-framework-03
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be modified, provisions of BCP 78 and BCP 79. This document may not be modified,
and derivative works of it may not be created, except to publish it and derivative works of it may not be created, except to publish it
as an RFC and to translate it into languages other than English. as an RFC and to translate it into languages other than English.
skipping to change at page 1, line 43 skipping to change at page 1, line 43
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on August 9 2016. This Internet-Draft will expire on August 14, 2016.
Copyright Notice Copyright Notice
Copyright (c) 2016 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 33 skipping to change at page 3, line 33
The reader is assumed to be familiar with the terminology as defined The reader is assumed to be familiar with the terminology as defined
in the NVO3 Framework document [RFC7365] and NVO3 Architecture in the NVO3 Framework document [RFC7365] and NVO3 Architecture
document [NVO3-ARCH]. document [NVO3-ARCH].
1.1. Infrastructure multicast 1.1. Infrastructure multicast
Infrastructure multicast includes protocols such as ARP/ND, DHCP, Infrastructure multicast includes protocols such as ARP/ND, DHCP,
and mDNS. It is possible to provide solutions for these that do not and mDNS. It is possible to provide solutions for these that do not
involve multicast in the underlay network. In the case of ARP/ND, involve multicast in the underlay network. In the case of ARP/ND,
an NVA can be used for distributing the mappings of IP address to an NVA can be used for distributing the mappings of IP address to
MAC address to all NVEs, and the NVEs can respond to ARP messages MAC address to all NVEs. The NVEs can then trap ARP Request/ND
from the TSs that are attached to it in a way that is similar to Neighbor Solicitation messages from the TSs that are attached to it
proxy-ARP. In the case of DHCP, the NVE can be configured to and respond to them, thereby eliminating the need to for
forward these messages using a helper function. broadcast/multicast of such messages. In the case of DHCP, the NVE
can be configured to forward these messages using a helper function.
Of course it is possible to support all of these infrastructure Of course it is possible to support all of these infrastructure
multicast protocols natively if the underlay provides multicast multicast protocols natively if the underlay provides multicast
transport. However, even in the presence of multicast transport, it transport. However, even in the presence of multicast transport, it
may be beneficial to use the optimizations mentioned above to reduce may be beneficial to use the optimizations mentioned above to reduce
the amount of such traffic in the network. the amount of such traffic in the network.
1.2. Application-specific multicast 1.2. Application-specific multicast
Application-specific multicast traffic, which may be either Source- Application-specific multicast traffic, which may be either Source-
Specific Multicast (SSM) or Any-Source Multicast (ASM)[RFC3569], Specific Multicast (SSM) or Any-Source Multicast (ASM)[RFC 3569],
has the following characteristics: has the following characteristics:
1. Receiver hosts are expected to subscribe to multicast content 1. Receiver hosts are expected to subscribe to multicast content
using protocols such as IGMP [RFC3376] (IPv4) or MLD (IPv6). using protocols such as IGMP [RFC3376] (IPv4) or MLD (IPv6).
Multicast sources and listeners participant in these protocols Multicast sources and listeners participant in these protocols
using addresses that are in the Tenant System address domain. using addresses that are in the Tenant System address domain.
2. The list of multicast listeners for each multicast group is not 2. The list of multicast listeners for each multicast group is not
known in advance. Therefore, it may not be possible for an NVA known in advance. Therefore, it may not be possible for an NVA
to get the list of participants for each multicast group ahead to get the list of participants for each multicast group ahead
skipping to change at page 7, line 10 skipping to change at page 7, line 10
group. When the members of a multicast group are outside the NVO3 group. When the members of a multicast group are outside the NVO3
domain, it is necessary for NVO3 gateways to keep track of the domain, it is necessary for NVO3 gateways to keep track of the
remote members of each multicast group. The NVEs and NVO3 gateways remote members of each multicast group. The NVEs and NVO3 gateways
then communicate the multicast groups that are of interest to the then communicate the multicast groups that are of interest to the
NVA. If the membership is not communicated to the NVA, and if it is NVA. If the membership is not communicated to the NVA, and if it is
necessary to prevent hosts attached to an NVE that have not necessary to prevent hosts attached to an NVE that have not
subscribed to a multicast group from receiving the multicast subscribed to a multicast group from receiving the multicast
traffic, the NVE would need to maintain multicast group membership traffic, the NVE would need to maintain multicast group membership
information. information.
In multi-homing environments, i.e. more than one NVE can reach a In multi-homing environments, i.e. in those where a TS is attached
specific TS, the NVA would be expected to provide all the NVEs that to more than one NVE, the NVA would be expected to provide
can reach the given TS. The ingress NVE can choose any one of the information to all of the NVEs under its control about all of the
egress NVEs for the data frames destined towards the TS. NVEs to which such a TS is attached. The ingress NVE can choose any
one of the egress NVEs for the data frames destined towards the TS.
In the absence of IGMP/MLD snooping, the traffic would be delivered In the absence of IGMP/MLD snooping, the traffic would be delivered
to all hosts that are part of the VNI. to all hosts that are part of the VNI.
This method requires multiple copies of the same packet to all NVEs This method requires multiple copies of the same packet to all NVEs
that participate in the VN. If, for example, a tenant subnet is that participate in the VN. If, for example, a tenant subnet is
spread across 50 NVEs, the packet would have to be replicated 50 spread across 50 NVEs, the packet would have to be replicated 50
times at the source NVE. This also creates an issue with the times at the source NVE. This also creates an issue with the
forwarding performance of the NVE. forwarding performance of the NVE.
Note that this method is similar to what was used in VPLS [RFC4762] Note that this method is similar to what was used in VPLS [RFC4792]
prior to support of MPLS multicast [RFC7117]. While there are some prior to support of MPLS multicast [RFC7117]. While there are some
similarities between MPLS VPN and the NVO3 overlay, there are some similarities between MPLS VPN and the NVO3 overlay, there are some
key differences: key differences:
- The CE-to-PE attachment in VPNs is somewhat static, whereas in a - The CE-to-PE attachment in VPNs is somewhat static, whereas in a
DC that allows VMs to migrate anywhere, the TS attachment to NVE DC that allows VMs to migrate anywhere, the TS attachment to NVE
is much more dynamic. is much more dynamic.
- The number of PEs to which a single VPN customer is attached in - The number of PEs to which a single VPN customer is attached in
an MPLS VPN environment is normally far less than the number of an MPLS VPN environment is normally far less than the number of
NVEs to which a VNI's VMs are attached in a DC. NVEs to which a VNI's VMs are attached in a DC.
When a VPN customer has multiple multicast groups, [RFC6513] When a VPN customer has multiple multicast groups, [RFC6513]
"Multicast VPN" combines all those multicast groups within each "Multicast VPN" combines all those multicast groups within each
VPN client to one single multicast group in the MPLS (or VPN) VPN client to one single multicast group in the MPLS (or VPN)
core. The result is that messages from any of the multicast core. The result is that messages from any of the multicast
groups belonging to one VPN customer will reach all the PE nodes groups belonging to one VPN customer will reach all the PE nodes
of the client. In other words, any messages belonging to any of the client. In other words, any messages belonging to any
multicast groups under customer X will reach all PEs of the multicast groups under customer X will reach all PEs of the
customer X. When the customer X is attached to only a handful of customer X. When the customer X is attached to only a handful of
PEs, the use of this approach does not result in excessive wastage PEs, the use of this approach does not result in excessive wastage
of bandwidth in the provider's network. of bandwidth in the provider's network.
In a DC environment, a typical server/hypervisor based virtual In a DC environment, a typical server/hypervisor based virtual
switch may only support 10's VMs (as of this writing). A subnet switch may only support 10's VMs (as of this writing). A subnet
with N VMs may be, in the worst case, spread across N vSwitches. with N VMs may be, in the worst case, spread across N vSwitches.
Using "MPLS VPN multicast" approach in such a scenario would Using "MPLS VPN multicast" approach in such a scenario would
require the creation of a Multicast group in the core for this VNI require the creation of a Multicast group in the core for this VNI
to reach all N NVEs. If only small percentage of this client's VMs to reach all N NVEs. If only small percentage of this client's VMs
participate in application specific multicast, a great number of participate in application specific multicast, a great number of
NVEs will receive multicast traffic that is not forwarded to any NVEs will receive multicast traffic that is not forwarded to any
of their attached VMs, resulting in considerable wastage of of their attached VMs, resulting in considerable wastage of
bandwidth. bandwidth.
Therefore, the Multicast VPN solution may not scale in DC Therefore, the Multicast VPN solution may not scale in DC
environment with dynamic attachment of Virtual Networks to NVEs and environment with dynamic attachment of Virtual Networks to NVEs and
greater number of NVEs for each virtual network. greater number of NVEs for each virtual network.
3.3. Replication at a multicast service node 3.3. Replication at a multicast service node
With this method, all multicast packets would be sent using a With this method, all multicast packets would be sent using a
unicast tunnel encapsulation from the ingress NVE to a multicast unicast tunnel encapsulation from the ingress NVE to a multicast
service node (MSN). The MSN, in turn, would create multiple copies service node (MSN). The MSN, in turn, would create multiple copies
skipping to change at page 9, line 39 skipping to change at page 9, line 39
NVE encapsulates the packet with the appropriate IP multicast NVE encapsulates the packet with the appropriate IP multicast
address in the tunnel encapsulation header for delivery to the address in the tunnel encapsulation header for delivery to the
desired set of NVEs. The protocol in the underlay could be any desired set of NVEs. The protocol in the underlay could be any
variant of Protocol Independent Multicast (PIM), or protocol variant of Protocol Independent Multicast (PIM), or protocol
dependent multicast, such as [ISIS-Multicast]. dependent multicast, such as [ISIS-Multicast].
If an NVE connects to its attached TSs via Layer 2 network, there If an NVE connects to its attached TSs via Layer 2 network, there
are multiple ways for NVEs to support the application specific are multiple ways for NVEs to support the application specific
multicast: multicast:
- The NVE only supports the basic IGMP/MLD snooping function, let - The NVE only supports the basic IGMP/MLD snooping function, let
the TSs routers handling the application specific multicast. This the TSs routers handling the application specific multicast. This
scheme doesn't utilize the underlay IP multicast protocols. scheme doesn't utilize the underlay IP multicast protocols.
- The NVE can act as a pseudo multicast router for the directly - The NVE can act as a pseudo multicast router for the directly
attached VMs and support proper mapping of IGMP/MLD's messages to attached VMs and support proper mapping of IGMP/MLD's messages to
the messages needed by the underlay IP multicast protocols. the messages needed by the underlay IP multicast protocols.
With this method, there are none of the issues with the methods With this method, there are none of the issues with the methods
described in Sections 3.2. described in Sections 3.2.
With PIM Sparse Mode (PIM-SM), the number of flows required would be With PIM Sparse Mode (PIM-SM), the number of flows required would be
(n*g), where n is the number of source NVEs that source packets for (n*g), where n is the number of source NVEs that source packets for
the group, and g is the number of groups. Bidirectional PIM (BIDIR- the group, and g is the number of groups. Bidirectional PIM (BIDIR-
PIM) would offer better scalability with the number of flows PIM) would offer better scalability with the number of flows
skipping to change at page 13, line 27 skipping to change at page 13, line 27
9. References 9. References
9.1. Normative References 9.1. Normative References
[RFC7365] Lasserre, M. et al., "Framework for data center (DC) [RFC7365] Lasserre, M. et al., "Framework for data center (DC)
network virtualization", October 2014. network virtualization", October 2014.
[RFC7364] Narten, T. et al., "Problem statement: Overlays for [RFC7364] Narten, T. et al., "Problem statement: Overlays for
network virtualization", October 2014. network virtualization", October 2014.
[NVO3-ARCH] Narten, T. et al.," An Architecture for Overlay Networks [NVO3-ARCH]
(NVO3)", work in progress, February 2014. Narten, T. et al.," An Architecture for Overlay Networks
(NVO3)", work in progress.
[RFC3376] Cain B. et al., "Internet Group Management Protocol, [RFC3376] Cain B. et al., "Internet Group Management Protocol,
Version 3", October 2002. Version 3", October 2002.
[RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs",
February 2012. February 2012.
9.2. Informative References 9.2. Informative References
[RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area
Network (VXLAN): A Framework for Overlaying Virtualized Network (VXLAN): A Framework for Overlaying Virtualized
Layer 2 Networks over Layer 3 Networks", August 2014. Layer 2 Networks over Layer 3 Networks", August 2014.
[RFC7637] Garg P. and Wang, Y. (Eds.), "NVGRE: Network [RFC7637] Garg, P. and Wang, Y. (Eds.), "NVGRE: Network
Virtualization using Generic Routing Encapsulation", Vvirtualization using Generic Routing Encapsulation",
September 2015. September 2015.
[STT] Davie, B. and Gross, J., "A stateless transport tunneling [STT] Davie, B. and Gross, J., "A stateless transport tunneling
protocol for network virtualization," work in progress. protocol for network virtualization," work in progress.
[DC-MC] McBride M., and Lui, H., "Multicast in the data center [DC-MC] McBride, M. and Lui, H., "Multicast in the data center
overview," work in progress. overview," work in progress.
[ISIS-Multicast] [ISIS-Multicast]
L. Yong, et al., "ISIS Protocol Extension for Building Yong, L. et al., "ISIS Protocol Extension for Building
Distribution Trees", work in progress. Distribution Trees", work in progress.
[RFC4762] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private [RFC4792] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private
LAN Service (VPLS) using Label Distribution Protocol (LDP) LAN Service (VPLS) using Label Distribution Protocol (LDP)
signaling," January 2007. signaling," RFC 4762, January 2007.
[RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014.
[LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000, [LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000,
January 1995. January 1995.
[EDGE-REP] [EDGE-REP]
Marques P. et al., "Edge multicast replication for BGP IP Marques P. et al., "Edge multicast replication for BGP IP
VPNs," work in progress.. VPNs," work in progress..
[RFC3569] S. Bhattacharyya, Ed., "An Overview of Source-Specific [RFC 3569]
S. Bhattacharyya, Ed., "An Overview of Source-Specific
Multicast (SSM)", July 2003. Multicast (SSM)", July 2003.
[LISP-Signal-Free] [LISP-Signal-Free]
Moreno, V. and Farinacci, D., "Signal-Free LISP Moreno, V. and Farinacci, D., "Signal-Free LISP
Multicast", work in progress. Multicast", work in progress.
10. Acknowledgments 10. Acknowledgments
Thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, and Thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, Nicolas
Nicolas Bouliane, for their comments and suggestions. Bouliane, and Saumya Dikshit for their comments and suggestions.
This document was prepared using 2-Word-v2.0.template.dot. This document was prepared using 2-Word-v2.0.template.dot.
Authors' Addresses Authors' Addresses
Anoop Ghanwani Anoop Ghanwani
Dell Dell
Email: anoop@alumni.duke.edu Email: anoop@alumni.duke.edu
Linda Dunbar Linda Dunbar
Huawei Technologies Huawei Technologies
5340 Legacy Drive, Suite 1750 5340 Legacy Drive, Suite 1750
Plano, TX 75024, USA Plano, TX 75024, USA
Phone: (469) 277 5840 Phone: (469) 277 5840
Email: ldunbar@huawei.com Email: ldunbar@huawei.com
Mike McBride Mike McBride
Huawei Technologies Huawei Technologies
mmcbride7@gmail.com Email: mmcbride7@gmail.com
Vinay Bannai Vinay Bannai
Google Google
Email: vbannai@gmail.com Email: vbannai@gmail.com
Ram Krishnan Ram Krishnan
Dell Dell
Email: ramkri123@gmail.com Email: ramkri123@gmail.com
 End of changes. 25 change blocks. 
55 lines changed or deleted 59 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/