[Docs] [txt|pdf] [Tracker] [WG] [Email] [Nits] [IPR]

Versions: 00 01 02 03

BESS                                                         Weiguo Hao
                                                              Lucy Yong
                                                               S. Hares
Internet Draft                                                   Huawei
Intended status: Standard Track                        October 27, 2014
Expires: April 2015



         Inter-AS Option C between NVO3 and BGP/MPLS IP VPN network
               draft-hao-bess-inter-nvo3-vpn-optionc-00.txt


Abstract

   This draft describes inter-as option-C solution between NVO3 network
   and MPLS/IP VPN network. BGP label routing information is extended
   to create multi-hop forwarding path between local NVE and remote PE.
   Also to ensure VPNv4 route exchange correctly between local NVE and
   remote PE, VN ID space should be partitioned, only the VN IDs of
   lower 1 Million can be used for interconnection with outer MPLS VPN
   network using option-C solution, the rest 15 Million VN IDs can only
   be used for intra DC.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with
   the provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 27, 2015.





Hao & et,al            Expires March 27, 2015                 [Page 1]


Internet-Draft            Inter-As Option-B             September 2014


Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document. Please review these documents
   carefully, as they describe your rights and restrictions with
   respect to this document.

Table of Contents

   1. Introduction ................................................ 2
   2. Conventions used in this document............................ 4
   3. Reference model ............................................. 5
   4. Traditional Option-C [RFC4364] Recap......................... 6
   5. Inter-As Option-C Solution................................... 6
      5.1. Multi-hop EBGP Connection............................... 6
      5.2. VPN routes exchange..................................... 7
      5.3. Data forwarding process................................. 7
         5.3.1. Data flow from TS1 to CE1.......................... 7
         5.3.2. Data flow from CE1 to TS1.......................... 8
   6. BGP Label Routing Extension.................................. 9
   7. NVE-NVA architecture........................................ 10
      7.1. Multi-Hop EBGP connection.............................. 10
      DC to WAN direction:........................................ 10
      WAN to DC direction:........................................ 10
      7.2. VPN route exchange..................................... 11
   8. Security Considerations..................................... 11
   9. IANA Considerations ........................................ 12
   10. References ................................................ 12
      10.1. Normative References.................................. 12
      10.2. Informative References................................ 12
   11. Acknowledgments ........................................... 12

1. Introduction

   In cloud computing era, multi-tenancy has become a core requirement
   for data centers. Since NVO3 can satisfy multi-tenancy key
   requirements, this technology is being deployed in an increasing
   number of cloud data center network. NVO3 focuses on the
   construction of overlay networks that operate over an IP (L3)
   underlay transport network. It can provide layer 2 bridging and
   layer 3 IP service for each tenant. VXLAN and NVGRE are two typical
   NVO3 technologies. NVO3 overlay network can be controlled through


Hao & et,al            Expires April 27, 2015                 [Page 2]


Internet-Draft            Inter-As Option-B             September 2014


   centralized NVE-NVA architecture or through distributed BGP VPN
   protocol.

   NVO3 has good scaling properties from relatively small networks to
   networks with several million tenant systems (TSs) and hundreds of
   thousands of virtual networks within a single administrative domain.
   In NVO3 network, 24-bit VN ID is used to identify different virtual
   networks, theoretically 16M virtual networks can be supported in a
   data center. In a data center network, each tenant may include one
   or more layer 2 virtual network and in normal cases each tenant
   corresponds to one routing domain (RD). Normally each layer 2
   virtual network corresponds to one or more subnets.

   To provide cloud service to external data center client, data center
   networks should be connected with WAN networks. BGP MPLS/IP VPN has
   already been widely deployed at WAN networks. Normally internal data
   center and external Normally internal data center and external
   MPLS/IP VPN network are different Autonomous System (AS). This
   requires the setting up of inter-as connections at Autonomous System
   Border Routers(ASBRs) between NVO3 network and external MPLS/IP
   network.

   Similar to the Inter-as connection method defined in RFC4364, there
   are three different ways of handling this case, they are option-A,
   option-B and option-C respectively in order of increasing
   scalability.

   Option-A is a back-to-back VRFs solution. Using option-A, EBGP
   session per VPN is created on peering ASBRs. In the data-plane,
   VLANs are used for tenant traffic separation. It has the lowest
   scalability among the three solutions. Compared to option-A solution,
   option-B solution has more scalability. But using option-B, ASBRs
   need to maintain and distribute all VPN prefixes. In the data plane,
   ASBRs need to perform MPLS VPN Label switching. Because MPLS VPN
   Label switching table space on ASBRs is limited, it still has
   scalability limitation for large VPN network. Option-C solution is a
   most scalable option through separating VPNv4 and PE prefixes
   exchange, the ASBRs don't need to maintain and distribute the
   customers VPN prefixes. The ASBR is only used to exchange the
   service provider(SP) internal IP.

   This draft is to propose inter-as option-C solution between NVO3
   network and external BGP MPLS/IP VPN network. Compared to the
   traditional option-C solution defined in [RFC4364], it is for
   heterogeneous network interconnection, the control plane and data
   plane procedures in NVO3 network should be newly specified.



Hao & et,al            Expires April 27, 2015                 [Page 3]


Internet-Draft            Inter-As Option-B             September 2014


2. Conventions used in this document

   Network Virtualization Edge (NVE) - An NVE is the network entity that
   sits at the edge of an underlay network and implements network
   virtualization functions.

   Tenant System - A physical or virtual system that can play the role
   of a host, or a forwarding element such as a router, switch,
   firewall, etc. It belongs to a single tenant and connects to one or
   more VNs of that tenant.

   VN - A VN is a logical abstraction of a physical network that
   provides L2 network services to a set of Tenant Systems.

   RD - Route Distinguisher. RDs are used to maintain uniqueness among
   identical routes in different VRFs, The route distinguisher is an 8-
   octet field prefixed to the customer's IP address. The resulting 12-
   octet field is a unique "VPN-IPv4" address.

   RT - Route targets. It is used to control the import and export of
   routes between different VRFs.



























Hao & et,al            Expires April 27, 2015                 [Page 4]


Internet-Draft            Inter-As Option-B             September 2014


3. Reference model

   +---------------------------------------------------+
   |  +----+           AS1                             |
   |  | TS1| -                                         |
   |  +----+  -                                        |
   |            - +----+    +----+                     |
   |            - |NVE1| -- |TOR1|---------------+     |
   |  +----+  -   +----+    +----+               |     |
   |  | TS2|-                                    |     |
   |  +----+                                     |     |
   |                                         +-------+ |
   |                           +------------ | ASBR-d| |
   |  +----+                   |             +-------+ |
   |  | TS3| -                 |                 |     |
   |  +----+  -                |                 |     |
   |            - +----+    +----+               |     |
   |            - |NVE2| -- |TOR2|               |     |
   |  +----+  -   +----+    +----+               |     |
   |  | TS4|-                                    |     |
   |  +----+                                     |     |
   |----------------------------------------------------
   |                                             |     |
   |  +----+                                     |     |
   |  | CE1| -                                   |     |
   |  +----+  -                                  |     |
   |            - +----+                     +-------+ |
   |            - | PE1| --------------------| ASBR-w| |
   |   +----+  -  +----+                     +-------+ |
   |   | CE2|-                                         |
   |   +----+          AS2                             |
   |---------------------------------------------------|
                         Figure 1 Reference model

   Figure 1 shows an arbitrary Multi-AS VPN interconnectivity scenario
   between NVO3 network and BGP MPLS/IP VPN network. NVE1, NVE2, and
   ASBR-d forms NVO3 overlay network in internal DC. TS1 and TS2
   connect to NVE1, TS3 and TS4 connect to NVE2. PE1 and ASBR-w forms
   MPLS IP/VPN network in external DC. CE1 and CE2 connect to PE1. The
   NVO3 network is in AS 1, the MPLS/IP VPN network is in AS 2.

   There are two tenants in NVO3 network, TSs in tenant 1 can freely
   communicate with CEs in VPN-Red, TSs in tenant 2 can freely
   communicate with CEs in VPN-Green. TS1 and TS3 belong to tenant 1,
   TS2 and TS4 belong to tenant 2. CE1 belongs to VPN-Red , CE2 belongs
   to VPN-Green. VN ID 10 and VN ID 20 are used to identify tenant1 and



Hao & et,al            Expires April 27, 2015                 [Page 5]


Internet-Draft            Inter-As Option-B             September 2014


   tenant2 respectively. PE1 assigned MPLS VPN Label 1000 and 2000 for
   the routes from CE1 and CE2 respectively.

4. Traditional Option-C [RFC4364] Recap

   Using traditional option-C defined in [RFC4364], PE routers in
   different ASes should first establish multi-hop EBGP connections to
   each other, and then exchange VPN-IPv4 routes over those connections.
   EBGP is used to distribute labeled IPv4/32 routes to create a label
   switched path from the ingress PE router to the egress PE router. In
   this procedure, VPN-IPv4 routes are neither maintained nor
   distributed by the ASBRs.  An ASBR only need maintain labeled
   IPv4/32 routes to the PE routers within its AS. If the /32 routes
   for the PE routers are NOT made known to the P routers(other than
   the ASBRs), then a packet's ingress PE need to put a three-label
   stack on it. The bottom label is assigned by the egress PE,
   corresponding to the packet's destination address in a particular
   VRF.  The middle label is assigned by the ASBR, corresponding to the
   /32 route to the egress PE.  The top label is assigned by the
   ingress PE's IGP Next Hop, corresponding to the /32 route to the
   ASBR.

5. Inter-As Option-C Solution

   Each NVE operates as default layer 3 gateway for local connecting
   TS(s). VRFs are created on each NVE to isolate IP forwarding process
   between different tenants. At least an L3 VN ID is used to identify
   each tenant.

   Similar to traditional Option-c defined in [RFC4364], multi-hop EBGP
   connections should be first established between NVEs and PEs, then
   VPN-IPv4 routes can be exchanged between those connections. EBGP is
   used to distribute labeled IPv4/32 routes to create a forwarding
   path from NVE to PE. Unlike traditional option-c BGP label switched
   path, the forwarding path has two segments, one segment from NVE to
   ASBR-d in NVO3 network is NVO3 tunnel, another segment from ASBR-d
   to PE in WAN network is traditional BGP LSP, the two segments should
   be stitched together, the stitching point is at ASBR-d. The behavior
   on ASBR-w and PEs in MPLS VPN network has no difference with the
   behavior of ASBR and PEs in traditional RFC4364 based MPLS VPN
   Option-C network.

5.1. Multi-hop EBGP Connection

   In WAN to DC direction, when ASBR-d receives labeled IPv4/32 routes
   from ASBR-w, it changes BGP Next Hop to itself, allocates new IP
   address as NVO3 tunnel destination IP per Label, and then advertises


Hao & et,al            Expires April 27, 2015                 [Page 6]


Internet-Draft            Inter-As Option-B             September 2014


   the label route to all local NVEs using BGP extension. The new
   allocated IP address is called NVO3 tunnel IP and it is used to
   identify a remote PE. NVO3 tunnel IP address pool should be
   configured in beforehand on ASBR-d. The new allocated NVO3 tunnel IP
   and MPLS Label correspondence forms outgoing forwarding table which
   is used to stitch NVO3 tunnel and BGP LSP from internal DC to
   external DC.

   In DC to WAN direction, ASBR-d announces labeled IPv4/32 routes to
   ASBR-w for each NVE, MPLS Label is assigned for each NVE. The
   allocated MPLS Label and NVE IP correspondence forms incoming
   forwarding table which is used to stitch BGP LSP and NVO3 tunnel
   from external DC to internal DC.

5.2. VPN routes exchange

   Then VPN-IPv4 routes can be exchanged between the NVE and remote PE
   using RFC4364. Route distinguishers (RD) and RT are specified for
   each VRF on each NVE and PE.

   Each NVE advertises all local VPN route to remote PEs using tenant
   identification VN ID as MPLS VPN Label. These remote PEs deal with
   the NVE as regular PE, they match RT and populates these VPN route
   to local VRF. For the traffic from remote CE to local TS, ingress PE
   uses the VN ID as bottom label in MPLS encapsulation. Because VN ID
   field is 24 bits, to ensure these NVEs and PEs interworking, VN ID
   length should not beyond 20 bits, i.e., VN ID value must not be
   larger than 1 Million. In NVO3 network, VN ID space should be
   partitioned, only the VN IDs of lower 1 Million can be used for
   interconnection with outer MPLS VPN network, the rest 15 Million VN
   IDs can only be used for intra DC.

   Each MPLS VPN PE also advertises all local VPN route to NVEs, these
   NVEs match RT and populates these VPN route to local VRF. For the
   traffic from local TS to remote CE, because ingress NVE doesn't
   support MPLS encapsulation, it encoded the MPLS VPN Label advertised
   from remote PE as VN ID in NVO3 encapsulation.

5.3. Data forwarding process

   This section describes the step by step procedures of data forward
   between TS1 and CE1 in figure 1.

5.3.1. Data flow from TS1 to CE1

   1. TS1 sends a packet to NVE1, destination IP is CE1's IP.



Hao & et,al            Expires April 27, 2015                 [Page 7]


Internet-Draft            Inter-As Option-B             September 2014


   2. NVE1 acquires local VRF relying on packet input interface, then
      looks up the VRF's routing table corresponding to tenant 1,
      performs NVO3 encapsulation, and then sends the encapsulation
      packet to ASBR-d. The MPLS VPN Label associated with the packet's
      destination address is encoded in VN ID field. NVO3 tunnel
      destination IP is the new IP address allocated on ASBR-d
      associated with the /32 routes for the PE routers that the remote
      CE attached.

   3. ASBR-d decapsulates the NVO3 encapsulation and then performs MPLS
      encapsulation. Two Labels should be pushed for the MPLS
      encapsulation, BGP LSP Label as top Label and MPLS VPN Label as
      bottom Label. BGP LSP Label is acquired by looking up outgoing
      stitching table, MPLS VPN Label is copied from VN ID.

   4. ASBR-w swaps BGP MPLS Label, then push IGP Label and sends the
      packet to PE1. MPLS VPN Label remains unchanged.

   5. PE1 pops all MPLS Label, finds local VRF relying on bottom MPLS
      VPN Label, looks up local IP forwarding table in the VRF, and
      then sends the packet to CE1.

5.3.2. Data flow from CE1 to TS1

   1. CE1 sends a packet to PE1, destination IP is TS1's IP.

   2. PE1 acquires local VRF relying on packet input interface, then
      looks up the VRF's routing table. It puts a three-label stack on
      it. The bottom label is the tenant VN ID corresponds to TS1, the
      VN ID is 10. The middle label is assigned by the ASBR-w,
      associating with the /32 route for the egress NVE1. The top label
      is assigned by the ingress PE's IGP Next Hop, corresponding to
      the /32 route to ASBR-w.

   3. ASBR-w pops top IGP Label, swaps middle BGP Label, and then sends
      the packet to ASBR-d.

   4. ASBR-d decapsulates MPLS encapsulation, performs NVO3
      encapsulation and then sends the packet to egress NVE1. The
      egress NVE's IP address is acquired relying on looking up
      incoming stitching table, VN ID is copied from the bottom MPLS
      Label ,i.e., MPLS VPN Label.

   5. NVE1 decapsulates NVO3 encapsulation, finds local VRF relying on
      VN ID, looks up routing table and then sends the packet to TS1.




Hao & et,al            Expires April 27, 2015                 [Page 8]


Internet-Draft            Inter-As Option-B             September 2014


6. BGP Label Routing Extension

   In RFC 3107, BGP is used to carry label mapping information for a
   particular route. In this draft, multi-hop EBGP connection needs to
   cross NVO3 and MPLS/IP VPN network. In NVO3 network, BGP label
   mapping information should be extended to convey NVO3 Tunnel IP
   address for a particular route.

   The Label mapping information is carried as part of the Network
   Layer Reachability Information (NLRI) in the Multiprotocol
   Extensions attributes.  The AFI indicates, as usual, the address
   family of the associated route, a new SAFI(TBD) should be  proposed
   to indicate the NLRI contains NVO3 tunnel IP address.

   The Network Layer Reachability information is encoded as one or more
   triples of the form <length, NVO3 Tunnel IP, prefix>, whose fields
   are described below:

         +---------------------------+
         |  Length (1 octet)         |
         +---------------------------+
         |  NVO3 Tunnel IP (4 octets)|
         +---------------------------+
         |  Prefix (variable)        |
         +---------------------------+


   The use and the meaning of these fields are as follows:

         a) Length:

            The Length field indicates the length in bits of the
   address prefix plus NVO3 Tunnel IP.

         b) NVO3 Tunnel IP:

            The NVO3 Tunnel IP is encoded as 4 octets in IPv4 case.

         c) Prefix:

            The Prefix field contains address prefixes followed by
   enough trailing bits to make the end of the field fall on an octet
   boundary.  Note that the value of trailing bits is irrelevant.

            The NVO3 Tunnel IP must be assigned by the ASBR-d located
   in NVO3 network for each MPLS Label specified for a particular route
   (and associated with its address prefix) defined in [RFC3107], the


Hao & et,al            Expires April 27, 2015                 [Page 9]


Internet-Draft            Inter-As Option-B             September 2014


   MPLS Label is carried in BGP Label mapping information received from
   peer ASBR-w.

      A BGP speaker can withdraw a previously advertised route (as well
   as the binding between this route and a NVO3 tunnel IP) by either (a)
   advertising a new route (and a label) with the same NLRI as the
   previously advertised route, or (b) listing the NLRI of the
   previously advertised route in the Withdrawn Routes field of an
   Update message.

      The NVO3 tunnel IP information carried (as part of NLRI) in the
   Withdrawn Routes field should be set to 0xFFFFFFFF.  (Of course,
   terminating the BGP session also withdraws all the previously
   advertised routes.)

7. NVE-NVA architecture

   No distributed BGP protocol is running on all NVEs and ASBR-d in
   NVO3 network, NVEs and ASBR-d are controlled by centralized NVA. The
   NVA runs EBGP protocol with peer ASBR-w to establish multi-hop EBGP
   connection firstly, then the NVA exchanges VPN-IPv4 routes with
   remote PEs.

   NVA maintains tenant information collected from all tenants. This
   information includes VN ID to identify each tenant and the
   corresponding RD and RT. This information can be statically
   configured by operators or dynamically notified by cloud management
   systems.

   NVA also maintains all TS's MAC/IP address and its attached NVE
   information for each tenant.

7.1. Multi-Hop EBGP connection

   DC to WAN direction:

   1. NVA allocates BGP MPLS Label per NVE.

   2. NVA advertises BGP Label routing information to peer ASBR-w.

   3. NVA downloads incoming stitching table <new allocated BGP MPLS
      Label, NVE IP> to ASBR-d.

   WAN to DC direction:

   1. NVA receives BGP Label routing information from peer ASBR-w.



Hao & et,al            Expires April 27, 2015                [Page 10]


Internet-Draft            Inter-As Option-B             September 2014


   2. NVA allocates NVO3 tunnel IP for each Label received from ASBR-w.

   3. NVA downloads outgoing stitching table<new allocated NVO3 tunnel
      IP, received MPLS Label> to ASBR-d.

7.2. VPN route exchange

     NVA advertises all internal data center tenant routing information
     to remote PEs using RFC 4364, which includes RD, RT, IP prefix,
     and MPLS VPN Label, the tenant identification of VN ID is used as
     MPLS VPN Label.

     Each remote MPLS VPN PE also advertises local VPN routes to NVA.
     NVA finds NVO3 Tunnel IP allocated by ASBR-d corresponding to the
     PE, matches RT attribute and populates the VPN routes to local VRF.

     Then the NVA downloads corresponding VPN forwarding table
     including <destination IP prefix/Mask, NVO3 Tunnel IP, VN ID>to
     each NVE.

                               VPN route exchange
                    -------------------------------------------
                    |                                         |
                   ------     EBGP      --------            -----
                   |NVA | ------------- |ASBR-w |------------|PE |
                   ------               --------            -----
                     .
                     . Southbound interface(Openflow,OVSDB,etc)
      ........................
      .          .           .
      .          .           .
      .          .           .
   ------     ------       -------
   |NVE1|     |NVE2|       |ASBR-d|
   ------     ------       -------
                       Figure 2 NVE-NVA Architecture

8. Security Considerations

   Internal IP (Loopback IP for PE/NVE) addresses a network is
   advertised and visible in another network, which is a security risk.
   Most operators wants to prevent any external visibility and access
   into their internal devices IP. option C is suggested to be deployed
   within a single SP or enterprise with both MPLS and NVO3 networks.



Hao & et,al            Expires April 27, 2015                [Page 11]


Internet-Draft            Inter-As Option-B             September 2014


9. IANA Considerations

   A new SAFI(TBD) is proposed to indicate the NLRI contains NVO3
   tunnel IP.

10. References

10.1. Normative References

[1]  [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate

      Requirement Levels", BCP 14, RFC 2119, March 1997.

[2]  [RFC4364] E. Rosen, Y. Rekhter, " BGP/MPLS IP Virtual Private
      Networks (VPNs)", RFC 4364, February 2006.

[3]  [RFC3107] Y. Rekhter,E. Rosen, ''Carrying Label Information in
      BGP-4'', RFC 3107, May 2001

10.2. Informative References

   [1]   [NVA] D.Black, etc, "An Architecture for Overlay Networks
         (NVO3)", draft-ietf-nvo3-arch-01, February 14, 2014

   [2]   [RFC7047]  B. Pfaff, B. Davie,''The Open vSwitch Database
         Management Protocol'', RFC 7047, December 2013

   [3]  [OpenFlow1.3]OpenFlow Switch Specification Version 1.3.0 (Wire
         Protocol 0x04). June 25, 2012.
         (https://www.opennetworking.org/images/stories/downloads/sdn-
         resources/onf-specifications/openflow/openflow-spec-v1.3.0.pdf)

11. Acknowledgments

   Authors like to thank Shunwan Zhuang for his valuable inputs.

Authors' Addresses

   Weiguo Hao
   Huawei Technologies
   101 Software Avenue,
   Nanjing 210012
   China
   Phone: +86-25-56623144
   Email: haoweiguo@huawei.com




Hao & et,al            Expires April 27, 2015                [Page 12]


Internet-Draft            Inter-As Option-B             September 2014


   Lucy Yong
   Huawei Technologies
   Phone: +1-918-808-1918
   Email: lucy.yong@huawei.com


   Susan Hares
   Huawei Technologies
   Phone: +1-734-604-0323
   Email: shares@ndzh.com.






































Hao & et,al            Expires April 27, 2015                [Page 13]


Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/