[Docs] [txt|pdf] [Tracker] [Email] [Nits] [IPR]

Versions: 00 01 02 03 04

Network Working Group                                          H. Grover
Internet-Draft                                                    D. Rao
Intended status: Standards Track                            D. Farinacci
Expires: October 15, 2010                                  Cisco Systems
                                                          April 13, 2010


                    Overlay Transport Virtualization
                          draft-hasmit-otv-00

Abstract

   In today's networking environment most enterprise networks span
   multiple physical sites.  Overlay Transport Virtualization (OTV)
   provides a scalable solution for L2/L3 connectivity across different
   sites using the currently deployed service provider and enterprise
   networks.  It is a very cost-effective and simple solution requiring
   deployment of a one or more OTV functional device at each of the
   enterprise sites.  This solution is agnostic to the technology used
   in the service provider network and connectivity between the
   enterprise and the service provider network.  This document provides
   an overview of this technology.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 15, 2010.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of



Grover, et al.          Expires October 15, 2010                [Page 1]

Internet-Draft                     OTV                        April 2010


   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  6
   2.  Overlay Control Plane  . . . . . . . . . . . . . . . . . . . .  8
     2.1.  Provider Control Plane . . . . . . . . . . . . . . . . . .  9
     2.2.  Overlay Control Plane  . . . . . . . . . . . . . . . . . .  9
     2.3.  Advertising unicast and multicast information  . . . . . . 10
     2.4.  Selecting next-hops for a MAC address entry  . . . . . . . 10
     2.5.  Edge Device connectivity . . . . . . . . . . . . . . . . . 11
       2.5.1.  Edge Devices are IP hosts and MAC routers  . . . . . . 11
       2.5.2.  Internal Interface Behavior  . . . . . . . . . . . . . 11
       2.5.3.  Overlay Interface Behavior . . . . . . . . . . . . . . 12
   3.  Forwarding Process and Rules . . . . . . . . . . . . . . . . . 12
     3.1.  Forwarding Process . . . . . . . . . . . . . . . . . . . . 12
       3.1.1.  Forwarding between Internal Links  . . . . . . . . . . 12
       3.1.2.  Forwarding from an Internal Link to an Overlay Link  . 12
       3.1.3.  Forwarding from an Overlay Interface to an
               Internal Interface . . . . . . . . . . . . . . . . . . 13
       3.1.4.  Overlay Forwarding and Native Forwarding
               Concurrently . . . . . . . . . . . . . . . . . . . . . 13
     3.2.  STP BPDU Handling  . . . . . . . . . . . . . . . . . . . . 13
   4.  Adjacency Server information . . . . . . . . . . . . . . . . . 14
   5.  Authoritative Edge Device Election . . . . . . . . . . . . . . 14
   6.  Site Identifier  . . . . . . . . . . . . . . . . . . . . . . . 14
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 15
   9.  IS-IS Code point Considerations  . . . . . . . . . . . . . . . 15
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 15
     10.2. Informative References . . . . . . . . . . . . . . . . . . 15
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16











Grover, et al.          Expires October 15, 2010                [Page 2]

Internet-Draft                     OTV                        April 2010


1.  Overview

   OTV is a new "MAC in IP" way of supporting L2 VPNs over an L2/L3
   infrastructure.  OTV provides an "over-the-top" method of doing
   virtualization versus traditional "in-the-network" style mechanisms
   where multiple routing and forwarding tables are maintained in every
   device from a source to a destination.  With "over-the-top" methods
   state is maintained at the network edges, but not at the site or in
   the core.

   OTV can be incrementally deployed and reside in a small number of
   devices at the edge between sites and the core.  We call these
   devices "Edge Devices" which perform typical layer-2 learning and
   forwarding functions on their site facing interfaces (internal
   interfaces) and perform IP-based virtualization functions on their
   core facing interfaces (for which an overlay network is realized).

   Traditional L2VPN technologies rely heavily on tunnels.  Rather than
   creating stateful tunnels, OTV encapsulates layer 2 traffic with an
   IP header ("MAC in IP"), but does not create any fixed tunnels.
   Based on the IP header, traffic is forwarded natively in the core
   over which OTV is being deployed.  This is an important feature as
   the native IP treatment of the encapsulated packet allows optimal
   multi-point connectivity as well as optimal broadcast and multicast
   forwarding, plus any other benefits the routed core may provide to
   native IP traffic.

   Layer 2 traffic, which requires traversing the overlay to reach its
   destination, is pre-pended with an IP header which ensures the packet
   is delivered to the edge boxes that provide connectivity to the Layer
   2 destination in the original MAC header.  As shown in figure 1, if a
   destination is reachable via Edge Device X2 (with a core facing IP
   address of IPB), other edge devices forwarding traffic to such
   destination will add an IP header with a destination IP address of
   IPB and forward the traffic into the core.  The core will forward
   traffic based on IP address IPB, once the traffic makes it to Edge
   Device X2 it will be stripped of the overlay IP header and it will be
   forwarded into the site in the same way a regular bridge would
   forward a packet at layer 2.  Broadcast or multicast traffic are
   encapsulated with a multicast header and follows a similar process.











Grover, et al.          Expires October 15, 2010                [Page 3]

Internet-Draft                     OTV                        April 2010


     +----+                                                       +----+
     | H1 |-------               ------------              -------| H2 |
     +----+       \             /            \            /       +----+
                   \+----+IPA  /   L3 Core    \ IPB+----+/
           ---------| X1 |----<                >---| X2 |--------
                   /+----+     \   Network    /    +----+\
                  /             \            /            \
                                 ------------


                                  +------------+
                                  | DA = IPB   |
                                  +------------+
                                  | SA = IPA   |
         +-----------+            +------------+          +-----------+
         | DMAC = H2 |            | DMAC = H2  |          | DMAC = H2 |
         +-----------+            +------------+          +-----------+
         | SMAC = H1 |            | SMAC = H1  |          | SMAC = H1 |
         +-----------+            +------------+          +-----------+
         | Payload   |            | Payload    |          | Payload   |
         +-----------+            +------------+          +-----------+



   Figure 1.  Traffic flow from H1 to H2 with encapsulation in the core.

   The key piece that OTV adds is the state to map a given destination
   MAC address in the L2 VPN to the IP address of the OTV edge device
   behind which that MAC address is located.  OTV forwarding is a
   function of mapping a destination MAC address in the VPN site to an
   edge device IP address in the overlay network.

   To achieve all this, a control plane is required to exchange the
   reachability information among the different OTV Edge Devices.  We
   will refer to this control plane as the oIGP and oMRP (Overlay IGP
   and Overlay Multicast Routing Protocol).  This is due to the fact
   that OTV does not flood unknown unicasts among Edge Devices and
   therefore precludes data-plane learning on the "overlay interface".
   Data-plane learning continues to happen on the "internal interfaces"
   to provide compatibility and transparency within the layer-2 sites
   connecting to the OTV overlay.  The edge devices appear to each VPN
   site to be providing L2 switched network connectivity amongst those
   sites.

   The required control plane utilizes IS-IS as an IGP capable of
   carrying a mix of MAC unicast and multicast addresses as well as IP
   addresses.  The information carried in IS-IS LSPs will be MAC unicast
   and multicast addresses with their associated VLAN IDs and IP next



Grover, et al.          Expires October 15, 2010                [Page 4]

Internet-Draft                     OTV                        April 2010


   hops.  The MAC addresses are those of the hosts connecting to the
   network and the IP next hops are the addresses of the Edge Devices
   through which these are reachable in the core.  Figure 2 shows what
   the resulting tables would look like in a simple two site example.
   Because MAC address on a site are advertised in IS-IS to all other
   sites, all Edge Devices will have knowledge of all MAC addresses for
   each VLAN in the VPN.

     +----+                                                       +----+
     | H1 |-------               ------------              -------| H2 |
     +----+       \             /            \            /       +----+
                 E1\+----+IPA  /   L3 Core    \ IPB+----+/E1
           ---------| X1 |----<                >---| X2 |--------
                   /+----+     \   Network    /    +----+\
                  /    Overlay1 \            /Overlay1    \
                                 ------------

     At X1                                At X2
     +----------------------------+       +----------------------------+
     | Destination | Interface/NH |       | Destination | Interface/NH |
     |----------------------------|       |----------------------------|
     | H1          | E1           |       | H1          | Overlay1:IPA |
     | H2          | Overlay1:IPB |       | H2          | E1           |
     +----------------------------+       +----------------------------+


   Figure 2.  OTV Forwarding Tables.

   Edge Devices will have an IP address on their core facing interface,
   and these nodes join a configured ASM/Bidir multicast group in the
   core transport network by sending IGMP/MLD reports like any host
   joining a group would.  So the edge boxes are hosts, relative to the
   core, subscribing to multicast groups that are created in the
   "provider network" and which rely on a provider IGP (pIGP) and a
   provider Multicast Routing Protocol (pMRP).

   Note: Only core devices participate in the pIGP/pMRP.  The edge
   devices connect as hosts to the core network and therefore do not
   participate in the pIGP/pMRP.  This is compatible and consistent with
   today's interconnection policies.

   We refer to the multicast group that the Edge Devices join as the
   "Provider Multicast Group (pMG)".  The pMG will be used for Edge
   Devices to become adjacent with each other to exchange their IS-IS
   LSPs, CSNPs, and Hellos.  Thus, by virtue of the pMG, all edge
   devices will see each other as if they were directly connected to the
   same multi-access multicast-capable segment for the purposes of IS-IS
   peering.  The pMG also defines a VPN; thus, when an Edge Device joins



Grover, et al.          Expires October 15, 2010                [Page 5]

Internet-Draft                     OTV                        April 2010


   a pMG the site becomes part of a VPN.  Multiple pMGs can be defined
   to define multiple VPNs.  IS-IS Hello authentication will be used to
   validate Edge Devices can join the adjacency set.

   The pMG could also be used to broadcast data traffic to all edge
   boxes when necessary.  Broadcast transmission will not incur head-end
   replication overhead.  OTV allows the pMRP to efficiently distribute
   broadcast traffic by the provider ASM/Bidir group.

   When forwarding of VPN multicast is required, new multicast state
   will be used in order to tailor the distribution trees to the optimal
   group of receivers, these multicast groups are to be created in the
   provider control plane (pMRP).  That is, each core device will resort
   to using SSM multicast in the core by having the Edge Device IGMPv3/
   MLDv2 join a {source, group} pair.

   Edge Devices must combine data-plane learning on their bridged
   internal interfaces with control-plane learning on their overlay
   interfaces.  The key to this combination is a series of rules through
   which data-plane events can trigger control-plane advertisements
   and/or learning events.

   OTV supports L2 multi-homing for sites where one or more of the
   bridge domains may be connected to multiple Edge Devices.  OTV
   provides loop elimination for multi-homed "sites" and does not
   require the extension of STP across sites.  This means each site can
   run it own STP rather than have to create one large STP domain across
   sites.

   OTV takes a conservative approach by employing an active-backup
   capability for all MAC addresses within a VLAN by having a single
   Edge Device forward data in and out of a site.  However, an active-
   active capability is provided across VLANs thereby allowing use of
   multiple Edge Devices to load balance traffic in and out of a L2
   multi-homed site.

1.1.  Terminology

   The term "Hello" or "Hello PDU" in this document, when not further
   qualified, includes the TRILL IIH PDU, the LAN IIH PDU and the P2P
   IIH PDU.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.






Grover, et al.          Expires October 15, 2010                [Page 6]

Internet-Draft                     OTV                        April 2010


      Site - A Site is a single or multi-homed connected network which
      is typically under the control of a single organization.  Sites
      are connected together via Edge Devices that operate in an overlay
      network.  The Edge Devices provide layer-2 connectivity among the
      sites.  A site will not be used by IS-IS as a transit network.  A
      layer-2 site is one that is mostly made up of hosts and switches.
      Routers may exist but the majority of the topology to the Edge
      Devices are L2 switched.  The number of MAC addresses advertised
      on the overlay network are all the hosts and routers connected to
      the L2 devices at the site.  A layer-3 site is one that is mostly
      made up of routers connecting to hosts via switches.  The majority
      of the topology to the Edge Devices are L3 routed.  The number of
      MAC addresses advertised on the overlay network are limited to the
      router devices at the site.

      VPN - A VPN is a collection of sites which are controlled by a
      single administration.  The addressing plan, router and switch
      configuration is consistent as it would be if the sites were
      physically at the same location.  Each VPN uses a unique IS-IS
      authentication key and a dedicated ASM/Bidir multicast group (pMG
      - provider Multicast Group) allocated by the core network.  There
      is one overlay network per VPN.

      Edge Device - A modified L2 switch that performs OTV functions.
      It will typically run as a L2 device but can be co-located in a
      device that performs L3 routing on other L3-enabled ports.  When
      OTV functionality is described, this functionality only occurs in
      an Edge Device.

      Internal Interface - These are layer 2 interfaces connected to
      site based switches or site based routers.  The internal interface
      is layer 2 regardless if it connects to a switch or a router.

      Overlay Interface - This is a logical multi-access multicast-
      capable interface.  The overlay interface can replicate broadcast
      and multicast packets efficiently.  The interface takes L2 frames
      and encapsulates them in IP unicast or multicast headers.  The
      overlay interface is realized by one or more physical core facing
      interfaces.  The core facing interfaces are assigned IP addresses
      out of the core provider's address space.

      MAC Table - This is a forwarding table of 48-bit MAC addresses.
      The table can contain unicast or multicast MAC addresses.  The
      table is populated by two sources.  One being traditional data-
      plane learning on internal interfaces and the other by the IS-IS
      protocol at the control-plane on the overlay interface.  A MAC
      table is scoped by VLAN therefore allowing the same MAC address to
      be used in different VLANs, and potentially in different VPNs.



Grover, et al.          Expires October 15, 2010                [Page 7]

Internet-Draft                     OTV                        April 2010


      Authoritative Edge Device (AE) - This is an Edge Device that
      forwards layer 2 frames in and out of a site from and to the
      overlay interface, respectively.  There is one and only one
      authoritative Edge Device for all MAC unicast and multicast
      addresses per VLAN.  For other VLANs, another Edge Device can be
      authoritative.  A Non-Authoritative Edge Device is called a NAE.

      Site-ID - Each Edge Device which resides in an OTV site will
      advertise over the overlay network the same site-id.  Site-ID
      election is dynamically determined by the IS-IS protocol and is
      syntactically the lowest system-id value of an Edge Device within
      the site.

      (VLAN, uMAC) - This is the designation of layer-2 network
      reachability information as encoded in IS-IS and as stored in the
      MAC table.  This notation describes a given unicast MAC address
      within a particular VLAN.

      (VLAN, mMAC, mIP) - This is the designation of layer-2 network
      reachability information as encoded in IS-IS and as stored in the
      MAC table.  This notation describes a given multicast MAC address
      within a particular VLAN.  The 'mIP' part of the 3-tuple is
      provided so the SSM based tree is joined based on the IP group
      address (since 32-to-1 aliasing can happen for IPv4 group address
      to MAC mappings and worse for IPv6).


2.  Overlay Control Plane

   In this section we discuss the control plane hierarchy.  At the very
   base of the hierarchy we find the provider control plane, which
   enables unicast reachability among the edge boxes and also provides
   the multicast group that makes edge boxes adjacent from the overlay
   control plane perspective.  At the next level, the overlay control
   plane conveys client-MAC-address reachability information between the
   edge devices.

   In general, the control planes are independent of each other.
   However, in order to optimize multicasting, multicast control-plane
   events (reports, joins, leaves) that occur in one MRP may initiate
   events in another MRP so that the optimal tree is always being used
   to forward traffic.  Also, events in the overlay control plane are
   triggered by forwarding events in the client data plane (however both
   client and overlay control planes remain independent of each other).







Grover, et al.          Expires October 15, 2010                [Page 8]

Internet-Draft                     OTV                        April 2010


2.1.  Provider Control Plane

   The provider control plane is the set of routing protocols which run
   in the core infrastructure to be able to deliver packets sourced from
   the site networks.  There is no required coordination of routing
   protocols between the site and the core.  That is, no more than
   typically necessary to connect to a core service.  In terms of
   addressing, the Edge Device is allocated an IP address out of the
   core block of addresses.

   For each VPN the edge device is to support, an ASM/Bidir multicast
   group is required.  That is, only one group per VPN.  The multicast
   state created in the client site network will map to the some amount
   of state in the core network.  However, no group address allocation
   is needed for these data groups.  Since SSM is used in the overlay,
   uniqueness of multicast is achieved by the uniqueness of the address
   assigned to the edge device.  The edge device takes a client
   multicast packet and encapsulates it in a core-deliverable multicast
   packet.

2.2.  Overlay Control Plane

   The overlay control plane conveys MAC reachability information
   between Edge Devices.  The MAC addresses that are locally connected
   to an Edge Device are advertised in the overlay IGP to other Edge
   Devices in the VPN.  Thus, MAC learning on the overlay is not based
   on data plane flooding, but is based on explicit advertisements of
   unicast and multicast MAC addresses.  This advertisement of MAC
   addresses is done by the overlay control plane.

   The overlay IGP establishes adjacencies only between Edge Devices
   that are in the same VPN.  As explained in the previous section, Edge
   Devices become part of a VPN when they join a multicast group defined
   in the core (provider-MRP); members of the same group are members of
   the same VPN.  The hellos and updates between overlay-IGP peers
   travel over the multicast group defined in the pMRP.  Thus, edge
   devices peer with each other as if they were directly connected at
   layer 2.  This peering is possible as all the traffic for the oIGP is
   encapsulated with the pMRP group address and sent into the core.
   Thus, all edge devices in a given VPN receive the oIGP multicast
   traffic as if they were all on the same segment.

   Similarly, the overlay-MRP traffic is encapsulated with the pMRP
   group address corresponding to the VPN.  The overlay-MRP is used to
   inform all the Edge Devices that the subscribers to a particular
   group are reachable over the overlay network.  Thus, the Edge Devices
   snoop IGMP/MLD reports and then the oMRP notifies all edge devices in
   the VPN which group has been joined by sending an MCAST PDU with the



Grover, et al.          Expires October 15, 2010                [Page 9]

Internet-Draft                     OTV                        April 2010


   Group MAC address in it.  The information conveyed by the oMRP is
   used solely for the Edge Devices to populate their oif-list at the
   source site.  Edge Devices on the receiving sites will IGMPv3/MLDv2
   join the corresponding (S,G) group in the provider plane (pMRP) when
   they snoop the IGMP/MLD traffic from the site.  Thus, multicast trees
   are built natively in the core, not on the overlay.

2.3.  Advertising unicast and multicast information

   IS-IS is used as the overlay IGP and MRP.  When a MAC address is
   learned by arrival of a data packet on an internal interface, the MAC
   address is placed in an IS-IS LSP if and only if the Edge Device is
   authoritative for the VLAN the MAC resides in.  Likewise, when a
   multicast MAC address is learned by IGMP/MLD snooping, a
   (VLAN,mMAC,mIP) entry is placed in an MCAST PDU by an authoritative
   Edge Device.  When an Edge Device is authoritative, it advertises a
   unicast MAC address as soon as it learns the MAC on an internal
   interface.  The TLV used for this purpose is the Unicast MAC RI TLV
   defined in [IS-IS-Layer2].

   As for multicast MAC addresses, when an IGMP/MLD report is received
   on an internal interface, the authoritative Edge Device will
   advertise the multicast MAC address in an MCAST PDU with metric 1.
   The TLV used for this purpose is the GADDR-TLV and its sub-TLVs as
   defined in [IS-IS-Layer2].

   Essential to OTV is that IS-IS is conveying not just MAC address
   information amongst the edge devices in a given VPN, it is also
   implying that those MAC address may be mapped to the IP addresses of
   the advertising edge devices for the purposes of "MAC in IP"
   forwarding across the overlay.

   To allow scalability of connecting large L2 sites together via the
   overlay, by default, an Edge Device will not report any (VLAN-ID,
   MAC) pairs.  To avoid inadvertent merging of VLANs among sites, Edge
   Devices will be required to configure the VLANs for which Edge
   Devices will advertise MACs for.

2.4.  Selecting next-hops for a MAC address entry

   When a site is multi-homed, a unicast MAC address is advertised by a
   single Edge Device.  That is, the Edge Device that is authoritative
   for the VLAN.  Therefore, remote Edge Devices should never see an
   equal-cost path to a MAC address at a remote site through multiple
   Edge Devices.

   The same can be said for multicast MAC addresses even though next-hop
   calculations are not necessary.  All a remote Edge Device cares about



Grover, et al.          Expires October 15, 2010               [Page 10]

Internet-Draft                     OTV                        April 2010


   is if there is at least one site that is a member of a multicast
   group so it knows to forward packets address to the group on the
   overlay network.  So only the authoritative Edge Device for a
   multicast MAC address will advertise the address in an MCAST PDU.

2.5.  Edge Device connectivity

   In order to successfully connect to the overlay, the Edge Device has
   several functions on its different interfaces.  These are summarized
   in this section.

2.5.1.  Edge Devices are IP hosts and MAC routers

   The Edge Device does not participate in the provider IGP (pIGP) as a
   router, but as a host.  The Edge Device has an IP address which is
   significant in the core/provider addressing space.  The Edge Device
   joins the multicast groups in the core by issuing IGMPv3/MLDv2
   reports, just like a host would.  Thus the Edge Device does not have
   an IGP relationship with the core.

   However, the Edge Device does participate in the overlay-IGP and its
   IP address is used as a router ID and a next hop address for unicast
   traffic by the overlay IGP.  However, the Edge Device does not build
   an IP routing table with the information received from the oIGP, but
   rather builds a hybrid table where MAC address destinations are
   reachable via IP next hop addresses.  If we were to name this
   somehow, we could call it a MAC router because it can route packets
   based on MAC addresses.

   Thus, Edge Devices are IP hosts in the provider plane, MAC routers in
   the overlay plane and bridges in the client bridging plane.

2.5.2.  Internal Interface Behavior

   The internal interfaces on an Edge Device are bridged and are
   indifferent to whether the site itself is L2 or L3.  These interfaces
   behave as regular switch interfaces and learn the source MAC
   addresses of traffic they receive.  Spanning tree BPDUs are received,
   processed and sourced on internal interfaces as they would on a
   regular 802.1d, 802.1s and 802.1w switch.  Additionally, traffic
   received on internal interfaces may trigger oIGP/oMRP advertisements
   and/or pMRP group joins according to the rules described in earlier
   in the section on "Advertising Unicast and Multicast MAC addresses".

   Traffic received on an internal interface will be forwarded according
   to the MAC tables either onto another internal interface (regular
   bridging) or onto the overlay (OTV forwarding).  This is explained in
   detail in Forwarding section.



Grover, et al.          Expires October 15, 2010               [Page 11]

Internet-Draft                     OTV                        April 2010


2.5.3.   Overlay Interface Behavior

   Overlay interfaces are interfaces which have an IP address in the
   provider/core address space.  Traffic out of these interfaces is
   encapsulated with an IP header, and traffic received on these
   interfaces must be de-capsulated to produce a L2 frame.  The detail
   on forwarding is discussed in Forwarding section.

   STP BPDUs are not sourced from overlay interfaces, therefore there
   should not be STP BPDUs in the core, nor do the overlay interfaces
   participate in the spanning tree protocol.

   Even though the overlay interface has an IP address, it does not
   participate in the provider IGP or MRP, it behaves as a host
   connected to the core.  The IP addresses assigned to the overlay
   interfaces are used as next hop addresses by the overlay-IGP,
   therefore the address table for the overlay interface will include a
   remote IP address as the next hop information for remote MAC
   addresses.


3.  Forwarding Process and Rules

3.1.  Forwarding Process

   Most of the interesting forwarding cases happen when a packet comes
   from the Overlay Link to be forwarded to an Internal Link, or vice
   versa.  But for completeness, we will describe how forwarding between
   Internal Links will occur.

3.1.1.  Forwarding between Internal Links

   When an Edge Device has internal links, it operates like a
   traditional L2 switch.  That is it will send unicast packets on a
   port where the MAC was learned, it will send multicast packets on the
   ports it has IGMP/MLD-snooped, and it will send broadcast packets out
   all ports.  This is done on a per VLAN basis.

3.1.2.  Forwarding from an Internal Link to an Overlay Link

   An Edge Device will decide to forward a unicast, multicast, or
   broadcast packet over the overlay interface when oIGP has put the
   logical port of the overlay interface in the MAC table for the
   corresponding unicast or multicast MAC address.  When a packet is
   sent over the overlay interface, it is first prepended with an IP
   header.  The packet as received from the internal interface is not
   touched other than to remove the preamble and FCS from the frame.
   The IP header, outer MAC header, and physical port the packet is to



Grover, et al.          Expires October 15, 2010               [Page 12]

Internet-Draft                     OTV                        April 2010


   go out is all cached in the hardware.  This is so all the information
   to physically forward the packet is all together to easily prepend
   and send at high rate.

   The IP addresses and the outer MAC addresses are all provided and
   stored for the hardware by the control-plane software.  The multi-
   homing of sites imposes additional rules on the forwarding of
   traffic.

3.1.3.  Forwarding from an Overlay Interface to an Internal Interface

   When a packet is received on the overlay interface, it will need to
   be IP de-capsulated to reveal the inner MAC header for forwarding.
   The inner MAC header SA and DA addresses will used for the MAC table
   lookup.  When a unicast or multicast packet is received on the
   overlay interface, an Edge Device must determine if it should forward
   it based on its authoritative status for the VLAN.

3.1.4.  Overlay Forwarding and Native Forwarding Concurrently

   There are cases where a VLAN will have some MACs that will be
   advertised and forwarded over the overlay network and others that
   will have their packets forwarded natively on physical interfaces.
   This can be controlled by policy configuration on the Edge Device.

   By default, a VLAN is not adertised on the overlay network, therefore
   forwarding cannot occur over on the overlay network.  When a VLAN is
   enabled, an Edge Device will begin advertising locally learned MAC
   addresses in IS-IS.  If the MAC needs to be connected through the
   core natively, the network adminstrator must set up a route-filter
   based access-list to deny advertising the MAC.

3.2.  STP BPDU Handling

   Since the Edge Device acts as an L2 switch it does participate in the
   Spanning Tree Protocol if the site has been configured to use it.
   The OTV design does not depend on the Spanning Tree Protocol for any
   of the VPN connectivity functionality.  The following are the rules
   an OTV Edge Device will follow:

   o  When STP is configured at a site, an Edge Device will send and
      receive BPDUs on internal interfaces.  An OTV Edge Device will not
      originate or forward BPDUs on the overlay network.

   o  An OTV Edge Device can become a root of one or more spanning
      trees.





Grover, et al.          Expires October 15, 2010               [Page 13]

Internet-Draft                     OTV                        April 2010


   o  An OTV Edge Device will take the typical action when receiving
      Topology Change Notification (TCNs) messages.

   o  When on OTV Edge Device detects another Edge Device in it's site
      has come up or gone down, it may send a TCN so it can gather new
      state for when its authoritative status changes for a VLAN.

   To allow the L2 switch network to scale to larger number of nodes and
   MAC addresses, it is considered a feature of OTV to maintain and keep
   the spanning trees small and per site.


4.  Adjacency Server information

   In case the provider core does not support ASM/Bidir multicast, there
   is an alternate mechaism to discover the remote Edge Devices which
   are part of a VPN.  In this scenario, an Edge Device is configured as
   an Adjacency Server.  All other Edge Devices inform the Adjacency
   Server regarding their reachability and capability information using
   the Adjacency Server TLV defined in [IS-IS-Layer-2].  Adjacency
   Server is responsible for informing all the other existing Edge
   Devices regarding addition or loss of an Edge Device.  Based on the
   reachability information, the Edge Devices can further communicate
   with one another directly using unicast or multicast data path.


5.  Authoritative Edge Device Election

   Authoritative Edge Device for a VLAN is selected from a list of local
   Edge Devices in an site.  The AE Device selection algorithm tries to
   ensure that the VLANs are evenly spread across all the Edge Devices
   supporting a specific VPN.  The VPN specific information is
   advertised by an Edge Device using Site Group TLV defined in [IS-IS-
   Layer-2]


6.  Site Identifier

   Site-ID information is sent out to all the remote Edge Devices using
   Site Identifier TLV defined in [IS-IS-Layer-2].  This information is
   used by the remote EDs to identify all the edge devices belonging to
   the same site.


7.  Acknowledgements

   The authors would like to thank many for their careful review.  They
   include Venu Nair, Victor Moreno, Ashok Chippa, Sameer Merchant, Tony



Grover, et al.          Expires October 15, 2010               [Page 14]

Internet-Draft                     OTV                        April 2010


   Speakman, Raghava Sivaramu, Nataraj Batchu, Sreenivas Duvvuri, Gaurav
   Badoni, Veena Raghavan, Marc Woolward and Tim Stevenson.

   Many have received individual presentations of OTV and provided
   critical feedback early in the design process.  These reviewers
   include Vince Fuller, Peter Lothberg, Dorian Kim, Peter Schoenmaker,
   Mark Berly, Scott Kirby, Dana Blair, Tom Edsall, Dinesh Dutt,
   Parantap Lahiri, and Jeff Jensen.


8.  Security Considerations

   This document adds no additional security risks to IS-IS, nor does it
   provide any additional security for IS-IS.


9.  IS-IS Code point Considerations

   This document uses the new PDU types, namely the MCAST PDU, MCAST-
   CSNP PDU, the MCAST-PSNP PDU as defined in [IS-IS-Layer-2].  This
   document uses a set of new IS-IS TLVs, the MAC-Reachability TLV (type
   141), the Group Address TLV (type 142) and its sub-TLVs, some sub-
   TLVs of the Port-Capability TLV (type 143), and Group Member Active
   Source TLV (type 146) that are defined in the [IS-IS-Layer-2].


10.  References

10.1.  Normative References

   [IS-IS]    ISO/IEC 10589, "Intermediate System to Intermediate System
              Intra-Domain Routing Exchange Protocol for use in
              Conjunction with the Protocol for Providing the
              Connectionless-mode Network Service (ISO 8473)", 2005.

   [IS-IS-Layer-2]
              Banerjee, A., et al., "Extensions to IS-IS for Layer-2
              Systems", draft-ietf-isis-layer2, work in progress 2010.

10.2.  Informative References

   [IEEE 802.1aq]
              "Standard for Local and Metropolitan Area Networks /
              Virtual Bridged Local Area Networks / Amendment 9:
              Shortest Path Bridging, Draft IEEE P802.1aq/D1.5", 2008.

   [RBRIDGES]
              Perlman, R., Eastlake, D., Dutt, D., Gai, S., and A.



Grover, et al.          Expires October 15, 2010               [Page 15]

Internet-Draft                     OTV                        April 2010


              Ghanwani, "RBridges: Base Protocol Specification", 2010.


Authors' Addresses

   Hasmit Grover
   Cisco Systems
   170 W Tasman Drive
   San Jose, CA  95138
   US

   Email: hasmit@cisco.com


   Dhananjaya Rao
   Cisco Systems
   170 W Tasman Drive
   San Jose, CA  95138
   US

   Email: dhrao@cisco.com


   Dino Farinacci
   Cisco Systems
   170 W Tasman Drive
   San Jose, CA  95138
   US

   Email: dino@cisco.com





















Grover, et al.          Expires October 15, 2010               [Page 16]


Html markup produced by rfcmarkup 1.107, available from http://tools.ietf.org/tools/rfcmarkup/