[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03

BESS WG                                                          Y. Wang
Internet-Draft                                                   R. Chen
Intended status: Standards Track                         ZTE Corporation
Expires: 19 June 2021                                   16 December 2020


                          Light Weighted EVPN
            draft-wang-bess-evpn-cmac-overload-reduction-03

Abstract

   When PBB EVPN [RFC7623] is used in Segment Routing networks, it is
   complicated to make use of the SID list to carry a function that is
   aiming for C-MACs.

   In [I-D.ietf-spring-srv6-network-programming], End.DX2 function is
   defined, this function can be used in EVPN VPLS.  When it is used in
   EVPN VPLS, the data-plane learning defined in End.DT2U function can
   also be activated for End.DX2 function.  On the basis of such End.DX2
   function, SRv6 EVPN can meet all the requirements per [RFC7623] and
   bring us some other benefits.  Such SRv6 EVPN is called light-
   weighted SRv6 EVPN, and it will be more simpler than PBB EVPN over
   SRv6.

   It is easy for the light-weighted SRv6 EVPN to carry a SID that is
   aiming for customer ethernet packets, because there will be no other
   ethernet header between the SID list and the customer ethernet
   header.  These SIDs may be user-defined functions for the customer
   ethernet headers.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 19 June 2021.





Wang & Chen               Expires 19 June 2021                  [Page 1]


Internet-Draft                  EVPN-lite                  December 2020


Copyright Notice

   Copyright (c) 2020 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Background  . . . . . . . . . . . . . . . . . . . . . . .   3
     1.2.  Overview  . . . . . . . . . . . . . . . . . . . . . . . .   4
     1.3.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   5
   2.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   7
     2.1.  No C-MAC Awareness in the Backbone  . . . . . . . . . . .   7
     2.2.  EVPN IRB Support  . . . . . . . . . . . . . . . . . . . .   7
     2.3.  Unified Encapsulation per Scenario  . . . . . . . . . . .   7
     2.4.  ESI Features Remain Supported . . . . . . . . . . . . . .   8
     2.5.  Flexible Multi-homing Remains Supported . . . . . . . . .   8
     2.6.  C-MAC Address Learning and Confinement  . . . . . . . . .   8
     2.7.  No C-MAC Flushing for All-Active ESes . . . . . . . . . .   8
     2.8.  Independent C-MAC Flushing for Single-Active ESes . . . .   9
     2.9.  Independent Convergency per <ESI, EVI>  . . . . . . . . .   9
     2.10. Route Aggregation and Default Route in Backbone . . . . .   9
     2.11. ARP Suppression . . . . . . . . . . . . . . . . . . . . .   9
     2.12. ESI Indicator Aggregation . . . . . . . . . . . . . . . .   9
     2.13. Unequal load-balance  . . . . . . . . . . . . . . . . . .  10
     2.14. AC-aware Service Interface  . . . . . . . . . . . . . . .  10
     2.15. ESI-agnostical Core-Routers . . . . . . . . . . . . . . .  10
   3.  Light-Weighted EVPN Overview  . . . . . . . . . . . . . . . .  11
     3.1.  Use Case  . . . . . . . . . . . . . . . . . . . . . . . .  11
     3.2.  Packet Walkthrough  . . . . . . . . . . . . . . . . . . .  11
   4.  Light-Weighted SRv6 EVPN  . . . . . . . . . . . . . . . . . .  13
     4.1.  SRv6 Solution Overview  . . . . . . . . . . . . . . . . .  13
       4.1.1.  Aggregatable End.DX2 SID and End.DX2AGG SID . . . . .  13
       4.1.2.  The Advertisement of ESI-IPs  . . . . . . . . . . . .  14
     4.2.  SRv6-specific EVPN-lite Procedures  . . . . . . . . . . .  15
       4.2.1.  End.DX2AGG Function and Arg.ACI . . . . . . . . . . .  16
   5.  Advanced Considerations . . . . . . . . . . . . . . . . . . .  17
     5.1.  ESI Indicator Advertisement Optimization  . . . . . . . .  17
       5.1.1.  Advertise ESI SIDs in Underlay Network  . . . . . . .  17



Wang & Chen               Expires 19 June 2021                  [Page 2]


Internet-Draft                  EVPN-lite                  December 2020


       5.1.2.  Advertise ESI SIDs for Overlay Network  . . . . . . .  17
       5.1.3.  Advertise AC SIDs for Overlay Network . . . . . . . .  18
     5.2.  Unequal LB Advertisement  . . . . . . . . . . . . . . . .  18
     5.3.  EVPN Egress Protection  . . . . . . . . . . . . . . . . .  19
       5.3.1.  EVPN Egress Node Protection . . . . . . . . . . . . .  19
       5.3.2.  EVPN Egress Link Protection . . . . . . . . . . . . .  19
     5.4.  C-MAC Flush Notification Procedure  . . . . . . . . . . .  20
     5.5.  E-Tree Support Considerations . . . . . . . . . . . . . .  20
     5.6.  EVPN IRB Support Considerations . . . . . . . . . . . . .  20
     5.7.  Use AC SID in MAC/IP Advertisement Routes . . . . . . . .  20
   6.  Light-Weighted MPLS EVPN  . . . . . . . . . . . . . . . . . .  20
     6.1.  MPLS Solution Overview  . . . . . . . . . . . . . . . . .  20
     6.2.  MPLS-specific EVPN-lite Procedures  . . . . . . . . . . .  22
     6.3.  Hierarchical VPLS in EVPN-lite  . . . . . . . . . . . . .  24
   7.  Comparison with Other Solutions . . . . . . . . . . . . . . .  25
     7.1.  Detailed Comparisons with PBB EVPN over SRv6  . . . . . .  25
     7.2.  Detailed Comparisons with Anycast Node SID  . . . . . . .  26
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  26
   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  26
     9.1.  End.DX2AGG SID  . . . . . . . . . . . . . . . . . . . . .  26
     9.2.  Global Unique ESI-label in EAD per ES Route . . . . . . .  27
   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  27
   11. Normative References  . . . . . . . . . . . . . . . . . . . .  27
   12. Informative References  . . . . . . . . . . . . . . . . . . .  28
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

1.  Introduction

1.1.  Background

   When there are too many customer-MACs (C-MACs), the RRs and/or ASBRs
   will be overloaded by the RT-2 routes for these MACs according to
   [RFC7432].  This issue can be simply solved by making the remote
   C-MAC entries learnt via data-plane MAC learning (like what PBB VPLS
   have done since [RFC7041]) rather than received from RT-2 routes.
   This simplified solution will works as well as PBB VPLS.  But this
   simplified solution will lose many important features which is based
   on the ESI concept.  Because the ingress-ESI can't be learnt via
   data-plane MAC learning at the egress PE.  So when the data packets
   is forwarded following these MAC entries, they can't benefit from the
   EAD/EVI routes as per RFC7432.  So the All-Active Redundancy mode for
   ES can't be supported.  This make the simplified solution can't work
   as well as PBB EVPN ([RFC7623]).








Wang & Chen               Expires 19 June 2021                  [Page 3]


Internet-Draft                  EVPN-lite                  December 2020


   This document proposes some new extensions to [RFC7432] to achieve
   all-active mode ES redundancy on TPEs and reduce the C-MAC loads for
   RRs and ASBRs at the same time.  The new solution will work even more
   better than PBB EVPN under the help of these extensions, especially
   when there is no deployment of MPLS dataplane.

   Furthermore, it naturally brings the benefits of high scalability,
   faster network convergence, and reduced operational complexity, and
   we call it light-weighted EVPNs because of these advantages.

1.2.  Overview

   In [RFC7432], the C-MACs is advertised via RT-2 route.  This behavior
   is inheritted by [RFC8365] and [I-D.ietf-bess-srv6-services].  but in
   order to solve the C-MAC overload problem for RRs and ASBRs, we have
   to return to a PBB-like dataplane C-MAC learning procedures.

   We discuss all the requirements for a light-weighted EVPN solution
   which pushes no C-MAC entries into the backbone network in Section 2.
   Note that some of these requirements is not supported well by PBB
   EVPN.

   In this document, the light-weighted EVPN solutions are also called
   as EVPN-lite for short.  A total of four EVPN-lite solutions are
   proposed since [Revision-01].  These solutions are VXLAN over EVPN
   IP-VRF, light-weighted VXLAN EVPN, light-weighted MPLS EVPN, light-
   weighted SRv6 EVPN.  But this revision focuses its attention on the
   SRv6 EVPNs and SR-MPLS EVPN.

   In order to compare these five solutions with [RFC7348] and [RFC7623]
   whose C-MAC entries are also not pushed into the backbone network,
   two terms are introduced in this document, because the comparisons
   need to be done in unified terminology.  One term is "Global ESI
   Indicator (GEI)", which is called as B-MAC in PBB EVPN.  The other
   term is "EVI's Global Dicreminator (EGD)", which is called as I-SID
   in PBB EVPN.

   Note that the EVI here corresponds to the I-Component of [RFC7623],
   not the B-Component.  In fact, there will be no typical B-components
   in some of the above seven solutions.

   Note that the GEI and EGD in different EVPN-lite solutions are very
   different.  The details will be described in Section 4.

   On the basis of GEI concept, then we define two route-types for EVPN-
   lite: The first route type is GEI/ES route, which is called as RT-2
   route in PBB EVPN.  The second route type is GEI/EVI route, which is
   called as EAD/EVI roue in [RFC7432].



Wang & Chen               Expires 19 June 2021                  [Page 4]


Internet-Draft                  EVPN-lite                  December 2020


   The details of these terms are described in Section 1.3.

1.3.  Terminology

   Most of the terminology used in this documents comes from [RFC7432]
   and [I-D.ietf-bess-srv6-services] except for the following:

   *  Light-weighted EVPN: The EVPN solution with high scalability and
      reduced operational complexity.

   *  EVPN-lite: The Light-weighted EVPN is also called EVPN-lite for
      short.

   *  C-MAC: Customer MAC, it is the same as the C-MAC of PBB EVPN.

   *  ISID: a broadcast domain identifier in PBB I-Component.

   *  LDV: Local Discreminating Value.  It is similar to the Local
      Discreminating Value of type 3 ESI.

   *  GDV: Global Discreminating Value.  An identifier with global
      uniqueness.

   *  EGD: EVI-GDV, an EVI's Global Discreminator, it is a GDV for an
      EVI instance.  A EGD is used to idenfify an EVPN Instance (EVI) in
      data plane.  The EGD is a Global Discreminating Value (GDV) of
      that EVI, so it is also the abbreviation of EVI-GDV.  e.g.  The
      EGD of [RFC7348] is a global VNI.

   *  ESI Indicator: A Global ID for an ESI.  Note that different PE may
      assign different ESI-indicator for the same ESI, espacially when
      the ES redundancy mode is single-active.  e.g.  The ESI indicator
      of [RFC7623] is B-MAC.

   *  GEI: Global ESI Indicator.  It is the same as the "ESI Indicator"
      except for the emphasization to its global uniqueness.  A GEI is
      used in data plane to identify an ESI, because it have global
      uniqueness across the service domain of a corresponding EVPN
      Instance (EVI).  But an ESI may have a few GEIs, each for a TPE,
      espacially in the single-active mode of ES redundancy.  And in
      E-Tree scenarios, an ESI may have two GEIs on the same PE, one for
      Root ACs, one for Leaf ACs.  e.g.  The GEIs for an ESI of
      [RFC8317] is two B-MACs, one for root ACs, one for Leaf ACs.








Wang & Chen               Expires 19 June 2021                  [Page 5]


Internet-Draft                  EVPN-lite                  December 2020


   *  GEI/ES: The EVPN route which is used to advertise the relation
      between ESI and its GEI.  Note that the GEI/ES route is advertised
      per ESI basis on a specified PE.  In PBB EVPN, the GEI/ES route is
      the MAC Advertisement Route.  Note that different solutions may
      have different GEI/ES routes.  Note that a GEI/ES don't have to be
      an EAD/ES route.

   *  EAD/EVI: An Ethernet A-D route per EVI.

   *  GEI/EVI: The EVPN route which is used to advertise the relation
      between <ESI/GEI, EVI> and its EVPN label and MPLS nexthops.  Note
      that in PBB EVPN, such route is not used.  Note that different
      solutions may have different GEI/EVI routes.  Note that a GEI/EVI
      don't have to be an EAD/EVI route.

   *  ARG.ACI: The argument part of a SID of the End.DX2AGG function is
      called as ARG.ACI, because the value of that argument will be a
      AC-ID.

   *  RT-2: MAC/IP Advertise Route.

   *  MAC Entry: An entry in the EVPN MAC table in data-plane.

   *  ESI SID: An SRv6 SID whose function type is End.DX2AGG.  Note that
      when the ESI is all-active mode, the ESI SID is the same on all
      PEs of that ES, according to Section 4.1.  In such case, the ESI
      SID can be called as ES anycast SID too.

   *  ESI IP: An End.DX2AGG SID with its Argument part being set to
      zero.

   *  VXLAN EVPN: EVPN per [RFC8365].

   *  EVPN VXLAN: A broadcast domain per [RFC7348], but use IMET routes
      of [RFC8365] to construct VXLAN tunnels.  Note that an EVPN VXLAN
      will not use EAD/EVI routes or MAC/IP Advertisement Routes.

   *  SPE - Stitching PE, the PEs to do label swapping operation for the
      EVPN labels.  It is similar to the SPE of MS-PWs.

   *  TPE - Target PE, the PEs to do EVPN forwarding for the overlay
      network.

   *  PLR - A router at the point of local repair in the underlay
      network.  In egress node protection, it is the penultimate hop
      router on an anycast tunnel.





Wang & Chen               Expires 19 June 2021                  [Page 6]


Internet-Draft                  EVPN-lite                  December 2020


   *  Anycast ECMP SID - An anycast SID that is load-balanced by the
      underlay network.

   *  Anycast FRR SID - An anycast SID that is fast-rerouted by the
      underlay network.

2.  Requirements

   EVPN C-MAC Reduction should be provided together with the following
   requirements:

2.1.  No C-MAC Awareness in the Backbone

   In typical operation, an EVPN PE sends a BGP MAC Advertisement route
   per C-MAC address.  In certain applications, this poses scalability
   challenges, as is the case in data center interconnect (DCI)
   scenarios where the number of virtual machines (VMs), and hence the
   number of C-MAC addresses, can be in the millions.  This is called as
   C-MAC overload of DC Backbone.  In such scenarios, it is required to
   reduce the number of BGP MAC Advertisement routes by relying on a
   'EVPN-lite' scheme, as is provided by ESI and its equivalents (e.g.
   Pseudo B-MAC, ESI IP).

2.2.  EVPN IRB Support

   The PBB-VPLS/PBB-EVPN is not friendly to IRB usecase because of its
   complicated Protocol Stack, so it is used just in pure L2VPN usecase
   up to now in the industry.

   The solution should provide efficient forwarding performance in EVPN
   IRB use cases.

2.3.  Unified Encapsulation per Scenario

   PBB EVPN, especially the MPLS encapsulation of its B-VPLS, is
   typically not used in DC Scenario.  So we bring PBB and MPLS
   encapsulation to DC Backbone just due to the C-MAC overload problem.
   EVPN IRB is widely deplyed in DC scenarios, but PBB EVPN is not
   friendly for EVPN IRB use cases.  So we have to use different
   solutions in EVPN IRB and C-MAC reduction use cases.  We believe that
   if we choose VXLAN/Geneve data-plane, we will prefer to use the same
   data-plane in all use cases, e.g.  EVPN IRB, C-MAC reduction.  So it
   is necessary to make NVO3/MPLS/SRv6 EVPN to support Section 2.1 in
   order to provider a unified solution for data center and other
   secenarios.






Wang & Chen               Expires 19 June 2021                  [Page 7]


Internet-Draft                  EVPN-lite                  December 2020


2.4.  ESI Features Remain Supported

   Two redundancy modes are defined in [RFC7432].  They are All-Active
   mode and Single-Active mode.

   In All-active mode, the C-MAC movement among the different adjacent
   PE nodes of the same ESI should not be considered as C-MAC mobility.
   In Single-Active mode, such movements can be considered as C-MAC
   mobility.

2.5.  Flexible Multi-homing Remains Supported

   Flexible multi-homing means that different ES instances can have
   different adjacent-PEs.  We call all the adjacent-PEs of the same ES
   instances as that ES's location-set in this document.  Flexible
   multi-homing means that different ES can have different location-set.

   For example, ES1's location-set is {PE1}, ES2's location-set is {PE2,
   PE3}, ES3's location-set is {PE1, PE3}, and ES4's location-set is
   {PE2,PE4}.

2.6.  C-MAC Address Learning and Confinement

   In EVPN, all the PE nodes participating in the same EVPN instance are
   exposed to all the C-MAC addresses learnt by any one of these PE
   nodes because a C-MAC learnt by one of the PE nodes is advertised in
   BGP to other PE nodes in that EVPN instance.  This is the case even
   if some of the PE nodes for that EVPN instance are not involved in
   forwarding traffic to, or from, these C-MAC addresses.  Even if an
   implementation does not install hardware forwarding entries for C-MAC
   addresses that are not part of active traffic flows on that PE, the
   device memory is still consumed by keeping record of the C-MAC
   addresses in the routing information base (RIB) table.  In network
   applications with millions of C-MAC addresses, this introduces a non-
   trivial waste of PE resources.  As such, it is required to confine
   the scope of visibility of C-MAC addresses to only those PE nodes
   that are actively involved in forwarding traffic to, or from, these
   addresses.

2.7.  No C-MAC Flushing for All-Active ESes

   Just as in [RFC7432], it is required to avoid C-MAC address flushing
   upon link, port, or node failure for remote All-Active multihomed
   segments.







Wang & Chen               Expires 19 June 2021                  [Page 8]


Internet-Draft                  EVPN-lite                  December 2020


2.8.  Independent C-MAC Flushing for Single-Active ESes

   Just as in [RFC7432], upon single-active ESI's link or port failure,
   the C-MACs of other single-active ESes from the same PE will not be
   flushed.

2.9.  Independent Convergency per <ESI, EVI>

   When the physical port of an All-Active ES works well, but a single
   Ethernet Tag ID (ETI) of that ES fails, The traffic to that ETI of
   that ES will be re-routed to other adjacent PE of the same ES, but
   the traffic to other ETIs of the same ES will not be affected.

   Note that when AC (ES link) fails but PE node still works well, there
   should not be steady bypassing traffic either.  The steady bypassing
   problem is discussed in [I-D.wang-bess-evpn-egress-protection].

2.10.  Route Aggregation and Default Route in Backbone

   The routes per ESIs can be aggregated in Backbone network.  Even the
   default route should be supported when the B-Component is an EVPN IP-
   VRF (e.g. in VXLAN over IP-VRF solutions).

   In SRv6 EVPN, different sub-interfaces of the same ESI can have
   different ESI-indicators in order to achieve Independent Convergency
   per <ESI, EVI>.  But only the common prefix of them should be
   advertised (both in underlay network and in overlay network) before
   any of the sub-interfaces fails.

2.11.  ARP Suppression

   The ARP suppression requires <IP,MAC> entries to be steadily held on
   all TPEs, So it conflicts with Section 2.6.  But if the C-MAC
   confinement requirements is not so important in some scenarios, The
   ARP Suppression can be activated.  This is an option.

2.12.  ESI Indicator Aggregation

   There are obvious difference between "ESI Route Aggregation" and "ESI
   Indicator Aggregation".  The "ESI Route Aggregation" is that some ESI
   Indicators are advertised by underlay protocols in a aggregatated
   manner, but different ESIs still have different ESI-Indicators.  The
   "ESI Indicator Aggregation" is that different ESIs use the same ESI-
   Indicator.

   Note that the "ESI Route Aggregation" is recommanded as long as it is
   possible, but the "ESI Indicator Aggregation" can only be used under
   certain restraints.



Wang & Chen               Expires 19 June 2021                  [Page 9]


Internet-Draft                  EVPN-lite                  December 2020


   When two ESes are attached to the same redundancy group of PEs, they
   can share the same ESI indicator.  But this will bring out some
   issues too.  One of these issues is that they may be attached to
   different groups of PEs in the future.  Another issue is that when
   only one of the ESes fails, the ESI indicator can't be withdrawn by
   that PE, so the steady bypass of that ES arises immediately after its
   failture on that PE.  If these issues are not so important in some
   scenarios, The ESI-Indicator Aggregation may be activated.  This is
   an option.

   Note that when ESI Indicator Aggregation is activated, the local-bias
   ES split-horizon procedures or its variations (like what
   [I-D.eastlake-bess-evpn-vxlan-bypass-vtep] does) should be used.

   Note that ESI Indicator Aggregation works well with single-active
   ESIs (see Section 4.2), its steadby bypassing problem will arise with
   all-active ESIs only.

   Note that the sub-interfaces of an ESI may be assigned with different
   ESI-indicators, and these ESI-indicators can be aggregated into a
   common prefix, this common prefix is assigned with the ESI.  In such
   case, only the common prefix should be advertised before any of the
   sub-interfaces fails.  But this is not considered as "ESI Indicator
   Aggregation", this is "ESI Route Aggregation".

2.13.  Unequal load-balance

   The light-weighted EVPNs should support the unequal load-balance
   defined in [I-D.ietf-bess-evpn-unequal-lb].

2.14.  AC-aware Service Interface

   In AC-aware bundling service interface, the ESes may make its two
   VLANs to be attached to the same broadcast domain.  These two VLANs
   may be assigned to the same sub-interface, or to different sub-
   interfaces.

2.15.  ESI-agnostical Core-Routers

   We should not make the core-routers aware of any per-EVI routing
   information of an ESI.  Because they are just underlay nodes.

   The core-routers may not aware of any per-ES routing infomation of
   the ESIes too.  In such case, the anycast ESI SID should be hiden
   into the SRH, and it is the inner SID for the Node SID of the egress
   PE.





Wang & Chen               Expires 19 June 2021                 [Page 10]


Internet-Draft                  EVPN-lite                  December 2020


3.  Light-Weighted EVPN Overview

3.1.  Use Case

   We assign a Global Discreminator EGD1 to an EVI instance EVI1, the
   EGD1 is a number consists of N bits.  We assign an ESI-indicator GEI1
   to ESI1 on PE1, and we assign an ESI-indicator GEI2 to ESI1 on PE2.
   We call the relationship between ESI1 and its two ESI-indicators as
   ESI1_GEI1 and ESI1_GEI2 respectively.  The EGD and GEIs MUST have
   global uniqueness in EVI1's service domain.

                                    +----------+
                      PE1           |          |
                 +-------------+    |          |
                 | ESI1_GEI1   |    |          |         PE3
                /|             |----|          |   +-------------+
               / |             |    | IP/MPLS  |   |             |
          LAG /  +-------------+    | Backbone |   |   ESI2_GEI3 |---CE2
      CE1=====                      |   with   |   |             |
              \  +-------------+    |   EVPN   |---|             |
               \ |             |    |   RRs    |   +-------------+
                \|             |----|   and    |
                 | ESI1_GEI2   |    |   SPEs   |
                 +-------------+    |          |
                      PE2           |          |
                                    +----------+

                    Figure 1: EVPN MAC Reduction Usecase

   We use IMET routes to build a broadcast-list.  The broadcast-list is
   used to forward BUM traffics.  The data-plane MAC learning for BUM
   traffics produces the first batch of C-MAC entries.  The subsequent
   C-MAC entries can be learnt from Unicast traffics and/or BUM
   traffics.  It is clear that we don't use MAC/IP routes to advertise
   C-MAC entries as usual, that is for fear that the RRs and/or SPEs are
   overloaded by these C-MACs.

3.2.  Packet Walkthrough

   #1 [PE1 forward ARP Request to PE2/PE3]

   *  When CE1 requests CE2's ARP, PE1 will receive the ARP Request BUM1
      from a AC (say AC1) of ESI1.  PE1 will forward the ARP Request
      following the broadcast-list of AC1's EVI instance(say EVI1).  The
      broadcast-list is constructed by IMET routes from PE2/PE3.






Wang & Chen               Expires 19 June 2021                 [Page 11]


Internet-Draft                  EVPN-lite                  December 2020


      PE1 will forward the ARP Request to PE2/PE3.  The ARP Request is
      encapsulated with GEI1 and EVI1_GDV1.  The inner SMAC of the ARP
      request is M1 which is CE1's MAC address.

   #2  [PE2/PE3's Dataplane MAC Learning]

   *  When PE2/PE3 receives the ARP Request packet BUM1, they do
      dataplane MAC learning independently.  They will learn that M1 is
      behind GEI1.

      Note that when PE2 learns that M1 is behind GEI1, it will assume
      that M1 is behind the local AC whose ESI-indicator is GEI1 too.
      The local AC may have more higher priority than the remote one.

      After the dataplane MAC learning, the ARP request packet BUM1 is
      broadcasted to the local ACs, behind one of which is CE2.

   #3  [PE2 Discard ARP Request to CE1]

   *  On receiving BUM1 from PE1, PE2 use the ingress GEI information in
      BUM1 to determine its ingress ESI ESI1, When ESI1 is all-active
      mode and PE2 is about to forward the ARP request to CE1, PE2 will
      find that the ESI for the outgoing AC is also ESI1, so PE2
      discards it for ESI loop-free considerations.

      Note that before that ARP Request packet is discarded, its source-
      MAC can be learnt, especially in "AC-aware bundling service
      interface".  The MAC entry is learnt against the GEI, but it will
      consider the local sub-interface on that ES as its outgoing
      interface, in order to avoid unknown-unicast flooding.

      Note that in "AC-aware bundling service interface", the AC-ID
      along with that GEI can help the MAC entry to be installed for the
      correct outgoing interface.  Such MAC entry is called as the
      synced MAC entry.

      When ESI1 is single-active mode, the outgoing AC may be in
      blocking state, otherwise its corresponding sub-interface on CE1
      will take charge of packet-drop behavior instead.  So alghough the
      ESI for the outgoing AC is not the same as ESI1, no loop will
      arise in the Ethernet Segment.

   #4  [PE3 Forward ARP Replay to PE1/PE2]

   *  When CE2 replies to CE1 for the ARP request, PE3 will forward the
      ARP reply U1 according to the MAC entry M1 learnt previously as
      above.




Wang & Chen               Expires 19 June 2021                 [Page 12]


Internet-Draft                  EVPN-lite                  December 2020


      PE3 will forward the ARP reply U1 to PE1 or PE2 according to
      ESI1's RT-1 per EVI routes and RT-1 per ES routes:

      When ESI1 is all-active mode, GEI1 may be the same as GEI2, in
      such case, we call both of them GEI21 instead.  The traffics to M1
      will be load-balanced between PE1 and PE2.  Because that GEI21 is
      advertised by both PE1 and PE2l.

   #5  [PE1 Forward ARP Replay to CE1]

   *  Whe PE1 received the ARP reply packet U1 from PE3, PE1 first match
      the packet to the its EVI instance EVI1 by U1's EGD information.
      And PE1 will not discard it because the egress ESI is not the same
      as the ingress ESI which is determined by U1's GEI information.

4.  Light-Weighted SRv6 EVPN

4.1.  SRv6 Solution Overview

4.1.1.  Aggregatable End.DX2 SID and End.DX2AGG SID

   When an Ethernet Segment ES1 is attached to an EVI, the attachment-
   circuit AC1 for that <ESI,EVI> is assigned with an End.DX2 SID.
   Different ACs of the same ESI are assigned with different End.DX2
   SIDs, we call them AC SIDs in this document.  But these different
   End.DX2 SIDs must be able to be aggregated into the same prefix, and
   this prefix are called as ESI Indicator in light-weighted SRv6 EVPNs.
   The format of aggregatable End.DX2 SIDs is illustrated in the
   following figure:

       |<---  ESI-Indicator(128-N bits) ---->|<----     N bits     --->|
       +------------+------------+-----------+-------------------------+
       |    Block   |   Node     | ESI.LDV   |          AC-ID          |
       +------------+------------+-----------+-------------------------+
       |<------ Locator -------->|<------------- Function ------------>|

               Figure 2: End.DX2 SID Formart for Aggregation

   Note that the ESI.LDV field is the Local Discreminator Value (LDV) of
   the ESI (especially the type 3/4/5 ESI).  The AC-ID field is the of
   the EVI of that End.DX2 SID.  The ESI.LDV field and the EGD field are
   integrated into the End.DX2 SID's Function part.









Wang & Chen               Expires 19 June 2021                 [Page 13]


Internet-Draft                  EVPN-lite                  December 2020


   Note that in "AC-aware bundling service interface" the AC-ID field
   MUST be the same as the Attachment Circuit ID of
   [I-D.sajassi-bess-evpn-ac-aware-bundling].  But in other service
   interfaces the AC-ID field can also be the EGD of that AC's EVPN
   instance.  Note that the EGD has a global meaning like a global VNI
   or an PBB I-SID, while the AC-ID part for an ordinary aggregatable
   End.DX2 SID typically is only a VLAN-ID on that ES.

   But we can consider the prefix aggregated from these End.DX2 SIDs as
   a new SRv6 function called End.DX2AGG SID, The format of the
   End.DX2AGG SID is illustrated in the following figure:

       |<------ Locator -------->|<- FUNC -->|<------ ARG.ACI -------->|
       +------------+------------+-----------+-------------------------+
       |    Block   |   Node     | ESI.LDV   |         AC-ID           |
       +------------+------------+-----------+-------------------------+

                      Figure 3: End.DX2AGG SID Format

   Note that whether these SIDs are considered as lots of End.DX2 SIDs
   or are considered as a single End.DX2AGG SID with different
   arguments, it is just a local matter of their PE node's independent
   choice, other PEs of the same EVI won't be aware of the difference of
   these two implementations.

   A SID with the End.DX2AGG function is called as an "ESI SID" in this
   document.  The ESI's GEI is the locator and fuction part of its
   corresponding ESI SID.  The argument part of the ESI SID is the AC-ID
   for the corresponding AC.  The AC-ID plus the ESI.LDV works like the
   function part of an End.DX2 SID.  The argument part of an ESI SID is
   called as ARG.ACI in this document, where the AC is the abbreviation
   of AC-ID.

   Note that an SRv6 ESI-indicator is an 128 bits ESI SID with a zero
   argument, it is also called as ESI-IP.  An ESI-SID may have a non-
   zero argument part, but the ESI-IPs always have zero argument part.

4.1.2.  The Advertisement of ESI-IPs

   The SRv6 SID in IMET route is an End.DT2M SID with a zero argument
   length.  The GEI1 and GEI2 are ESI-IPs of End.DX2AGG SID that is
   defined in Figure 3.  We can use IGP protocols to advertise GEI1 and
   GEI2 to PE3 respectively in SRv6 underlay.  So we don't have to use
   EAD/ES route or EAD/EVI route in SRv6 EVPN in this section.

   Note that if ESI1 is single-active mode, GEI1 is different from GEI2,
   but if ESI1 is all-active mode, GEI1 is the same as GEI2.




Wang & Chen               Expires 19 June 2021                 [Page 14]


Internet-Draft                  EVPN-lite                  December 2020


   Note that when PE1 node fails and the ESI is all active, the PLR node
   will do underlay anycast FRR switching for GEI1(=GEI2).  This will
   bring out fast network convergency.

   Note that when the PE-CE link of GEI1 fails, the IGP route of GEI1
   will be withdrawn, So there will be no steady bypassing for that ES,
   but a temporary bypassing can be performed to further improve the
   convergency.

   The detailed comparisons between light-weighted SRv6 EVPN and PBB
   EVPN over SRv6 is described in Section 7.

4.2.  SRv6-specific EVPN-lite Procedures

   [6A]  In Step #1, PE1 will forward the ARP Request to PE2/PE3 with
         the following SRv6 BE encapsulation: It's underlay Source IP is
         the End.DX2AGG SID on PE1 for ESI1; It's underlay Destination
         IP is the End.DT2M SID on PE2/PE3.  The locator and function
         part of the End.DX2AGG SID is GEI1.  The Argument part of the
         End.DX2AGG SID is 0.

         Note that the underlay SIP will be the End.DT2U SID (because
         they don't need an ESI SID) for the single-homed ingress ACs.
         The multi-homed ingress ACs with single-active behavior may not
         be assigned with an dedicated ESI-indicator either.  In such
         situations, the underlay SIP can be the End.DT2U SID too.  Note
         that in such situations, the ESI indicator of all single-active
         ESIs for the same EVI are aggregated into the same IPv6
         address.

   [6B]  In Step #3, PE2 can compare the ingress-GEI of BUM1 and the GEI
         of outgoing AC directly, no GEI-to-ESI lookup needed.

         Note that PE2 can decapsulate the packet following the End.DX2
         function or following the End.DX2AGG function.  It is just a
         local matter.

   [6C]  In Step #4, PE3 will forward the ARP reply to PE1 with the
         following SRv6 BE encapsulation: It's underlay Source IP is the
         End.DX2AGG SID on PE3 for ESI2; It's underlay Destination IP is
         the End.DX2AGG SID on PE1 for ESI1 according to the MAC entry
         M1.  The ARG.ACI for the End.DX2AGG SID in DIP is the EGD
         configured on PE3.  Note that the EGD for the same EVI is
         configured with the same value on PE1/PE2/PE3.







Wang & Chen               Expires 19 June 2021                 [Page 15]


Internet-Draft                  EVPN-lite                  December 2020


         When ESI1 is all-active mode, GEI1 will be the same as GEI2, so
         we call both of them GEI21 instead.  The traffics to M1 will be
         load-balanced between PE1 and PE2 by the underlay network on
         PE3.  Because GEI21 is advertised by both PE1 and PE2 in the
         underlay IGP protocol.

         Note that if the DIP is the anycast node SID of PE1 and PE2,
         when the PE-CE link of ESI1 fails, the traffic will be steadily
         bypassed untill that link recovers again.

   [6D]  In Step #5, Whe PE1 received the SRv6 encapsulated ARP reply
         packet from PE3, PE1 first match the packet to the End.DX2AGG
         SID of ESI1 by DIP, then match the packet to the EVI instance
         EVI1 by ARG.ACI.

4.2.1.  End.DX2AGG Function and Arg.ACI

   The "Endpoint with decapsulation and ESI-specific L2 table
   forwarding" behavior (End.DX2AGG for short) is a variant of the
   End.DX2 behavior.

   Two of the applications of the End.DX2AGG behavior are the EVPN VPLS
   [RFC7432] and the EVPN ETREE [RFC8317]use-cases.

   Any SID instance of this behavior is associated with an ESI E.  The
   behavior also takes an argument: "Arg.ACI".  This argument provides a
   local mapping to an outgoing interface OIF,.  The OIF corresponds to
   <ESI E, EVI V>, and the EVI V's bridge table is L2 Table T .

   The End.DX2AGG SID MUST be the last segment in a SR Policy.

   When N receives a packet whose IPv6 DA is S and S is a local
   End.DX2AGG SID, the processing is identical to the End.DX2 behavior
   except for the Upper-layer header processing which is as follows:

    S01. If (Upper-Layer Header type == 143(Ethernet) ) {
    S02.    Remove the outer IPv6 Header with all its extension headers.
    S03.    Learn the exposed MAC Source Address in L2 Table T.
    S04.    Find out the OIF, and forward the Ethernet frame to the OIF.
    S05. } Else {
    S06.    Process as per Section 4.1.1
                of [I-D.ietf-spring-srv6-network-programming].
    S07. }

   Note that the EVI V is determined by the End.DX2AGG SID's ESI-IP and
   ARG.ACI argument.





Wang & Chen               Expires 19 June 2021                 [Page 16]


Internet-Draft                  EVPN-lite                  December 2020


   Note that the MAC learning should not be applied unless the EVI V is
   an E-LAN service.

   Note that the OIF may be found out using the MAC-entries in L2
   Table T, when the EVI V is an E-LAN service and the AC-aware bundling
   service interface is used.

   Note that we can use the ARG.ACI to find out the the OIF on that ES,
   then the EVI V will be found out.

5.  Advanced Considerations

5.1.  ESI Indicator Advertisement Optimization

5.1.1.  Advertise ESI SIDs in Underlay Network

   The End.DX2AGG SIDs can be advertised as an IP prefix in underlay IGP
   protocols.  Although it is the aggregation of many AC SIDs, the ESI
   SIDs may still be too many for the underlay network.  And the core
   routers who are service-agnostic have to install these prefixes.

   In order to solve these problems, the ESI SIDs can be advertised via
   EVPN routes in the overlay network.

   Note that when the URPF (Unicast Reverse Path Forwarding) is enabled
   and the ESI SIDs are encapsulated as Source IPs, The ESI SIDs should
   be advertised in underlay network, even if the ESI SIDs won't be
   encapsulated as destination IPs.  Otherwise the source ESI SID should
   be hiden into the SRH too.

5.1.2.  Advertise ESI SIDs for Overlay Network

   When we use EVPN routes to advertise ESI SIDs among the PEs for the
   overlay network, These routes will not be imported by the core
   routers.  In such case, when the ESI SIDs are used as destination IP
   addresses, they should be hiden behind the node SID of the
   corresponding egress PE router.

   Note that the association between an ESI SID and its corresponding
   Node SID is also advertised by such EVPN routes.

   We can use EAD/ES route (or EAD/EVI route) to advertise Global ESI
   Indicator (GEI) (and EGD), these EAD routes is called as GEI/ES or
   GEI/EVI route in this document.  When the GEI/EVI route is used to
   advertise GEI, the End.DX2AGG SID is advertised in its SRv6 L2
   Service TLV, not in its nexthop.  The EGD may be carried in the
   ARG.ACI field of the End.DX2AGG SID, or it can also be determined
   from its EVI-RTs.



Wang & Chen               Expires 19 June 2021                 [Page 17]


Internet-Draft                  EVPN-lite                  December 2020


   Either GEI/EVI routes (or GEI/ES) routes will be advertised/imported
   for Global Routing Table (GRT), so their Route-Targets (RT) will be
   configured with GRT.  Because there isn't a dedicated B-component
   like PBB VPLS and PBB EVPN.  Note that the GEI/EVI routes can be
   installed as /128 routes and the ARG.ACI part can be set to the
   actual EGD of the corresponding EVI.  In such case, when a C-MAC is
   learnt over an End.DX2AGG SID (as IPv6 SA) in the data-plane, the
   ARG.ACI field of that SID should be set to the EVI's EGD when the
   C-MAC entry is installed.

   Although GEIs is imported to GRT, they are awared only on PE nodes,
   the transit nodes in underlay network won't be aware of GEIs (they
   can aware the common prefix of these GEIs) in order to reduce the FIB
   consumption.  We can use the argument length in the SRv6 SID
   Structure Sub-Sub-TLV to check whether the EGD is too big for the
   End.DX2AGG SID, So we can avoid the destruction to the function part
   of the End.DX2AGG and we can use flexible EGD length.

5.1.3.  Advertise AC SIDs for Overlay Network

   In order to solve the problem described in Section 2.9, we may have
   to advertise AC SIDs.  But the amount of AC SIDs may be hundreds of
   times larger than ESI SIDs.  It is necessary for the light-weighted
   SRv6 EVPNs to reduce the advertisement of AC SIDs.

   The AC SID of a specified <ESI,EVI> will not be advertised by its
   PEs, until these PEs know that the <ESI,EVI> fails on at least one of
   them.

   Note that the AC SID for that <ESI,EVI> can be used as the source IP
   of the SRv6 encapsulation before that AC SID is advertised via EVPN
   routes.  Because that when a MAC is learnt over that AC SID, the
   packet for that MAC can also be forwarded according to the IP Prefix
   of the corresponding ESI SID due to the longest match procedures of
   IP lookup.

   The detailed AC-SID advertisement will be added in the future
   versions.

5.2.  Unequal LB Advertisement

   When the ESI SIDs are advertised by EVPN routes for the overlay
   network according to Section 5.1.2, we can advertise the EVPN Link
   Bandwidth extended community (see [I-D.ietf-bess-evpn-unequal-lb]) or
   something else along with the ESI SIDs using such EVPN routes.






Wang & Chen               Expires 19 June 2021                 [Page 18]


Internet-Draft                  EVPN-lite                  December 2020


   Note that these extra information (which are advertised along with
   the EVPN routes) are awared by the PEs only.  The underlay network
   don't have to be aware of it.

   Note that when the EVPN Link Bandwidth extended community is
   advertised along with the ESI SID, The nexthop of the GEI/ES route
   should not be set to the anycast ECMP Node SID of the advertising PE
   (egress-PE).  On receiving such GEI/ES route, the ingress PE may push
   this GEI/ES route's nexthop onto the End.DX2AGG/End.DX2 SID when
   constructing the SID stack, if unequal-LB is required.

5.3.  EVPN Egress Protection

5.3.1.  EVPN Egress Node Protection

   There are two methods to achieve EVPN egress node protection:

   *  The first method: Both the ESI SID and the AC SID are anycast SID,
      and they are hiden behind the corresponding egress Node SID
      according to Section 5.1.2.  So when the egress node fails, the
      PLR can do "midpoint protection" for that node SID, as a result of
      that, the destination IP will be rewritten to the ESI SID behind
      that node SID.

      Note that the ESI SID is an anycast SID, so it will be re-routed
      by the underlay network after that failure.

      Note that this method requires no special extensions.  So it will
      be suitable for more SRv6 devices than mirror SID.

   *  The second method: the egress protection procedures per
      [I-D.wang-bess-evpn-egress-protection] (which uses an anycast FRR
      Node SID to achieve underlay anycast FRR protection) can be
      applied to the GEI/ES route's nexthop, in order to apply underlay
      anycast FRR protection.

   Note that the PLR don't have unequal load-balance information, So
   neither of these two methods will meet the unequal load-balance
   requirements after that failure.  But it will be the best result
   unless the unequal load-balance information can be advertised via
   IGP.

5.3.2.  EVPN Egress Link Protection

   The details will be added in the future versions, but the procedures
   about the synced MAC entry of [Section 3.2, Paragraph 5, Item 1] will
   be helpful.




Wang & Chen               Expires 19 June 2021                 [Page 19]


Internet-Draft                  EVPN-lite                  December 2020


5.4.  C-MAC Flush Notification Procedure

   The withdraw of GEI Advertisement can be used as C-MAC flush
   notification like what have been done by [RFC8317] and
   [I-D.ietf-bess-pbb-evpn-isid-cmacflush].

   Note that even if the GEI/EVI routes of Section 5.1 are not
   advertised, the withdraw of those GEI/EVI route can still be used as
   a C-MAC flush notification of their <ESI,EVI>.

5.5.  E-Tree Support Considerations

   E-tree Supprot extensions is similar to [RFC8317] section 5 except
   for the following notable differences: The leaf B-MACs are replaced
   by leaf GEIs, the root B-MACs are replaced by root GEIs.  the PBB
   encapsulation is replaced by other encapsulations, the B-component is
   replaced by an IP-VRF or the underlay GRT.  The B-MAC Advertisement
   Route is replaced by GEI/EVI route or ESI/IP Route.

5.6.  EVPN IRB Support Considerations

   The dataplane in this draft is no more complex than typical SRv6
   EVPN.  So it will work as efficient as we should expect in SRv6 EVPN
   IRB usecase.

5.7.  Use AC SID in MAC/IP Advertisement Routes

   But the AC SID can be used in MAC/IP advertisement route, even if
   C-MAC overload is not a real threat.  By doing this, the data-plane
   can be unified among these usecases.

   Note that the AC SID is also a typical End.DX2 SID too.

6.  Light-Weighted MPLS EVPN

6.1.  MPLS Solution Overview

   In MPLS EVPN control plane, we use a 24 bits unsigned number as the
   EGD of EVI1, and it has global uniqueness in EVI1's service domain.
   In data plane, we use QinQ tags to carry the EGD.

   We use a Global Unique Label (GUL) to identify an ESI in EVI1's
   service domain.  So the ESI-GUL is also its Global ESI Indicator.
   The ESI-GULs are avertised through RT-1 per ES routes, and they are
   considered to be an ESI-label by these routes.  The label in RT-3
   route's PMSI-Tunnel Attribute (PTA-Label) whose tunnel type is
   ingress replication is called as Ingress Replication Multicast Label
   (IRML) in this document.



Wang & Chen               Expires 19 June 2021                 [Page 20]


Internet-Draft                  EVPN-lite                  December 2020


   We use the following encapsulation in MPLS-based EVPN-lite:

            Format #1                    Format #2
       +-----------------------+     +----------------------------+
       | PSN Labels            |     | PSN Labels                 |
       +-----------------------+     +----------------------------+
       | IRML (EVI1)           |     | Destination-ESI GUL (ESI1) |
       +-----------------------+     +----------------------------+
       | Source-ESI GUL (ESI1) |     | Source-ESI GUL (ESI2)      |
       +-----------------------+     +----------------------------+
       | Ethernet Header       |     | Ethernet Header (EVI1)     |
       +-----------------------+     +----------------------------+
       | Ethernet Payload      |     | Ethernet Payload           |
       +-----------------------+     +----------------------------+
       | Ethernet FCS          |     | Ethernet FCS               |
       +-----------------------+     +----------------------------+

                 Figure 4: MPLS Encapsulation for EVPN-lite

   Note that the GUL can be a single Label Stack Entry (LSE), in such
   case, it should be allocated in DCB label space.  Given that the ESIs
   and vESIs may be too many to be allocated in DCB in certain
   scenarios, so the GUL should be allocated in a few context-specific
   label spaces, each identified by a Context Label Space ID (CLS-ID)
   per [I-D.ietf-bess-mvpn-evpn-aggregation-label] in such case.  In
   such case, the ESI-GUL is the entirety of ESI-label and its Context
   Label Space ID (CLS-ID), so it means two LSEs in the Label Stack at
   that time.

   Note that the ESI GULs are assigned by a center authority, which may
   be a DC controller or an administrator.

   Note that the ESI-label (ESI-GUL) should be pushed onto the Label
   Stack whether the packet is BUM or not.  The ESI-GUL can't identify
   the EVPN Instance EVI1, so we have to use the EGD in the inner
   ethernet header of "Format #2" to find EVI1 out.

   Note that the GUL concept is very different with the "upstream-
   assigned label (UAL)" concept.  Because that when a SPE receives a
   GUL from a remote PE, the GUL is considered as an outgoing-label to
   that remote PE, and although the GUL is also considered as a
   incoming-label of the current SPE, and the label operation for the
   GUL will be a "swap", to be precise, The SPE will swap it to itself
   and then push the MPLS Label Stack to that advertising PE.  When the
   same GUL is received from different remote PEs, MPLS ECMP or FRR
   procedures will be applied.





Wang & Chen               Expires 19 June 2021                 [Page 21]


Internet-Draft                  EVPN-lite                  December 2020


   So when the GUL is two LSEs in the label stack, we can say that the
   Context-specific Label Space (CLS) of the ESI-label (inside the GUL)
   takes the role of B-MAC of PBB EVPN, and the CLS-ID label inside the
   GUL takes the role of the B-VPLS label of PBB EVPN.  So no B-VPLS
   instances will be found here.

   Note that the GEI/ES route of MPLS-based EVPN-lite is the RT-1 per ES
   route.

   Note that the light-weighted MPLS EVPN solutions can be used whether
   or not the SR-MPLS LSPs are used in the underlay network.

   The conceptual comparisons between light-weighted MPLS EVPN and
   (Pseudo-) PBB EVPN is illustrated in [Revision-01].

6.2.  MPLS-specific EVPN-lite Procedures

   According to [RFC7432], When the IMET route's PTA's tunnel type is
   ingress replication, the ESI-label is considered to be downstream-
   assigned too.  Because that nothing of RT-1 per ES route will
   indicate whether the ESI-label is upstream-assigned or not.

   Alghough ESI-GUL can be a single LSE or two LSEs in the Label Stack,
   we assume that it is a single LSE by default in this section, it is
   for simplification purpose.

   [M1]  In Step #1, "Format #1" of Figure 4 will be used.

         Although the Ingress Replication Multicat Label (IRML) of
         "Format #1" can identify EVI1 by itself, we suppose that the
         ethernet header of it should also carry EGD as what [M4] does.

         Note that there isn't a B-VPLS here, so the IRML identifies the
         EVI1 itself.  The EVI1 here equals I-VPLS of PBB EVPN.

         Note that when that ARP Request packet comes from a SHD
         (single-homed device), the ESI of its AC will be null.  The
         Source-ESI GUL in "Format #1" will be replaced with a MPLS
         label identifying the ingress TPE.  When we assume that the
         underlay network is a SR-MPLS network, that TPE-identifying
         label can be the node SID label of that ingress TPE.  This
         method follows [I-D.wang-bess-evpn-context-label-02], and the
         context of the TPE-identifying label is identified by the
         EVI1's IRML of "Format #1".







Wang & Chen               Expires 19 June 2021                 [Page 22]


Internet-Draft                  EVPN-lite                  December 2020


         Note that the TPE-identifying label typically will do nothing
         to the all-active ESes, they are used just for the single-homed
         ESes.  But when Section 2.12 is activated, and all ESIs share
         the same ESI indicator, an anycast TPE-identifying label in the
         DCB can be used as that ESI indicator.

   [M2]  In Step #2, "Format #1" of Figure 4 will be received.  PE3
         knows the packet is for EVI1 with the help of the IRML label.
         Then PE3 can learn the relation between the ingress-GEI
         (ingress-ESI GUL) and S-MAC of BUM1 directly, no GEI to ESI
         lookup needed.

   [M3]  In Step #3, PE2 can compare the ingress-GEI (ingress-ESI GUL)
         of BUM1 and the egress-GEI (ESI-GUL of outgoing AC) directly,
         no GEI to ESI lookup needed.

   [M4]  In Step #4, "Format #2" of Figure 4 will be used.  The source-
         ESI GUL, from which the corresponding MAC entry M1 is
         previously learnt, will be encapsulated as the destination-ESI
         GUL directly.  No GEI to ESI lookup needed only if we don't
         care the requirements of Section 2.9.  Otherwise we should
         refer the corresponding RT-1 per EVI routes of ESI1 to forward
         the packet.  These RT-1 per EVI routes are advertised for EVI1,
         so the Ethernet Tag ID (ETI) of these routes don't have to be
         the EGD.

         Note that when ESI1 is single-active mode, ESI-GUL of ESI1 will
         be different on PE1 and PE2.  But the MAC entry M1 will use the
         newest one only, the swithover between them is called as MAC-
         move.

   [M5]  In Step #5, Whe PE1 received the ARP reply packet from PE3, PE1
         first match the packet to ESI1 by Destination-ESI GUL, then
         match the packet to the EVI instance EVI1 by the QinQ tags of
         Ethernet header.

         Note that we suppose that the original tags from ingress AC
         will be processed following the Raw mode per [RFC4448].
         Although the tagged mode can be used technically.  Note that
         the original tags (if they are kept in the packet) will be the
         inner tags of the EGD.

         Note that when RT-1 per EVI route are used, as specified in
         [M4].  There is no need to carry EGD in unicast data-packets
         too.






Wang & Chen               Expires 19 June 2021                 [Page 23]


Internet-Draft                  EVPN-lite                  December 2020


6.3.  Hierarchical VPLS in EVPN-lite

   In hierachical topology (as illustrated in the following figure), the
   PEs are separated into two groups, the Target PEs (TPEs) and the
   Superstratum PEs (SPEs).

              ___TPE5___        SPE3       ___TPE4_____
             /AC5       \      /   \      /            \AC4
          CE3            \    /     \    /              >=====CE2
             \___         \  /       \  /          ____/AC2
              ___TPE3----SPE1-------SPE2-------TPE2
             /AC3          /                       \
          CE1             /                         \
             \____TPE1___/                           \___CE6
              AC1


                         Figure 5: EVPN-lite H-VPLS

   The TPEs works like the IB-BEB-PE in PBB VPLS, the SPE works like the
   BCB-PE in PBB VPLS.  The BCB-PEs in PBB VPLS do BUM replication based
   on the PBB header.  There are no PBB hearder in EVPN-lite solutions,
   but the SPEs won't learn the C-MACs, which is the same as BCB-PEs in
   PBB VPLS.  The forwarding behaviors of these EVPN-lite solutions are
   very different from each other:

   *  In SRv6-based EVPN-lite, the SPEs are typically pure underlay
      nodes, they don't have to aware of the EVIs.

   *  The SPEs in MPLS-based EVPN-lite don't have to aware of the BUM
      packets, because that, for IMET routes, they work like the ASBRs
      in inter-AS option B.  In such case, the TPEs do ingress-
      replication for all other TPEs by themselves.

      The SPEs in MPLS-based EVPN-lite may terminate the IMET routes
      that were received from their TPEs.  These IMET routes are
      imported into an corresponding BD, but may not be passed through
      other SPEs, so as not to cause duplicated BUM packets.  In such
      case, take SPE1 for example, there are two split-horizon-groups,
      one group is TPE1/TPE3/TPE5, another split-horizon-group is SPE1/
      SPE2.  The BUM packets are replicated between different split-
      horizon-groups.  In such case, the TPEs do ingress-replication for
      its directly connected TPEs and SPEs, not for the indirectly
      connected TPEs and SPEs.  But the unicast packet will not be
      forwarded by that BD on the SPEs.  The unicast packets will be
      label-swapped in the context-specific label-space for the
      corresponding GULs.




Wang & Chen               Expires 19 June 2021                 [Page 24]


Internet-Draft                  EVPN-lite                  December 2020


      Note that the BCB-PE in PBB VPLS is typically supported in the
      industry, But it seems that the BCB-PE in PBB EVPN is typically
      not supported in the industry up to now.  Because the BCB-PE
      function can be replaced in MPLS EVPN by a label-swapping
      operation which is like the inter-AS option B scenarios.

   Note that the BUM packets here are defined based on the destination
   C-MAC addresses.

7.  Comparison with Other Solutions

   We briefly compared light-weighted SRv6 EVPN with PBB-VPLS, PBB-EVPN
   and VXLAN solutions in [Revision-01], further brief comparisions with
   VTEP Group (and its transplantation in SRv6 network) were described
   in [Revision-02].  So we just add the detailed comparisons between
   EVPN-lite SRv6 and PBB EVPN over SRv6 in this revision.

7.1.  Detailed Comparisons with PBB EVPN over SRv6

   I think the "PBB EVPN over SRv6 underlay" solution will be complex,
   if we address too much things.  I have some examples in the
   following:

   *  The upper-layer header for SRv6 is the PBB-header for B-MACs, not
      the ethernet header for C-MACs, so the SID list (SR-Path or
      network programming Instructions) in the SRH can't be constructed
      for the sake of the I-Component.  For example, when a SRv6 SID for
      MAC-guarding (or something else, just an example) present in the
      SRH for PBB EVPN SRv6, I think it means BMAC-guarding, no C-MAC
      guarding.

   *  The B-MACs for the all-active ESIs can't be aggregated, but the
      SRv6 SIDs for ESIs can be aggregated.  The underlay can advertise
      the aggregated prefixes only, so the burden of the underlay
      network may not be increased too much.  When the underlay routes
      is aggregated, the C-MACs can also be learnt against /128 source-
      IP, it is the advantage of a light-weighted SRv6 EVPN, which can't
      be gained from a PBB header.

   *  The B-MACs are for overlay protection (the real overlay is the
      I-VPLS, but the B-VPLS is also an overlay network from the
      viewpoint of the SRv6 network).  But the SRv6 SIDs for ESIs will
      be for underlay protection, it works like the egress protection.
      They are two different types of solutions.

   *  Although PBB EVPN can be transplanted into SRv6 networks along
      with the PBB header, It seems to be more complicated to me.  Take
      the EVPN IRB usecases for example, that requires seven sequences



Wang & Chen               Expires 19 June 2021                 [Page 25]


Internet-Draft                  EVPN-lite                  December 2020


      of header processing, like (SRv6/B-MAC/C-MAC)(Inner-IP)(C-MAC/B-
      MAC/SRv6), during the overlay L3 forwarding.  I think it will be
      horrible enough for some ASICs to implement it.  When the
      processing is simplified as (SRv6/C-MAC)(Inner-IP)(C-MAC/SRv6), it
      sounds like a step forward, not backward, IMHO.  We can achieve
      this goal easily inside the EVPN framework, only if the data-plane
      learning can still be considered as an option after PBB EVPN.

   Fortunately, SRv6 is just too young to have a transplantation of PBB
   EVPN.  So it will waste nothing for the SRv6 nodes to give up the PBB
   header whom is never used by these SRv6 nodes.  Note that the SRv6
   functions (End.DT2U and End.DT2M) for L2VPNs have source-IP-based
   data-plane learning for a long time already.

   In EVPN IRB usecase, [I-D.ietf-bess-evpn-irb-extended-mobility]
   defines some optional extensions to support some specific IRB
   usecases.  In these specific IRB usecases, the <MAC,IP> bindings will
   change across VM-moves.  These extensions can't be applied to PBB
   EVPNs, they can't be applied to light-weighted EVPNs either.  This
   will not prevent PBB EVPNs and light-weighted EVPNs from supporting
   typical IRB use-cases.

7.2.  Detailed Comparisons with Anycast Node SID

   The "Anycast Node SID" solution here is the transplantation of
   Anycast-VTEP-IP solution in SRv6 data-plane, where the Anycast Node
   SID is the equivalent of the Anycast VTEP IP address.  Note that SRv6
   Anycast Node SID is the ultimate aggregation of ESI indicators.  The
   detailed comparisons will be added in the future visions.

8.  Security Considerations

   Security considerations will be added in future versions.

9.  IANA Considerations

9.1.  End.DX2AGG SID

   IANA is requested to allocate a new code points for the new SRv6
   Endpoint Behaviors defined in this document.

                 +------+-------------+---------------+
                 | Type | Description | Reference     |
                 +------+-------------+---------------+
                 | TBD1 | End.DX2AGG    | This Document |
                 +------+-------------+---------------+





Wang & Chen               Expires 19 June 2021                 [Page 26]


Internet-Draft                  EVPN-lite                  December 2020


                            Figure 6: End.DX2AGG

9.2.  Global Unique ESI-label in EAD per ES Route

   When we use Global Unique ESI-label in EAD per ES route, especially
   in ingress-replication use case, It should be explicitly indicated in
   the EAD per ES route.  The details will be added in future versions.

10.  Acknowledgements

   The authors would like to thank the following for their comments and
   review of this document:

   Ye Shu.

11.  Normative References

   [I-D.ietf-bess-evpn-unequal-lb]
              Malhotra, N., Sajassi, A., Rabadan, J., Drake, J.,
              Lingala, A., and S. Thoria, "Weighted Multi-Path
              Procedures for EVPN All-Active Multi-Homing", Work in
              Progress, Internet-Draft, draft-ietf-bess-evpn-unequal-lb-
              07, 14 October 2020, <https://tools.ietf.org/html/draft-
              ietf-bess-evpn-unequal-lb-07>.

   [I-D.ietf-bess-mvpn-evpn-aggregation-label]
              Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands,
              "MVPN/EVPN Tunnel Aggregation with Common Labels", Work in
              Progress, Internet-Draft, draft-ietf-bess-mvpn-evpn-
              aggregation-label-04, 15 November 2020,
              <https://tools.ietf.org/html/draft-ietf-bess-mvpn-evpn-
              aggregation-label-04>.

   [I-D.ietf-bess-srv6-services]
              Dawra, G., Filsfils, C., Talaulikar, K., Raszuk, R.,
              Decraene, B., Zhuang, S., and J. Rabadan, "SRv6 BGP based
              Overlay services", Work in Progress, Internet-Draft,
              draft-ietf-bess-srv6-services-05, 2 November 2020,
              <https://tools.ietf.org/html/draft-ietf-bess-srv6-
              services-05>.

   [I-D.ietf-spring-srv6-network-programming]
              Filsfils, C., Camarillo, P., Leddy, J., Voyer, D.,
              Matsushima, S., and Z. Li, "SRv6 Network Programming",
              Work in Progress, Internet-Draft, draft-ietf-spring-srv6-
              network-programming-27, 10 December 2020,
              <https://tools.ietf.org/html/draft-ietf-spring-srv6-
              network-programming-27>.



Wang & Chen               Expires 19 June 2021                 [Page 27]


Internet-Draft                  EVPN-lite                  December 2020


   [I-D.sajassi-bess-evpn-ac-aware-bundling]
              Sajassi, A., Mishra, M., Thoria, S., Brissette, P.,
              Rabadan, J., and J. Drake, "AC-Aware Bundling Service
              Interface in EVPN", Work in Progress, Internet-Draft,
              draft-sajassi-bess-evpn-ac-aware-bundling-02, 18 August
              2020, <https://tools.ietf.org/html/draft-sajassi-bess-
              evpn-ac-aware-bundling-02>.

   [RFC4448]  Martini, L., Ed., Rosen, E., El-Aawar, N., and G. Heron,
              "Encapsulation Methods for Transport of Ethernet over MPLS
              Networks", RFC 4448, DOI 10.17487/RFC4448, April 2006,
              <https://www.rfc-editor.org/info/rfc4448>.

   [RFC7348]  Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
              L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
              eXtensible Local Area Network (VXLAN): A Framework for
              Overlaying Virtualized Layer 2 Networks over Layer 3
              Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014,
              <https://www.rfc-editor.org/info/rfc7348>.

   [RFC7432]  Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
              Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
              Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February
              2015, <https://www.rfc-editor.org/info/rfc7432>.

   [RFC7623]  Sajassi, A., Ed., Salam, S., Bitar, N., Isaac, A., and W.
              Henderickx, "Provider Backbone Bridging Combined with
              Ethernet VPN (PBB-EVPN)", RFC 7623, DOI 10.17487/RFC7623,
              September 2015, <https://www.rfc-editor.org/info/rfc7623>.

   [RFC8317]  Sajassi, A., Ed., Salam, S., Drake, J., Uttaro, J.,
              Boutros, S., and J. Rabadan, "Ethernet-Tree (E-Tree)
              Support in Ethernet VPN (EVPN) and Provider Backbone
              Bridging EVPN (PBB-EVPN)", RFC 8317, DOI 10.17487/RFC8317,
              January 2018, <https://www.rfc-editor.org/info/rfc8317>.

   [RFC8365]  Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
              Uttaro, J., and W. Henderickx, "A Network Virtualization
              Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
              DOI 10.17487/RFC8365, March 2018,
              <https://www.rfc-editor.org/info/rfc8365>.

12.  Informative References








Wang & Chen               Expires 19 June 2021                 [Page 28]


Internet-Draft                  EVPN-lite                  December 2020


   [I-D.eastlake-bess-evpn-vxlan-bypass-vtep]
              Eastlake, D., Li, Z., and S. Zhuang, "EVPN VXLAN Bypass
              VTEP", Work in Progress, Internet-Draft, draft-eastlake-
              bess-evpn-vxlan-bypass-vtep-06, 19 October 2020,
              <https://tools.ietf.org/html/draft-eastlake-bess-evpn-
              vxlan-bypass-vtep-06>.

   [I-D.ietf-bess-evpn-irb-extended-mobility]
              Malhotra, N., Sajassi, A., Pattekar, A., Lingala, A.,
              Rabadan, J., and J. Drake, "Extended Mobility Procedures
              for EVPN-IRB", Work in Progress, Internet-Draft, draft-
              ietf-bess-evpn-irb-extended-mobility-04, 27 October 2020,
              <https://tools.ietf.org/html/draft-ietf-bess-evpn-irb-
              extended-mobility-04>.

   [I-D.ietf-bess-pbb-evpn-isid-cmacflush]
              Rabadan, J., Sathappan, S., Nagaraj, K., Miyake, M., and
              T. Matsuda, "PBB-EVPN ISID-based CMAC-Flush", Work in
              Progress, Internet-Draft, draft-ietf-bess-pbb-evpn-isid-
              cmacflush-01, 30 October 2020,
              <https://tools.ietf.org/html/draft-ietf-bess-pbb-evpn-
              isid-cmacflush-01>.

   [I-D.wang-bess-evpn-context-label-02]
              Wang, Y., "'SR-MPLS signalling for CSL-based Context VC'
              in I-D.wang-bess-evpn-context-label-02", 10 June 2020,
              <https://tools.ietf.org/html/draft-wang-bess-evpn-context-
              label-02#section-4.2>.

   [I-D.wang-bess-evpn-egress-protection]
              Wang, Y. and R. Chen, "EVPN Egress Protection", Work in
              Progress, Internet-Draft, draft-wang-bess-evpn-egress-
              protection-04, 29 October 2020,
              <https://tools.ietf.org/html/draft-wang-bess-evpn-egress-
              protection-04>.

   [Revision-01]
              "Revision-01 of this draft", 1 July 2020,
              <https://tools.ietf.org/html/draft-wang-bess-evpn-cmac-
              overload-reduction-01>.

   [Revision-02]
              "Revision-02 of this draft", 14 November 2020,
              <https://tools.ietf.org/html/draft-wang-bess-evpn-cmac-
              overload-reduction-02>.






Wang & Chen               Expires 19 June 2021                 [Page 29]


Internet-Draft                  EVPN-lite                  December 2020


   [RFC7041]  Balus, F., Ed., Sajassi, A., Ed., and N. Bitar, Ed.,
              "Extensions to the Virtual Private LAN Service (VPLS)
              Provider Edge (PE) Model for Provider Backbone Bridging",
              RFC 7041, DOI 10.17487/RFC7041, November 2013,
              <https://www.rfc-editor.org/info/rfc7041>.

Authors' Addresses

   Yubao Wang
   ZTE Corporation
   No.68 of Zijinghua Road, Yuhuatai Distinct
   Nanjing
   China

   Email: wang.yubao2@zte.com.cn


   Ran Chen
   ZTE Corporation
   No. 50 Software Ave, Yuhuatai Distinct
   Nanjing
   China

   Email: chen.ran@zte.com.cn



























Wang & Chen               Expires 19 June 2021                 [Page 30]


Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/