[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 draft-ietf-idr-ix-bgp-route-server

GROW Working Group                                           E. Jasinska
Internet-Draft                                        Limelight Networks
Intended status: Standards Track                             N. Hilliard
Expires: April 29, 2011                                             INEX
                                                               R. Raszuk
                                                           Cisco Systems
                                                               N. Bakker
                                                             AMS-IX B.V.
                                                        October 26, 2010

                     Internet Exchange Route Server


   The growing popularity of Internet exchange points (IXPs) brings a
   new set of requirements to interconnect participating networks.
   While bilateral exterior BGP sessions between exchange participants
   were previously the most common means of exchanging reachability
   information, the overhead associated with dense interconnection has
   caused substantial operational scaling problems for Internet exchange
   point participants.

   This document outlines a specification for multilateral
   interconnections at IXPs.  Multilateral interconnection is a method
   of exchanging routing information between three or more BGP speakers
   using a single intermediate broker system, referred to as a route
   server.  Route servers are typically used on shared access media
   networks such as Internet exchange points (IXPs), to facilitate
   simplified interconnection between multiple Internet routers on such
   a network.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

Jasinska, et al.         Expires April 29, 2011                 [Page 1]

Internet-Draft             IX BGP Route Server              October 2010

   The list of current Internet-Drafts can be accessed at

   The list of Internet-Draft Shadow Directories can be accessed at

   This Internet-Draft will expire on April 29, 2011.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.

Jasinska, et al.         Expires April 29, 2011                 [Page 2]

Internet-Draft             IX BGP Route Server              October 2010

Table of Contents

   1.  Introduction to Multilateral Interconnection . . . . . . . . .  4
     1.1.  Specification of Requirements  . . . . . . . . . . . . . .  5
   2.  Bilateral Interconnection  . . . . . . . . . . . . . . . . . .  5
   3.  Multilateral Interconnection . . . . . . . . . . . . . . . . .  6
   4.  Technical Considerations for Route Server Implementations  . .  7
     4.1.  Client UPDATE Messages . . . . . . . . . . . . . . . . . .  7
     4.2.  Attribute Transparency . . . . . . . . . . . . . . . . . .  7
       4.2.1.  NEXT_HOP Attribute . . . . . . . . . . . . . . . . . .  8
       4.2.2.  AS_PATH Attribute  . . . . . . . . . . . . . . . . . .  8
       4.2.3.  MULTI_EXIT_DISC Attribute  . . . . . . . . . . . . . .  8
       4.2.4.  Communities Attributes . . . . . . . . . . . . . . . .  8
     4.3.  Per-Client Prefix Filtering  . . . . . . . . . . . . . . .  9
       4.3.1.  Prefix Hiding on a Route Server  . . . . . . . . . . .  9
       4.3.2.  Mitigation Techniques  . . . . . . . . . . . . . . . . 10  Multiple Route Server RIBs . . . . . . . . . . . . 10  Advertising Multiple Paths . . . . . . . . . . . . 10
   5.  Operational Considerations for Route Server Installations  . . 12
     5.1.  Route Server Scaling . . . . . . . . . . . . . . . . . . . 12
       5.1.1.  Tackling Scaling Issues  . . . . . . . . . . . . . . . 12  View Merging and Decomposition . . . . . . . . . . 12  Destination Splitting  . . . . . . . . . . . . . . 13  NEXT_HOP Resolution  . . . . . . . . . . . . . . . 13
     5.2.  NLRI Leakage Mitigation  . . . . . . . . . . . . . . . . . 13
     5.3.  Route Server Redundancy  . . . . . . . . . . . . . . . . . 13
     5.4.  AS_PATH Consistency Check  . . . . . . . . . . . . . . . . 14
     5.5.  Implementing Routing Policies  . . . . . . . . . . . . . . 14
       5.5.1.  Communities  . . . . . . . . . . . . . . . . . . . . . 14
       5.5.2.  Internet Routing Registry  . . . . . . . . . . . . . . 14
   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 15
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 15
   8.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 15
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 15
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 15
     9.2.  Informative References . . . . . . . . . . . . . . . . . . 16
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17

Jasinska, et al.         Expires April 29, 2011                 [Page 3]

Internet-Draft             IX BGP Route Server              October 2010

1.  Introduction to Multilateral Interconnection

   Internet exchange points (IXPs) provide IP data interconnection
   facilities for their participants, typically using shared Layer-2
   networking media such as Ethernet.  The Border Gateway Protocol (BGP)
   [RFC4271], an inter-Autonomous System routing protocol, is commonly
   used to facilitate exchange of network reachability information over
   such media.

   In the case of bilateral interconnection between two exchange
   participant routers, each router must be configured with a BGP
   session to the other.  At IXPs with many participants who wish to
   implement dense interconnection, this requirement can lead both to
   large router configurations and high administrative overhead.  Given
   the growth in the number of participants at many IXPs, it has become
   operationally troublesome to implement densely meshed
   interconnections at these IXPs.

   Multilateral interconnection is a method of interconnecting BGP
   speaking routers using a third party brokering system, commonly
   referred to as a route server and typically managed by the IXP
   operator.  Each of the multilateral interconnection participants
   (usually referred to as route server clients) announces network
   reachability information to the route server using exterior BGP, and
   the route server in turn forwards this information to each other
   route server client connected to it, according to its configuration.
   Although a route server uses BGP to exchange reachability information
   with each of its clients, it does not forward traffic itself and is
   therefore not a router.

   A route server can be viewed as similar in function to an [RFC4456]
   route reflector, except that it operates using EBGP instead of iBGP.
   Certain adaptions to [RFC4271] are required, to enable an EBGP router
   to operate as a route server, which are outlined in Section 4 of this
   document.  Operational considerations to be taken into account in a
   route server deployment are subject of Section 5.

   The term "route server" is often in a different context used to
   describe a BGP node whose purpose is to accept BGP feeds from
   multiple clients for the purpose of operational analysis and
   troubleshooting.  A system of this form may alternatively be known as
   a "route collector" or a "route-views server".  This document uses
   the term "route server" exclusively to describe multilateral peering
   brokerage systems.

Jasinska, et al.         Expires April 29, 2011                 [Page 4]

Internet-Draft             IX BGP Route Server              October 2010

1.1.  Specification of Requirements

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "OPTIONAL" in this document are to be interpreted as described in

2.  Bilateral Interconnection

   Bilateral interconnection is a method of interconnecting routers
   using individual BGP sessions between each participant router on an
   IXP in order to exchange reachability information.  While
   interconnection policies vary from participant to participant, most
   IXPs have significant numbers of participants who see value in
   interconnecting with as many other exchange participants as possible.
   In order for an IXP participant to implement a dense interconnection
   policy, it is necessary for the participant to liaise with each of
   their intended interconnection partners and if this partner agrees to
   interconnect, then both participants' routers must be configured with
   a BGP session to exchange network reachability information.  If each
   exchange participant interconnects with each other participant, a
   full mesh of BGP sessions is needed, as detailed in Figure 1.

                                ___      ___
                               /   \    /   \
                            ..| AS1 |..| AS2 |..
                           :   \___/____\___/   :
                           :     | \    / |     :
                           :     |  \  /  |     :
                           : IXP |   \/   |     :
                           :     |   /\   |     :
                           :     |  /  \  |     :
                           :    _|_/____\_|_    :
                           :   /   \    /   \   :
                            ..| AS3 |..| AS4 |..
                               \___/    \___/

               Figure 1: Full-Mesh Interconnection at an IXP

   Figure 1 depicts an IXP platform with four connected routers,
   administered by four separate exchange participants, each of them
   with a locally unique autonomous system number: AS1, AS2, AS3 and
   AS4.  Each of these four participants wishes to exchange traffic with
   all other participants; this is accomplished by configuring a full
   mesh of BGP sessions on each router connected to the exchange,
   resulting in 6 BGP sessions across the IXP fabric.

Jasinska, et al.         Expires April 29, 2011                 [Page 5]

Internet-Draft             IX BGP Route Server              October 2010

   The number of BGP sessions at an exchange has an upper bound of
   n*(n-1)/2, where n is the number of routers at the exchange.  As many
   exchanges have relatively large numbers of participating networks,
   the quadratic scaling requirements of dense interconnection tend to
   cause operational and administrative overhead at large IXPs.
   Consequently, new participants to an IXP require significant initial
   resourcing in order to gain value from their IXP connection, while
   existing exchange participants need to commit ongoing resources in
   order to benefit from interconnecting with these new participants.

3.  Multilateral Interconnection

   Multilateral interconnection is implemented using a route server
   configured to use BGP to distribute network layer reachability
   information (NLRI) among all client routers.  The route server
   preserves the BGP NEXT_HOP attribute from all received NLRI UPDATE
   messages, and passes these messages with unchanged NEXT_HOP to its
   route server clients, according to its configured routing policy.
   Using this method of exchanging NLRI messages, an IXP participant
   router can receive an aggregated list of prefixes from all other
   route server clients using a single BGP session to the route server
   instead of depending on BGP sessions with each other router at the
   exchange.  This reduces the overall number of BGP sessions at an
   Internet exchange from n*(n-1)/2 to n, where n is the number of
   routers at the exchange.

   In practical terms, this allows dense interconnection between IXP
   participants with low administrative overhead and significantly
   simpler and smaller router configurations.  In particular, new IXP
   participants benefit from immediate and extensive interconnection,
   while existing route server participants receive reachability
   information from these new participants without necessarily having to
   adapt their configurations.

Jasinska, et al.         Expires April 29, 2011                 [Page 6]

Internet-Draft             IX BGP Route Server              October 2010

                                ___      ___
                               /   \    /   \
                            ..| AS1 |..| AS2 |..
                           :   \___/    \___/   :
                           :      \      /      :
                           :       \    /       :
                           :        \__/        :
                           : IXP   /    \       :
                           :      |  RS  |      :
                           :       \____/       :
                           :        /  \        :
                           :       /    \       :
                           :    __/      \__    :
                           :   /   \    /   \   :
                            ..| AS3 |..| AS4 |..
                               \___/    \___/

           Figure 2: IXP-based Interconnection with Route Server

   As illustrated in Figure 2, each router on the IXP fabric requires
   only a single BGP session to the route server, from which it can
   receive reachability information for all other routers on the IXP
   which also connect to the route server.

4.  Technical Considerations for Route Server Implementations

4.1.  Client UPDATE Messages

   A route server MUST accept all UPDATE messages received from each of
   its clients for inclusion in its Adj-RIB-In.  These UPDATE messages
   MAY by omitted from the route server's Loc-RIB or Loc-RIBs, due to
   filters configured for the purposes of implementing routing policy.
   The route server SHOULD perform one or more BGP Decision Processes to
   select routes for subsequent advertisement to its clients, taking
   into account possible configuration to provide multiple NLRI paths to
   a particular client as described in Section or multiple Loc-
   RIBs as described in Section  The route server SHOULD
   forward UPDATE messages where appropriate from its Loc-RIB or Loc-
   RIBs to its clients.

4.2.  Attribute Transparency

   As a route server primarily performs a brokering service,
   modification of attributes could cause route server clients to alter
   their BGP best-path selection process for received prefix
   reachability information, thereby changing the intended routing
   policies of exchange participants.  Therefore, contrary to what is

Jasinska, et al.         Expires April 29, 2011                 [Page 7]

Internet-Draft             IX BGP Route Server              October 2010

   specified in section 5. of [RFC4271], route servers SHOULD NOT update
   well-known BGP attributes received from route server clients before
   redistributing them to their other route server clients.  Optional
   recognized and unrecognized BGP attributes, whether transitive or
   non-transitive, SHOULD NOT be updated by the route server and SHOULD
   be passed on to other route server clients.

4.2.1.  NEXT_HOP Attribute

   The NEXT_HOP, a well-known mandatory BGP attribute, defines the IP
   address of the router used as the next hop to the destinations listed
   in the Network Layer Reachability Information field of the UPDATE
   message.  As the route server does not participate in the actual
   routing of traffic, the NEXT_HOP attribute MUST be passed unmodified
   to the route server clients, similar to the "third party" next hop
   feature described in section 5.1.3. of [RFC4271].

4.2.2.  AS_PATH Attribute

   AS_PATH is a well-known mandatory attribute which identifies the
   autonomous systems through which routing information carried in the
   UPDATE message has passed.

   As a route server does not participate in the process of forwarding
   data between client routers, and because modification of the AS_PATH
   attribute could affect route server client best-path calculations,
   the route server SHOULD NOT prepend its own AS number to the AS_PATH
   segment nor modify the AS_PATH segment in any other way.

4.2.3.  MULTI_EXIT_DISC Attribute

   MULTI_EXIT_DISC is an optional non-transitive attribute intended to
   be used on external (inter-AS) links to discriminate among multiple
   exit or entry points to the same neighboring AS.  If applied to an
   NLRI UPDATE sent to a route server, the attribute (contrary to
   section 5.1.4 of [RFC4271]) SHOULD be propagated to other route
   server clients and the route server SHOULD NOT modify the value of
   this attribute.

4.2.4.  Communities Attributes

   The BGP COMMUNITIES ([RFC1997]) and Extended Communities ([RFC4360])
   attributes are attributes intended for labeling information carried
   in BGP UPDATE messages.  Transitive as well as non-transitive
   Communities attributes applied to an NLRI UPDATE sent to a route
   server SHOULD NOT be modified, processed or removed.  However, if
   such an attribute is intended for processing by the route server
   itself, it MAY be modified or removed.

Jasinska, et al.         Expires April 29, 2011                 [Page 8]

Internet-Draft             IX BGP Route Server              October 2010

4.3.  Per-Client Prefix Filtering

4.3.1.  Prefix Hiding on a Route Server

   While IXP participants often use route servers with the intention of
   interconnecting with as many other route server participants as
   possible, there are several circumstances where control of prefix
   distribution on a per-client basis is important for ensuring that
   desired interconnection policies are met.

                                ___      ___
                               /   \    /   \
                            ..| AS1 |..| AS2 |..
                           :   \___/    \___/   :
                           :       \    / |     :
                           :        \  /  |     :
                           : IXP     \/   |     :
                           :         /\   |     :
                           :        /  \  |     :
                           :    ___/____\_|_    :
                           :   /   \    /   \   :
                            ..| AS3 |..| AS4 |..
                               \___/    \___/

               Figure 3: Filtered Interconnection at an IXP

   Using the example in Figure 3, AS1 does not directly exchange prefix
   information with either AS2 or AS3 at the IXP, but only interconnects
   with AS4.

   In the traditional bilateral interconnection model, prefix filtering
   to a third party exchange participant is accomplished either by not
   engaging in a bilateral interconnection with that participant or else
   by implementing outbound prefix filtering on the BGP session towards
   that participant.  However, in a multilateral interconnection
   environment, only the route server can perform outbound prefix
   filtering in the direction of the route server client; route server
   clients depend on the route server to perform their filtering for

   If the same prefix is sent to a route server from multiple route
   server clients with different BGP attributes, and traditional best-
   path route selection is performed on that list of prefixes, then the
   route server will select a single best-path prefix for propagation to
   all connected clients.  If, however, the route server has been
   configured to filter the calculated best-path prefix from reaching a
   particular route server client, then that client will receive no

Jasinska, et al.         Expires April 29, 2011                 [Page 9]

Internet-Draft             IX BGP Route Server              October 2010

   reachability information for that prefix from the route server,
   despite the fact that the route server has received alternative
   reachability information for that prefix from other route server
   clients.  This phenomenon is referred to as "prefix hiding".

   For example, in Figure 3, if the same prefix were sent to the route
   server via AS2 and AS4, and the route via AS2 was preferred according
   to BGP's traditional best-path selection, but AS2 was filtered by
   AS1, then AS1 would never receive this prefix, even though the route
   server had previously received a valid alternative path via AS4.
   This happens because the best-path selection is performed only once
   on the route server for all clients.

   It should be noted that prefix hiding will only occur on route
   servers which employ per-client prefix filtering; if an IXP operator
   deploys a route server without prefix filtering, then prefix hiding
   does not occur, as all paths are considered equally valid from the
   point of view of the route server.

   There are several techniques which may be employed to prevent the
   prefix hiding problem from occurring.  Route server implementations
   SHOULD implement at least one method to prevent prefix hiding.

4.3.2.  Mitigation Techniques  Multiple Route Server RIBs

   The most portable means of preventing the route server prefix hiding
   problem is by using a route server BGP implementation which performs
   the per-client best-path calculation for each set of prefixes which
   results after the route server's client filtering policies have been
   taken into consideration.  This can be implemented by using per-
   client Loc-RIBs, with prefix filtering implemented between the Adj-
   RIB-In and the per-client Loc-RIB.  Implementations MAY optimize this
   by maintaining prefixes not subject to filtering policies in a global
   Loc-RIB, with per-client Loc-RIBs stored as deltas.

   This problem mitigation technique is highly portable, as it makes no
   assumptions about the feature capabilities of the route server
   clients.  Advertising Multiple Paths

   The prefix distribution model described above assumes standard BGP
   session encoding where the route server sends a single path to its
   client for any given prefix.  This path is selected using the BGP
   path selection decision process described in [RFC4271].  If, however,
   it were possible for the route server to send more than a single path

Jasinska, et al.         Expires April 29, 2011                [Page 10]

Internet-Draft             IX BGP Route Server              October 2010

   to a route server client, then route server clients would no longer
   depend on receiving a single best path to a particular prefix;
   consequently, the prefix hiding problem described in Section 4.3.1
   would disappear.

   We present two methods which describe how such increased path
   diversity could be implemented.  Diverse BGP Path Approach

   The Diverse BGP Path proposal as defined in
   [I-D.ietf-grow-diverse-bgp-path-dist] is a simple way to distribute
   multiple prefix paths from a route server to a route server client by
   using a separate BGP session from the route server to a client for
   each different path.

   The number of paths which may be distributed to a client is
   constrained by the number of BGP sessions which the server and the
   client are willing to establish with each other.  The distributed
   paths may be established from the global BGP Loc-RIB on the route
   server in addition to any per-client Loc-RIB.  As there may be more
   potential paths to a given prefix than configured BGP sessions, this
   method is not guaranteed to eliminate the prefix hiding problem in
   all situations.  Furthermore, this method may significantly increase
   the number of BGP sessions handled by the route server, which may
   negatively impact its performance.  BGP ADD-PATH Approach

   The [I-D.ietf-idr-add-paths] Internet draft proposes a different
   approach to multiple path propagation, by allowing a BGP speaker to
   forward multiple paths for the same prefix on a single BGP session.
   As [RFC4271] specifies that a BGP listener must implement an implicit
   withdraw when it receives an UPDATE message for a prefix which
   already exists in its Adj-RIB-In, this approach requires explicit
   support for the feature both on the route server and on its clients.

   If the ADD-PATH capability is negotiated bidirectionally between the
   route server and a route server client, and the route server client
   propagates multiple paths for the same prefix to the route server,
   then this could potentially cause the propagation of inactive,
   invalid or suboptimal paths to the route server, thereby causing loss
   of reachability to other route server clients.  For this reason, ADD-
   PATH implementations on a route server SHOULD enforce send-only mode
   with the route server clients, which would result in negotiating
   receive-only mode from the client to the route server.

Jasinska, et al.         Expires April 29, 2011                [Page 11]

Internet-Draft             IX BGP Route Server              October 2010

5.  Operational Considerations for Route Server Installations

5.1.  Route Server Scaling

   While deployment of multiple Loc-RIBs on the route server presents a
   simple way to avoid the prefix hiding problem noted in Section 4.3.1,
   this approach requires significantly more computing resources on the
   route server than where a single Loc-RIB is deployed for all clients.
   As the [RFC4271] Decision Process must be applied to all Loc-RIBs
   deployed on the route server, both CPU and memory requirements on the
   host computer scale approximately according to O(P * N), where P is
   the total number of unique prefixes received by the route server and
   N is the number of route server clients which require a unique Loc-
   RIB.  As this is a super-linear scaling relationship, large route
   servers may derive benefit from deploying per-client Loc-RIBs only
   where they are required.

   Regardless of any Loc-RIB optimization implemented, the route
   server's control plane bandwidth requirements will scale according to
   O(P * N), where P is the total number of unique prefixes received by
   the route server and N is the total number of route server clients.
   In the case where P_avg (the arithmetic mean number of unique
   prefixes received per route server client) remains roughly constant
   even as the number of connected clients increases, this relationship
   can be rewritten as O((P_avg * N) * N) or O(N^2).  This quadratic
   upper bound on the network traffic requirements indicates that the
   route server model will not scale to arbitrarily large sizes.

5.1.1.  Tackling Scaling Issues

   The network traffic scaling issue presents significant difficulties
   with no clear solution - ultimately, each client must receive a
   UPDATE for each unique prefix received by the route server.  However,
   there are several potential methods for dealing with the CPU and
   memory resource requirements of route servers.  View Merging and Decomposition

   View merging and decomposition, outlined in [RS-ARCH], describes a
   method of optimising memory and CPU requirements where multiple route
   server clients are subject to exactly the same routing policies.  In
   this situation, the multiple Loc-RIB views required by each client
   are merged into a single view.

   A variation of this approach may be implemented on route servers by
   ensuring that separate Loc-RIBs are only configured for route server
   clients with unique export peering policies.

Jasinska, et al.         Expires April 29, 2011                [Page 12]

Internet-Draft             IX BGP Route Server              October 2010  Destination Splitting

   Destination splitting, also described in [RS-ARCH], describes a
   method for route server clients to connect to multiple route servers
   and to send non-overlapping sets of prefixes to each route server.
   As each route server computes the best path for its own set of
   prefixes, the quadratic scaling requirement operates on multiple
   smaller sets of prefixes.  This reduces the overall computational and
   memory requirements for managing multiple Loc-RIBs and performing the
   best-path calculation on each.  In order for this method to perform
   well, destination splitting would require significant co-ordination
   between the route server operator and each route server client.  In
   practice, such levels of co-ordination are unlikely to work
   successfully, thereby diminishing the value of this approach.  NEXT_HOP Resolution

   As route servers are usually deployed at IXPs which use flat layer 2
   networks, recursive resolution of the NEXT_HOP attribute is generally
   not required, and can be replaced by a simple check to ensure that
   the NEXT_HOP value for each prefix is a network address on the IXP
   LAN's IP address range.

5.2.  NLRI Leakage Mitigation

   NLRI leakage occurs when a BGP client unintentionally distributes
   NLRI UPDATE messages to one or more neighboring BGP routers.  NLRI
   leakage of this form to a route server can cause connectivity
   problems at an IXP if each route server client is configured to
   accept all prefix UPDATE messages from the route server.  It is
   therefore RECOMMENDED when deploying route servers that, due to the
   potential for collateral damage caused by NLRI leakage, route server
   operators deploy NLRI leakage mitigation measures in order to prevent
   unintentional prefix announcements or else limit the scale of any
   such leak.  Although not foolproof, per-client inbound prefix limits
   can restrict the damage caused by prefix leakage in many cases.  Per-
   client inbound prefix filtering on the route server is a more
   deterministic and usually more reliable means of preventing prefix
   leakage, but requires more administrative resources to maintain

5.3.  Route Server Redundancy

   As the purpose of an IXP route server implementation is to provide a
   reliable reachability brokerage service, it is RECOMMENDED that
   exchange operators who implement route server systems provision
   multiple route servers on each shared Layer-2 domain.  There is no
   requirement to use the same BGP implementation or operating system

Jasinska, et al.         Expires April 29, 2011                [Page 13]

Internet-Draft             IX BGP Route Server              October 2010

   for each route server on the IXP fabric; however, it is RECOMMENDED
   that where an operator provisions more than a single server on the
   same shared Layer-2 domain, each route server implementation be
   configured equivalently and in such a manner that the path
   reachability information from each system is identical.

5.4.  AS_PATH Consistency Check

   As per [RFC4271] every BGP speaker who advertises a route to another
   external BGP speaker prepends its own AS number as the last element
   of the AS_PATH sequence.  Therefore the leftmost AS in an AS_PATH
   attribute is equal to the autonomous system number of the BGP speaker
   that sent an UPDATE message.

   [RFC4271] suggests in section 6.3 that a BGP speaker MAY check the
   AS_PATH attribute of each UPDATE message received for consistency, if
   the leftmost AS in the AS_PATH is in fact the one of the sender.

   Route servers do not modify the AS_PATH attribute (as described in
   Section 4.2.2), since they do not participate in the traffic
   exchange.  Therefore a consistency check on the AS_PATH of an UPDATE
   received by a route server client would fail.  It is therefore
   RECOMMENDED that route server clients disable the AS_PATH consistency
   check towards the route server.

5.5.  Implementing Routing Policies

   Prefix filtering is commonly implemented on route servers to provide
   prefix distribution control mechanisms for route server clients.
   There are a few commonly used strategies available.

5.5.1.  Communities

   Prefixes sent to the route server are tagged with certain COMMUNITIES
   attributes agreed upon beforehand between the operator and all
   participants.  Based on the values, routes are propagated to all
   other participants, a subset of participants, or none.  This allows
   for one-way filtering policies to be implemented on the route server;
   if a participant chooses not to exchange routes with a certain other
   participant, he will have to instruct the route server to not
   announce his own routes and filter incoming routes on his own router.

5.5.2.  Internet Routing Registry

   Filters configured on the route server can be constructed by querying
   an Internet Routing Registry database for RPSL [RFC2622] objects
   placed there by participating operators.  Import and export
   statements for the route server's ASN in an aut-num object define

Jasinska, et al.         Expires April 29, 2011                [Page 14]

Internet-Draft             IX BGP Route Server              October 2010

   their desired policy, from which the configured filters are derived.

6.  Security Considerations

   On route server installations which do not employ prefix-hiding
   mitigation techniques, the prefix hiding problem outlined in section
   Section 4.3.1 can be used in certain circumstances to proactively
   block third party prefix announcements from other route server

7.  IANA Considerations

   The new set of mechanism for route servers does not require any new
   allocations from IANA.

8.  Acknowledgments

   The authors would like to thank Chris Hall, Ryan Bickhart and Steven
   Bakker for their valuable input.

   In addition, the authors would like to acknowledge the developers of
   BIRD, OpenBGPD and Quagga, whose open source BGP implementations
   include route server capabilities which are compliant with this

9.  References

9.1.  Normative References

              Raszuk, R., Fernando, R., Patel, K., McPherson, D., and K.
              Kumaki, "Distribution of diverse BGP paths.",
              draft-ietf-grow-diverse-bgp-path-dist-02 (work in
              progress), July 2010.

              Walton, D., Retana, A., Chen, E., and J. Scudder,
              "Advertisement of Multiple Paths in BGP",
              draft-ietf-idr-add-paths-04 (work in progress),
              August 2010.

   [RFC1997]  Chandrasekeran, R., Traina, P., and T. Li, "BGP
              Communities Attribute", RFC 1997, August 1996.

Jasinska, et al.         Expires April 29, 2011                [Page 15]

Internet-Draft             IX BGP Route Server              October 2010

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2622]  Alaettinoglu, C., Villamizar, C., Gerich, E., Kessens, D.,
              Meyer, D., Bates, T., Karrenberg, D., and M. Terpstra,
              "Routing Policy Specification Language (RPSL)", RFC 2622,
              June 1999.

   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
              Protocol 4 (BGP-4)", RFC 4271, January 2006.

   [RFC4360]  Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended
              Communities Attribute", RFC 4360, February 2006.

   [RFC4456]  Bates, T., Chen, E., and R. Chandra, "BGP Route
              Reflection: An Alternative to Full Mesh Internal BGP
              (IBGP)", RFC 4456, April 2006.

   [RS-ARCH]  Govindan, R., Alaettinoglu, C., Varadhan, K., and D.
              Estrin, "A Route Server Architecture for Inter-Domain
              Routing", 1995,

9.2.  Informative References

   [RFC1863]  Haskin, D., "A BGP/IDRP Route Server alternative to a full
              mesh routing", RFC 1863, October 1995.

   [RFC3418]  Presuhn, R., "Management Information Base (MIB) for the
              Simple Network Management Protocol (SNMP)", STD 62,
              RFC 3418, December 2002.

   [RFC4223]  Savola, P., "Reclassification of RFC 1863 to Historic",
              RFC 4223, October 2005.

   [RFC4760]  Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
              "Multiprotocol Extensions for BGP-4", RFC 4760,
              January 2007.

   [RFC5065]  Traina, P., McPherson, D., and J. Scudder, "Autonomous
              System Confederations for BGP", RFC 5065, August 2007.

   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
              IANA Considerations Section in RFCs", BCP 26, RFC 5226,
              May 2008.

Jasinska, et al.         Expires April 29, 2011                [Page 16]

Internet-Draft             IX BGP Route Server              October 2010

Authors' Addresses

   Elisa Jasinska
   Limelight Networks
   2220 W 14th St
   Tempe, AZ  85281

   Email: elisa@llnw.com

   Nick Hilliard
   4027 Kingswood Road
   Dublin  24

   Email: nick@inex.ie

   Robert Raszuk
   Cisco Systems
   170 West Tasman Drive
   San Jose, CA  95134

   Email: raszuk@cisco.com

   Niels Bakker
   AMS-IX B.V.
   Westeinde 12
   Amsterdam, NH  1017 ZN

   Email: niels.bakker@ams-ix.net

Jasinska, et al.         Expires April 29, 2011                [Page 17]

Html markup produced by rfcmarkup 1.129d, available from https://tools.ietf.org/tools/rfcmarkup/