[Docs] [txt|pdf] [Tracker] [Email] [Nits]
Versions: 00
Network Working Group Joseph
Internet Draft (Juniper Networks)
Intended Status: Proposed Standard August 19, 2009
Expiration Date: February 2010
Experience with rsvp-te p2mp based mvpn
draft-joseph-p2mp-mvpn-experience-00.txt
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Copyright and License Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Joseph [Page 1]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Abstract
Multicast based VPNs have been deployed for a while now, based on the
Draft Rosen solution. In today's scenario, network deployments are moving
towards a choice Label Switched Multicast, primarily to garner some of the
advantages that a Label Switched Network can offer. In short, the
requirement is to achieve more optimal multicast replication and in other
words achieve better and effective bandwidth savings. This document describes
some of the experiences gained from the implementation and deployment of
Label Switched multicast using the RSVP-TE P2MP Label Switched Path approach,
and such is information only. The intent is to translate the experiences
gained into valuable practices for the Service Provider and Enterprise community
who intend to deploy this class of mVPNs. Information based on "Hierarchical
Multicast Trees" and "Aggregated P Tunnels" have not been included, and will
follow in the next version of this draft.
Specification of Requirements
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
1. Introduction
Draft Rosen based mVPN deployments did not impose any mVPN awareness
on the Provider routers, since it used overlay GRE based PIM siganlling
between PE routers in order to build indiviual and customer specific
Multicast Domains. Going back into a little bit of history, RFC4364
originally RFC 2547bis (and draft-ietf-l3vpn-rfc2547bis), has not
described a mechanism to provide multicast signaling and multicast
data delivery through the service provider network for Layer 3 VPN
service. Thus, a number of solutions were discussed and various
architectures were implemented based on a given vendor.
Currently, the Layer 3 VPN working group has two drafts in
draft-ietf-l3vpn-2547bis-mcast-07.txt (we use the term [LSM-MVPN-DP]
to refer to this draft and draft-ietf-l3vpn-2547bis-mcast-bgp-05.txt
(We use the term [LSM-MVPN-CP] to refer to this draft.
The draft-ietf-l3vpn-2547bis-mcast document is a superset of
previous solutions and provides many approaches as options
without mandating any of the them. In other words it forms a framework
for a more detailed specification of each option. One of the options
available is RSVP-TE. So when we refer to the term "LSM-MVPN-DP", we
are actually referring to the option of using RSVP-TE.
The draft "ietf-l3vpn-2547bis-mcast-bgp-05.txt" defines the control
plane signalling using MP-BGP for exchanging Customer Multicast
routes within the Provider network.
In this document, we describe some of the lessons learnt from
implementing and deploying Multicast VPN using the P2MP RSVP-TE
approach. We hope it will benefit service providers and network
operators looking to deploy this mVPN services based on this approach.
Joseph [Page 2]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
2. Implementation
As of writing, there are two known implementations in IOS-XR
from Cisco and JunOS from Juniper Networks. Contact these vendors
for implementation details beyond what is provided in this draft.
In the following sections, we describe some of lessons we learnt
during implementation and early interoperability testing stage.
3. ASM vs. SSM
One of the most common questions asked or assessed while deploying
mVPN, especially while deploying Draft Rosen mVPNs. In this case, the
SSM Model offers two options namely "I-PMSI" which indeed replicates
the ASM model to a large extent, followed by the "S-PMSI" model which
is purely source based. In other words, the S-PMSI model ensures that
traffic only reaches PE routers for a given mVPN that has interested
receivers, which is indicated by the Receiver PE using a Type 7
Multicast route advertisement. The I-PMSI model is similar to the ASM
approach as mentioned earlier in this section - as multicast traffic
is sent to all PEs within a given mVPN. However Receiver PEs that do
not have interested receivers drop the traffic at that point.
So in reality the choice here can directly rule out the ASM Model, and
instead focus on the two SSM alternates available. Before we answer
this question, let us look at the following sections for certain
considerations first.
4. RSVP Scalability
One of the points to be considered is RSVP Scalability. Looking at a
scenario where we have 1 Source PE and 3 Receiver PEs physically
connected via a single Provider router, we would have the following:
Based on the BGP A-D received from the leaf nodes (Receiver PEs),
the Source PE signals RSVP-PATH messages to each Leaf Node, which is
reciprocated with RSVP-RESV messages. In this case, we are talking
about 6 PATH/RESV messages and state for 3 LSPs in total. The number
of messages increases at the Provider router, which is the point
of replication for the Sub-LSPs to each Lead Node, as it creates
a P2MP LSP which results in 2 PATH/RESV message to each leaf node.
Therefore the core router forwards/receives 12 PATH/RESV messages
and maintains state for 3 LSPs in total.
Joseph [Page 3]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
5. BGP Scalability
It is recommended to use BGP Route Reflectors for scalability reasons.
It is very common to see operators deploy dedicated Reflectors per
address family to simplify administrations, and operations and also to
provide scalability. Therefore it may be a good idea to look at
deploying Route Reflectors per address family. Certain hardware vendors
do offer the capability of dedicating Routing Engines/Route Processors
within a single chassis to serve each address family. This could also
be an option, since it would offer better consolidation, and provide
a cost effective solution.
6. Resiliency and Convergence within the Provider Core
At the time of writing this document, RSVP-TE supports Fast Re-route
Facility backup/Link Protection for sub-second convergence. Therefore
traffic from a failed link can be switched over to a bypass path even
before the Routing Protocol process is aware of the failure.
In addition, we recommend tuning the IGP to achieve Fast Convergence,
in order for the local PE router to be aware of a remote link failure.
IGP fast convergence only has an impact on the time taken for the
headend router to signal a failed LSP over an alternate path and
therefore does not directly have a bearing on the switchover time
for RSVP-TE. The most common IGP parameters tuned for quick
convergence are; initial delay for generating Link State updates
upon a failure - which should be set to take effect immediately,
and SPF calculation which should also be set to the best minimum
possible as per the hardware vendor's recommendation.
Node failures in certain cases caused recovery times to increase
upto 180 seconds, and the use of BFD in the context of an LSP
reduced the convergence times, and is recommended.
It is also important to quickly detect a failure. It has been found
that - interface dampening, at times can result in the physical status
of the link being suppressed. This would have an impact on the
convergence. Certain hardware has default timers set for Gigabit
Ethernet interfaces which are referred as "Carrier-Delay". It is
recommended that this value be set to "0", in order to enforce
an immediate link-down trigger in the event of a link failure.
Both Juniper and Cisco support BGP Fast convergence features that avoid
long and cumbersome re-writes of the FIB tables in the event of a path
to a given Next-Hop changes. Therefore no special or additional
configuration is required.
Joseph [Page 4]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
7. Resiliency End-to-End
Assuming there is more than one Source available for a given group.
Two options have been evaluated. The first is to use Anycast RP for
two multicast sources for instance, and the address with the longest
prefix match automatically becomes the preferred primary source.
The second approach is using separate unicast IP addresses - let
us assume we are using two sources. In this case, two separate
PIM joins are sent to both the sources (Let us say PIM joins
are sent to both S1 and S2). Therefore we may have (S1,G1) and
(S2,G2). It is also possible to have a single source multi-homed
to the ingress PE and achieve link level resilliency at the
sending site (between the CE and PE). In this case, something
similar to the illustration provided below was tested.
Two streams are delivered to the Leaf PE, however using two VRFs:
+-----------+ +-----------+
| in PE | | LeafPE |
__|__,-----. | ,-------.| ,-----.__|_____
,---. / | ( VRF1 )..( M-VPN1 ... VRF1 ) | \ ,---.
/ \/ | `-----' | `-------'| `-----' | \/ Dest\
(SOURCE | | | | | ( CE )
\ /\___|__,-----. | ,-------.| ,-----. | /\ /
`---' | ( VRF2 )... M-VPN2 ... VRF2 )_|_____/ `---'
| `-----' | `-------'| `-----' |
+-----------+ +-----------+
The former is subject to network convergence, which includes
IGP, and BGP and RSVP (in case S-PMSI) is used. The reason
here is that [LSM-mVPN-DP] chooses a single UMH
(Upstream Multicast Hop) for forwarding.
The later adds more burden on the Provider network, since
two copies are being simultaneously forwarded in the network
for the same set of groups. In other words one copy is
redundant. However failover in this case, is much quicker
than the first option - since the application can
re-converge to the redundant/second copy in the event of
the primary copy being un-available. Moreover there is
no re-synchronization at any control plane protocols
needed to achieve failover. Many operators can choose option-1
for standard service offerings, and option-2 may be offered
as a premium - for the simple fact that provider bandwidth
usage is being doubled.
Joseph [Page 5]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
8. Recommendations for the Type of Source Trees
In the section above we discussed some of the details in
regard to convergence and scalability, and now let us
address some recommendations based on experience. The
goal of a network designer evolves around achieving an
acceptable and optimal level of multicast forwarding in the
provider core. By virtue of being acceptable, firstly we
recommend that as far as possible traffic should only be
forwarded to PEs that have interested receivers for a given
group. Secondly while in the backbone, No more than one
copy a packet traverses any link. Thirdly, if bandwidth
utilization needs to be optimized within the
backbone, then a minimum cost tree should be followed
rather than a shortest path tree. The number of states
in a core network is proportional to the following:
+ Inclusive P-tunnel: number of VRF members of all
MVPNs in the network.
+ Selective P-tunnels: total Number of multicast flows
((C-S, C-G)) of all MVPNs.
It is found that I-PMSI tunnels work well for most
implementations, and can tuned with a threshold rate for
multicast traffic flows - which will force traffic to
switch to a S-PMSI tunnels. We have found that if the
ratio between number of member PEs vs. number of receiver
PEs for a given mVPN is low - it is more than obvious
than the I-PMSI tunnel approach would be recommended,
since Selective Tunnels do not help in bandwidth savings.
This will have a saving on the state information, since
lesser trees are used in the provider network. We have
seen convergence being directly proportionate to the amount
of state information the network carries.
S-PMSI tunnels can be used within an mVPN on a per group
basis, for high bandwidth sessions on an individual
basis - if needed. This could be a typical requirement
for an operator for whom bandwidth savings is more important
than scalability.
Joseph [Page 6]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
9. PHP
The egress PE device does not maintain any multicast state
information with the core, similar to the mode used in
Draft Rosen. There is no PIM relationship, nor any GRE tunnels
with the Provider and Provider Edge devices. The only
binding fact, is the MPLS Multicast Tree and its respective
label association (which is based on a combination of incoming
interface and label space). Therefore, the packet needs to have
the labels intact in order to associate the respective multicast
tree and egress interface (CE Facing). Therefore PHP is not used
in this case. This typically is the default behavior and no
additional configuration should be required to enable this.
10. QoS
Classification and queuing is based on MPLS EXP bits.
The recommendation is to have multicast traffic not subject
to early drops such as WRED/RED and provide temporal buffering
within the queue that carries multicast traffic, in order to
ensure that the latency/jitter is predictable.
11. MTU
No special requirements for handling Label Switched mVPN traffic.
However it is recommended to have the largest MTU value supported
on the entire network, set.
12. IANA Considerations
This document introduces no new IANA Considerations.
13. Security Considerations
This document introduces no new Security Considerations.
14. Acknowledgements
15. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[MVPN] E. Rosen, R. Aggarwal [Editors], "Multicast in MPLS/BGP IP
VPNs", draft-ietf-l3vpn-2547bis-mcast, work in progress
[MVPN-BGP], R. Aggarwal, E. Rosen, T. Morin, Y. Rekhter, "BGP
Encodings for Multicast in MPLS/BGP IP VPNs", draft-ietf-
l3vpn-2547bis-mcast-bgp, work in progress
Joseph [Page 7]
Internet Draft draft-joseph-p2mp-mvpn-experience-00.txt August 2009
16. Non-normative References
17. Author Information
Joseph
Juniper Networks, Inc.
e-mail: vjoseph@juniper.net
Joseph [Page 8]
Html markup produced by rfcmarkup 1.129d, available from
https://tools.ietf.org/tools/rfcmarkup/