[Docs] [txt|pdf] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 draft-ietf-intarea-gre-mtu

Intarea Working Group                                          R. Bonica
Internet-Draft                                          Juniper Networks
Intended status: Best Current Practice                      C. Pignataro
Expires: December 23, 2013                                 Cisco Systems
                                                           June 21, 2013


    A Fragmentation Strategy for Generic Routing Encapsulation (GRE)
                    draft-bonica-intarea-gre-mtu-01

Abstract

   This memo documents a GRE fragmentation strategy that has been
   implemented by many vendors and deployed in many networks.  It was
   written so that a) implementors will be aware of best common practice
   and b) those who rely on GRE will understand how implementations
   work.  The scope of this memo is limited to point-to-point GRE
   tunnels.  All other tunnel types are beyond the scope of this memo.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on December 23, 2013.

Copyright Notice

   Copyright (c) 2013 IETF Trust and the persons identified as the
   document authors.  All rights reserved.





Bonica & Pignataro      Expires December 23, 2013               [Page 1]


Internet-Draft              GRE Fragmentation                  June 2013


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  How To Use This Document  . . . . . . . . . . . . . . . .   3
     1.2.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Candidate Strategies and Strategic Overview . . . . . . . . .   5
     2.1.  Candidate Strategies  . . . . . . . . . . . . . . . . . .   5
     2.2.  Strategic Overview  . . . . . . . . . . . . . . . . . . .   6
   3.  Generic Requirements for GRE Ingress Routers  . . . . . . . .   7
     3.1.  General . . . . . . . . . . . . . . . . . . . . . . . . .   7
     3.2.  Tunnel MTU (TMTU) Estimation and Discovery  . . . . . . .   7
   4.  Procedures Affecting The GRE Deliver Header . . . . . . . . .   8
     4.1.  Tunneling GRE Over IPv4 . . . . . . . . . . . . . . . . .   8
     4.2.  Tunneling GRE Over IPv6 . . . . . . . . . . . . . . . . .   9
   5.  Procedures Affecting the GRE Payoad . . . . . . . . . . . . .   9
     5.1.  IPv4 Payloads . . . . . . . . . . . . . . . . . . . . . .   9
     5.2.  IPv6 Payloads . . . . . . . . . . . . . . . . . . . . . .   9
     5.3.  MPLS Payloads . . . . . . . . . . . . . . . . . . . . . .   9
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  10
   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
     9.2.  Informative References  . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

1.  Introduction

   Generic Routing Encapsulation (GRE) [RFC2784] can be used to carry
   any network layer protocol over any network layer protocol.  GRE has
   been implemented by many vendors and is widely deployed on the
   Internet.

   [RFC2784], by design, does not describe procedures that affect
   fragmentation.  Lacking guidance from the specification, vendors have
   developed implementation-specific fragmentation strategies.  For the
   most part, devices implementing one fragmentation strategy can
   interoperate with devices that implement another fragmentation



Bonica & Pignataro      Expires December 23, 2013               [Page 2]


Internet-Draft              GRE Fragmentation                  June 2013


   strategy.  Operational experience has demonstrated the relative
   merits of each strategy.  Section 3 of [RFC4459] describes four
   fragmentation strategies and evaluates the relative merits of each.

   This memo documents a GRE fragmentation strategy that has been
   implemented by many vendors and deployed in many networks.  It was
   written so that a) implementors will be aware of best common practice
   and b) those who rely on GRE will understand how implementations
   work.  The scope of this memo is limited to point-to-point GRE
   tunnels.  All other tunnel types are beyond the scope of this memo.

   This memo specifies requirements beyond those stated in [RFC2784].
   However, it does not update [RFC2784].  Therefore, a GRE
   implementation can be compliant with [RFC2784] without satisfying the
   requirements of this memo.

1.1.  How To Use This Document

   This memo is presented in sections.  Section 2 reviews four
   fragmentation strategies presented in [RFC4459] and provides an
   overview the strategy described herein.

   Section 3 defines generic requirements for GRE ingress routers.
   These include compliance with the specifications of [RFC2784] and
   Tunnel MTU Estimation and Discovery.

   Section 4 defines procedures affecting generation of the GRE delivery
   header.  It is divided into two subsections.  Section 4.1 is
   applicable when GRE is delivered over IPv4 [RFC0791] and Section 4.2
   is applicable when GRE is delivered over IPv6 [RFC2460].

   Section 5 defines procedures for handling payloads that are so large
   that they cannot be forwarded through the GRE tunnel without
   fragmentation.  Section 5.1 is applicable when the payload is IPv4,
   Section 5.2 is applicable when the payload is IPv6 and Section 5.3 is
   applicable with the payload is MPLS.

   Section 6 discusses IANA considerations and Section 7 discusses
   security considerations.

1.2.  Terminology

   The following terms are specific to GRE and are taken from [RFC2784]:

   o  GRE delivery header - an IPv4 or IPv6 header whose source address
      is that of the GRE ingress and whose destination address is that
      of the GRE egress.  The GRE delivery header encapsulates a GRE
      header.



Bonica & Pignataro      Expires December 23, 2013               [Page 3]


Internet-Draft              GRE Fragmentation                  June 2013


   o  GRE header - the GRE protocol header.  The GRE header is
      encapsulated in the GRE delivery header and encapsulates GRE
      payload.

   o  GRE payload - a network layer packet that is encapsulated by the
      GRE header.  The GRE payload can be IPv4, IPv6 or MPLS.
      Procedures for encapsulating IPv4 and IPv6 in GRE are described in
      [RFC2784].  Procedures for encapsulating MPLS in GRE are described
      in [RFC4023].  While other protocols may be delivered over GRE,
      they are beyond the scope of this document.

   o  GRE payload header - the IPv4, IPv6 or MPLS header of the GRE
      payload

   o  GRE overhead - the combined size of the GRE delivery header and
      the GRE header, measured in octets

   The following terms are specific MTU discovery:

   o  link MTU (LMTU) - the maximum transmission unit, i.e., maximum
      packet size in octets, that can be conveyed over a link.  LMTU is
      a unidirectional metric.  A bidirectional link may be
      characterized by one LMTU in the forward direction and another MTU
      in the reverse direction.

   o  path MTU (PMTU) - the minimum LMTU of all the links in a path
      between a source node and a destination node.  If the source and
      destination node are connected through an equal cost multipath
      (ECMP), the PMTU is equal to the minimum LMTU of all links
      contributing to the multipath.

   o  tunnel MTU (TMTU) - the maximum transmission unit, i.e., maximum
      packet size in octets, that can be conveyed over a GRE tunnel
      without fragmentation.  The TMTU is equal to the PMTU associated
      with the path between the tunnel ingress and the tunnel egress,
      minus the GRE overhead

   o  Path MTU Discovery (PMTUD) - A procedure for dynamically
      discovering the PMTU between two nodes on the Internet.  PMTUD
      procedures rely on a router's ability to deliver ICMP feedback to
      the host that originated a packet.  PMTUD procedures for IPv4 are
      defined in [RFC1191].  PMTUD procedures for IPv6 are defined in
      [RFC1981].

   o  Packetization Layer MTU Discovery (PLMTUD) - An extension of PMTUD
      that is designed to operate correctly in the absence of ICMP
      feedback from a router to the host that originated a packet.
      PLMTUD procedures are defined in [RFC4821]



Bonica & Pignataro      Expires December 23, 2013               [Page 4]


Internet-Draft              GRE Fragmentation                  June 2013


   The following terms are introduced by this memo:

   o  fragmentable packet - all IPv4 packets with DF-bit equal to 0

   o  non-fragmentable packet - all IPv4 packets with DF-bit equal to 1.
      Also, for the purposes of this document, all IPv6 packets are
      considered to be non-fragmentable.

2.  Candidate Strategies and Strategic Overview

2.1.  Candidate Strategies

   Section 3 of [RFC4459] identifies the following tunnel fragmentation
   strategies:

   1.  Fragmentation and Reassembly by the Tunnel Endpoints

   2.  Signalling the Lower MTU to the Sources

   3.  Encapsulate Only When There is Free MTU

   4.  Fragmentation of the Inner Packet

   In Strategy 1, the tunnel ingress router encapsulates the entire
   payload, without fragmentation, into a single GRE-delivery packet.
   It then fowards the GRE-delivery packet in the direction of the
   tunnel egress.  If the GRE-delivery packet exceeds the LMTU of any
   link along the path to the tunnel egress, the router directly
   upstream of that link fragments it.  The tunnel egress router
   reassembles the GRE-delivery packet, de-encapsulates its payload, and
   processes the payload appropriately.

   In Strategy 2, the tunnel ingress router performs PMTUD procedures or
   some variant thereof (e.g., PLMTUD).  When the tunnel ingress router
   receives a non-fragmentable IPv4 packet so large that it cannot be
   forwarded through the tunnel, it discards the packet and sends an
   ICMPv4 [RFC0792] Destination Unreachable message to the packet
   source, with type equal to 4 (fragmentation needed and DF set).  The
   ICMP Destination Unreachable message contains a Next-hop MTU (as
   specified by [RFC1191]) and the next-hop MTU is equal to the TMTU
   associated with the tunnel.  If the ICMPv4 message reaches the packet
   source, and if the packet source executes PMTUD procedures, the
   packet source adjusts its PMTU for the packet destination and emits
   subsequent packets with size less than the TMTU.

   In Strategy 3, the network is engineered so that all network ingress
   links have LMTU less than the TMTU of any tunnel contained by the
   network.  In this case, all packets entering the network are small



Bonica & Pignataro      Expires December 23, 2013               [Page 5]


Internet-Draft              GRE Fragmentation                  June 2013


   enough to be forwarded through any tunnel contained by the network,
   without fragmentation.  The entire issue is thus avoided.

   In Strategy 4, the tunnel ingress router performs PMTUD procedures or
   some variant thereof (e.g., PLMTUD).  When the tunnel ingress router
   receives a fragmentable IPv4 packet so large that it cannot be
   forwarded through the tunnel without fragmentation, it fragments the
   payload and encapsulates each payload fragment in to a complete,
   separate GRE-delivery packet.  It forwards those complete packets to
   the tunnel egress router which de-encapsulates them and forwards each
   payload fragment, individually and without re-assembly, to the
   payload destination.  The payload destination reassembles packet.

   Strategy 3 is attractive because it avoids fragmentation.  However,
   networks cannot always be designed to meet the requirements of
   Strategy 3.  When this is the case, Strategies 1, 2 and 4 become
   applicable.

   Strategy 2 is also attractive, because it avoids fragmentation.
   However, Strategy 2 requires the payload source and the tunnel egress
   to execute PMTUD procedures.  PMTUD procedures require ICMP feedback
   from downstream routers and fail when the network blocks required
   ICMP messages.  Therefore, Strategy 2 can cause blackholing in
   networks that block ICMP.

   Strategy 1 is an attractive alternative to Strategy 1, because it
   does not rely on PMTUD.  However, Strategy 1 may not be feasible in
   many operational environments because it assigns the task of
   reassembly to the tunnel egress router.  When the tunnel supports
   high data rates, reassembly at the tunnel egress is not cost-
   effective.

   Strategy 4 moves the task of packet reassembly from the tunnel egress
   to the payload destination.  However, it is applicable only when the
   payload is fragmentable.  Furthermore, it requires the tunnel ingress
   router to perform PMTUD procedures and fails when the network blocks
   ICMP messages from tunnel interior to the tunnel ingress.

2.2.  Strategic Overview

   The fragmentation strategy described herein, has two modes of
   operation.  The default mode resembles Strategies 2 and 4, above.
   When a GRE ingress router runs in the default mode, and it receives a
   non-fragmentable packet that is too large to forward through the
   tunnel, it behaves as described in Strategy 2, above.  When the it
   receives a fragmentable packet that is too large to forward through
   the tunnel, it behaves as described in Strategy 4, above.  In neither
   case will the GRE ingress router fragment the GRE-delivery packet.



Bonica & Pignataro      Expires December 23, 2013               [Page 6]


Internet-Draft              GRE Fragmentation                  June 2013


   When GRE is delivered over IPv4, the DF-bit on the delivery header is
   always set to 1 (Don't Fragment).

   Default mode operation is desirable with the following conditions are
   true:

   o  the payload source supports PMTUD procedures

   o  the tunnel ingress supports PMTUD procedures

   o  the network does not block ICMP messages required by PMTUD

   Realizing that some devices do not support PMTUD and that some
   networks indiscriminately block ICMP messages, the fragmentation
   strategy described herein includes a non-default mode, which
   incorporates some characteristics of Strategy 1, above.

   When a GRE ingress router runs in the non-default mode, and it
   receives a non-fragmentable packet that is too large to forward
   through the tunnel, it behaves as described in Strategy 2, above.
   When the it receives a fragmentable packet that is too large to
   forward through the tunnel, it behaves as described in Strategy 4,
   above.  In neither case will the GRE ingress router fragment the GRE-
   delivery packet.  In this respect, the default and non-default modes
   are identical to one another.

   However, if the ingress router delivers fragmentable payload over
   IPv4, it copies the DF-bit value from the payload header to the
   delivery header.  Therefore, the GRE delivery packet may be
   fragmented by any router between the GRE ingress and egress.  When
   this occurs, the GRE delivery packet is reassembled by the GRE
   egress.

   The non-default mode of operation is desirable in some scenarios
   where networks block ICMP messages required by PMTUD.

3.  Generic Requirements for GRE Ingress Routers

   This section defines procedures that all GRE ingress routers must
   execute.

3.1.  General

   Implementations MUST satisfy all of the requirements stated in
   [RFC2784].

3.2.  Tunnel MTU (TMTU) Estimation and Discovery




Bonica & Pignataro      Expires December 23, 2013               [Page 7]


Internet-Draft              GRE Fragmentation                  June 2013


   Implementations MUST maintain a running TMTU estimate.  The TMTU
   associated with a tunnel MUST NOT, at any time, be greater than the
   LMTU associated with the next-hop towards the tunnel egress minus the
   GRE overhead.

   Implementations SHOULD execute either PMTUD or PLMTUD procedures to
   further refine their TMTU estimate.  If they do so, they MUST set the
   TMTU to a value that is less than or equal to the discovered PMTU
   minus the GRE overhead.

   However, if an implementation supports PMTUD or PLMTUD for GRE
   tunnels, it MUST include a configuration option that disables those
   procedures.  This configuration option may be required to mitigate
   certain denial of service attacks (see Section 7).  When PMTUD is
   disabled, the TMTU MUST be set to a value that is less than or equal
   to the LMTU associated with the next-hop towards tunnel egress, minus
   the GRE overhead.

   The ingress router's TMTU estimate will not always reflect the actual
   TMTU.  It is only an estimate.  When the TMTU associated with a
   tunnel changes, the tunnel ingress router will not discover that
   change immediately.  Likewise, if the ingress router performs PMTUD
   procedures and tunnel interior routers cannot deliver ICMP feedback
   to the tunnel ingress, TMTU estimates may be inaccurate.

4.  Procedures Affecting The GRE Deliver Header

   This section defines procedures that GRE ingress routers execute
   while generating the GRE delivery header.

4.1.  Tunneling GRE Over IPv4

   By default, the GRE ingress router MUST set the DF-bit in the
   delivery header to 1 (Don't Fragment).  Also, by default, the GRE
   ingress router MUST NOT emit a delivery header with MF-bit equal to 1
   (More Fragments) or Offset greater than 0.

   However, the GRE ingress router MUST support a configuration option
   that invokes the following behavior:

   o  when the GRE payload is IPv6, the DF-bit on the delivery header is
      set to 1 (Don't Fragment)

   o  when the GRE payload is IPv4, the DF-bit value is copied from the
      payload header to the delivery header

   When the DF-bit on the delivery header is set to 0, the GRE delivery
   packet may be fragmented by any router between the GRE ingress and



Bonica & Pignataro      Expires December 23, 2013               [Page 8]


Internet-Draft              GRE Fragmentation                  June 2013


   egress and the GRE delivery packet will be reassembled by the GRE
   egress.

4.2.  Tunneling GRE Over IPv6

   The GRE ingress router MUST NOT emit a delivery header containing a
   fragment header.

5.  Procedures Affecting the GRE Payoad

   This section defines procedures that GRE ingress routers execute when
   they receive a packet a) whose next-hop is a GRE tunnel and b) whose
   size is greater than the TMTU associated with that tunnel.

5.1.  IPv4 Payloads

   If the payload is non-fragmentable, the GRE ingress router MUST
   discard the packet and send an ICMPv4 Destination Unreachable message
   to the payload source, with type equal to 4 (fragmentation needed and
   DF set).  The ICMP Destination Unreachable message MUST contain an
   Next-hop MTU (as specified by [RFC1191]) and the next-hop MTU MUST be
   equal to the TMTU associated with the tunnel.

   If the payload is fragmentable, the GRE ingress router MUST fragment
   the payload and submit each fragment to GRE tunnel.  Therefore, the
   GRE egress router will receive complete, non-fragmented packets,
   containing fragmented payloads.  The GRE egress router will forward
   the payload fragments to their ultimate destination where they will
   be reassembled.

5.2.  IPv6 Payloads

   The GRE ingress router MUST discard the packet and send an ICMPv6
   [RFC4443] Packet Too Big message to the payload source.  The MTU
   specified in the Packet Too Big message MUST be equal to the TMTU
   associated with the tunnel.

5.3.  MPLS Payloads

   The GRE ingress router MUST discard the packet.  As it is impossible
   to reliably identify the payload source, the GRE ingress router MUST
   NOT attempt to send an ICMPv4 Destination Unreachable message or an
   ICMPv6 Packet Too Big message to the payload source.

6.  IANA Considerations

   This document makes no request of IANA.




Bonica & Pignataro      Expires December 23, 2013               [Page 9]


Internet-Draft              GRE Fragmentation                  June 2013


7.  Security Considerations

   PMTU Discovery is vulnerable to two denial of service attacks (see
   Section 8 of [RFC1191] for details).  Both attacks are based upon on
   a malicious party sending forged ICMPv4 Destination Unreachable or
   ICMPv6 Packet Too Big messages to a host.  In the first attack, the
   forged message indicates an inordinately small PMTU.  In the second
   attack, the forged message indicates an inordinately large MTU.  In
   both cases, throughput is adversely affected.  On order to mitigate
   such attacks, GRE implementations MUST include a configuration option
   to disable PMTU discovery on GRE tunnels.  Also, they MAY include a
   configuration option that conditions the behavior of PMTUD to
   establish a minimum PMTU.

8.  Acknowledgements

   The authors would like to thank Jagadish Grandhi, Jeff Haas, John
   Scudder, Mike Sullenberger and Wen Zhang for their constructive
   comments.  The authors also express their gratitude to an anonymous
   donor, without whom this document would not have been written.

9.  References

9.1.  Normative References

   [RFC0791]  Postel, J., "Internet Protocol", STD 5, RFC 791, September
              1981.

   [RFC0792]  Postel, J., "Internet Control Message Protocol", STD 5,
              RFC 792, September 1981.

   [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
              November 1990.

   [RFC1981]  McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery
              for IP version 6", RFC 1981, August 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2460]  Deering, S. and R. Hinden, "Internet Protocol, Version 6
              (IPv6) Specification", RFC 2460, December 1998.

   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
              March 2000.





Bonica & Pignataro      Expires December 23, 2013              [Page 10]


Internet-Draft              GRE Fragmentation                  June 2013


   [RFC4023]  Worster, T., Rekhter, Y., and E. Rosen, "Encapsulating
              MPLS in IP or Generic Routing Encapsulation (GRE)", RFC
              4023, March 2005.

   [RFC4443]  Conta, A., Deering, S., and M. Gupta, "Internet Control
              Message Protocol (ICMPv6) for the Internet Protocol
              Version 6 (IPv6) Specification", RFC 4443, March 2006.

   [RFC4821]  Mathis, M. and J. Heffner, "Packetization Layer Path MTU
              Discovery", RFC 4821, March 2007.

9.2.  Informative References

   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
              Network Tunneling", RFC 4459, April 2006.

Authors' Addresses

   Ron Bonica
   Juniper Networks
   2251 Corporate Park Drive Herndon
   Herndon, Virginia  20170
   USA

   Email: rbonica@juniper.net


   Carlos Pignataro
   Cisco Systems
   7200-12 Kit Creek Road
   Research Triangle Park, North Carolina  27709
   USA

   Email: cpignata@cisco.com

















Bonica & Pignataro      Expires December 23, 2013              [Page 11]


Html markup produced by rfcmarkup 1.129b, available from https://tools.ietf.org/tools/rfcmarkup/