< draft-ietf-intarea-frag-fragile-10.txt   draft-ietf-intarea-frag-fragile-11.txt >
Internet Area WG R. Bonica Internet Area WG R. Bonica
Internet-Draft Juniper Networks Internet-Draft Juniper Networks
Intended status: Best Current Practice F. Baker Intended status: Best Current Practice F. Baker
Expires: November 15, 2019 Unaffiliated Expires: December 16, 2019 Unaffiliated
G. Huston G. Huston
APNIC APNIC
R. Hinden R. Hinden
Check Point Software Check Point Software
O. Troan O. Troan
Cisco Cisco
F. Gont F. Gont
SI6 Networks SI6 Networks
May 14, 2019 June 14, 2019
IP Fragmentation Considered Fragile IP Fragmentation Considered Fragile
draft-ietf-intarea-frag-fragile-10 draft-ietf-intarea-frag-fragile-11
Abstract Abstract
This document describes IP fragmentation and explains how it This document describes IP fragmentation and explains how it
introduces fragility to Internet communication. introduces fragility to Internet communication.
This document also proposes alternatives to IP fragmentation and This document also proposes alternatives to IP fragmentation and
provides recommendations for developers and network operators. provides recommendations for developers and network operators.
Status of This Memo Status of This Memo
skipping to change at page 1, line 43 skipping to change at page 1, line 43
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 15, 2019. This Internet-Draft will expire on December 16, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 46 skipping to change at page 2, line 46
4.7.4. Persistent Loss Caused By Unidirectional Routing . . 14 4.7.4. Persistent Loss Caused By Unidirectional Routing . . 14
4.8. Blackholing Due To Filtering or Loss . . . . . . . . . . 14 4.8. Blackholing Due To Filtering or Loss . . . . . . . . . . 14
5. Alternatives to IP Fragmentation . . . . . . . . . . . . . . 15 5. Alternatives to IP Fragmentation . . . . . . . . . . . . . . 15
5.1. Transport Layer Solutions . . . . . . . . . . . . . . . . 15 5.1. Transport Layer Solutions . . . . . . . . . . . . . . . . 15
5.2. Application Layer Solutions . . . . . . . . . . . . . . . 16 5.2. Application Layer Solutions . . . . . . . . . . . . . . . 16
6. Applications That Rely on IPv6 Fragmentation . . . . . . . . 17 6. Applications That Rely on IPv6 Fragmentation . . . . . . . . 17
6.1. Domain Name Service (DNS) . . . . . . . . . . . . . . . . 17 6.1. Domain Name Service (DNS) . . . . . . . . . . . . . . . . 17
6.2. Open Shortest Path First (OSPF) . . . . . . . . . . . . . 18 6.2. Open Shortest Path First (OSPF) . . . . . . . . . . . . . 18
6.3. Packet-in-Packet Encapsulations . . . . . . . . . . . . . 18 6.3. Packet-in-Packet Encapsulations . . . . . . . . . . . . . 18
6.4. UDP Applications Enhancing Performance . . . . . . . . . 18 6.4. UDP Applications Enhancing Performance . . . . . . . . . 18
7. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 18 7. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 19
7.1. For Application and Protocol Developers . . . . . . . . . 18 7.1. For Application and Protocol Developers . . . . . . . . . 19
7.2. For System Developers . . . . . . . . . . . . . . . . . . 19 7.2. For System Developers . . . . . . . . . . . . . . . . . . 19
7.3. For Middle Box Developers . . . . . . . . . . . . . . . . 19 7.3. For Middle Box Developers . . . . . . . . . . . . . . . . 19
7.4. For ECMP, LAG and Load-Balancer Developers And Operators 19 7.4. For ECMP, LAG and Load-Balancer Developers And Operators 20
7.5. For Network Operators . . . . . . . . . . . . . . . . . . 20 7.5. For Network Operators . . . . . . . . . . . . . . . . . . 20
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21
9. Security Considerations . . . . . . . . . . . . . . . . . . . 20 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 21
11.1. Normative References . . . . . . . . . . . . . . . . . . 21 11.1. Normative References . . . . . . . . . . . . . . . . . . 21
11.2. Informative References . . . . . . . . . . . . . . . . . 22 11.2. Informative References . . . . . . . . . . . . . . . . . 23
Appendix A. Contributors' Address . . . . . . . . . . . . . . . 25 Appendix A. Contributors' Address . . . . . . . . . . . . . . . 26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26
1. Introduction 1. Introduction
Operational experience [Kent] [Huston] [RFC7872] reveals that IP Operational experience [Kent] [Huston] [RFC7872] reveals that IP
fragmentation introduces fragility to Internet communication. This fragmentation introduces fragility to Internet communication. This
document describes IP fragmentation and explains how it introduces document describes IP fragmentation and explains the fragility it
fragility to Internet communication. This document also proposes introduces. It also proposes alternatives to IP fragmentation and
alternatives to IP fragmentation and provides recommendations for provides recommendations for developers and network operators.
developers and network operators.
While this document identifies issues associated with IP While this document identifies issues associated with IP
fragmentation, it does not recommend deprecation. Some applications fragmentation, it does not recommend deprecation. Legacy protocols
(see Section 6) require IP fragmentation. Furthermore, fragmentation that depend upon IP fragmentation SHOULD be updated to break that
is expected to work in domains where security and interoperability dependency. However, some applications and environments (see
issues are addressed. Section 6) require IP fragmentation. In these cases, the protocol
will continue to rely on IP fragmentation, but the designer should to
be aware that fragmented packets may result in blackholes; a design
should include appropriate safeguards (e.g. PLPMTU).
Rather than deprecating IP Fragmentation, this document recommends Rather than deprecating IP Fragmentation, this document recommends
that upper-layer protocols address the problem of fragmentation at that upper-layer protocols address the problem of fragmentation at
their layer, reducing their reliance on IP fragmentation to the their layer, reducing their reliance on IP fragmentation to the
greatest degree possible. greatest degree possible.
1.1. IP-in-IP Tunnels 1.1. IP-in-IP Tunnels
This document acknowledges that in some cases, packets must be This document acknowledges that in some cases, packets must be
fragmented within IP-in-IP tunnels [I-D.ietf-intarea-tunnels]. fragmented within IP-in-IP tunnels [I-D.ietf-intarea-tunnels].
Therefore, this document makes no recommendations regarding IP-in-IP Therefore, this document makes no additional recommendations
tunnels. regarding IP-in-IP tunnels.
2. IP Fragmentation 2. IP Fragmentation
2.1. Links, Paths, MTU and PMTU 2.1. Links, Paths, MTU and PMTU
An Internet path connects a source node to a destination node. A An Internet path connects a source node to a destination node. A
path can contain links and routers. If a path contains more than one path can contain links and routers. If a path contains more than one
link, the links are connected in series and a router connects each link, the links are connected in series and a router connects each
link to the next. link to the next.
skipping to change at page 4, line 13 skipping to change at page 4, line 15
routers. routers.
Each link is constrained by the number of bytes that it can convey in Each link is constrained by the number of bytes that it can convey in
a single IP packet. This constraint is called the link Maximum a single IP packet. This constraint is called the link Maximum
Transmission Unit (MTU). IPv4 [RFC0791] requires every link to Transmission Unit (MTU). IPv4 [RFC0791] requires every link to
support a specified MTU (see NOTE 1). IPv6 [RFC8200] requires every support a specified MTU (see NOTE 1). IPv6 [RFC8200] requires every
link to support an MTU of 1280 bytes or greater. These are called link to support an MTU of 1280 bytes or greater. These are called
the IPv4 and IPv6 minimum link MTU's. the IPv4 and IPv6 minimum link MTU's.
Likewise, each Internet path is constrained by the number of bytes Likewise, each Internet path is constrained by the number of bytes
that it can convey in a IP single packet. This constraint is called that it can convey in a single IP packet. This constraint is called
the Path MTU (PMTU). For any given path, the PMTU is equal to the the Path MTU (PMTU). For any given path, the PMTU is equal to the
smallest of its link MTU's. Because Internet paths are dynamic, PMTU smallest of its link MTU's. Because Internet paths are dynamic, PMTU
is also dynamic. is also dynamic.
For reasons described below, source nodes estimate the PMTU between For reasons described below, source nodes estimate the PMTU between
themselves and destination nodes. A source node can produce themselves and destination nodes. A source node can produce
extremely conservative PMTU estimates in which: extremely conservative PMTU estimates in which:
o The estimate for each IPv4 path is equal to the IPv4 minimum link o The estimate for each IPv4 path is equal to the IPv4 minimum link
MTU. MTU.
skipping to change at page 4, line 52 skipping to change at page 5, line 6
of these packets is larger than the actual PMTU, a downstream router of these packets is larger than the actual PMTU, a downstream router
will not be able to forward the packet through the next link along will not be able to forward the packet through the next link along
the path. Therefore, the downstream router drops the packet and the path. Therefore, the downstream router drops the packet and
sends an Internet Control Message Protocol (ICMP) [RFC0792] [RFC4443] sends an Internet Control Message Protocol (ICMP) [RFC0792] [RFC4443]
Packet Too Big (PTB) message to the source node (see NOTE 3). The Packet Too Big (PTB) message to the source node (see NOTE 3). The
ICMP PTB message indicates the MTU of the link through which the ICMP PTB message indicates the MTU of the link through which the
packet could not be forwarded. The source node uses this information packet could not be forwarded. The source node uses this information
to refine its PMTU estimate. to refine its PMTU estimate.
PMTUD produces a running estimate of the PMTU between a source node PMTUD produces a running estimate of the PMTU between a source node
and a destination node. Because PMTU is dynamic, at any given time, and a destination node. Because PMTU is dynamic, the PMTU estimate
the PMTU estimate can differ from the actual PMTU. In order to can be larger than the actual PMTU. In order to detect PMTU
detect PMTU increases, PMTUD occasionally resets the PMTU estimate to increases, PMTUD occasionally resets the PMTU estimate to its initial
its initial value and repeats the procedure described above. value and repeats the procedure described above.
Ideally, PMTUD operates as described above. However, in some Ideally, PMTUD operates as described above. However, in some
scenarios, PMTUD fails. For example: scenarios, PMTUD fails. For example:
o PMTUD relies on the network's ability to deliver ICMP PTB messages o PMTUD relies on the network's ability to deliver ICMP PTB messages
to the source node. If the network cannot deliver ICMP PTB to the source node. If the network cannot deliver ICMP PTB
messages to the source node, PMTUD fails. messages to the source node, PMTUD fails.
o PMTUD is susceptible to attack because ICMP messages are easily o PMTUD is susceptible to attack because ICMP messages are easily
forged [RFC5927] and not authenticated by the receiver. Such forged [RFC5927] and not authenticated by the receiver. Such
attacks can cause PMTUD to produce unnecessarily conservative PMTU attacks can cause PMTUD to produce unnecessarily conservative PMTU
estimates. estimates.
NOTE 1: In IPv4, every host must be capable of receiving a packet NOTE 1: In IPv4, every host must be capable of receiving a packet
whose length is equal to 576 bytes. However, the IPv4 minimum link whose length is equal to 576 bytes. However, the IPv4 minimum link
MTU is not 576. Section 3.2 of RFC 791 explicitly states that the MTU is not 576. Section 3.2 of RFC 791 explicitly states that the
IPv4 minimum link MTU is 68 bytes. But for practical purposes, many IPv4 minimum link MTU is 68 bytes. But for practical purposes, many
network operators consider the IPv4 minimum link MTU to be 576 bytes. network operators consider the IPv4 minimum link MTU to be 576 bytes,
So, for the purposes of this document, we assume that the IPv4 to minimize the requirement for fragmentation en route. So, for the
minimum path MTU is 576 bytes. purposes of this document, we assume that the IPv4 minimum path MTU
is 576 bytes.
NOTE 2: A non-fragmentable packet can be fragmented at its source. NOTE 2: A non-fragmentable packet can be fragmented at its source.
However, it cannot be fragmented by a downstream node. An IPv4 However, it cannot be fragmented by a downstream node. An IPv4
packet whose DF-bit is set to zero is fragmentable. An IPv4 packet packet whose DF-bit is set to zero is fragmentable. An IPv4 packet
whose DF-bit is set to one is non-fragmentable. All IPv6 packets are whose DF-bit is set to one is non-fragmentable. All IPv6 packets are
also non-fragmentable. also non-fragmentable.
NOTE 3:: The ICMP PTB message has two instantiations. In ICMPv4 NOTE 3:: The ICMP PTB message has two instantiations. In ICMPv4
[RFC0792], the ICMP PTB message is Destination Unreachable message [RFC0792], the ICMP PTB message is a Destination Unreachable message
with Code equal to (4) fragmentation needed and DF set. This message with Code equal to (4) fragmentation needed and DF set. This message
was augmented by [RFC1191] to indicate the MTU of the link through was augmented by [RFC1191] to indicate the MTU of the link through
which the packet could not be forwarded. In ICMPv6 [RFC4443], the which the packet could not be forwarded. In ICMPv6 [RFC4443], the
ICMP PTB message is a Packet Too Big Message with Code equal to (0). ICMP PTB message is a Packet Too Big Message with Code equal to (0).
This message also indicates the MTU of the link through which the This message also indicates the MTU of the link through which the
packet could not be forwarded. packet could not be forwarded.
2.2. Fragmentation Procedures 2.2. Fragmentation Procedures
When an upper-layer protocol submits data to the underlying IP When an upper-layer protocol submits data to the underlying IP
skipping to change at page 7, line 13 skipping to change at page 7, line 15
o Access the estimate that PMTUD produced. o Access the estimate that PMTUD produced.
o Execute PMTUD procedures themselves. o Execute PMTUD procedures themselves.
o Execute Packetization Layer PMTUD (PLPMTUD) [RFC4821] o Execute Packetization Layer PMTUD (PLPMTUD) [RFC4821]
[I-D.ietf-tsvwg-datagram-plpmtud] procedures. [I-D.ietf-tsvwg-datagram-plpmtud] procedures.
According to PLPMTUD procedures, the upper-layer protocol maintains a According to PLPMTUD procedures, the upper-layer protocol maintains a
running PMTU estimate. It does so by sending probe packets of running PMTU estimate. It does so by sending probe packets of
various sizes to its upper-layer peer and receiving acknowledgements. various sizes to its upper-layer peer and receiving acknowledgements.
This strategy differs from PMTUD in that it relies of acknowledgement This strategy differs from PMTUD in that it relies on acknowledgement
of received messages, as opposed to ICMP PTB messages concerning of received messages, as opposed to ICMP PTB messages concerning
dropped messages. Therefore, PLPMTUD does not rely on the network's dropped messages. Therefore, PLPMTUD does not rely on the network's
ability to deliver ICMP PTB messages to the source. ability to deliver ICMP PTB messages to the source.
3. Requirements Language 3. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
skipping to change at page 9, line 24 skipping to change at page 9, line 24
computationally expensive and because it is prone to attacks computationally expensive and because it is prone to attacks
(Section 4.6). (Section 4.6).
NOTE 1: Virtual reassembly is a procedure in which a device NOTE 1: Virtual reassembly is a procedure in which a device
reassembles a packet, forwards its fragments, and discards the reassembles a packet, forwards its fragments, and discards the
reassembled copy. In A+P and CGN, virtual reassembly is required in reassembled copy. In A+P and CGN, virtual reassembly is required in
order to correctly translate fragment addresses. order to correctly translate fragment addresses.
4.3. Stateless Firewalls 4.3. Stateless Firewalls
IP fragmentation causes problems for stateless firewalls whose rules As discussed in more detail in Section 4.6, IP fragmentation causes
include TCP and UDP ports. Because port information is not available problems for stateless firewalls whose rules include TCP and UDP
in the trailing fragments the firewall is limited to the following ports. Because port information is not available in the trailing
options: fragments the firewall is limited to the following options:
o Accept all trailing fragments, possibly admitting certain classes o Accept all trailing fragments, possibly admitting certain classes
of attack. of attack.
o Block all trailing fragments, possibly blocking legitimate o Block all trailing fragments, possibly blocking legitimate
traffic. traffic.
Neither option is attractive. Neither option is attractive.
4.4. Equal Cost Multipath, Link Aggregate Groups and Stateless Load- 4.4. Equal Cost Multipath, Link Aggregate Groups and Stateless Load-
skipping to change at page 11, line 15 skipping to change at page 11, line 15
These reassembly issues are not easily reproducible in IPv6 because These reassembly issues are not easily reproducible in IPv6 because
the IPv6 identification field is 32 bits long. the IPv6 identification field is 32 bits long.
4.6. Security Vulnerabilities 4.6. Security Vulnerabilities
Security researchers have documented several attacks that exploit IP Security researchers have documented several attacks that exploit IP
fragmentation. The following are examples: fragmentation. The following are examples:
o Overlapping fragment attacks [RFC1858][RFC3128][RFC5722] o Overlapping fragment attacks [RFC1858][RFC3128][RFC5722]
o Resource exhaustion attacks (such as the Rose Attack) o Resource exhaustion attacks (such as the Rose Attack,
http://www.digital.net/~gandalf/Rose_Frag_Attack_Explained.htm)
o Attacks based on predictable fragment identification values o Attacks based on predictable fragment identification values
[RFC7739] [RFC7739]
o Evasion of Network Intrusion Detection Systems (NIDS) [Ptacek1998] o Evasion of Network Intrusion Detection Systems (NIDS) [Ptacek1998]
In the overlapping fragment attack, an attacker constructs a series In the overlapping fragment attack, an attacker constructs a series
of packet fragments. The first fragment contains an IP header, a of packet fragments. The first fragment contains an IP header, a
transport-layer header, and some transport-layer payload. This transport-layer header, and some transport-layer payload. This
fragment complies with local security policy and is allowed to pass fragment complies with local security policy and is allowed to pass
skipping to change at page 12, line 33 skipping to change at page 12, line 34
transient and persistent loss. transient and persistent loss.
Transient loss of ICMP PTB messages can cause transient PMTU black Transient loss of ICMP PTB messages can cause transient PMTU black
holes. When the conditions contributing to transient loss abate, the holes. When the conditions contributing to transient loss abate, the
network regains its ability to deliver ICMP PTB messages and network regains its ability to deliver ICMP PTB messages and
connectivity between the source and destination nodes is restored. connectivity between the source and destination nodes is restored.
Section 4.7.1 of this document describes conditions that lead to Section 4.7.1 of this document describes conditions that lead to
transient loss of ICMP PTB messages. transient loss of ICMP PTB messages.
Persistent loss of ICMP PTB messages can cause persistent black Persistent loss of ICMP PTB messages can cause persistent black
holes. Section 4.7.2 and Section 4.7.3 of this document describe holes. Section 4.7.2, Section 4.7.3, and Section 4.7.4 of this
conditions that lead to persistent loss of ICMP PTB messages. document describe conditions that lead to persistent loss of ICMP PTB
messages.
The problem described in this section is specific to PMTUD. It does The problem described in this section is specific to PMTUD. It does
not occur when the upper-layer protocol obtains its PMTU estimate not occur when the upper-layer protocol obtains its PMTU estimate
from PLPMTUD or from any other source. from PLPMTUD or from any other source.
4.7.1. Transient Loss 4.7.1. Transient Loss
The following factors can contribute to transient loss of ICMP PTB The following factors can contribute to transient loss of ICMP PTB
messages: messages:
skipping to change at page 15, line 39 skipping to change at page 15, line 43
Manual configuration is always applicable. If the MSS is configured Manual configuration is always applicable. If the MSS is configured
to a sufficiently low value, the IP layer will never produce a packet to a sufficiently low value, the IP layer will never produce a packet
whose length is greater than the protocol minimum link MTU. However, whose length is greater than the protocol minimum link MTU. However,
manual configuration prevents TCP from taking advantage of larger manual configuration prevents TCP from taking advantage of larger
link MTU's. link MTU's.
Upper-layer protocols can implement PMTUD in order to discover and Upper-layer protocols can implement PMTUD in order to discover and
take advantage of larger path MTUs. However, as mentioned in take advantage of larger path MTUs. However, as mentioned in
Section 2.1, PMTUD relies upon the network to deliver ICMP PTB Section 2.1, PMTUD relies upon the network to deliver ICMP PTB
messages. Therefore, PMTUD is applicable only in environments where messages. Therefore, PMTUD can only provide an estimate of the PMTU
the risk of ICMP PTB loss is acceptable. in environments where the risk of ICMP PTB loss is acceptable (e.g.,
known to not be filtered).
By contrast, PLPMTUD does not rely upon the network's ability to By contrast, PLPMTUD does not rely upon the network's ability to
deliver ICMP PTB messages. It utilises probe messages sent as TCP deliver ICMP PTB messages. It utilises probe messages sent as TCP
segments to determine if the probed PMTU can be successfully used segments to determine whether the probed PMTU can be successfully
across the network path. In PLPMTUD, probing is separated from used across the network path. In PLPMTUD, probing is separated from
congestion control, so that loss of a TCP probe segment does not congestion control, so that loss of a TCP probe segment does not
cause a reduction of the congestion control window. [RFC4821] cause a reduction of the congestion control window. [RFC4821]
defines PLPMTUD procedures for TCP. defines PLPMTUD procedures for TCP.
While TCP will never cause the underlying IP module to emit a packet While TCP will never knowingly cause the underlying IP module to emit
that is larger than the PMTU estimate, it can cause the underlying IP a packet that is larger than the PMTU estimate, it can cause the
module to emit a packet that is larger than the actual PMTU. If this underlying IP module to emit a packet that is larger than the actual
occurs, the packet is dropped, the PMTU estimate is updated, the PMTU. For example, if routing changes and as a result the PMTU
segment is divided into smaller segments and each smaller segment is becomes smaller, TCP will not know until the ICMP PTB message
submitted to the underlying IP module. arrives. If this occurs, the packet is dropped, the PMTU estimate is
updated, the segment is divided into smaller segments and each
smaller segment is submitted to the underlying IP module.
The Datagram Congestion Control Protocol (DCCP) [RFC4340]. the Stream The Datagram Congestion Control Protocol (DCCP) [RFC4340] and the
Control Protocol (SCP) [RFC4960], and the Stream Control Transport Stream Control Transport Protocol (SCTP) [RFC4960] also can be
Protocol (SCTP) [RFC4960] also can be operated in a mode that does operated in a mode that does not require IP fragmentation. They both
not require IP fragmentation. They both accept data from an accept data from an application and divide that data into segments,
application and divide that data into segments, with no segment with no segment exceeding a maximum size. </t><t> DCCP offers manual
exceeding a maximum size. Both DCCP and SCP offer manual configuration, PMTUD, and PLPMTUD as mechanisms for managing that
configuration, PMTUD and PLPMTUD as mechanisms for managing that maximum size. Datagram protocols can also implement PLPMTUD to
maximum size. [I-D.ietf-tsvwg-datagram-plpmtud] proposes PLPMTUD estimate the PMTU via[I-D.ietf-tsvwg-datagram-plpmtud]. This
procedures for DCCP and SCP. proposes procedures for performing PLPMTUD with UDP, UDP-Options,
SCTP, QUIC and other datagram protocols.
Currently, User Data Protocol (UDP) [RFC0768] lacks a fragmentation Currently, User Data Protocol (UDP) [RFC0768] lacks a fragmentation
mechanism of its own and relies on IP fragmentation. However, mechanism of its own and relies on IP fragmentation. However,
[I-D.ietf-tsvwg-udp-options] proposes a fragmentation mechanism for [I-D.ietf-tsvwg-udp-options] proposes a fragmentation mechanism for
UDP. UDP.
5.2. Application Layer Solutions 5.2. Application Layer Solutions
[RFC8085] recognizes that IP fragmentation reduces the reliability of [RFC8085] recognizes that IP fragmentation reduces the reliability of
Internet communication. It also recognizes that UDP lacks a Internet communication. It also recognizes that UDP lacks a
skipping to change at page 16, line 47 skipping to change at page 17, line 8
fragmentation." fragmentation."
RFC 8085 continues: RFC 8085 continues:
"Applications that do not follow the recommendation to do PMTU/ "Applications that do not follow the recommendation to do PMTU/
PLPMTUD discovery SHOULD still avoid sending UDP datagrams that would PLPMTUD discovery SHOULD still avoid sending UDP datagrams that would
result in IP packets that exceed the path MTU. Because the actual result in IP packets that exceed the path MTU. Because the actual
path MTU is unknown, such applications SHOULD fall back to sending path MTU is unknown, such applications SHOULD fall back to sending
messages that are shorter than the default effective MTU for sending messages that are shorter than the default effective MTU for sending
(EMTU_S in [RFC1122]). For IPv4, EMTU_S is the smaller of 576 bytes (EMTU_S in [RFC1122]). For IPv4, EMTU_S is the smaller of 576 bytes
and the first-hop MTU. For IPv6, EMTU_S is 1280 bytes. The and the first-hop MTU. For IPv6, EMTU_S is 1280 bytes [RFC8200].
effective PMTU for a directly connected destination (with no routers The effective PMTU for a directly connected destination (with no
on the path) is the configured interface MTU, which could be less routers on the path) is the configured interface MTU, which could be
than the maximum link payload size. Transmission of minimum-sized less than the maximum link payload size. Transmission of minimum-
UDP datagrams is inefficient over paths that support a larger PMTU, sized UDP datagrams is inefficient over paths that support a larger
which is a second reason to implement PMTU discovery." PMTU, which is a second reason to implement PMTU discovery."
RFC 8085 assumes that for IPv4, an EMTU_S of 576 is sufficiently RFC 8085 assumes that for IPv4, an EMTU_S of 576 is sufficiently
small, even though the IPv4 minimum link MTU is 68 bytes. small is sufficiently small to be supported by most current Internet
paths, even though the IPv4 minimum link MTU is 68 bytes.
This advice applies equally to application that run directly over IP. This advice applies equally to any application that runs directly
over IP.
6. Applications That Rely on IPv6 Fragmentation 6. Applications That Rely on IPv6 Fragmentation
The following applications rely on IPv6 fragmentation: The following applications rely on IPv6 fragmentation:
o DNS [RFC1035] o DNS [RFC1035]
o OSPFv3 [RFC2328][RFC5340] o OSPFv3 [RFC2328][RFC5340]
o Packet-in-packet encapsulations o Packet-in-packet encapsulations
skipping to change at page 18, line 38 skipping to change at page 18, line 48
See [I-D.ietf-intarea-tunnels] for further discussion. See [I-D.ietf-intarea-tunnels] for further discussion.
6.4. UDP Applications Enhancing Performance 6.4. UDP Applications Enhancing Performance
Some UDP applications rely on IP fragmentation to achieve acceptable Some UDP applications rely on IP fragmentation to achieve acceptable
levels of performance. These applications use UDP datagram sizes levels of performance. These applications use UDP datagram sizes
that are larger than the path MTU so that more data can be conveyed that are larger than the path MTU so that more data can be conveyed
between the application and the kernel in a single system call. between the application and the kernel in a single system call.
For example, the Licklider Transmission Protocol (LTP) [RFC5326] To pick one example, the Licklider Transmission Protocol (LTP),
which is in current use on the International Space Station (ISS) uses [RFC5326]which is in current use on the International Space Station
UDP datagram sizes larger than the path MTU to achieve acceptable (ISS), uses UDP datagram sizes larger than the path MTU to achieve
levels of performance even though this invokes IP fragmentation. acceptable levels of performance even though this invokes IP
fragmentation. More generally, SNMP and video applications may
transmit an application-layer quantum of data, depending on the
network layer to fragment and reassemble as needed.
7. Recommendations 7. Recommendations
7.1. For Application and Protocol Developers 7.1. For Application and Protocol Developers
Developers SHOULD NOT develop new protocols or applications that rely Developers SHOULD NOT develop new protocols or applications that rely
on IP fragmentation. When a new protocol or application is deployed on IP fragmentation. When a new protocol or application is deployed
in an environment that does not fully support IP fragmentation, it in an environment that does not fully support IP fragmentation, it
SHOULD operate correctly, either in its default configuration or in a SHOULD operate correctly, either in its default configuration or in a
specified alternative configuration. specified alternative configuration.
skipping to change at page 19, line 23 skipping to change at page 19, line 36
rely on IP fragmentation but should only be used in environments rely on IP fragmentation but should only be used in environments
where IP fragmentation is known to be supported. where IP fragmentation is known to be supported.
Protocols may be able to avoid IP fragmentation by using a Protocols may be able to avoid IP fragmentation by using a
sufficiently small MTU (e.g. The protocol minimum link MTU), sufficiently small MTU (e.g. The protocol minimum link MTU),
disabling IP fragmentation, and ensuring that the transport protocol disabling IP fragmentation, and ensuring that the transport protocol
in use adapts its segment size to the MTU. Other protocols may in use adapts its segment size to the MTU. Other protocols may
deploy a sufficiently reliable PMTU discovery mechanism deploy a sufficiently reliable PMTU discovery mechanism
(e.g.,PLMPTUD). (e.g.,PLMPTUD).
UDP applications SHOULD abide by the recommendations state in UDP applications SHOULD abide by the recommendations stated in
Section 3.2 of [RFC8085]. Section 3.2 of [RFC8085].
7.2. For System Developers 7.2. For System Developers
Software libraries SHOULD include provision for PLPMTUD for each Software libraries SHOULD include provision for PLPMTUD for each
supported transport protocol. supported transport protocol.
7.3. For Middle Box Developers 7.3. For Middle Box Developers
Middle boxes should process IP fragments in a manner that is Middle boxes should process IP fragments in a manner that is
skipping to change at page 19, line 48 skipping to change at page 20, line 13
operators to deploy stateless middle boxes. These stateless middle operators to deploy stateless middle boxes. These stateless middle
boxes may perform sub-optimally, process IP fragments in a manner boxes may perform sub-optimally, process IP fragments in a manner
that is not compliant with RFC 791 or RFC 8200, or even discard IP that is not compliant with RFC 791 or RFC 8200, or even discard IP
fragments completely. Such behaviors are NOT RECOMMENDED. If a fragments completely. Such behaviors are NOT RECOMMENDED. If a
middleboxes implements non-standard behavior with respect to IP middleboxes implements non-standard behavior with respect to IP
fragmentation, then that behavior MUST be clearly documented. fragmentation, then that behavior MUST be clearly documented.
7.4. For ECMP, LAG and Load-Balancer Developers And Operators 7.4. For ECMP, LAG and Load-Balancer Developers And Operators
In their default configuration, when the IPv6 Flow Label is not equal In their default configuration, when the IPv6 Flow Label is not equal
to zero, IPv6 devices that implement ECMP, LAG or other load- to zero, IPv6 devices that implement Equal-Cost Multipath (ECMP)
balancing technologies SHOULD accept only the following fields as Routing as described in <xref target="RFC2328">OSPF</xref> and other
input to their hash algorithm: routing protocols, <xref target="RFC7424">Link Aggregation Grouping
(LAG)</xref>, or other load-balancing technologies SHOULD accept only
the following fields as input to their hash algorithm:</t>
o IP Source Address. o IP Source Address.
o IP Destination Address. o IP Destination Address.
o Flow Label. o Flow Label.
Operators SHOULD deploy these devices in their default configuration. <t>Operators SHOULD deploy these devices in their default
configuration.
These recommendations are similar to those presented in [RFC6438] and These recommendations are similar to those presented in [RFC6438] and
[RFC7098]. They differ in that they specify a default configuration. [RFC7098]. They differ in that they specify a default configuration.
7.5. For Network Operators 7.5. For Network Operators
Operators MUST ensure proper PMTUD operation in their network, Operators MUST ensure proper PMTUD operation in their network,
including making sure the network generates PTB packets when dropping including making sure the network generates PTB packets when dropping
packets too large compared to outgoing interface MTU. However, packets too large compared to outgoing interface MTU. However,
implementations MAY rate limit ICMP messages as per [RFC1812] and implementations MAY rate limit ICMP messages as per [RFC1812] and
[RFC4443]. [RFC4443].
As per RFC 4890, network operators MUST NOT filter ICMPv6 PTB As per RFC 4890, network operators MUST NOT filter ICMPv6 PTB
messages unless they are known to be forged or otherwise messages unless they are known to be forged or otherwise
illegitimate. As stated in Section 4.7, filtering ICMPv6 PTB packets illegitimate. As stated in Section 4.7, filtering ICMPv6 PTB packets
causes PMTUD to fail. Many upper-layer protocols rely on PMTUD. causes PMTUD to fail. Many upper-layer protocols rely on PMTUD.
As per RFC 8200, network operators MUST NOT deploy IPv6 links whose As per RFC 8200, network operators MUST NOT deploy IPv6 links whose
MTU is less than 1280 bytes. MTU is less than 1280 bytes.
Network operators SHOULD NOT filter IP fragments if they originated Network operators SHOULD NOT filter IP fragments if they are known to
at a domain name server or are destined for a domain name server. have originated at a domain name server or be destined for a domain
This is because domain name services are critical to operation of the name server. This is because domain name services are critical to
Internet. operation of the Internet.
8. IANA Considerations 8. IANA Considerations
This document makes no request of IANA. This document makes no request of IANA.
9. Security Considerations 9. Security Considerations
This document mitigates some of the security considerations This document mitigates some of the security considerations
associated with IP fragmentation by discouraging its use. It does associated with IP fragmentation by discouraging its use. It does
not introduce any new security vulnerabilities, because it does not not introduce any new security vulnerabilities, because it does not
skipping to change at page 21, line 12 skipping to change at page 21, line 31
Jinmei, Jen Linkova, Paolo Lucente, Manoj Nayak, Eric Nygren, Fred Jinmei, Jen Linkova, Paolo Lucente, Manoj Nayak, Eric Nygren, Fred
Templin and Joe Touch for their comments. Templin and Joe Touch for their comments.
11. References 11. References
11.1. Normative References 11.1. Normative References
[I-D.ietf-tsvwg-datagram-plpmtud] [I-D.ietf-tsvwg-datagram-plpmtud]
Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and
T. Voelker, "Packetization Layer Path MTU Discovery for T. Voelker, "Packetization Layer Path MTU Discovery for
Datagram Transports", draft-ietf-tsvwg-datagram-plpmtud-07 Datagram Transports", draft-ietf-tsvwg-datagram-plpmtud-08
(work in progress), February 2019. (work in progress), June 2019.
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
DOI 10.17487/RFC0768, August 1980, DOI 10.17487/RFC0768, August 1980,
<https://www.rfc-editor.org/info/rfc768>. <https://www.rfc-editor.org/info/rfc768>.
[RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791,
DOI 10.17487/RFC0791, September 1981, DOI 10.17487/RFC0791, September 1981,
<https://www.rfc-editor.org/info/rfc791>. <https://www.rfc-editor.org/info/rfc791>.
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
skipping to change at page 24, line 50 skipping to change at page 25, line 22
[RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927,
DOI 10.17487/RFC5927, July 2010, DOI 10.17487/RFC5927, July 2010,
<https://www.rfc-editor.org/info/rfc5927>. <https://www.rfc-editor.org/info/rfc5927>.
[RFC6346] Bush, R., Ed., "The Address plus Port (A+P) Approach to [RFC6346] Bush, R., Ed., "The Address plus Port (A+P) Approach to
the IPv4 Address Shortage", RFC 6346, the IPv4 Address Shortage", RFC 6346,
DOI 10.17487/RFC6346, August 2011, DOI 10.17487/RFC6346, August 2011,
<https://www.rfc-editor.org/info/rfc6346>. <https://www.rfc-editor.org/info/rfc6346>.
[RFC6864] Touch, J., "Updated Specification of the IPv4 ID Field",
RFC 6864, DOI 10.17487/RFC6864, February 2013,
<https://www.rfc-editor.org/info/rfc6864>.
[RFC6888] Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa, [RFC6888] Perreault, S., Ed., Yamagata, I., Miyakawa, S., Nakagawa,
A., and H. Ashida, "Common Requirements for Carrier-Grade A., and H. Ashida, "Common Requirements for Carrier-Grade
NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888, NATs (CGNs)", BCP 127, RFC 6888, DOI 10.17487/RFC6888,
April 2013, <https://www.rfc-editor.org/info/rfc6888>. April 2013, <https://www.rfc-editor.org/info/rfc6888>.
[RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6 [RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6
Flow Label for Load Balancing in Server Farms", RFC 7098, Flow Label for Load Balancing in Server Farms", RFC 7098,
DOI 10.17487/RFC7098, January 2014, DOI 10.17487/RFC7098, January 2014,
<https://www.rfc-editor.org/info/rfc7098>. <https://www.rfc-editor.org/info/rfc7098>.
 End of changes. 33 change blocks. 
86 lines changed or deleted 99 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/