draft-ietf-tsvwg-datagram-plpmtud-03.txt   draft-ietf-tsvwg-datagram-plpmtud-04.txt 
Internet Engineering Task Force G. Fairhurst Internet Engineering Task Force G. Fairhurst
Internet-Draft T. Jones Internet-Draft T. Jones
Updates: 4821 (if approved) University of Aberdeen Updates: 4821 (if approved) University of Aberdeen
Intended status: Standards Track M. Tuexen Intended status: Standards Track M. Tuexen
Expires: January 3, 2019 I. Ruengeler Expires: March 9, 2019 I. Ruengeler
Muenster University of Applied Sciences Muenster University of Applied Sciences
July 02, 2018 September 5, 2018
Packetization Layer Path MTU Discovery for Datagram Transports Packetization Layer Path MTU Discovery for Datagram Transports
draft-ietf-tsvwg-datagram-plpmtud-03 draft-ietf-tsvwg-datagram-plpmtud-04
Abstract Abstract
This document describes a robust method for Path MTU Discovery This document describes a robust method for Path MTU Discovery
(PMTUD) for datagram Packetization layers. The document describes an (PMTUD) for datagram Packetization Layers (PLs). The document
extension to RFC 1191 and RFC 8201, which specifies ICMP-based Path describes an extension to RFC 1191 and RFC 8201, which specifies
MTU Discovery for IPv4 and IPv6. The method allows a Packetization ICMP-based Path MTU Discovery for IPv4 and IPv6. The method allows a
Layer (PL), or a datagram application that uses a PL, to discover PL, or a datagram application that uses a PL, to discover whether a
whether a network path can support the current size of datagram. network path can support the current size of datagram. This can be
This can be used to detect and reduce the message size when a sender used to detect and reduce the message size when a sender encounters a
encounters a network black hole (where packets are discarded, and no network black hole (where packets are discarded, and no ICMP message
ICMP message is received). The method can also probe a network path is received). The method can also probe a network path with
with progressively larger packets to find whether the maximum packet progressively larger packets to find whether the maximum packet size
size can be increased. This allows a sender to determine an can be increased. This allows a sender to determine an appropriate
appropriate packet size, providing functionally for datagram packet size, providing functionally for datagram transports that is
transports that is equivalent to the Packetization layer PMTUD equivalent to the Packetization layer PMTUD specification for TCP,
specification for TCP, specified in RFC4821. specified in RFC 4821.
The document also provides implementation notes for incorporating The document also provides implementation notes for incorporating
Datagram PMTUD into IETF Datagram transports or applications that use Datagram PMTUD into IETF datagram transports or applications that use
transports. datagram transports.
When published, this specification updates RFC4821. When published, this specification updates RFC 4821 when used with
datagram transports.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 3, 2019. This Internet-Draft will expire on March 9, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 3 1.1. Classical Path MTU Discovery . . . . . . . . . . . . . . 4
1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 5 1.2. Packetization Layer Path MTU Discovery . . . . . . . . . 5
1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6 1.3. Path MTU Discovery for Datagram Services . . . . . . . . 6
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7
3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 8 3. Features Required to Provide Datagram PLPMTUD . . . . . . . . 9
3.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 10 4. DPLPMTUD Mechanisms . . . . . . . . . . . . . . . . . . . . . 11
3.2. Validation of Probe Packet Size . . . . . . . . . . . . . 12 4.1. PLPMTU Probe Packets . . . . . . . . . . . . . . . . . . 11
3.3. Reducing the PLPMTU: Confirming Path Characteristics . . 12 4.2. Confirmation of Probed Packet Size . . . . . . . . . . . 13
3.4. Increasing the PLPMTU: Supporting Path Changes . . . . . 13 4.3. Detection of Black Holes . . . . . . . . . . . . . . . . 13
3.5. Robustness to inconsistent Path information . . . . . . . 13 4.4. Response to PTB Messages . . . . . . . . . . . . . . . . 14
4. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 13 4.4.1. Validation of PTB Messages . . . . . . . . . . . . . 14
4.1. PROBE_SEARCH: Probing for a larger PLPMTU . . . . . . . . 14 4.4.2. Use of PTB Messages . . . . . . . . . . . . . . . . . 15
4.2. The PROBE_DONE state . . . . . . . . . . . . . . . . . . 15 5. Datagram Packetization Layer PMTUD . . . . . . . . . . . . . 16
4.3. Validation and Use of PTB Messages . . . . . . . . . . . 15 5.1. DPLPMTUD Components . . . . . . . . . . . . . . . . . . . 17
4.4. Timers . . . . . . . . . . . . . . . . . . . . . . . . . 16 5.1.1. Timers . . . . . . . . . . . . . . . . . . . . . . . 17
4.5. Constants . . . . . . . . . . . . . . . . . . . . . . . . 16 5.1.2. Constants . . . . . . . . . . . . . . . . . . . . . . 17
4.6. Variables . . . . . . . . . . . . . . . . . . . . . . . . 17 5.1.3. Variables . . . . . . . . . . . . . . . . . . . . . . 18
4.7. Selecting PROBED_SIZE . . . . . . . . . . . . . . . . . . 18 5.2. DPLPMTUD Phases . . . . . . . . . . . . . . . . . . . . . 19
4.8. Simple Black Hole Detection . . . . . . . . . . . . . . . 18 5.2.1. Path Confirmation Phase . . . . . . . . . . . . . . . 20
4.8.1. Simple Black Hole Detection State Machine . . . . . . 19 5.2.2. Search Phase . . . . . . . . . . . . . . . . . . . . 21
4.9. Full State Machine . . . . . . . . . . . . . . . . . . . 20 5.2.2.1. Resilience to inconsistent path information . . . 21
5. Specification of Protocol-Specific Methods . . . . . . . . . 23 5.2.3. Search Complete Phase . . . . . . . . . . . . . . . . 21
5.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 23 5.2.4. PROBE_BASE Phase . . . . . . . . . . . . . . . . . . 22
5.1.1. Application Request . . . . . . . . . . . . . . . . . 24 5.2.5. ERROR Phase . . . . . . . . . . . . . . . . . . . . . 22
5.1.2. Application Response . . . . . . . . . . . . . . . . 24 5.2.5.1. Robustness to inconsistent path . . . . . . . . . 23
5.1.3. Sending Application Probe Packets . . . . . . . . . . 24
5.1.4. Validating the Path . . . . . . . . . . . . . . . . . 24 5.2.6. DISABLED Phase . . . . . . . . . . . . . . . . . . . 23
5.1.5. Handling of PTB Messages . . . . . . . . . . . . . . 24 5.3. State Machine . . . . . . . . . . . . . . . . . . . . . . 23
5.2. DPLPMTUD with UDP Options . . . . . . . . . . . . . . . . 24 5.4. Search to Increase the PLPMTU . . . . . . . . . . . . . . 26
5.2.1. UDP Request Option . . . . . . . . . . . . . . . . . 25 5.4.1. Probing for a larger PLPMTU . . . . . . . . . . . . . 26
5.2.2. UDP Response Option . . . . . . . . . . . . . . . . . 25 5.4.2. Selection of Probe Sizes . . . . . . . . . . . . . . 27
5.3. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 26 5.4.3. Resilience to inconsistent Path information . . . . . 28
5.3.1. SCTP/IP4 and SCTP/IPv6 . . . . . . . . . . . . . . . 26 6. Specification of Protocol-Specific Methods . . . . . . . . . 28
5.3.1.1. Sending SCTP Probe Packets . . . . . . . . . . . 26 6.1. Application support for DPLPMTUD with UDP or UDP-Lite . . 28
5.3.1.2. Validating the Path with SCTP . . . . . . . . . . 27 6.1.1. Application Request . . . . . . . . . . . . . . . . . 29
5.3.1.3. PTB Message Handling by SCTP . . . . . . . . . . 27 6.1.2. Application Response . . . . . . . . . . . . . . . . 29
5.3.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 27 6.1.3. Sending Application Probe Packets . . . . . . . . . . 29
5.3.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . 27 6.1.4. Validating the Path . . . . . . . . . . . . . . . . . 29
5.3.2.2. Validating the Path with SCTP/UDP . . . . . . . . 27 6.1.5. Handling of PTB Messages . . . . . . . . . . . . . . 29
5.3.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . 27 6.2. DPLPMTUD with UDP Options . . . . . . . . . . . . . . . . 30
5.3.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 28 6.2.1. UDP Probe Request Option . . . . . . . . . . . . . . 31
5.3.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 28 6.2.2. UDP Probe Response Option . . . . . . . . . . . . . . 31
5.3.3.2. Validating the Path with SCTP/DTLS . . . . . . . 28 6.3. DPLPMTUD for SCTP . . . . . . . . . . . . . . . . . . . . 32
5.3.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 28 6.3.1. SCTP/IPv4 and SCTP/IPv6 . . . . . . . . . . . . . . . 32
5.4. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 28 6.3.1.1. Sending SCTP Probe Packets . . . . . . . . . . . 32
5.4.1. Sending QUIC Probe Packets . . . . . . . . . . . . . 28 6.3.1.2. Validating the Path with SCTP . . . . . . . . . . 33
5.4.2. Validating the Path with QUIC . . . . . . . . . . . . 29 6.3.1.3. PTB Message Handling by SCTP . . . . . . . . . . 33
5.4.3. Handling of PTB Messages by QUIC . . . . . . . . . . 29 6.3.2. DPLPMTUD for SCTP/UDP . . . . . . . . . . . . . . . . 33
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 29 6.3.2.1. Sending SCTP/UDP Probe Packets . . . . . . . . . 33
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 6.3.2.2. Validating the Path with SCTP/UDP . . . . . . . . 33
8. Security Considerations . . . . . . . . . . . . . . . . . . . 30 6.3.2.3. Handling of PTB Messages by SCTP/UDP . . . . . . 33
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3.3. DPLPMTUD for SCTP/DTLS . . . . . . . . . . . . . . . 33
9.1. Normative References . . . . . . . . . . . . . . . . . . 30 6.3.3.1. Sending SCTP/DTLS Probe Packets . . . . . . . . . 34
9.2. Informative References . . . . . . . . . . . . . . . . . 32 6.3.3.2. Validating the Path with SCTP/DTLS . . . . . . . 34
Appendix A. Event-driven state changes . . . . . . . . . . . . . 32 6.3.3.3. Handling of PTB Messages by SCTP/DTLS . . . . . . 34
Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 35 6.4. DPLPMTUD for QUIC . . . . . . . . . . . . . . . . . . . . 34
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 6.4.1. Sending QUIC Probe Packets . . . . . . . . . . . . . 34
6.4.2. Validating the Path with QUIC . . . . . . . . . . . . 35
6.4.3. Handling of PTB Messages by QUIC . . . . . . . . . . 35
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35
9. Security Considerations . . . . . . . . . . . . . . . . . . . 36
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 36
10.1. Normative References . . . . . . . . . . . . . . . . . . 36
10.2. Informative References . . . . . . . . . . . . . . . . . 38
Appendix A. Event-driven state changes . . . . . . . . . . . . . 38
Appendix B. Revision Notes . . . . . . . . . . . . . . . . . . . 41
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43
1. Introduction 1. Introduction
The IETF has specified datagram transport using UDP, SCTP, and DCCP, The IETF has specified datagram transport using UDP, SCTP, and DCCP,
as well as protocols layered on top of these transports (e.g., SCTP/ as well as protocols layered on top of these transports (e.g., SCTP/
UDP, DCCP/UDP) and directly over the IP network layer. This document UDP, DCCP/UDP, QUIC/UDP), and direct datagram transport over the IP
describes a robust method for Path MTU Discovery (PMTUD) that may be network layer. This document describes a robust method for Path MTU
used with these transport protocols (or the applications that use Discovery (PMTUD) that may be used with these transport protocols (or
their transport service) to discover an appropriate size of packet to the applications that use their transport service) to discover an
use across an Internet path. appropriate size of packet to use across an Internet path.
This specification clarifies the PLPMTUD method for SCTP described in
section 10.2 of [RFC4821] by specifying the procedure in Section 6.3
of this document.
1.1. Classical Path MTU Discovery 1.1. Classical Path MTU Discovery
Classical Path Maximum Transmission Unit Discovery (PMTUD) can be Classical Path Maximum Transmission Unit Discovery (PMTUD) can be
used with any transport that is able to process ICMP Packet Too Big used with any transport that is able to process ICMP Packet Too Big
(PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message (PTB) messages (e.g., [RFC1191] and [RFC8201]). The term PTB message
is applied to both IPv4 ICMP Unreachable messages (type 3) that carry is applied to both IPv4 ICMP Unreachable messages (Type 3) that carry
the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too the error Fragmentation Needed (Type 3, Code 4) and ICMPv6 packet too
big messages (Type 2). When a sender receives a PTB message, it big messages (Type 2). When a sender receives a PTB message, it
reduces the effective MTU to the value reported as the Link MTU in reduces the effective MTU to the value reported in the PTB message
the PTB message, and a method that from time-to-time increases the (in this document called the PTB_SIZE). A method from time-to-time
packet size in attempt to discover an increase in the supported PMTU. increases the packet size in attempt to discover an increase in the
The packets sent with a size larger than the current effective PMTU supported PMTU. The packets sent with a size larger than the current
are known as probe packets. effective PMTU are known as probe packets.
Packets not intended as probe packets are either fragmented to the Packets not intended as probe packets are either fragmented to the
current effective PMTU, or the attempt to send fails with an error current effective PMTU, or an attempt to send a packet larger than
code. Applications are sometimes provided with a primitive to let current effective PMTU fails with an error code. Applications are
them read the maximum packet size, derived from the current effective sometimes provided with a primitive to let them read the maximum
PMTU. packet size, derived from the current effective PMTU.
Classical PMTUD is subject to protocol failures. One failure arises Classical PMTUD is subject to protocol failures. One failure arises
when traffic using a packet size larger than the actual PMTU is when traffic using a packet size larger than the actual PMTU is black
black-holed (all datagrams sent with this size, or larger, are holed (all datagrams sent with this size, or larger, are silently
silently discarded without the sender receiving ICMP PTB messages). discarded without the sender receiving ICMP PTB messages). This
This could arise when the PTB messages are not delivered back to the could arise when the PTB messages are not delivered back to the
sender for some reason [RFC2923]). For example, ICMP messages are sender for some reason [RFC2923]). For example, ICMP messages are
increasingly filtered by middleboxes (including firewalls) [RFC4890]. increasingly filtered by middleboxes (including firewalls) [RFC4890].
A stateful firewall could be configured with a policy to block A stateful firewall could be configured with a policy to block
incoming ICMP messages, which would prevent reception of PTB messages incoming ICMP messages, which would prevent reception of PTB messages
to endpoints behind this firewall. Other examples include cases to endpoints behind this firewall. Other examples include cases
where PTB messages are not correctly processed/generated by tunnel where PTB messages are not correctly processed/generated by tunnel
endpoints. endpoints.
Another failure could result if a node that is not on the network Another failure could result if a node that is not on the network
path sends a PTB message that attempts to force the sender to change path sends a PTB message that attempts to force the sender to change
skipping to change at page 4, line 45 skipping to change at page 5, line 11
reacting to such messages by utilising the quoted packet within a PTB reacting to such messages by utilising the quoted packet within a PTB
message payload to validate that the received PTB message was message payload to validate that the received PTB message was
generated in response to a packet that had actually originated from generated in response to a packet that had actually originated from
the sender. However, there are situations where a sender would be the sender. However, there are situations where a sender would be
unable to provide this validation. unable to provide this validation.
Examples where validation of the PTB message is not possible include: Examples where validation of the PTB message is not possible include:
o When the router issuing the ICMP message is acting on a tunneled o When the router issuing the ICMP message is acting on a tunneled
packet, the ICMP message will be directed to the tunnel endpoint. packet, the ICMP message will be directed to the tunnel endpoint.
This tunnel endpoint is responsible for forwardiung the ICMP This tunnel endpoint is responsible for forwarding the ICMP
message and also processing the quoted packet within the payload message and also processing the quoted packet within the payload
field to remove the effect of the tunnel, and return a correctly field to remove the effect of the tunnel, and return a correctly
fromatted ICMP message to the sender. Failure to do this results formatted ICMP message to the sender. Failure to do appropriate
in black-holing. processing therefore results in black-holing.
o When a router issuing the ICMP message implements RFC792 o When a router issuing the ICMP message implements RFC 792
[RFC0792], it is only required the to include the first 64 bits of [RFC0792], it is only required to include (quote) the first 64
the IP payload of the packet within the quoted payload.This may be bits of the IP payload of the packet within the ICMP payload.
insufficient to perfom the tunnel processing described in the This could be insufficient to perform the tunnel processing
previous bullet. Even if the decapsulated message is processed by described in the previous bullet. Even if the decapsulated
the tunnel endpoint, there could be insufficient bytes remaining message is processed by the tunnel endpoint, there could be
for the sender to interpret the quoted transport information. insufficient bytes remaining for the sender to interpret the
RFC1812 [RFC1812] requires routers to return the full packet if quoted transport information. RFC 1812 [RFC1812] requires routers
possible, often the case for IPv4 when used the path includes to return the full packet if possible. This can result in black-
tunnels; or where the packet has been encapsulated/tunneled over holing when used the path includes tunnels.
an encrypted transport and it is not possible to determine the
original transport header ). o When a router issuing the ICMP message quotes a packet with an
encrypted transport, it may lack sufficient context to determine
the original transport header.
o Even when the PTB message includes sufficient bytes of the quoted o Even when the PTB message includes sufficient bytes of the quoted
packet, the network layer could lack sufficient context to packet, the network layer could lack sufficient context to
validate the message, because this depends on information about validate the ICMP message, because this depends on information
the active transport flows at an endpoint node (e.g., the socket/ about the active transport flows at an endpoint node (e.g., the
address pairs being used, and other protocol header information). socket/address pairs being used, and other protocol header
information).
1.2. Packetization Layer Path MTU Discovery 1.2. Packetization Layer Path MTU Discovery
The term Packetization Layer (PL) has been introduced to describe the The term Packetization Layer (PL) has been introduced to describe the
layer that is responsible for placing data blocks into the payload of layer that is responsible for placing data blocks into the payload of
IP packets and selecting an appropriate Maximum Packet Size (MPS). IP packets and selecting an appropriate Maximum Packet Size (MPS).
This function is often performed by a transport protocol, but can This function is often performed by a transport protocol, but can
also be performed by other encapsulation methods working above the also be performed by other encapsulation methods working above the
transport. transport layer.
In contrast to PMTUD, Packetization Layer Path MTU Discovery In contrast to PMTUD, Packetization Layer Path MTU Discovery
(PLPMTUD) [RFC4821] does not rely upon reception and validation of (PLPMTUD) [RFC4821] does not rely upon reception and validation of
PTB messages. It is therefore more robust than Classical PMTUD. PTB messages. It is therefore more robust than Classical PMTUD.
This has become the recommended approach for implementing PMTU This has become the recommended approach for implementing PMTU
discovery with TCP. discovery with TCP.
It uses a general strategy where the PL sends probe packet to search It uses a general strategy where the PL sends probe packets to search
for the largest size of unfragmented datagram that can be sent over a for the largest size of unfragmented datagram that can be sent over a
path. The probe packets are sent with a progressively larger packet network path. The probe packets are sent with a progressively larger
size. If a probe packet is successfully delivered (as determined by packet size. If a probe packet is successfully delivered (as
the PL), then the PLPMTU is raised to the size of the successful determined by the PL), then the PLPMTU is raised to the size of the
probe. If no response is received to a probe packet, the method successful probe. If no response is received to a probe packet, the
reduces the probe size. This PLPMTU is used to set the application method reduces the probe size. This PLPMTU is used to set the
MPS. application MPS.
PLPMTUD introduces flexibility in the implementation of PMTU PLPMTUD introduces flexibility in the implementation of PMTU
discovery. At one extreme, it can be configured to only perform PTB discovery. At one extreme, it can be configured to only perform PTB
black hole detection and recovery to increase the robustness of black hole detection and recovery to increase the robustness of
Classical PMTUD, or at the other extreme, all PTB processing can be Classical PMTUD, or at the other extreme, all PTB processing can be
disabled and PLPMTUD can completely replace Classical PMTUD. disabled and PLPMTUD can completely replace Classical PMTUD.
PLPMTUD can also include additional consistency checks without PLPMTUD can also include additional consistency checks without
increasing the risk of increased black-holing. For instance,the increasing the risk of increased black-holing. For instance,the
information available at the PL, or higher layers, makes PTB information available at the PL, or higher layers, makes PTB
validation more straight forward. validation more straight forward.
1.3. Path MTU Discovery for Datagram Services 1.3. Path MTU Discovery for Datagram Services
Section 4 of this document presents a set of algorithms for datagram Section 5 of this document presents a set of algorithms for datagram
protocols to discover the largest size of unfragmented datagram that protocols to discover the largest size of unfragmented datagram that
can be sent over a path. The method described relies on features of can be sent over a network path. The method described relies on
the PL Section 3 and apply to transport protocols operating over IPv4 features of the PL described in Section 3 and applies to transport
and IPv6. It does not require cooperation from the lower layers, protocols operating over IPv4 and IPv6. It does not require
although it can utilise ICMP PTB messages when these received cooperation from the lower layers, although it can utilise ICMP PTB
messages are made available to the PL. messages when these received messages are made available to the PL.
The UDP Usage Guidelines [RFC8085] state "an application SHOULD The UDP Usage Guidelines [RFC8085] state "an application SHOULD
either use the Path MTU information provided by the IP layer or either use the Path MTU information provided by the IP layer or
implement Path MTU Discovery (PMTUD)", but does not provide a implement Path MTU Discovery (PMTUD)", but does not provide a
mechanism for discovering the largest size of unfragmented datagram mechanism for discovering the largest size of unfragmented datagram
than can be used on a path. Prior to this document, PLPMTUD had not that can be used on a network path. Prior to this document, PLPMTUD
been specified for UDP. had not been specified for UDP.
Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the Section 10.2 of [RFC4821] recommends a PLPMTUD probing method for the
Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat Stream Control Transport Protocol (SCTP). SCTP utilises heartbeat
messages as probe packets, but RFC4821 does not provide a complete messages as probe packets, but RFC4821 does not provide a complete
specification. This document provides the details to complete that specification. The present document provides the details to complete
specification. that specification.
The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires The Datagram Congestion Control Protocol (DCCP) [RFC4340] requires
implementations to support Classical PMTUD and states that a DCCP implementations to support Classical PMTUD and states that a DCCP
sender "MUST maintain the MPS allowed for each active DCCP session". sender "MUST maintain the MPS allowed for each active DCCP session".
It also defines the current congestion control MPS (CCMPS) supported It also defines the current congestion control MPS (CCMPS) supported
by a path. This recommends use of PMTUD, and suggests use of control by a network path. This recommends use of PMTUD, and suggests use of
packets (DCCP-Sync) as path probe packets, because they do not risk control packets (DCCP-Sync) as path probe packets, because they do
application data loss. The method defined in this specification not risk application data loss. The method defined in this
could be used with DCCP. specification could be used with DCCP.
Section 5 specifies the method for a set of transports, and provides Section 6 specifies the method for a set of transports, and provides
information to enables the implementation of PLPMTUD with other information to enable the implementation of PLPMTUD with other
datagram transports and applications that use datagram transports. datagram transports and applications that use datagram transports.
2. Terminology 2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
Other terminology is directly copied from [RFC4821], and the Other terminology is directly copied from [RFC4821], and the
definitions in [RFC1122]. definitions in [RFC1122].
Black-Holed: When the sender is unaware that packets are not Actual PMTU: The Actual PMTU is the PMTU of a network path between a
delivered to the destination endpoint (e.g., when the sender sender PL and a destination PL, which the DPLPMTUD algorithm seeks
transmits packets of a particular size with a previously known to determine.
effective PMTU (also refered to as the PLPMTU), but is unaware of
a change to the path that resulted in a smaller PLPMTU). Black Holed: Packets are Black holed when the sender is unaware that
packets are not delivered to the destination endpoint (e.g., when
the sender transmits packets of a particular size with a
previously known effective PMTU and they are silently discarded by
the network, but is not made aware of a change to the path that
resulted in a smaller PLPMTU by ICMP messages).
Classical Path MTU Discovery: Classical PMTUD is a process described Classical Path MTU Discovery: Classical PMTUD is a process described
in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to in [RFC1191] and [RFC8201], in which nodes rely on PTB messages to
learn the largest size of unfragmented datagram than can be used learn the largest size of unfragmented datagram that can be used
across a path. across a network path.
Datagram: A datagram is a transport-layer protocol data unit, Datagram: A datagram is a transport-layer protocol data unit,
transmitted in the payload of an IP packet. transmitted in the payload of an IP packet.
Effective PMTU: The current estimated value for PMTU that is used by Effective PMTU: The Effective PMTU is the current estimated value
a PMTUD. This is equivalent to the PLPMTU derived by PLPMTUD. for PMTU that is used by a PMTUD. This is equivalent to the
PLPMTU derived by PLPMTUD.
EMTU_S: The Effective MTU for sending (EMTU_S) is defined in EMTU_S: The Effective MTU for sending (EMTU_S) is defined in
[RFC1122] as "the maximum IP datagram size that may be sent, for a [RFC1122] as "the maximum IP datagram size that may be sent, for a
particular combination of IP source and destination addresses...". particular combination of IP source and destination addresses...".
EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in EMTU_R: The Effective MTU for receiving (EMTU_R) is designated in
[RFC1122] as the largest datagram size that can be reassembled by [RFC1122] as the largest datagram size that can be reassembled by
EMTU_R ("Effective MTU to receive"). EMTU_R ("Effective MTU to receive").
Link: A communication facility or medium over which nodes can Link: A Link is a communication facility or medium over which nodes
communicate at the link layer, i.e., a layer below the IP layer. can communicate at the link layer, i.e., a layer below the IP
Examples are Ethernet LANs and Internet (or higher) layer and layer. Examples are Ethernet LANs and Internet (or higher) layer
tunnels. and tunnels.
Link MTU: The Maximum Transmission Unit (MTU) is the size in bytes
of the largest IP packet, including the IP header and payload,
that can be transmitted over a link. Note that this could more
properly be called the IP MTU, to be consistent with how other
standards organizations use the acronym MT. This includes the IP
header, but excludes link layer headers and other framing that is
not part of IP or the IP payload. Other standards organizations
generally define link MTU to include the link layer headers.
MPS: The Maximum Packet Size (MPS) is the largest size of Link MTU: The Link Maximum Transmission Unit (MTU) is the size in
application data block that can be sent unfragmented across a bytes of the largest IP packet, including the IP header and
path. In DPLPMTUD this quantity is derived from PLPMTU by taking payload, that can be transmitted over a link. Note that this
into consideration the size of the application and lower protocol could more properly be called the IP MTU, to be consistent with
how other standards organizations use the acronym. This includes
the IP header, but excludes link layer headers and other framing
that is not part of IP or the IP payload. Other standards
organizations generally define the link MTU to include the link
layer headers. layer headers.
Packet: An IP header plus the IP payload. MPS: The Maximum Packet Size (MPS) is the largest size of
application data block that can be sent across a network path. In
DPLPMTUD this quantity is derived from the PLPMTU by taking into
consideration the size of the lower protocol layer headers.
Packetization Layer (PL): The layer of the network stack that places MIN_PMTU: The MIN_PMTU is the smallest size of PLPMTU that DPLPTMUD
data into packets and performs transport protocol functions. will attempt to use.
Path: The set of link and routers traversed by a packet between a Packet: A Packet is the IP header plus the IP payload.
source node and a destination node by a particular flow.
Path MTU (PMTU): The minimum of the Link MTU of all the links Packetization Layer (PL): The Packetization Layer (PL) is the layer
forming a path between a source node and a destination node. of the network stack that places data into packets and performs
transport protocol functions.
PLPMTU: The estimate of the actual PMTU provided by the DPLPMTUD Path: The Path is the set of links and routers traversed by a packet
algorithm. between a source node and a destination node by a particular flow.
PLPMTUD: Packetization Layer Path MTU Discovery, the method Path MTU (PMTU): The Path MTU (PMTU) is the minimum of the Link MTU
described in this document for datagram PLs, which is an extension of all the links forming a network path between a source node and
to Classical PMTU Discovery. a destination node.
Probe packet: A datagram sent with a purposely chosen size PTB_SIZE: The PTB_SIZE is a value reported in a validated PTB
(typically larger than the current PLPMTU) to detect if packets of message that indicates next hop link MTU of a router along the
this size can be successfully sent end-toend across the network
path. path.
PLPMTU: The Packetization Layer PMTU is an estimate of the actual
PMTU provided by the DPLPMTUD algorithm.
PLPMTUD: Packetization Layer Path MTU Discovery (PLPMTUD), the
method described in this document for datagram PLs, which is an
extension to Classical PMTU Discovery.
Probe packet: A probe packet is a datagram sent with a purposely
chosen size (typically the current PLPMTU or larger) to detect if
packets of this size can be successfully sent end-to-end across
the network path.
3. Features Required to Provide Datagram PLPMTUD 3. Features Required to Provide Datagram PLPMTUD
TCP PLPMTUD has been defined using standard TCP protocol mechanisms. TCP PLPMTUD has been defined using standard TCP protocol mechanisms.
All of the requirements in [RFC4821] also apply to use of the All of the requirements in [RFC4821] also apply to the use of the
technique with a datagram PL. Unlike TCP, some datagram PLs require technique with a datagram PL. Unlike TCP, some datagram PLs require
additional mechanisms to implement PLPMTUD. additional mechanisms to implement PLPMTUD.
There are eight requirements for performing the datagram PLPMTUD There are eight requirements for performing the datagram PLPMTUD
method described in this specification: method described in this specification:
1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide 1. PMTU parameters: A DPLPMTUD sender is RECOMMENDED to provide
information about the maximum size of packet that can be information about the maximum size of packet that can be
transmitted by the sender on the local link (the local Link MTU). transmitted by the sender on the local link (the local Link MTU).
It MAY utilize similar information about the receiver when this It MAY utilize similar information about the receiver when this
is supplied (note this could be less than EMTU_R). This avoids is supplied (note this could be less than EMTU_R). This avoids
implementations trying to send probe packets that can not be implementations trying to send probe packets that can not be
transmited by the local link. Too high a value may reduce the transmitted by the local link. Too high of a value could reduce
efficiency of the search algorithm. Some applications also have the efficiency of the search algorithm. Some applications also
a maximum transport protocol data unit (PDU) size, in which case have a maximum transport protocol data unit (PDU) size, in which
there is no benefit from probing for a size larger than this case there is no benefit from probing for a size larger than this
(unless a transport allows multiplexing multiple applications (unless a transport allows multiplexing multiple applications
PDUs into the same datagram). PDUs into the same datagram).
2. PLPMTU: A datagram application MUST be able to choose the size of 2. PLPMTU: A datagram application is REQUIRED to be able to choose
datagrams sent to the network, up to the PLPMTU, or a smaller the size of datagrams sent to the network, up to the PLPMTU, or a
value (such as the MPS) derived from this. This value is managed smaller value (such as the MPS) derived from this. This value is
by the DPLPMTUD method. The PLPMTU (specified as the effective managed by the DPLPMTUD method. The PLPMTU (specified as the
PMTU in Section 1 of [RFC1191]) is equivalent to the EMTU_S effective PMTU in Section 1 of [RFC1191]) is equivalent to the
(specified in [RFC1122]). EMTU_S (specified in [RFC1122]).
3. Probe packets: On request, a PLPMTUD sender is REQUIRED to be 3. Probe packets: On request, a DPLPMTUD sender is REQUIRED to be
able to transmit a packet larger than the PLMPMTU. This can be able to transmit a packet larger than the PLMPMTU. This is used
uses to send a probe packet. In IPv4, a probe packet MUST be to send a probe packet. In IPv4, a probe packet MUST be sent
sent with the Don't Fragment (DF) bit set in the IP header, and with the Don't Fragment (DF) bit set in the IP header, and
without network layer endpoint fragmentation. In IPv6, a probe without network layer endpoint fragmentation. In IPv6, a probe
packet is always sent without source fragmentation (as specified packet is always sent without source fragmentation (as specified
in section 5.4 of [RFC8201]). in section 5.4 of [RFC8201]).
4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize 4. Processing PTB messages: A DPLPMTUD sender MAY optionally utilize
PTB messages received from the network layer to help identify PTB messages received from the network layer to help identify
when a path does not support the current size of packet probe. when a network path does not support the current size of probe
Any received PTB message MUST be validated before it is used to packet. Any received PTB message MUST be validated before it is
update the PLPMTU discovery information [RFC8201]. This used to update the PLPMTU discovery information [RFC8201]. This
validation confirms that the PTB message was sent in response to validation confirms that the PTB message was sent in response to
a packet originating by the sender, and needs to be performed a packet originating by the sender, and needs to be performed
before the PLPMTU discovery method reacts to the PTB message. before the PLPMTU discovery method reacts to the PTB message.
When the router link MTU is indicated in the PTB message this MAY When the PTB_SIZE is indicated in the PTB message, this MAY be
be used by DPLPMTUD to reduce the probe size but MUST NOT be used used by DPLPMTUD to reduce the probe size but MUST NOT be used to
to increase the PLPMTU ([RFC8201]). This validation SHOULD increase the PLPMTU ([RFC8201]). This validation SHOULD utilise
utilise information that can not be simply determined by an off- information that can not be simply determined by an off-path
path attacker, for example, by checking the value of a protocol attacker, for example, by checking the value of a protocol header
header field known only to the two PL endpoints. (Some datagram field known only to the two PL endpoints. (Some datagram
applications use well-known source and destination ports and applications use well-known source and destination ports and
therefore this check needs to rely on other information.) therefore this check needs to rely on other information.)
5. Reception feedback: The destination PL endpoint is REQUIRED to 5. Reception feedback: The destination PL endpoint is REQUIRED to
provide a feedback method that indicates to the DPLPMTUD sender provide a feedback method that indicates to the DPLPMTUD sender
when a probe packet has been received by the destination PL when a probe packet has been received by the destination PL
endpoint. The local PL endpoint at the sending node is REQUIRED endpoint. The mechanism needs to be robust to the possibility
to pass this feedback to the sender-side DPLPMTUD method. that packets could be significantly delayed along a network path.
The local PL endpoint at the sending node is REQUIRED to pass
this feedback to the sender-side DPLPMTUD method.
6. Probing and congestion control: The isolated loss of a probe 6. Probing and congestion control: The isolated loss of a probe
packet SHOULD NOT be treated as an indication of congestion and packet SHOULD NOT be treated as an indication of congestion and
its loss SHOULD NOT directly trigger a congestion control its loss SHOULD NOT directly trigger a congestion control
reaction [RFC4821]. reaction [RFC4821].
7. Probe loss recovery: If the data block carried by a probe packet 7. Probe loss recovery: If the data block carried by a probe packet
needs to be sent reliably, the PL (or layers above) MUST arrange needs to be sent reliably, the PL (or layers above) are REQUIRED
retransmission/repair of any resulting loss. This method MUST be to arrange any retransmission/repair of any resulting loss. This
robust in the case where probe packets are lost due to other method is REQUIRED to be robust in the case where probe packets
reasons (including link transmission error, congestion). The are lost due to other reasons (including link transmission error,
DPLPMTUD method treats isolated loss of a probe packet (with or congestion). The DPLPMTUD sender treats isolated loss of a probe
without an PTB message) as a potential indication of a PMTU limit packet (with or without an PTB message) as a potential indication
on the path, but not as an indictaion of congestion Paragraph 6. of a PMTU limit for the path, but not as an indication of
congestion, see Paragraph 6.
8. Shared PLPMTU state: The PLPMTU value could also be stored with 8. Shared PLPMTU state: The PLPMTU value could also be stored with
the corresponding entry in the destination cache and used by the corresponding entry in the destination cache and used by
other PL instances. The specification of PLPMTUD [RFC4821] other PL instances. The specification of PLPMTUD [RFC4821]
states: "If PLPMTUD updates the MTU for a particular path, all states: "If PLPMTUD updates the MTU for a particular path, all
Packetization Layer sessions that share the path representation Packetization Layer sessions that share the path representation
(as described in Section 5.2 of [RFC4821]) SHOULD be notified to (as described in Section 5.2 of [RFC4821]) SHOULD be notified to
make use of the new MTU and make the required congestion control make use of the new MTU and make the required congestion control
adjustments". Such methods need to be robust to the wide variety adjustments". Such methods MUST be robust to the wide variety of
of underlying network forwarding behaviours, PLPMTU adjustments underlying network forwarding behaviours, PLPMTU adjustments
based on shared PLPMTU values should be incorporated in the based on shared PLPMTU values should be incorporated in the
search algorithms. Section 5.2 of [RFC8201] provides guidance on search algorithms. Section 5.2 of [RFC8201] provides guidance on
the caching of PMTU information and also the relation to IPv6 the caching of PMTU information and also the relation to IPv6
flow labels. flow labels.
In addition, the following principles are stated for design of a In addition, the following principles are stated for design of a
DPLPMTUD method: DPLPMTUD method:
o MPS: A method MUST signal appropriate MPS to the higher layer o MPS: A method is REQUIRED to signal an appropriate MPS to the
using the PL. This may change following a change to the path. higher layer using the PL. The value of the MPS can change
The method SHOULD avoid forcing an application to use an arbitrary following a change to the path. It is RECOMMENDED that methods
small MPS (PLPMTU) for transmission while the method is searching avoid forcing an application to use an arbitrary small MPS
for the currently supported PLPMTU. Datagram PLs do not (PLPMTU) for transmission while the method is searching for the
necessarily support fragmentation of PDUs larger than the PLPMTU. currently supported PLPMTU. Datagram PLs do not necessarily
A reduced MPS can adversely impact the performance of a datagram support fragmentation of PDUs larger than the PLPMTU. A reduced
MPS can adversely impact the performance of a datagram
application. application.
o Path validation: A method MUST be robust to path changes that o Path validation: It is RECOMMENDED that methods are robust to path
could have occurred since the path characteristics were last changes that could have occurred since the path characteristics
confirmed, and to the possibility of inconsistent path information were last confirmed, and to the possibility of inconsistent path
being received. information being received.
o Datagram reordering: A method MUST be robust to the possibility o Datagram reordering: A method is REQUIRED to be robust to the
that a flow encounters reordering, or has the traffic (including possibility that a flow encounters reordering, or the traffic
probe packets) is divided over more than one network path. (including probe packets) is divided over more than one network
path.
o When to probe: A method SHOULD determine whether the path capacity o When to probe: It is RECOMMENDED that methods determine whether
has increased since it last measured the path. This determines the path capacity has increased since it last measured the path.
when the path should again be probed. This determines when the path should again be probed.
3.1. PLPMTU Probe Packets 4. DPLPMTUD Mechanisms
This section lists the protocol mechanisms used in this
specification.
4.1. PLPMTU Probe Packets
The DPLPMTUD method relies upon the PL sender being able to generate The DPLPMTUD method relies upon the PL sender being able to generate
probe packets with a specific size. TCP is able to generate these probe packets with a specific size. TCP is able to generate these
probe packets by choosing to appropriately segment data being sent probe packets by choosing to appropriately segment data being sent
[RFC4821]. [RFC4821]. In contrast, a datagram PL that needs to construct a
probe packet has to either request an application to send a data
In contrast, a datagram PL that needs to construct a probe packet has block that is larger than that generated by an application, or to
to either request an application to send a data block that is larger utilise padding functions to extend a datagram beyond the size of the
than that generated by an application, or to utilise padding application data block. Protocols that permit exchange of control
functions to extend a datagram beyond the size of the application messages (without an application data block) could alternatively
data block. Protocols that permit exchange of control messages prefer to generate a probe packet by extending a control message with
(without an application data block) could alternatively prefer to padding data.
generate a probe packet by extending a control message with padding
data.
When the method fails to validate the PLPMTU, it may be required to
send a probe packet with a size less than the size of the data block
generated by an application. In this case, the PL could provide a
way to fragment a datagram at the PL, or could instead utilise a
control packet with padding.
A receiver needs to be able to distinguish an in-band data block from A receiver needs to be able to distinguish an in-band data block from
any added padding. This is needed to ensure that any added padding any added padding. This is needed to ensure that any added padding
is not passed on to an application at the receiver. is not passed on to an application at the receiver.
This results in three possible ways that a sender can create a probe This results in three possible ways that a sender can create a probe
packet listed in order of preference: packet listed in order of preference:
Probing using padding data: A probe packet that contains only Probing using padding data: A probe packet that contains only
control information together with any padding needed to inflate control information together with any padding, which is needed to
the packet to the size required for the probe packet. Since these be inflated to the size required for the probe packet. Since
probe packets do not carry an application-supplied data block,they these probe packets do not carry an application-supplied data
do not typically require retransmission, although they do still block, they do not typically require retransmission, although they
consume network capacity and incur endpoint processing. do still consume network capacity and incur endpoint processing.
Probing using appication data and padding data: A probe packet that Probing using application data and padding data: A probe packet that
contains a data block supplied by an application that is combined contains a data block supplied by an application that is combined
with padding to inflate the length of the datagram to the size with padding to inflate the length of the datagram to the size
required for the probe packet. If the application/transport needs required for the probe packet. If the application/transport needs
protection from the loss of this probe packet, the application/ protection from the loss of this probe packet, the application/
transport may perform transport-layer retransmission/repair of the transport could perform transport-layer retransmission/repair of
data block (e.g., by retransmission after loss is detected or by the data block (e.g., by retransmission after loss is detected or
duplicating the data block in a datagram without the padding by duplicating the data block in a datagram without the padding
data). data).
Probing using appication data: A probe packet that contains a data Probing using application data: A probe packet that contains a data
block supplied by an application that matches the size required block supplied by an application that matches the size required
for the probe packet. This method requests the application to for the probe packet. This method requests the application to
issue a data block of the desired probe size. If the application/ issue a data block of the desired probe size. If the application/
transport needs protection from the loss of an unsuccessful probe transport needs protection from the loss of an unsuccessful probe
packet, the application/transport needs then to perform transport- packet, the application/transport needs then to perform transport-
layer retransmission/repair of the data block (e.g., by layer retransmission/repair of the data block (e.g., by
retransmission after loss is detected). retransmission after loss is detected).
A PL that uses a probe packet carrying an application data block, A PL that uses a probe packet carrying an application data block,
could need to retransmit this application data block if the probe could need to retransmit this application data block if the probe
fails. This could need the PL to re-fragment the data block to a fails. This could need the PL to re-fragment the data block to a
smaller packet size that is expected to traverse the end-to-end path smaller packet size that is expected to traverse the end-to-end path
(which could utilise network-layer or PL fragmentation when these are (which could utilise endpoint network-layer or PL fragmentation when
available). these are available).
DLPMTUD MAY choose to use only one of these methods to simplify the DPLPMTUD MAY choose to use only one of these methods to simplify the
implementation. implementation.
3.2. Validation of Probe Packet Size Probe messages sent by a PL MUST contain enough information to
uniquely identify the probe within Maximum Segment Lifetime, while
being robust to reordering and replay of probe response and ICMP PTB
messages.
The PL needs a method to determine when probe packets have been 4.2. Confirmation of Probed Packet Size
successfully received end-to-end across a network path.
The PL needs a method to determine (confirm) when probe packets have
been successfully received end-to-end across a network path.
Transport protocols can include end-to-end methods that detect and Transport protocols can include end-to-end methods that detect and
report reception of specific datagrams that they send (e.g., DCCP and report reception of specific datagrams that they send (e.g., DCCP and
SCTP provide keep-alive/heartbeat features). When supported, this SCTP provide keep-alive/heartbeat features). When supported, this
mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of mechanism SHOULD also be used by DPLPMTUD to acknowledge reception of
a probe packet. a probe packet.
A PL that does not acknowledge data reception (e.g., UDP and UDP- A PL that does not acknowledge data reception (e.g., UDP and UDP-
Lite) is unable to detect when the packets that it sends are Lite) is unable itself to detect when the packets that it sends are
discarded because their size is greater than the actual PMTU. These discarded because their size is greater than the actual PMTU. These
PLs need to either rely on an application protocol to detect this PLs need to either rely on an application protocol to detect this
loss, or make use of an additional transport method such as UDP- loss, or make use of an additional transport method such as UDP-
Options [I-D.ietf-tsvwg-udp-options]. In addition, they might need Options [I-D.ietf-tsvwg-udp-options].
to send reachability probes (e.g., periodically solicit a response
from the destination) to determine whether the last successfully
probed PLPMTU is still supported by the network path.
Section Section 4 specifies this function for a set of IETF-specified Section Section 5 specifies this function for a set of IETF-specified
protocols. protocols.
3.3. Reducing the PLPMTU: Confirming Path Characteristics 4.3. Detection of Black Holes
If the DPLPMTUD method detects that a packet with the PLPMTU size is A PL sender needs to reduce the PLPMTU when it discovers the actual
no supported by the network path, then the DLPMTUD method needs to PMTU supported by a network path is less than the PLPMTU (i.e. to
validate the PLPMTU. This can happen when a validated PTB message is detect that traffic is being black holed). This can be triggered
received, or another event that indicates the network path no longer when a validated PTB message is received, or by another event that
sustains this packet size, such as a loss report from the PL indicates the network path no longer sustains the current packet
size, such as a loss report from the PL or repeated lack of response
to probe packets sent to confirm the PLPMTU. Detection is followed
by a reduction of the PLPMTU.
All implementations of DPLPMTUD are REQUIRED to provide support that Black Hole detection is performed by periodically sending packet
reduces the PLPMTU when the actual PMTU supported by a network path probes of size PLPMTU to verify that a network path still supports
is less than the PLPMTU. the last acknowledged PLPMTU size. There are two ways a DPLPMTUD
sender detect that the current PLPMTU is not sustained by the path
(i.e., to detect a black hole):
3.4. Increasing the PLPMTU: Supporting Path Changes o A PL can rely upon a mechanisms implemented within the PL protocol
to detect excessive loss of data sent with a specific packet size
and then conclude that this excessive loss could be a result of an
invalid PMTU (as in PLPMTUD for TCP [RFC4821]).
An implementation that only reduces the PLPMTU to a suitable size is o A PL can use the probing mechanism to send confirmation probe
sufficient to ensure reliable operation, but may be very inefficient packets of the size of the current PLPMTU and a timer track
when the actual PMTU changes or when the method (for whatever reason) whether acknowledgments are received (e.g., The number of probe
makes a suboptimal choice for the PLPMTU. packets sent without receiving an acknowledgement, PROBE_COUNT,
becomes greater than the MAX_PROBES). These messages need to be
generated periodically (e.g., using the confirmation timer
Section 5.1.1), and should be suppressed when the PL is not
actively sending data. Successive loss of probes is an indication
that the current path no longer supports the PLPMTU.
A full implementation of the DPLPMTUD method is RECOMMENDED to When the method detects the current PLPMTU is not supported (a black
provide a way for the sending PL endpoint to detect when the PLPMTU hole is found), DPLPMTUD sets a lower MPS. The PL then confirms that
is smaller than the actual PMTU size. This allows the sender to the updated PLPMTU can be successfully used across the path. This
increase the PLPMTU following a change in the characteristics of the can need the PL to send a probe packet with a size less than the size
path, such as when a link is reconfigured with a larger MTU, or when of the data block generated by an application. In this case, the PL
there is a change in the set of links traversed by an end-to-end flow could provide a way to fragment a datagram at the PL, or could
(e.g. after a routing or fail-over decision). instead utilise a control packet with padding.
3.5. Robustness to inconsistent Path information 4.4. Response to PTB Messages
The decision to increase the PLPMTU needs to be robust to the This method requires the DPLPMTUD sender to validate any received PTB
possibility that information learned about the path is inconsistent message before using the PTB information. The response to a PTB
(this could happen when probe packets are lost due to other reasons, message depends on the PTB_SIZE indicated in the PTB message, the
or some of the packets in a flow are forwarded along a portion of the state of the PLPMTUD state machine, and the IP protocol being used.
path that supports a different PMTU).
Frequent path changes could occur due to unexpected "flapping" - Section 4.4.1 first describes validation for both IPv4 ICMP
where some packets from a flow pass along one path, but other packets Unreachable messages (type 3) and ICMPv6 packet too big messages,
follow a different path with different properties. DPLPMTUD can be both of which are referred to as PTB messages in this document.
made robust to these anomalies by introducing hysteresis into the
decision to increase the Maximum Packet Size.
XXX A future revision of this section will include recommend 4.4.1. Validation of PTB Messages
appropriate methods to provide robustness. XXX
4. Datagram Packetization Layer PMTUD A PL that receives a PTB message from a router or middlebox, MUST
perform ICMP validation as specified in Section 5.2 of [RFC8085].
This needs the PL to check the protocol information in the quoted
payload to validate the message originated from the sending node.
This check includes determining the appropriate port and IP
information - necessary for the PTB message to be passed to the PL.
In addition, the PL SHOULD validate information from the ICMP payload
to determine that the quoted packet was sent by the PL. These checks
are intended to provide protection from packets that originate from a
node that is not on the network path. PTB messages are discarded if
they fail to pass these checks, or where there is insufficient ICMP
payload to perform the checks
This section specifies Datagram PLPMTUD (DPLPMTUD). This method can PTB messages that have been validated can be utilised by the DPLPMTUD
be introduced at various points in the IP protocol stack, to discover algorithm. A method that utilises these PTB messages can improve the
the PLPMTU so that the application can use an MPS appropriate to the speed at the which the algorithm detects an appropriate PLPMTU,
current network path. compared to one that relies solely on probing.
4.4.2. Use of PTB Messages
A set of checks are intended to provide protection from a router that
reports an unexpected PTB_SIZE. The PL needs to check that the
indicated PTB_SIZE is less than the size used by probe packets and
larger than minimum size accepted.
This section provides an informative summary of how PTB messages can
be utilised.
Validating PTB Messages:
* A simple implementation is permitted to ignore received PTB
messages and therefore the PLPMTU is not updated when a PTB
message is received.
* An implementation that supports PTB messages MUST validate
messages before they are processed.
MIN_PMTU < PTB_SIZE < BASE_MTU
* A robust PL MAY enter the PROBE_ERROR state for an IPv4 path
when the PTB_SIZE reported in the PTB message >= 576B and when
this is less than the BASE_MTU.
* A robust PL MAY enter the PROBE_ERROR state for an IPv6 path
when the PTB_SIZE reported in the PTB message >= 1280B and when
this is less than the BASE_MTU.
PTB_SIZE = PLPMTU
* Transition to SEARCH_COMPLETE.
PTB_SIZE > PROBED_SIZE
* The PTB_SIZE > PROBED_SIZE, inconsistent network signal. These
PTB messages ought to be discarded without further processing
(the PLPMTU not updated).
* The information could be utilised as an input to trigger
enabling a resilience mode.
BASE_PMTU <= PTB_SIZE < PLPMTU
* Black hole detection is triggered and the PLPMTU ought to be
set to BASE_PMTU.
* The PL could use PTB_SIZE reported in the PTB message to
initialise a search algorithm.
PLPMTU < PTB_SIZE < PROBED_SIZE
* The PLPMTU continues to be valid, but the last PROBED_SIZE
searched was larger than the actual PMTU.
* The PLPMTU is not updated.
* The PL can use the reported PTB_SIZE from the PTB message as
the next search point when it resumes the search algorithm.
5. Datagram Packetization Layer PMTUD
This section specifies Datagram PLPMTUD (DPLPMTUD). The method can
be introduced at various points in the IP protocol stack to discover
the PLPMTU so that an application can utilise an appropriate MPS for
the current network path.
+----------------------+ +----------------------+
| APP* | | APP* |
+-+-------+----+---+---+ +-+-------+----+---+---+
| | | | | | | |
+---+--+ +--+--+ | +-+---+ +---+--+ +--+--+ | +-+---+
| QUIC*| |UDPO*| | |SCTP*| | QUIC*| |UDPO*| | |SCTP*|
+---+--+ +--+--+ | ++--+-+ +---+--+ +--+--+ | ++--+-+
| | | | | | | | | |
+-------++ | | | +-------+-+ | | |
| | | | | | | |
++-+--++ | ++-+--++ |
| UDP | | | UDP | |
+---+--+ | +---+--+ |
| | | |
+--------------+-----+-+ +--------------+-----+-+
| Network Interface | | Network Interface |
+----------------------+ +----------------------+
Figure 1: Examples where DPLPMTUD can be implemented Figure 1: Examples where DPLPMTUD can be implemented
The central idea of DPLPMTUD is probing by a sender. Probe packets The central idea of DPLPMTUD is probing by a sender. Probe packets
are sent to find out the maximum size of user message that is are sent to find the maximum size of user message that is completely
completely transferred across the network path from the sender to the transferred across the network path from the sender to the
destination. destination.
The are various functions performed by the algorithm: This section identifies the components needed for implementation, the
phases of operation, the state machine and search algorithm.
4.1. PROBE_SEARCH: Probing for a larger PLPMTU
The DPLPMTUD method utilises probe packets to confirm that a packet
of size PROBED_SIZE can traverse the network path. The PROBE_COUNT
is initialised to zero when a probe packet is first sent with a
particular size.
A timer is used to trigger the generation of probe packets. The
probe_timer is started each time a probe packet is sent to the
destination and is cancelled when receipt of the probe packet is
acknowledged. The PROBED_SIZE is confirmed, and this value is then
assignmed to PLPMTU. The DPLPMTUD method may send subsequent probes
of an increasing size. Increasing probes follow a search strategy as
discussed in Section 4.7.
Each time the probe_timer expires, the PROBE_COUNT is incremented,
the probe_timer is reinitialised, and a probe packet of the same size
is retransmitted.
The maximum number of retransmissions for a PROBED_SIZE is configured
(MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES,
probing will stop and enters the PROBE_DONE state.
4.2. The PROBE_DONE state
When the PL sender completes probing for a larger PLPMTU, it enters 5.1. DPLPMTUD Components
the PROBE_DONE state. This starts the PMTU_RAISE_TIMER. While this
running, the PLPMTU remains at the value set in the last succesful
probe packet.
If the PL is designed in a way that is unable to validate This section describes components of DPLPMTUD.
reachability to the destination endpoint after probing has completed,
the method uses a REACHABILITY_TIMER to periodically repeat a probe
packet for the current PLPMTU size, while the PMTU_RAISE_TIMER is
running. If the REACHABILITY_TIMER expires, the method exits the
PROBE_DONE state. The done state is also exited when a validated PTB
message is received.
If the PMTU_RAISE_TIMER expires, the PL sender also exits the 5.1.1. Timers
PROBE_DONE state, but in this case resumes probing from the size of
the PLPMTU.
4.3. Validation and Use of PTB Messages The method utilises three timers:
This section describes processing for both IPv4 ICMP Unreachable PROBE_TIMER: The PROBE_TIMER is configured to expire after a period
messages (type 3) and ICMPv6 packet too big messages. longer than the maximum time to receive an acknowledgment to a
probe packet. This value MUST be larger than 1 second, and SHOULD
be larger than 15 seconds. Guidance on selection of the timer
value are provided in section 3.1.1 of the UDP Usage Guidelines
[RFC8085].
A PL that receives a PTB message from a router or middlebox, MUST If the PL has a path Round Trip Time (RTT) estimate and timely
validate the PTB message. The PL checks the protocol information in acknowledgements the PROBE_TIMER can be derived from the PL RTT
the quoted payload to validate the message originated from the estimate.
sending node. The node also checks that the reported link MTU size
is less than the size used by packet probes. PTB messages are
discarded if they fail to pass these checks, or where there is
insufficient ICMP payload to perform these checks. The checks are
intended to provide protection from packets that originate from a
node that is not on the network path or a node that attempts to
report a larger link MTU than the current probe size.
PTB messages that have been validated can be utilised by the DPLPMTUD PMTU_RAISE_TIMER: The PMTU_RAISE_TIMER is configured to the period a
algorithm. A method that utilises these PTB messages can improve the sender will continue to use the current PLPMTU, after which it re-
speed at the which the algorithm detects an appropriate PLPMTU enters the Search phase. This timer has a period of 600 secs, as
compared to one that relies solely on probing. recommended by PLPMTUD [RFC4821].
4.4. Timers DPLPMTUD SHOULD inhibit sending probe packets when no application
data has been sent since the previous probe packet.
The method in the previous subsections utilises three timers: CONFIRMATION_TIMER: The CONFIRMATION_TIMER is configured to the
period a PL sender waits before confirming the current PLPMTU is
still supported. This is less than the PMTU_RAISE_TIMER and used
to decrease the PLPMTU (e.g., when a black hole is encountered).
Confirmation needs to be frequent enough when data is flowing that
the sending PL does not black hole extensive amounts of traffic.
Guidance on selection of the timer value are provided in section
3.1.1 of the UDP Usage Guidelines[RFC8085].
PROBE_TIMER: Configured to expire after a period longer than the DPLPMTUD SHOULD inhibit sending probe packets when no application
maximum time to receive an acknowledgment to a probe packet. This data has been sent since the previous probe packet.
value MUST be larger than 1 second, and SHOULD be larger than 15
seconds. Guidance on selection of the timer value are provide in
section 3.1.1 of the UDP Usage Guidelines [RFC8085].
If the PL has an RTT estimate and timely acknowedgements the An implementation could implement the various timers using a single
PROBE_TIMER can be derrived from the PL RTT estimate. timer process.
PMTU_RAISE_TIMER: Configured to the period a sender ought to 5.1.2. Constants
continue use the current PLPMTU, after which it re-commences
probing for a higher PMTU. This timer has a period of 600 secs,
as recommended by DPLPMTUD [RFC4821].
REACHABILITY_TIMER: Configured to the period a sender ought to wait The following constants are defined:
before confirming the current PLPMTU is still supported. This is
less than the PMTU_RAISE_TIMER and used to decrease the PLPMTU
(e.g. when a black hole is encountered).
DPLPMTUD ought to suspend reachability probes when no application MAX_PROBES: MAX_PROBES is the maximum value of the
data has been sent since the previous probe packet. Guidance on PROBE_ERROR_COUNTER. The default value of MAX_PROBES is 10.
selection of the timer value are provide in section 3.1.1 of the
UDP Usage Guidelines[RFC8085]. DPLPMTUD ought to be suspended or
only sent in conjuction with out traffic during periods of
dormancy. This PLPMTU validation needs to be frequent enough when
data is flowing that the sending PL does not black hole extensive
amounts of traffic
An implementation could implement the various timers using a single MIN_PMTU: The MIN_PMTU is smallest allowed probe packet size. For
timer process. IPv6, this value is 1280 bytes, as specified in [RFC2460]. For
IPv4, the minimum value is 68 bytes. (An IPv4 router is required
to be able to forward a datagram of 68 octets without further
fragmentation. This is the combined size of an IPv4 header and
the minimum fragment size of 8 octets. In addition, receivers are
required to be able to reassemble fragmented datagrams at least up
to 576B, as stated in section 3.3.3 of [RFC1122]))
4.5. Constants MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU. This has to
be less than or equal to the minimum of the local MTU of the
outgoing interface and the destination PMTU for receiving. An
application or PL MAY reduce the MAX_PMTU when there is no need to
send packets larger than a specific size.
The following constants are defined: BASE_PMTU: The BASE_PMTU is a configured size expected to work for
most paths. The size is equal to or larger than the MIN_PMTU and
smaller than the MAX_PMTU. In the case of IPv6, this value is
1280 bytes [RFC2460]. When using IPv4, a size of 1200 bytes is
RECOMMENDED.
MAX_PROBES: The maximum value of the PROBE_ERROR_COUNTER. The 5.1.3. Variables
default value of MAX_PROBES is 10.
MIN_PMTU: The smallest allowed probe packet size. For IPv6, this This method utilises a set of variables:
value is 1280 bytes, as specified in [RFC2460]. For IPv4, the
minimum value is 68 bytes. (An IPv4 routed is required to be able
to forward a datagram of 68 octets without further fragmentation.
This is the combined size of an IPv4 header and the minimum
fragment size of 8 octets.)
BASE_PMTU: The BASE_PMTU is a considered a size that ought to work PROBED_SIZE: The PROBED_SIZE is the size of the current probe
in most cases. The size is equal to or larger than the minimum packet. This is a tentative value for the PLPMTU, which is
permitted and smaller than the maximum allowed. In the case of awaiting confirmation by an acknowledgment.
IPv6, this value is 1280 bytes [RFC2460]. When using IPv4, a size
of 1200 bytes is RECOMMENDED.
MAX_PMTU: The MAX_PMTU is the largest size of PLPMTU that is probed. PROBE_COUNT: The PROBE_COUNT is a count of the number of
This has to be less than or equal to the minimum of the local MTU unsuccessful probe packets that have been sent with a size of
of the outgoing interface and the destination PLMTU for receiving. PROBED_SIZE. The value is initialised to zero when a particular
An application or PL may reduce this when it knows there is no size of PROBED_SIZE is first attempted.
need to send packets above a specific size.
The figure below illustrates the relationship between some of these The figure below illustrates the relationship between the packet size
variables, in this case when the DPLPMTUD algorithm performs path constants and variables, in this case when the DPLPMTUD algorithm
probing to increase the size of the PLPMTU. The MPS is less than the performs path probing to increase the size of the PLPMTU. The MPS is
PLPMTU. A probe packet has been sent of size PROBED_SIZE. When this less than the PLPMTU. A probe packet has been sent of size
is acknowledged, the PLPMTU will be raised to PROBED_SIZE allowing PROBED_SIZE. When this is acknowledged, the PLPMTU will be raised to
the PROBED_SIZE to be increased towards the actual PMTU. PROBED_SIZE allowing the PROBED_SIZE to be increased towards the
actual PMTU.
MIN_PMTU PMTU_MAX MIN_PMTU PMTU_MAX
<------------------------------------------------------> <------------------------------------------------------>
| | | | | | | | | |
V | | | V V | | | V
BASE_PMTU V | V Actual PMTU BASE_PMTU V | V Actual PMTU
MPS | PROBED_SIZE MPS | PROBED_SIZE
V V
PLPMTU PLPMTU
Figure 2: Relationships between probe and packet sizes Figure 2: Relationships between probe and packet sizes
4.6. Variables 5.2. DPLPMTUD Phases
This method utilises a set of variables: The Datagram PLPMTUD algorithm moves through several phases of
operation.
PROBE_TIMER: Configured to expire after a period longer than the An implementation that only reduces the PLPMTU to a suitable size
maximum time to receive an acknowledgment to a probe packet. This would be sufficient to ensure reliable operation, but can be very
value MUST be larger than 1 second, and SHOULD be larger than 15 inefficient when the actual PMTU changes or when the method (for
seconds. Guidance on selection of the timer value are provide in whatever reason) makes a suboptimal choice for the PLPMTU.
section 3.1.1 of the UDP Usage Guidelines [RFC8085].
PL with RTT estimates may use values smaller than 1 seconded A full implementation of DPLPMTUD provides an algorithm enabling the
derrived from their RTT estimate to speed up detection of DPLPMTUD sender to increase the PLPMTU following a change in the
connectivity issues on the path. characteristics of the path, such as when a link is reconfigured with
a larger MTU, or when there is a change in the set of links traversed
by an end-to-end flow (e.g., after a routing or path fail-over
decision).
PROBED_SIZE: The PROBED_SIZE is the size of the current probe Black hole detection, see Section 4.3 and PTB processing Section 4.4
packet. This is a tentative value for the PLPMTU, which is proceed in parallel with these phases of operation.
awaiting confirmation by an acknowledgment.
PROBE_COUNT: This is a count of the number of unsuccessful probe +-------------------+
packets that have been sent with size PROBED_SIZE. The value is | Path Confirmation +-- Connectivity
initialised to zero when a particular size of PROBED_SIZE is first +--------+----------+ \----- or BASE_PMTU
attempted. | /\ \/ Confirmation Fails
Connectivity and | | +-------+
BASE_PMTU confirmed | ---------+ Error |
| +-------+
| CONFIRMATION_TIMER
| Fires
\/
+----------------+ +--------------+
| Search Complete|<---------+ Search |
+----------------+ +--------------+
Search Algorithm
Completes
PTB_SIZE: The PTB_Size is value returned by a validated PTB message Figure 3: DPLPMTUD Phases
indicating the local MTU size of a router along the path.
4.7. Selecting PROBED_SIZE Path Confirmation
Implementations discover the search range by validating the minimum * Connectivity is confirmed.
path MTU and then using the probe method to select a PROBED_SIZE less
than or equal to the maximum PMTU_MAX. Where PMTU_MAX is the minimum
of the local link MTU and EMTU_R (learned from the remote endpoint).
The PMTU_MAX MAY be constrained by an application that has a maximum
to the size of datagrams it wishes to send.
Implementations use a search algorithm to choose probe sizes within * DPLPMTUD confirms the BASE_PMTU is supported across the network
the search range. path.
xxx A future version of this section will detail example methods for * DPLPMTUD then enters the search phase.
selecting probe size values, but does not plan to mandate a single
method. xxx
Implementations MAY optimizse the search procedure by selecting step Search
sizes from a table of common PMTU sizes.
Implementations SHOULD select probe sizes to maximise the gain in * DPLPMTUD performs probing to increase the PLPMTU.
PLPMTU each search step. Implementations ought to take into
consideration useful probe size steps and a minimum useful gain in * DPLPMTUD then enters the search complete or an error phase.
Search Complete
* DPLPMTUD has found a suitable PLPMTU that is supported across
the network path.
* Black hole detection will confirm this PLPMTU continues to be
supported.
* On a longer time-frame, DPLPMTUD will re-enter the search phase
to discover if the PLPMTU can be raised.
Error
* Inconsistent or invalid network signals cause DPLPMTUD to be
unable to progress.
* This causes the algorithm to lower the MPS until the path is
shown to support the BASE_PMTU, or to suspend DPLPMTUD.
5.2.1. Path Confirmation Phase
DPLPMTUD starts in the Path confirmation phase. Path confirmation is
performed in two stages:
1. Connectivity to the remote peer is first confirmed. When a
connection-oriented PL is used, this stage is implicit. It is
performed as part of the normal PL connection handshake. In
contrast, an connectionless PL MUST send an acknowledged probe
packet to confirm that the remote peer is reachable.
2. In the second stage, the PL confirms it can successfully send a
datagram of the BASE_PMTU size across the current path.
A PL that does not wish to support a network path with a PLPMTU less
than BASE_PMTU can simplify the phase into a single step by
performing connectivity checks with probes of the BASE_PMTU size.
A PL MAY respond to PTB messages while in this phase, see
Section 4.4.
Once path confirmation has completed, DPLPMTUD can advertise an MPS
to an upper layer.
If DPLPMTUD fails to complete these tests it enters the
PROBE_DISABLED phase, see Section 5.2.6, and ceases using DPLPTMUD.
5.2.2. Search Phase
The search phase utilises a search algorithm in attempt to increase
the PLPMTU (see Section 5.4.1). The PL sender increases the MPS each
time a packet probe confirms a larger PLPMTU is supported by the
path. The algorithm concludes by entering the SEARCH_COMPLETE phase,
see Section 5.2.3.
A PL MAY respond to PTB messages while in this phase, using the PTB
to advance or terminate the search, see Section 4.4. Similarly black
hole detection can terminate the search by entering the PROBE_BASE
phase, see Section 5.2.4.
5.2.2.1. Resilience to inconsistent path information
Sometimes a PL sender is able to detect inconsistent results from the
sequence of PLPMTU probes that it sends or the sequence of PTB
messages that it receives. This could be manifested as excessive
fluctuation of the MPS.
When inconsistent path information is detected, a PL sender can
enable an alternate search mode that clamps the offered MPS to a
smaller value for a period of time. This avoids unnecessary black-
holing of packets.
5.2.3. Search Complete Phase
On entry to the search complete phase, the DPLPMTUD sender starts the
PMTU_RAISE_TIMER. In this phase, the PLPMTU remains at the value
confirmed by the last successful probe packet.
In this phase, the PL MUST periodically confirm that the PLPMTU is
still supported by the path. If the PL is designed in a way that is
unable to confirm reachability to the destination endpoint after
probing has completed, the method uses a CONFIRMATION_TIMER to
periodically repeat a probe packet for the current PLPMTU size.
If the DPLPMTUD sender is unable to confirm reachability for packets
with a size of the current PLPMTU (e.g., if the CONFIRMATION_TIMER
expires) or the PL signals a lack of reachability, the method exits
the phase and enters the PROBE_BASE phase, see Section 5.2.4.
If the PMTU_RAISE_TIMER expires, the DPLPMTUD sender re-enters the
Search phase, see Section 5.2.2, and resumes probing for a larger
PLPMTU. PLPMTU.
4.8. Simple Black Hole Detection Back hole detection can be used in parallel to check that a network
path continues to support a previously confirmed PLPMTU. If a black
hole is detected the algorithm moves to the PROBE_BASE phase, see
Section 5.2.4.
The DPLPMTUD method can be used to provide black hole detection. The phase can also exited when a validated PTB message is received
This enables a reduction of the PLPMTU when a PL sender encounters a (see Section 4.4.1).
path that fails to support the current MPS and also fails to return a
PTB message to the sender.
The simple method starts by setting the PLPMTU to the BASE_PMTU. 5.2.4. PROBE_BASE Phase
When the method detects that communication is not possible with this
size of packet, the PLPMTU is reduced, until an operable message size
is reached or the PLPMTU reaches the BASE_MTU size. The method
enables a sending PL to inform an application of the reduced MPS and
accordingly send smaller packets.
The simple black hole detetction method does not seek to increase the This phase is entered when black hole detection or a PTB message
PLPMTU. This makes it vulneable to transient reductions in the indicates that the PLPMTU is not supported by the path.
actual PLPMTU, which could result in a PLPMTU lower than the actual
PMTU.
The full methiod is specified in Section 4.9. On entry to this phase, the PLPMTU is set to the BASE_PMTU, and a
corresponding reduced MPS is advertised.
4.8.1. Simple Black Hole Detection State Machine PROBED_SIZE is then set to the PLPMTU (i.e., the BASE_PMTU), to
confirm this size is supported across the path. If confirmed,
DPLPMTUD enters the Search Phase to determine whether the PL sender
can use a larger PLPMTU.
The PL sender starts with the PLPMTU and PROBED_SIZE set to the If the path cannot be confirmed to support the BASE_PMTU after
BASE_PMTU. sending MAX_PROBES, DPLPMTUD moves to the Error phase, see
Section 5.2.5.
While a PL has a PLPMTU greater than the BASE_MTU, the PL needs to 5.2.5. ERROR Phase
send probe packets at the PROBED_SIZE to revalidate the PLPMTU.
Black hole detection is also triggered by lack of reachability at the
PL. When the PL sender detects that multiple transmissions of
packets of PROBED_SIZE are no longer being acknowledged (e.g., When
the number of probe packets sent without receiving an acknowledgement
(PROBE_COUNT) becomes greater than the MAX_PROBES), the PL concludes
that it has detected a black hole and reduces PLPMTU.
The connectivity check may be performened by the protocol The ERROR phase is entered when there is conflicting or invalid
implementing the PL (as in PLPMTUD for TCP [RFC4821]). When the PLPMTU information for the path (e.g. a failure to support the
application using the PL does not regularly send packets of size BASE_PMTU). In this phase, the MPS is set to a value less than the
PROBED_SIZE, additional probe packets need to be sent by PL using the BASE_PMTU, but at least the size of the MIN_PMTU.
reachability timer Section 4.4.
If method does reduces the PLPMTU to the MIN_PMTU, the method DPLPMTUD remains in the ERROR phase until a consistent view of the
concludes the path does not support the MIN_PMTU. path can be discovered and it has also been confirmed that the path
supports the BASE_PMTU.
If multihoming is supported, a state machine is needed for each Note: MIN_PMTU may be identical to BASE_PMTU, simplifying the actions
active path. in this phase.
The state machine for a simple black hole detection mechanism is If no acknowledgement is received for PROBE_COUNT probes of size
depicted in Figure 3. MIN_PMTU, the method suspends DPLPMTUD, see Section 5.2.5.
XXX a future version of the simple black hole detection state machine 5.2.5.1. Robustness to inconsistent path
might consider icmp PTB messages XXX
+------------+
| PROBE_START|
+-----+------+
| Connectivity confirmed
| (reachability tests start)
PROBE_COUNT >= V
MAX_PROBES +------------+
+---------------| PROBE_BASE +->-+
| +-----+------+ |
| | ^ | PROBE_COUNT < MAX_PROBES
| | +-----+
| V
| | PROBE_ACK
| PROBE_COUNT |
| = MAX_PROBES +------------+
| (reduce +-<-+ PROBE_DONE +->-+
| PLPMTU) | +------+-----+ |
| | ^ | ^ | PROBE_COUNT < MAX_PROBES
| | | | | | (Contine probing)
| +-----+ | +-----+
V V
+------------+ |
| PROBE_ERROR|<------------+
+------------+
Figure 3: State machine for detecting black holes Robustness to paths unable to sustain the BASE_PMTU. Some paths
could be unable to sustain packets of the BASE_PMTU size. These
paths could use an alternate algorithm to implement the PROBE_ERROR
phase that allows fallback to a smaller than desired PLPMTU, rather
than suffer connectivity failure.
4.9. Full State Machine This could also utilise methods such as endpoint IP fragmentation to
enable the PL sender to communicate using packets smaller than the
BASE_PMTU.
A full state machine for DPLPMTUD is depicted in Figure 4. If 5.2.6. DISABLED Phase
multihoming is supported, a state machine is needed for each active
path.
PROBE_TIMER expiry This phase suspends operation of DPLPMTUD. It disables probing for
(PROBE_COUNT = MAX_PROBES) the PLPMTU until action is taken by the PL or application using the
+-------------+ +--------------+ PL.
+->| PROBE_START +--------------->|PROBE_DISABLED|
PROBE_TIMER expiry | +--+-------+--+ +--------------+
(PROBE_COUNT = | | |
MAX_PROBES) +-----+ | Connectivity confirmed
v
+---------- +------------+ -+ PROBE_TIMER expiry
MAX_PMTU acked or | | PROBE_BASE | | (PROBE_COUNT <
PTB (>= BASE_PMTU)| +----> +--------+---+ <+ MAX_PROBES)
+---------------+ | /\ | |
| | | | | PTB
| PMTU_RAISE_TIMER| | | | (PTB_SIZE < BASE_PMTU)
| or reachability | | | | or
| (PROBE_COUNT | | | | PROBE_TIMER expiry
| = MAX_PROBES) | | | | (PROBE_COUNT = MAX_PROBES)
| +-----------+ | | \
| | PTB | | \
| | (< PROBED_SIZE)| | \
| | | | ---------------+
| | | | |
| | | | Probe |
| | | | acked |
v | | v v
+----------+-+ +----+---------+ Probe +-------------+
| PROBE_DONE |<-------------- | PROBE_SEARCH |<-------| PROBE_ERROR |
+------+-----+ MAX_PMTU acked +------------+-+ acked +-------------+
/\ | or /\ |
| | PROBE_TIMER expiry | |
| |(PROBE_COUNT = MAX_PROBES) | |
| | | |
+----+ +------+
Reachability probe acked PROBE_TIMER expiry
or PROBE_TIMER expiry (PROBE_COUNT < MAX_PROBES)
(PROBE_COUNT < MAX_PROBES) or
Probe acked
Figure 4: State machine for Datagram PLPMTUD 5.3. State Machine
XXX A future version of this document will update the state machine A state machine for DPLPMTUD is depicted in Figure 4. If multihoming
to describe handling of validated PTB messages. XXX is supported, a state machine is needed for each active path.
The following states are defined to reflect the probing process: PROBE_TIMER expiry
(PROBE_COUNT = MAX_PROBES)
+-------------------+ +--------------+
| PROBE_START +------>|PROBE_DISABLED|
+-------------------+ +--------------+
| ^
| Path confirmed |
v |
MAX_PMTU acked or +--------------+-+ (PROBE_COUNT |
PTB (BASE_PMTU <= +---------| PROBE_SEARCH | | < MAX_PROBES) |
PTB_SIZE | +--> +--------------+<+ or Probe acked |
<PROBED_SIZE) | | | ^ |
or | | | | |
(PROBE_COUNT | | | | |
=MAX_PROBES) | | | | |
+---------------+ | | | |
| | | | |
| | | | |
| PMTU_RAISE_TIMER | | | |
| | | | |
| | | | |
| +-----------+ | | |
| | | | |
| | | | |
| | (PTB_SIZE < PLPMTU)| | |
| | or | | BASE_PMTU |
| | Black hole detected | | Probe acked |
v | v | |
+----------+----+ +--------------+ +-------------+
|SEARCH_COMPLETE|----------->| PROBE_BASE |<-------| PROBE_ERROR |
+------+--------+ +--------------+ +-------------+
/\ | Black hole detected ^ | | BASE_PMTU Probe acked: ^
| | or | | | |
| | (PTB_SIZE < PLPMTU) | | | Probe BASE_PMTU: |
| | | | | (PROBE_COUNT = MAX_PROBES)|
| | | | +---------------------------+
+----+ +--+
Confirmation: PROBE_TIMER expiry:
(PROBE_COUNT < MAX_PROBES) (PROBE_COUNT < MAX_PROBES)
or
PLPMTU Probe acked
Figure 4: State machine for Datagram PLPMTUD. Note: Some state
changes are not show to simplify the diagram.
The following states are defined:
PROBE_START: The PROBE_START state is the initial state before PROBE_START: The PROBE_START state is the initial state before
probing has started. PLPMTUD is not performed in this state. The probing has started. The state confirms connectivity to the
state transitions to PROBE_BASE, when a path has been confirmed, remote PL.
i.e. when a sent packet has been acknowledged on this path. Any
transport method may be used to exit PROBE_BASE as long as the
send packet is acknowledge by the other side. The PLPMTU is set
to the BASE_PMTU size. Probing ought to start immediately after
connection setup to prevent the prevent the loss of user data.
PROBE_BASE: The PROBE_BASE state is the starting point for probing The PLPMTU is set to the BASE_PMTU size. Probing ought to start
with datagram PLPMTUD. It is used to confirm whether the immediately after connection setup to prevent the prevent the loss
BASE_PMTU size is supported by the network path. On entry, the of user data. PLPMTUD is not performed in this state. The state
PROBED_SIZE is set to the BASE_PMTU size and the PROBE_COUNT is transitions to PROBE_SEARCH, when a network path has been
set to zero. A probe packet is sent, and the PROBE_TIMER is confirmed, i.e., when a sent packet has been acknowledged on this
started. The state is left when the PROBE_COUNT reaches network path and the BASE_PMTU is confirmed to be supported. If
MAX_PROBES; a PTB message is validated, or a probe packet is the network path cannot be confirmed this state transitions to
acknowledged. PROBE_DISABLED.
PROBE_SEARCH: The PROBE_SEARCH state is the main probing state. PROBE_SEARCH: The PROBE_SEARCH state is the main probing state.
This state is entered either when probing for the BASE_PMTU was This state is entered when probing for the BASE_PMTU was
successful or when there is a successful reachability test in the successful.
PROBE_ERROR state. On entry, the PLPMTU is set to the last
acknowledged PROBED_SIZE.
The PROBE_COUNT is set to zero when the first probe packet is sent The PROBE_COUNT is set to zero when the first probe packet is sent
for each probe size. Each time a probe packet is acknowledged, for each probe size. Each time a probe packet is acknowledged,
the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is the PLPMTU is set to the PROBED_SIZE, and then the PROBED_SIZE is
increased. increased using the search algorithm.
When a probe packet is sent and not acknowledged within the period When a probe packet is sent and not acknowledged within the period
of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe of the PROBE_TIMER, the PROBE_COUNT is incremented and the probe
packet is retransmitted. The state is exited when the PROBE_COUNT packet is retransmitted. The state is exited when the PROBE_COUNT
reaches MAX_PROBES; a PTB message is validated; or a probe of size reaches MAX_PROBES; a PTB message is validated; a probe of size
PMTU_MAX is acknowledged. PMTU_MAX is acknowledged or black hole detection is triggered.
SEARCH_COMPLETE: The SEARCH_COMPLETE state indicates a successful
end to the PROBE_SEARCH state. DPLPMTUD remains in this state
until either the PMTU_RAISE_TIMER expires; a received PTB message
is validated; or black hole detection is triggered.
When DPLPMTUD uses an unacknowledged PL and is in the
SEARCH_COMPLETE state, a CONFIRMATION_TIMER periodically resets
the PROBE_COUNT and schedules a probe packet with the size of the
PLPMTU. If the probe packet fails to be acknowledged after
MAX_PROBES attempts, the method enters the PROBE_BASE state. When
used with an acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT
continue to generate PLPMTU probes in this state.
PROBE_BASE: The PROBE_BASE state is used to confirm whether the
BASE_PMTU size is supported by the network path and is designed to
allow an application to continue working when there are transient
reductions in the actual PMTU. It also seeks to avoid long
periods where traffic is black holed while searching for a larger
PLPMTU.
On entry, the PROBED_SIZE is set to the BASE_PMTU size and the
PROBE_COUNT is set to zero.
Each time a probe packet is sent, and the PROBE_TIMER is started.
The state is exited when the probe packet is acknowledged, and the
PL sender enters the PROBE_SEARCH state.
The state is also left when the PROBE_COUNT reaches MAX_PROBES; a
PTB message is validated. This causes the PL sender to enter the
PROBE_ERROR state.
PROBE_ERROR: The PROBE_ERROR state represents the case where the PROBE_ERROR: The PROBE_ERROR state represents the case where the
network path is not known to support an PLPMTU of at least the network path is not known to support a PLPMTU of at least the
BASE_PMTU size. It is entered when either a probe of size BASE_PMTU size. It is entered when either a probe of size
BASE_PMTU has not been acknowledged or a validated PTB message BASE_PMTU has not been acknowledged or a validated PTB message
indicates a smaller link MTU than the BASE_PMTU. On entry, the indicates a smaller PTB_SIZE smaller than the BASE_PMTU.
PROBE_COUNT is set to zero and the PROBED_SIZE is set to the
MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU size. In this
state, a probe packet is sent, and the PROBE_TIMER is started.
The state transitions to the PROBE_SEARCH state when a probe
packet is acknowledged.
PROBE_DONE: The PROBE_DONE state indicates a successful end to a On entry, the PROBE_COUNT is set to zero and the PROBED_SIZE is
probing phase. DPLPMTUD remains in this state until either the set to the MIN_PMTU size, and the PLPMTU is reset to MIN_PMTU
PMTU_RAISE_TIMER expires or a received PTB message is validated. size. In this state, a probe packet is sent, and the PROBE_TIMER
is started. The state transitions to the PROBE_SEARCH state when
a probe packet is acknowledged of at least size BASE_PMTU. Robust
implementations may validate the BASE_PMTU several times before
transition to the PROBE_SEARCH.
When PLPMTUD uses an unacknowledged PL and is in the PROBE_DONE Implementations are permitted to enable endpoint fragmentation if
state, a REACHABILITY_TIMER periodically resets the PROBE_COUNT the DPLPMTUD is unable to validate MIN_PMTU within PROBE_COUNT
and schedules a probe packet with the size of the PLPMTU. If the probes. If DPLPMTUD is unable to validate MIN_PMTU the
probe packet fails to be acknowledged after MAX_PROBES attempts, implementation should transition to PROBE_DISABLED.
the method enters the PROBE_BASE state. When used with an
acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to
probe in this state.
PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity PROBE_DISABLED: The PROBE_DISABLED state indicates that connectivity
could not be established. DPLPMTUD MUST NOT probe in this state. could not be established. DPLPMTUD MUST NOT probe in this state.
Appendix A contains an informative description of key events. Appendix A contains an informative description of key events.
5. Specification of Protocol-Specific Methods 5.4. Search to Increase the PLPMTU
This section describes the algorithms used by DPLPMTUD to search for
a larger PLPMTU.
5.4.1. Probing for a larger PLPMTU
Implementations use a search algorithm across the search range to
determine whether a larger PLPMTU can be supported across a network
path.
The method discovers the search range by confirming the minimum
PLPMTU and then using the probe method to select a PROBED_SIZE less
than or equal to PMTU_MAX. PMTU_MAX is the minimum of the local MTU
and EMTU_R (learned from the remote endpoint). The PMTU_MAX MAY be
reduced by an application that sets a maximum to the size of
datagrams it will send.
The PROBE_COUNT is initialised to zero when a probe packet is first
sent with a particular size. A timer is used by the search algorithm
to trigger the sending of probe packets of size PROBED_SIZE, larger
than the PLPMTU. Each probe packet successfully sent to the remote
peer is confirmed by acknowledgement at the PL, see Section 4.1.
Each time a probe packet is sent to the destination, the PROBE_TIMER
is started. The timer is cancelled when the PL receives
acknowledgment that the probe packet has been successfully sent
across the path Section 4.1. This confirms that the PROBED_SIZE is
supported, and the PROBED_SIZE value is then assigned to the PLPMTU.
The search algorithm can continue to send subsequent probe packets of
an increasing size.
If the timer expires before a probe packet is acknowledged, the probe
has failed to confirm the PROBED_SIZE. Each time the PROBE_TIMER
expires, the PROBE_COUNT is incremented, the PROBE_TIMER is
reinitialised, and a probe packet of the same size is retransmitted
(the replicated probe improve the resilience to loss). The maximum
number of retransmissions for a particular size is configured
(MAX_PROBES). If the value of the PROBE_COUNT reaches MAX_PROBES,
probing will stop, and the PL sender enters the SEARCH_COMPLETE
state.
5.4.2. Selection of Probe Sizes
The search algorithm needs to determine a minimum useful gain in
PLPMTU. It would not be constructive for a PL sender to attempt to
probe for all sizes - this would incur unnecessary load on the path
and has the undesirable effect of slowing the time to reach a more
optimal MPS. Implementations SHOULD select the set of probe packet
sizes to maximise the gain in PLPMTU from each search step.
Implementations could optimize the search procedure by selecting step
sizes from a table of common PMTU sizes. When selecting the
appropriate next size to search, an implementor ought to also
consider that there can be common sizes of MPS that applications seek
to use.
xxx Author Note: A future version of this section will detail example
methods for selecting probe size values, but does not plan to mandate
a single method. xxx
5.4.3. Resilience to inconsistent Path information
A decision to increase the PLPMTU needs to be resilient to the
possibility that information learned about the network path is
inconsistent (this could happen when probe packets are lost due to
other reasons, or some of the packets in a flow are forwarded along a
portion of the path that supports a different actual PMTU).
Frequent path changes could occur due to unexpected "flapping" -
where some packets from a flow pass along one path, but other packets
follow a different path with different properties. DPLPMTUD can be
made resilient to these anomalies by introducing hysteresis into the
search decision to increase the MPS.
6. Specification of Protocol-Specific Methods
This section specifies protocol-specific details for datagram PLPMTUD This section specifies protocol-specific details for datagram PLPMTUD
for IETF-specified transports. for IETF-specified transports.
The first subsection provides guidance on how to implement the The first subsection provides guidance on how to implement the
DPLPMTUD method as a part of an application using UDP or UDP-Lite. DPLPMTUD method as a part of an application using UDP or UDP-Lite.
The guidance also applies to other datagram services that do not The guidance also applies to other datagram services that do not
include a specific transport protocol (such as a tunnel include a specific transport protocol (such as a tunnel
encapsulation). The following subsection describe how DPLPMTUD can encapsulation). The following subsection describe how DPLPMTUD can
be implemented as a part of the transport service, allowing be implemented as a part of the transport service, allowing
applications using the service to benefit from discovery of the applications using the service to benefit from discovery of the
PLPMTU without themselves needing to implement this method. PLPMTU without themselves needing to implement this method.
5.1. Application support for DPLPMTUD with UDP or UDP-Lite 6.1. Application support for DPLPMTUD with UDP or UDP-Lite
The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do The current specifications of UDP [RFC0768] and UDP-Lite [RFC3828] do
not define a method in the RFC-series that supports PLPMTUD. In not define a method in the RFC-series that supports PLPMTUD. In
particular, the UDP transport does not provide the transport layer particular, the UDP transport does not provide the transport layer
features needed to implement datagram PLPMTUD. features needed to implement datagram PLPMTUD.
The DPLPMTUD method can be implemented as a part of an application The DPLPMTUD method can be implemented as a part of an application
built directly or indirectly on UDP or UDP-Lite, but relies on built directly or indirectly on UDP or UDP-Lite, but relies on
higher-layer protocol features to implement the method [RFC8085]. higher-layer protocol features to implement the method [RFC8085].
Some primitives used by DPLPMTUD might not be available via the Some primitives used by DPLPMTUD might not be available via the
Datagram API (e.g., the ability to access the PLPMTU cache, or Datagram API (e.g., the ability to access the PLPMTU cache, or
interpret received ICMP PTB messages). interpret received ICMP PTB messages).
In addition, it is desirable that PMTU discovery is not performed by In addition, it is desirable that PMTU discovery is not performed by
multiple protocol layers. An application SHOULD avoid implementing multiple protocol layers. An application SHOULD avoid implementing
DPLPMTUD when the underlying transport system provides this DPLPMTUD when the underlying transport system provides this
capability. Using a common method for manging the PLPMTU has capability. Using a common method for managing the PLPMTU has
benefits, both in the ability to share state between different benefits, both in the ability to share state between different
processes and opportunities to coordinate probing. processes and opportunities to coordinate probing.
5.1.1. Application Request 6.1.1. Application Request
An application needs an application-layer protocol mechanism (such as An application needs an application-layer protocol mechanism (such as
a message acknowledgement method) that solicits a response from a a message acknowledgement method) that solicits a response from a
destination endpoint. The method SHOULD allow the sender to check destination endpoint. The method SHOULD allow the sender to check
the value returned in the response to provide additional protection the value returned in the response to provide additional protection
from off-path insertion of data [RFC8085], suitable methods include a from off-path insertion of data [RFC8085], suitable methods include a
parameter known only to the two endpoints, such as a session ID or parameter known only to the two endpoints, such as a session ID or
initialised sequence number. initialised sequence number.
5.1.2. Application Response 6.1.2. Application Response
An application needs an application-layer protocol mechanism to An application needs an application-layer protocol mechanism to
communicate the response from the destination endpoint. This communicate the response from the destination endpoint. This
response may indicate successful reception of the probe across the response may indicate successful reception of the probe across the
path, but could also indicate that some (or all packets) have failed path, but could also indicate that some (or all packets) have failed
to reach the destination. to reach the destination.
5.1.3. Sending Application Probe Packets 6.1.3. Sending Application Probe Packets
A probe packet that may carry an application data block, but the A probe packet that may carry an application data block, but the
successful transmission of this data is at risk when used for successful transmission of this data is at risk when used for
probing. Some applications may prefer to use a probe packet that probing. Some applications may prefer to use a probe packet that
does not carry an application data block to avoid disruption to does not carry an application data block to avoid disruption to
normal data transfer. normal data transfer.
5.1.4. Validating the Path 6.1.4. Validating the Path
An application that does not have other higher-layer information An application that does not have other higher-layer information
confirming correct delivery of datagrams SHOULD implement the confirming correct delivery of datagrams SHOULD implement the
REACHABILITY_TIMER to periodically send probe packets while in the CONFIRMATION_TIMER to periodically send probe packets while in the
PROBE_DONE state. SEARCH_COMPLETE state.
5.1.5. Handling of PTB Messages 6.1.5. Handling of PTB Messages
An application that is able and wishes to receive PTB messages MUST An application that is able and wishes to receive PTB messages MUST
perform ICMP validation as specified in Section 5.2 of [RFC8085]. perform ICMP validation as specified in Section 5.2 of [RFC8085].
This requires that the application to check each received PTB This requires that the application to check each received PTB
messages to validate it is received in response to transmitted messages to validate it is received in response to transmitted
traffic and that the reported link MTU is less than the current probe traffic and that the reported PTB_SIZE is less than the current
size. A validated PTB message MAY be used as input to the DPLPMTUD probed size. A validated PTB message MAY be used as input to the
algorithm, but MUST NOT be used directly to set the PLPMTU. DPLPMTUD algorithm, but MUST NOT be used directly to set the PLPMTU.
5.2. DPLPMTUD with UDP Options 6.2. DPLPMTUD with UDP Options
UDP-Options [I-D.ietf-tsvwg-udp-options] can supply the additional UDP Options[I-D.ietf-tsvwg-udp-options] can supply the additional
functionality required to implement DPLPMTUD within the UDP transport functionality required to implement DPLPMTUD within the UDP transport
service. This avoids the need for applications to implement the service. Implementing DPLPMTU using UDP Options avoids the need for
DPLPMTUD method. each application to implement the DPLPMTUD method.
This enables padding to be added to UDP datagrams and can be used to Section 5.6 of[I-D.ietf-tsvwg-udp-options] defines the MSS option,
provide feedback acknowledgement of received probe packets. which allows the local sender to indicate the EMTU_R to the peer.
The value received in this option can be used to initialise PMTU_MAX.
The specification also defines two UDP Options to support DPLMTUD. UDP Options enables padding to be added to UDP datagrams that are
used as Probe Packets. Feedback confirming reception of each Probe
Packet is provided by two new UDP Options:
Section 5.6 of [I-D.ietf-tsvwg-udp-options] defines the MSS option o The Probe Request Option (Section 6.2.1) is set by a sending PL to
which allows the local sender to indicate the EMTU_R to the peer. solicit a response from a remote endpoint. A four-byte token
This option can be used to initialise PMTU_MAX. An application identifies each request.
wishing to avoid the effects of MSS-Clamping (where a middlebox
changes the advertised TCP maximum sending size) ought to use a
cryptographic method to encrypt this parameter.
5.2.1. UDP Request Option o The Probe Response Option (Section 6.2.2 is generated by the UDP
Options receiver in response to reception of a previously received
Probe Request Option. Each Probe Response Option echoes a
previously received four-byte token.
The Request Option allows a sending endpoint to solicit a response The token value allows implementations to be distinguish between
from a destination endpoint. acknowledgements for initial probe packets and acknowledgements
confirming receipt of subsequent probe packets (e.g., travelling
along alternate paths with a larger RTT). Each probe packet needs to
be uniquely identifiable by the UDP Options sender within the Maximum
Segment Lifetime (MSL). The UDP Options sender therefore needs to
not recycle token values until they have expired or have been
acknowledged. A 4 byte value for the token field provides sufficient
space for multiple unique probes to be made within the MSL.
The Request Option carries a four byte token set by the sender. This Implementations ought to only send a probe packet with a Probe
token can be set to a value that is likely to be known only to the Request Option when required by their local state machine, i.e., when
sender (and becomes known to nodes along the end-to-end path). The probing to grow the PLPMTU or to confirm the current PLPMTU. The
sender can then check the value returned in the response to provide procedure to handle the loss of a response packet is the
additional protection from off-path insertion of data [RFC8085]. responsibility of the sender of the request.
+---------+--------+-----------------+ A PL needs to determine that the path can still support the size of
| Kind=9 | Len=6 | Token | datagram that the application is currently sending in the DPLPMTUD
+---------+--------+-----------------+ search_done state (i.e., to detect black-holing of data). One way to
1 byte 1 byte 4 bytes achieve this is to send probe packets of size PLPMTU or to utilise a
higher-layer method that provides explicit feedback indicating any
packet loss. Another possibility is to utilise data packets that
carry a Timestamp Option. Reception of a valid timestamp that was
echoed by the remote endpoint can be used to infer connectivity.
Figure 5: UDP REQ Option Format This can provide useful feedback even over paths with asymmetric
capacity and/or that carry UDP Option flows that have very asymmetric
datagram rates, because an echo of the most recent timestamp still
indicates reception of at least one packet of the transmitted size.
This is sufficient to confirm there is no black hole.
5.2.2. UDP Response Option In contrast, when sending a probe to increase the PLPMTU, a timestamp
may be unable to unambiguously identify that a specific probe packet
has been received. Timestamp mechanisms cannot be used to confirm
the reception of individual probe messages and cannot be used to
stimulate a response from the remote peer.
The Response Option is generated by the PL in response to reception 6.2.1. UDP Probe Request Option
of a previously received Echo Request. The Token field associates
the response with the Token value carried in the most recently-
received Echo Request. The rate of generation of UDP packets
carrying a Response Option MAY be rate-limited.
+---------+--------+-----------------+ The Probe Request Option allows a sending endpoint to solicit a
| Kind=10 | Len=6 | Token | response from a destination endpoint.
+---------+--------+-----------------+
1 byte 1 byte 4 bytes
Figure 6: UDP RES Option Format The Probe Request Option carries a four byte token set by the sender.
This token can be set to a value that is likely to be known only to
the sender (and is sent along the end-to-end path). The sender can
then check the value returned in the UDP Probe Response Option. The
value of the Token field, uniquely identifies a probe within the
maximum segment lifetime and can also provide additional protection
from off-path insertion of data[RFC8085].
5.3. DPLPMTUD for SCTP +---------+--------+-----------------+
| Kind=9 | Len=6 | Token |
+---------+--------+-----------------+
1 byte 1 byte 4 bytes
Figure 5: UDP Probe REQ Option Format
6.2.2. UDP Probe Response Option
The Probe Response Option is generated in response to reception of a
previously received Probe Request Option.
The Probe Response Option carries a four byte token field. The Token
field associates the response with the Token value carried in the
most recently-received Echo Request. The rate of generation of UDP
packets carrying a Probe Response Option MAY be rate-limited.
+---------+--------+-----------------+
| Kind=10 | Len=6 | Token |
+---------+--------+-----------------+
1 byte 1 byte 4 bytes
Figure 6: UDP Probe RES Option Format
6.3. DPLPMTUD for SCTP
Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing Section 10.2 of [RFC4821] specifies a recommended PLPMTUD probing
method for SCTP. It recommends the use of the PAD chunk, defined in method for SCTP. It recommends the use of the PAD chunk, defined in
[RFC4820] to be attached to a minimum length HEARTBEAT chunk to build [RFC4820] to be attached to a minimum length HEARTBEAT chunk to build
a probe packet. This enables probing without affecting the transfer a probe packet. This enables probing without affecting the transfer
of user messages and without interfering with congestion control. of user messages and without interfering with congestion control.
This is preferred to using DATA chunks (with padding as required) as This is preferred to using DATA chunks (with padding as required) as
path probes. path probes.
XXX Future versions of this document might define a parameter XXX Author Note: Future versions of this document might define a
contained in the INIT and INIT ACK chunk to indicate the remote peer parameter contained in the INIT and INIT ACK chunk to indicate the
MTU to the local peer. However, multihoming makes this a bit remote peer MTU to the local peer. However, multihoming makes this a
complex, so it might not be worth doing. XXX bit complex, so it might not be worth doing. XXX
5.3.1. SCTP/IP4 and SCTP/IPv6 6.3.1. SCTP/IPv4 and SCTP/IPv6
The base protocol is specified in [RFC4960]. This provides an The base protocol is specified in [RFC4960]. This provides an
acknowledged PL. A sender can therefore enter the PROBE_BASE state acknowledged PL. A sender can therefore enter the PROBE_BASE state
as soon as connectivity has been confirmed. as soon as connectivity has been confirmed.
5.3.1.1. Sending SCTP Probe Packets 6.3.1.1. Sending SCTP Probe Packets
Probe packets consist of an SCTP common header followed by a Probe packets consist of an SCTP common header followed by a
HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control HEARTBEAT chunk and a PAD chunk. The PAD chunk is used to control
the length of the probe packet. The HEARTBEAT chunk is used to the length of the probe packet. The HEARTBEAT chunk is used to
trigger the sending of a HEARTBEAT ACK chunk. The reception of the trigger the sending of a HEARTBEAT ACK chunk. The reception of the
HEARTBEAT ACK chunk acknowledges reception of a successful probe. HEARTBEAT ACK chunk acknowledges reception of a successful probe.
The HEARTBEAT chunk carries a Heartbeat Information parameter which The HEARTBEAT chunk carries a Heartbeat Information parameter which
should include, besides the information suggested in [RFC4960], the should include, besides the information suggested in [RFC4960], the
probe size, which is the size of the complete datagram. The size of probe size, which is the size of the complete datagram. The size of
skipping to change at page 27, line 5 skipping to change at page 33, line 5
request and the PAD chunk header. The payload of the PAD chunk request and the PAD chunk header. The payload of the PAD chunk
contains arbitrary data. contains arbitrary data.
To avoid fragmentation of retransmitted data, probing starts right To avoid fragmentation of retransmitted data, probing starts right
after the handshake, before data is sent. Assuming normal behaviour after the handshake, before data is sent. Assuming normal behaviour
(i.e., the PMTU is smaller than or equal to the interface MTU), this (i.e., the PMTU is smaller than or equal to the interface MTU), this
process will take a few round trip time periods depending on the process will take a few round trip time periods depending on the
number of PMTU sizes probed. The Heartbeat timer can be used to number of PMTU sizes probed. The Heartbeat timer can be used to
implement the PROBE_TIMER. implement the PROBE_TIMER.
5.3.1.2. Validating the Path with SCTP 6.3.1.2. Validating the Path with SCTP
Since SCTP provides an acknowledged PL, a sender does MUST NOT Since SCTP provides an acknowledged PL, a sender MUST NOT implement
implement the REACHABILITY_TIMER while in the PROBE_DONE state. the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.
5.3.1.3. PTB Message Handling by SCTP 6.3.1.3. PTB Message Handling by SCTP
Normal ICMP validation MUST be performed as specified in Appendix C Normal ICMP validation MUST be performed as specified in Appendix C
of [RFC4960]. This requires that the first 8 bytes of the SCTP of [RFC4960]. This requires that the first 8 bytes of the SCTP
common header are quoted in the payload of the PTB message, which can common header are quoted in the payload of the PTB message, which can
be the case for ICMPv4 and is normally the case for ICMPv6. be the case for ICMPv4 and is normally the case for ICMPv6.
When a PTB message has been validated, the router Link MTU indicated When a PTB message has been validated, the PTB_SIZE reported in the
in the PTB message SHOULD be used with the DPLPMTUD algorithm, PTB message SHOULD be used with the DPLPMTUD algorithm, providing
providing that the reported Link MTU is less than the current probe that the reported PTB_SIZE is less than the current probe size.
size.
5.3.2. DPLPMTUD for SCTP/UDP 6.3.2. DPLPMTUD for SCTP/UDP
The UDP encapsulation of SCTP is specified in [RFC6951]. The UDP encapsulation of SCTP is specified in [RFC6951].
5.3.2.1. Sending SCTP/UDP Probe Packets 6.3.2.1. Sending SCTP/UDP Probe Packets
Packet probing can be performed as specified in Section 5.3.1.1. The Packet probing can be performed as specified in Section 6.3.1.1. The
maximum payload is reduced by 8 bytes, which has to be considered maximum payload is reduced by 8 bytes, which has to be considered
when filling the PAD chunk. when filling the PAD chunk.
5.3.2.2. Validating the Path with SCTP/UDP 6.3.2.2. Validating the Path with SCTP/UDP
Since SCTP provides an acknowledged PL, a sender does MUST NOT Since SCTP provides an acknowledged PL, a sender MUST NOT implement
implement the REACHABILITY_TIMER while in the PROBE_DONE state. the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.
5.3.2.3. Handling of PTB Messages by SCTP/UDP 6.3.2.3. Handling of PTB Messages by SCTP/UDP
Normal ICMP validation MUST be performed for PTB messages as Normal ICMP validation MUST be performed for PTB messages as
specified in Appendix C of [RFC4960]. This requires that the first 8 specified in Appendix C of [RFC4960]. This requires that the first 8
bytes of the SCTP common header are contained in the PTB message, bytes of the SCTP common header are contained in the PTB message,
which can be the case for ICMPv4 (but note the UDP header also which can be the case for ICMPv4 (but note the UDP header also
consumes a part of the quoted packet header) and is normally the case consumes a part of the quoted packet header) and is normally the case
for ICMPv6. When the validation is completed, the router Link MTU for ICMPv6. When the validation is completed, the PTB_SIZE indicated
size indicated in the PTB message SHOULD be used with the DPLPMTUD in the PTB message SHOULD be used with the DPLPMTUD providing that
providing that the reported link MTU is less than the current probe the reported PTB_SIZE is less than the current probe size.
size.
5.3.3. DPLPMTUD for SCTP/DTLS 6.3.3. DPLPMTUD for SCTP/DTLS
The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is The Datagram Transport Layer Security (DTLS) encapsulation of SCTP is
specified in [RFC8261]. It is used for data channels in WebRTC specified in [RFC8261]. It is used for data channels in WebRTC
implementations. implementations.
5.3.3.1. Sending SCTP/DTLS Probe Packets 6.3.3.1. Sending SCTP/DTLS Probe Packets
Packet probing can be done as specified in Section 5.3.1.1. Packet probing can be done as specified in Section 6.3.1.1.
5.3.3.2. Validating the Path with SCTP/DTLS 6.3.3.2. Validating the Path with SCTP/DTLS
Since SCTP provides an acknowledged PL, a sender does MUST NOT Since SCTP provides an acknowledged PL, a sender MUST NOT implement
implement the REACHABILITY_TIMER while in the PROBE_DONE state. the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.
5.3.3.3. Handling of PTB Messages by SCTP/DTLS 6.3.3.3. Handling of PTB Messages by SCTP/DTLS
It is not possible to perform normal ICMP validation as specified in It is not possible to perform normal ICMP validation as specified in
[RFC4960], since even if the ICMP message payload contains sufficient [RFC4960], since even if the ICMP message payload contains sufficient
information, the reflected SCTP common header would be encrypted. information, the reflected SCTP common header would be encrypted.
Therefore it is not possible to process PTB messages at the PL. Therefore it is not possible to process PTB messages at the PL.
5.4. DPLPMTUD for QUIC 6.4. DPLPMTUD for QUIC
Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a Quick UDP Internet Connection (QUIC) [I-D.ietf-quic-transport] is a
UDP-based transport that provides reception feedback. UDP-based transport that provides reception feedback.
Section 9.2 of [I-D.ietf-quic-transport] describes the path Section 9.2 of [I-D.ietf-quic-transport] describes the path
considerations when sending QUIC packets. It recommends the use of considerations when sending QUIC packets. It recommends the use of
PADDING frames to build the probe packet. This enables probing the PADDING frames to build the probe packet. This enables probing
without affecting the transfer of other QUIC frames. without affecting the transfer of other QUIC frames.
This provides an acknowledged PL. A sender can therefore enter the This provides an acknowledged PL. A sender can therefore enter the
PROBE_BASE state as soon as connectivity has been confirmed. PROBE_BASE state as soon as connectivity has been confirmed.
5.4.1. Sending QUIC Probe Packets 6.4.1. Sending QUIC Probe Packets
A probe packet consists of a QUIC Header and a payload containing A probe packet consists of a QUIC Header and a payload containing
only PADDING Frames. PADDING Frames are a single octet (0x00) and only PADDING Frames. PADDING Frames are a single octet (0x00) and
several of these can be used to create a probe packet of size several of these can be used to create a probe packet of size
PROBED_SIZE. QUIC provides an acknowledged PL. A sender can PROBED_SIZE. QUIC provides an acknowledged PL. A sender can
therefore enter the PROBE_BASE state as soon as connectivity has been therefore enter the PROBE_BASE state as soon as connectivity has been
confirmed. confirmed.
The current specification of QUIC sets the following: The current specification of QUIC sets the following:
o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to o BASE_PMTU: 1200. A QUIC sender needs to pad initial packets to
1200 bytes to validate the path can support packets of a useful 1200 bytes to confirm the path can support packets of a useful
size. size.
o MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has o MIN_PMTU: 1200 bytes. A QUIC sender that determines the PMTU has
fallen below 1200 bytes MUST immediately stop sending on the fallen below 1200 bytes MUST immediately stop sending on the
affected path. affected path.
5.4.2. Validating the Path with QUIC 6.4.2. Validating the Path with QUIC
QUIC provides an acknowledged PL. A sender therefore MUST NOT QUIC provides an acknowledged PL. A sender therefore MUST NOT
implement the REACHABILITY_TIMER while in the PROBE_DONE state. implement the CONFIRMATION_TIMER while in the SEARCH_COMPLETE state.
5.4.3. Handling of PTB Messages by QUIC 6.4.3. Handling of PTB Messages by QUIC
QUIC operates over the UDP transport, and the guidelines on ICMP QUIC operates over the UDP transport, and the guidelines on ICMP
validation as specified in Section 5.2 of [RFC8085] therefore apply. validation as specified in Section 5.2 of [RFC8085] therefore apply.
Although QUIC does not currently specify a method for validating ICMP Although QUIC does not currently specify a method for validating ICMP
responses, it does provide some guidelines to make it harder for an responses, it does provide some guidelines to make it harder for an
off-path attacker to inject ICMP messages. off-path attacker to inject ICMP messages.
o Set the IPv4 Don't Fragment (DF) bit on a small proportion of o Set the IPv4 Don't Fragment (DF) bit on a small proportion of
packets, so that most invalid ICMP messages arrive when there are packets, so that most invalid ICMP messages arrive when there are
no DF packets outstanding, and can therefore be identified as no DF packets outstanding, and can therefore be identified as
skipping to change at page 29, line 42 skipping to change at page 35, line 34
packets (for example, the IP ID or UDP checksum) to further packets (for example, the IP ID or UDP checksum) to further
authenticate incoming Datagram Too Big messages. authenticate incoming Datagram Too Big messages.
o Any reduction in PMTU due to a report contained in an ICMP packet o Any reduction in PMTU due to a report contained in an ICMP packet
is provisional until QUIC's loss detection algorithm determines is provisional until QUIC's loss detection algorithm determines
that the packet is actually lost. that the packet is actually lost.
XXX The above list was pulled whole from quic-transport - input is XXX The above list was pulled whole from quic-transport - input is
invited from QUIC contributors. XXX invited from QUIC contributors. XXX
6. Acknowledgements 7. Acknowledgements
This work was partially funded by the European Union's Horizon 2020 This work was partially funded by the European Union's Horizon 2020
research and innovation programme under grant agreement No. 644334 research and innovation programme under grant agreement No. 644334
(NEAT). The views expressed are solely those of the author(s). (NEAT). The views expressed are solely those of the author(s).
7. IANA Considerations 8. IANA Considerations
This memo includes no request to IANA. This memo includes no request to IANA.
XXX If new UDP Options are specified in this document, a request to XXX If new UDP Options are specified in this document, a request to
IANA will be included here. XXX IANA will be included here. XXX
If there are no requirements for IANA, the section will be removed If there are no requirements for IANA, the section will be removed
during conversion into an RFC by the RFC Editor. during conversion into an RFC by the RFC Editor.
8. Security Considerations 9. Security Considerations
The security considerations for the use of UDP and SCTP are provided The security considerations for the use of UDP and SCTP are provided
in the references RFCs. Security guidance for applications using UDP in the references RFCs. Security guidance for applications using UDP
is provided in the UDP Usage Guidelines [RFC8085]. is provided in the UDP Usage Guidelines [RFC8085].
There are cases where PTB messages are not delivered due to policy, There are cases where PTB messages are not delivered due to policy,
configuration or equipment design (see Section 1.1), this method configuration or equipment design (see Section 1.1), this method
therefore does not rely upon PTB messages being received, but is able therefore does not rely upon PTB messages being received, but is able
to utilise these when they are received by the sender. PTB messages to utilise these when they are received by the sender. PTB messages
could potentially be used to cause a node to inappropriately reduce could potentially be used to cause a node to inappropriately reduce
the PLPMTU. A node supporting DPLPMTUD MUST therefore appropriately the PLPMTU. A node supporting DPLPMTUD MUST therefore appropriately
validate the payload of PTB messages to ensure these are received in validate the payload of PTB messages to ensure these are received in
response to transmitted traffic (i.e., a reported error condition response to transmitted traffic (i.e., a reported error condition
that corresponds to a datagram actually sent by the path layer. that corresponds to a datagram actually sent by the path layer).
Parallel forwarding paths may need to be considered. Section 3.5 Parallel forwarding paths may need to be considered. Section 5.2.5.1
identifies the need for robustness in the method when the path identifies the need for robustness in the method when the path
information may be inconsistent. information may be inconsistent.
A node performing DPLPMTUD could experience conflicting information A node performing DPLPMTUD could experience conflicting information
about the size of supported probe packets. This could occur when about the size of supported probe packets. This could occur when
there are multiple paths are concurrently in use and these exhibit a there are multiple paths are concurrently in use and these exhibit a
different PMTU. If not considered, this could result in data being different PMTU. If not considered, this could result in data being
black holed when the PLPMTU is larger than the smallest PMTU across black holed when the PLPMTU is larger than the smallest PMTU across
the current paths. the current paths.
An on-path attacker could forge PTB messages to drive down the PLPMTU An on-path attacker could forge PTB messages to drive down the PLPMTU
9. References 10. References
9.1. Normative References 10.1. Normative References
[I-D.ietf-quic-transport] [I-D.ietf-quic-transport]
Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
and Secure Transport", draft-ietf-quic-transport-13 (work and Secure Transport", draft-ietf-quic-transport-14 (work
in progress), June 2018. in progress), August 2018.
[I-D.ietf-tsvwg-udp-options] [I-D.ietf-tsvwg-udp-options]
Touch, J., "Transport Options for UDP", draft-ietf-tsvwg- Touch, J., "Transport Options for UDP", draft-ietf-tsvwg-
udp-options-04 (work in progress), July 2018. udp-options-05 (work in progress), July 2018.
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768,
DOI 10.17487/RFC0768, August 1980, DOI 10.17487/RFC0768, August 1980,
<https://www.rfc-editor.org/info/rfc768>. <https://www.rfc-editor.org/info/rfc768>.
[RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5,
RFC 792, DOI 10.17487/RFC0792, September 1981, RFC 792, DOI 10.17487/RFC0792, September 1981,
<https://www.rfc-editor.org/info/rfc792>. <https://www.rfc-editor.org/info/rfc792>.
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
skipping to change at page 32, line 19 skipping to change at page 38, line 15
[RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed.,
"Path MTU Discovery for IP version 6", STD 87, RFC 8201, "Path MTU Discovery for IP version 6", STD 87, RFC 8201,
DOI 10.17487/RFC8201, July 2017, DOI 10.17487/RFC8201, July 2017,
<https://www.rfc-editor.org/info/rfc8201>. <https://www.rfc-editor.org/info/rfc8201>.
[RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto, [RFC8261] Tuexen, M., Stewart, R., Jesup, R., and S. Loreto,
"Datagram Transport Layer Security (DTLS) Encapsulation of "Datagram Transport Layer Security (DTLS) Encapsulation of
SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November SCTP Packets", RFC 8261, DOI 10.17487/RFC8261, November
2017, <https://www.rfc-editor.org/info/rfc8261>. 2017, <https://www.rfc-editor.org/info/rfc8261>.
9.2. Informative References 10.2. Informative References
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
DOI 10.17487/RFC1191, November 1990, DOI 10.17487/RFC1191, November 1990,
<https://www.rfc-editor.org/info/rfc1191>. <https://www.rfc-editor.org/info/rfc1191>.
[RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery", [RFC2923] Lahey, K., "TCP Problems with Path MTU Discovery",
RFC 2923, DOI 10.17487/RFC2923, September 2000, RFC 2923, DOI 10.17487/RFC2923, September 2000,
<https://www.rfc-editor.org/info/rfc2923>. <https://www.rfc-editor.org/info/rfc2923>.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
skipping to change at page 32, line 48 skipping to change at page 38, line 44
[RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering [RFC4890] Davies, E. and J. Mohacsi, "Recommendations for Filtering
ICMPv6 Messages in Firewalls", RFC 4890, ICMPv6 Messages in Firewalls", RFC 4890,
DOI 10.17487/RFC4890, May 2007, DOI 10.17487/RFC4890, May 2007,
<https://www.rfc-editor.org/info/rfc4890>. <https://www.rfc-editor.org/info/rfc4890>.
Appendix A. Event-driven state changes Appendix A. Event-driven state changes
This appendix contains an informative description of key events: This appendix contains an informative description of key events:
Path Setup: When a new path is initiated, the state is set to Path Setup: When a new path is initiated, the state is set to
PROBE_START. As soon as the path is confirmed, the state changes PROBE_START. This sends a probe packet with the size of the
to PROBE_BASE and probing for this path is started. The first BASE_PMTU. As soon as the path is confirmed, the state changes to
probe packet is sent with the size of the BASE_PMTU. PROBE_SEARCH.
Arrival of an Acknowledgment: Depending on the probing state, the Arrival of an Acknowledgment: Depending on the probing state, the
reaction differs according to Figure 7, which is a simplification reaction differs according to Figure 7, which is a simplification
of Figure 4 focusing on this event. of Figure 4 focusing on this event.
+--------------+ +----------------+ +--------------+ +----------------+
| PROBE_START | --3------------------------------->| PROBE_DISABLED | | PROBE_START | --3------------------------------> | PROBE_DISABLED |
+--------------+ --4-----------\ +----------------+ +--------------+ --4---------------- ------------> +----------------+
\ \/
+--------------+ \ +--------------+ /\ +--------------+
| PROBE_ERROR | --------------- \ | PROBE_ERROR | -------------------- \ ----------> | PROBE_BASE |
+--------------+ \ \ +--------------+ --4--------------/ \ +--------------+
\ \ \
+--------------+ \ \ +--------------+ +--------------+ --1 -------- \ +--------------+
| PROBE_BASE | --1---------- \ ------------> | PROBE_BASE | | PROBE_BASE | \ --- \ ------> | PROBE_ERROR |
+--------------+ --2----- \ \ +--------------+ +--------------+ --3--------- \ -----/ \ +--------------+
\ \ \ \ \
+--------------+ \ \ ------------> +--------------+ +--------------+ \ -----> +--------------+
| PROBE_SEARCH | --2--- \ -----------------> | PROBE_SEARCH | | PROBE_SEARCH | --2--- -----------------> | PROBE_SEARCH |
+--------------+ --1---\----\---------------------> +--------------+ +--------------+ \ ------------------> +--------------+
\ \ \ ---- /
+--------------+ \ \ +--------------+ +---------------+ / \ +---------------+
| PROBE_DONE | \ -------------------> | PROBE_DONE | |SEARCH_COMPLETE| -1--- \ |SEARCH_COMPLETE|
+--------------+ -----------------------> +--------------+ +---------------+ -5-- -----------------------> +---------------+
\
\ +--------------+
--------------------------> | PROBE_BASE |
+--------------+
Condition 1: The maximum PMTU size has not yet been reached. Condition 1: The maximum PMTU size has not yet been reached.
Condition 2: The maximum PMTU size has been reached. Conition 3: Condition 2: The maximum PMTU size has been reached. Condition 3:
Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4: Probe Timer expires and PROBE_COUNT = MAX_PROBEs. Condition 4:
PROBE_ACK received. PROBE_ACK received. Condition 5: Black hole detected.
Figure 7: State changes at the arrival of an acknowledgment Figure 7: State changes at the arrival of an acknowledgment
Probing timeout: The PROBE_COUNT is initialised to zero each time Probing timeout: The PROBE_COUNT is initialised to zero each time
the value of PROBED_SIZE is changed and when a acknowledgment the value of PROBED_SIZE is changed and when a acknowledgment
confirming delivery of a probe packet arries. The PROBE_TIMER is confirming delivery of a probe packet. The PROBE_TIMER is started
started each time a probe packet is sent. It is stopped when an each time a probe packet is sent. It is stopped when an
acknowledgment arrives that confirms delivery of a probe packet of acknowledgment arrives that confirms delivery of a probe packet of
PROBED_SIZE. If the probe packet is not acknowledged before the PROBED_SIZE. If the probe packet is not acknowledged before the
PROBE_TIMER expires, the PROBE_COUNT is incremented. When the PROBE_TIMER expires, the PROBE_COUNT is incremented. When the
PROBE_COUNT equals the value MAX_PROBES, the state is changed, PROBE_COUNT equals the value MAX_PROBES, the state is changed,
otherwise a new probe packet of the same size (PROBED_SIZE) is otherwise a new probe packet of the same size (PROBED_SIZE) is
resent. The state transitions are illustrated in Figure 8. This resent. The state transitions are illustrated in Figure 8. This
shows a simplification of Figure 4 with a focus only on this shows a simplification of Figure 4 with a focus only on this
event. event.
+--------------+ +----------------+ +--------------+ +----------------+
| PROBE_START |----------------------------------->| PROBE_DISABLED | | PROBE_START | --2------------------------------->| PROBE_DISABLED |
+--------------+ +----------------+ +--------------+ +----------------+
+--------------+ +--------------+ +--------------+ +--------------+
| PROBE_ERROR | -----------------> | PROBE_ERROR | | PROBE_ERROR | -----------------> | PROBE_ERROR |
+--------------+ / +--------------+ +--------------+ / +--------------+
/ /
+--------------+ --2----------/ +--------------+ +--------------+ --2----------/ +--------------+
| PROBE_BASE | --1------------------------------> | PROBE_BASE | | PROBE_BASE | --1------------------------------> | PROBE_BASE |
+--------------+ +--------------+ +--------------+ +--------------+
+--------------+ +--------------+ +--------------+ +--------------+
| PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH | | PROBE_SEARCH | --1------------------------------> | PROBE_SEARCH |
+--------------+ --2--------- +--------------+ +--------------+ --2--------- +--------------+
\ \
+--------------+ \ +--------------+ +---------------+ \ +---------------+
| PROBE_DONE | -------------------> | PROBE_DONE | |SEARCH_COMPLETE| -------------------> |SEARCH_COMPLETE|
+--------------+ +--------------+ +---------------+ +---------------+
Condition 1: The maximum number of probe packets has not been Condition 1: The maximum number of probe packets has not been
reached. Condition 2: The maximum number of probe packets has been reached. Condition 2: The maximum number of probe packets has been
reached. reached. XXX This diagram has not been validated.
Figure 8: State changes at the expiration of the probe timer Figure 8: State changes at the expiration of the probe timer
PMTU raise timer timeout: The path through the network can change PMTU raise timer timeout: DPLPMTUD periodically sends a probe packet
over time. It impossible to discover whether a path change has to detect whether a larger PMTU is possible. This probe packet is
increased the actual PMTU by exchanging packets less than or equal generated by the PMTU_RAISE_TIMER.
to the PLPMTU. This requires PLPMTUD to periodically send a probe
packet to detect whether a larger PMTU is possible. This probe
packet is generated by the PMTU_RAISE_TIMER. When the timer
expires, probing is restarted with the BASE_PMTU and the state is
changed to PROBE_BASE.
Arrival of a PTB message: The active probing of the path can be Arrival of a PTB message: The active probing of the path can be
supported by the arrival of a PTB message sent by a router or supported by the arrival of a PTB message indicating the PTB_SIZE.
middleboxes indicating the router's local link MTU. Two cases can Two examples are:
be distinguished:
1. The indicated link MTU in the PTB message is between the 1. The PTB_SIZE is between the PLPMTU and the probe that
already probed and PLPMTU and the probe that triggered the PTB triggered the PTB message.
message.
2. The indicated link MTU in the PTB message is smaller than the 2. The PTB_SIZE is smaller than the PLPMTU.
PLPMTU.
In first case, the PROBE_BASE state transitions to the PROBE_ERROR In first case, the PROBE_BASE state transitions to the PROBE_ERROR
state. In the PROBE_SEARCH state, a new probe packet is sent with state. In the PROBE_SEARCH state, a new probe packet is sent with
the sized reported by the PTB message. Its result is handled the size reported by the PTB message.
according to the former events.
The second case could be a result of a network re-configuration.
If the reported link MTU in the PTB message is greater than the
BASE_MTU, the probing starts again with a value of PROBE_BASE.
Otherwise, the method enters the state PROBE_ERROR.
Note: Not all routers include the link MTU size when they send a In second case, the probing starts again with a value of
PTB message. If the PTB message does not indicate the link MTU, PROBE_BASE.
the probe is handled in the same way as condition 2 of Figure 8.
Appendix B. Revision Notes Appendix B. Revision Notes
Note to RFC-Editor: please remove this entire section prior to Note to RFC-Editor: please remove this entire section prior to
publication. publication.
Individual draft -00: Individual draft -00:
o Comments and corrections are welcome directly to the authors or o Comments and corrections are welcome directly to the authors or
via the IETF TSVWG working group mailing list. via the IETF TSVWG working group mailing list.
skipping to change at page 37, line 4 skipping to change at page 42, line 37
o Added more discussion of implementation within an application. o Added more discussion of implementation within an application.
o Added text on flapping paths. o Added text on flapping paths.
o Replaced 'effective MTU' with new term PLPMTU. o Replaced 'effective MTU' with new term PLPMTU.
Working Group draft -03: Working Group draft -03:
o Updated figures o Updated figures
o Added more discussion on blackhole detection o Added more discussion on blackhole detection
o Added figure describing just blackhole detection o Added figure describing just blackhole detection
o Added figure relating MPS sizes o Added figure relating MPS sizes
o Updated full state machine artwork for clarity Working Group draft -04:
o Changed all text to refer to /packet probes/ /validation/ (rather o Described phases and named these consistently.
than /verification/).
o Corrected transition from confirmation directly to the search
phase (Base has been checked).
o Redrawn state diagrams.
o Renamed BASE_MTU to BASE_PMTU (because it is a base for the PMTU).
o Clarified Error state.
o Clarified supsending DPLPMTUD.
o Verified normative text in requirements section.
o Removed duplicate text.
o Changed all text to refer to /packet probe/probe packet/
/validation/verification/ added term /Probe Confirmation/ and
clarified BlackHole detection.
Authors' Addresses Authors' Addresses
Godred Fairhurst Godred Fairhurst
University of Aberdeen University of Aberdeen
School of Engineering School of Engineering
Fraser Noble Building Fraser Noble Building
Aberdeen AB24 3U Aberdeen AB24 3U
UK UK
 End of changes. 226 change blocks. 
766 lines changed or deleted 1077 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/