draft-ietf-intarea-gue-06.txt   draft-ietf-intarea-gue-07.txt 
Internet Area WG T. Herbert Internet Area WG T. Herbert
Internet-Draft Quantonium Internet-Draft Quantonium
Intended status: Standard track L. Yong Intended status: Standard track L. Yong
Expires March 4, 2019 Huawei USA Expires September 8, 2019 Independent
O. Zia O. Zia
Microsoft Microsoft
August 31, 2018 March 7, 2019
Generic UDP Encapsulation Generic UDP Encapsulation
draft-ietf-intarea-gue-06 draft-ietf-intarea-gue-07
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
skipping to change at page 1, line 35 skipping to change at page 1, line 35
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This Internet-Draft will expire on March 4, 2019. This Internet-Draft will expire on September 8, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. to this document.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
skipping to change at page 3, line 21 skipping to change at page 3, line 21
efficient handling of UDP packets can be leveraged. GUE specifies efficient handling of UDP packets can be leveraged. GUE specifies
basic encapsulation methods upon which higher level constructs, such basic encapsulation methods upon which higher level constructs, such
as tunnels and overlay networks for network virtualization, can be as tunnels and overlay networks for network virtualization, can be
constructed. GUE is extensible by allowing optional data fields as constructed. GUE is extensible by allowing optional data fields as
part of the encapsulation, and is generic in that it can encapsulate part of the encapsulation, and is generic in that it can encapsulate
packets of various IP protocols. packets of various IP protocols.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1. Terminology and acronyms . . . . . . . . . . . . . . . . . 5 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . . 5
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 6 1.2. Terminology and acronyms . . . . . . . . . . . . . . . . . 6
2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 7 1.3. Requirements Language . . . . . . . . . . . . . . . . . . . 7
2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . . 7 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 8
3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 8 3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 9 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Proto field . . . . . . . . . . . . . . . . . . . . . . 9 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 3.2.1. Proto field . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2. Ctype field . . . . . . . . . . . . . . . . . . . . . . 11
3.3. Flags and extension fields . . . . . . . . . . . . . . . . 11 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 11
3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 11 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 11
3.3.2. Example GUE header with extension fields . . . . . . . 11 3.3.2. Example GUE header with extension fields . . . . . . . 12
3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 12 3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 13
3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 13 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 13
3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 14
3.6. Hiding the transport layer protocol number . . . . . . . . 13 4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 15 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 15
4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 16 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 16
5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 17 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 17
5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17
5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 18 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 18
5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 18 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 18
5.4.1. Processing a received data message . . . . . . . . . . 18 5.4.1. Processing a received data message . . . . . . . . . . 18
5.4.2. Processing a received control message . . . . . . . . . 19 5.4.2. Processing a received control message . . . . . . . . . 19
5.5. Router and switch operation . . . . . . . . . . . . . . . . 19 5.5. Middlebox inspection . . . . . . . . . . . . . . . . . . . 19
5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 20 5.6. Router and switch operation . . . . . . . . . . . . . . . . 20
5.6.1. Inferring connection semantics . . . . . . . . . . . . 20 5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 20
5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 20 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 21
5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 21 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 21
5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 21 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 21
5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 22 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 22
5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 22 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 22
5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 22 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 23
5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 23 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 23
5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 23 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 23
5.11.1. Flow classification . . . . . . . . . . . . . . . . . 23 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 24
5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 24 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 24
5.12 Negotiation of acceptable flags and extension fields . . . 25 5.12. Negotiation of acceptable flags and extension fields . . . 25
6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 26 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 26
6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 26 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 26
6.2 Comparison of GUE to other encapsulations . . . . . . . . . 26 6.2. Comparison of GUE to other encapsulations . . . . . . . . . 26
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 28 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 28
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 28 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 28
8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 28 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 28
8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 29 8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 29
8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 29 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 29
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30
10.1. Normative References . . . . . . . . . . . . . . . . . . . 30 10.1. Normative References . . . . . . . . . . . . . . . . . . . 30
10.2. Informative References . . . . . . . . . . . . . . . . . . 30 10.2. Informative References . . . . . . . . . . . . . . . . . . 31
Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 33 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 34
A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 33 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 34
A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 34 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 34
A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 34 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 35
A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 35 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 35
A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 35 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 36
A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 36 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 37
Appendix B: Implementation considerations . . . . . . . . . . . . 36 Appendix B: Implementation considerations . . . . . . . . . . . . 37
B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 37 B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 37
B.2. Setting flow entropy as a route selector . . . . . . . . . 37 B.2. Setting flow entropy as a route selector . . . . . . . . . 38
B.3. Hardware protocol implementation considerations . . . . . . 37 B.3. Hardware protocol implementation considerations . . . . . . 38
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 39
1. Introduction 1. Introduction
This specification describes Generic UDP Encapsulation (GUE) which is This specification describes Generic UDP Encapsulation (GUE) which is
a general method for encapsulating packets of arbitrary IP protocols a general method for encapsulating packets of arbitrary IP protocols
within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating
packets in UDP facilitates efficient transport across networks. packets in UDP facilitates efficient transport across networks.
Networking devices widely provide protocol specific processing and Networking devices widely provide protocol specific processing and
optimizations for UDP (as well as TCP) packets. Packets for atypical optimizations for UDP (as well as TCP) packets. Packets for atypical
IP protocols (those not usually parsed by networking hardware) can be IP protocols (those not usually parsed by networking hardware) can be
encapsulated in UDP packets to maximize deliverability and to encapsulated in UDP packets to maximize deliverability and to
leverage flow specific mechanisms for routing and packet steering. leverage flow specific mechanisms for routing and packet steering.
GUE provides an extensible header format for including optional data GUE provides an extensible header format for including optional data
in the encapsulation header. This data potentially covers items such in the encapsulation header. This data potentially covers items such
as the virtual networking identifier, security data for validating or as a virtual networking identifier, security data for validating or
authenticating the GUE header, congestion control data, etc. GUE also authenticating the GUE header, congestion control data, etc. GUE also
allows private optional data in the encapsulation header. This allows private optional data in the encapsulation header. This
feature can be used by a site or implementation to define local feature can be used by a site or implementation to define local
custom optional data, and allows experimentation of options that may custom optional data, and allows experimentation of options that may
eventually become standard. eventually become standard.
This document does not define any specific GUE extensions. [GUEEXTEN] This document does not define any specific GUE extensions. [GUEEXTEN]
specifies a set of initial extensions. specifies a set of initial extensions.
The motivation for the GUE protocol is described in section 6. 1.1. Applicability
1.1. Terminology and acronyms GUE is a network encapsulation protocol that encapsulates packets for
various IP protocols. Potential use cases include network tunneling,
multi-tenant network virtualization, tunneling for mobility, and
transport layer encapsulation. GUE is intended for deploying overlay
networks in public or private data center environments, as well as
providing a general tunneling mechanism usable in the Internet.
GUE is a UDP based encapsulation protocol transported over existing
IPv4 and IPv6 networks. Hence, as a UDP based protocol, GUE adheres
to the UDP usage guidelines as specified in [RFC8085]. Applicability
of these guidelines are dependent on the underlay IP network and the
nature of GUE payload protocol (for example TCP/IP or IP/Ethernet).
[RFC8085] outlines two applicability scenarios for UDP applications,
1) general Internet and 2) controlled environment. GUE is intended to
allow deployment in both controlled environments and in the
uncontrolled Internet. The requirements of [RFC8085] pertaining to
deployment of a UDP encapsulation protocol in these environments are
applicable. Section 5 provides the specifics for satisfying
requirements of [RFC8085]. It is the responsibility of the operator
deploying GUE to ensure that the necessary operational requirements
are met for the environment in which GUE is being deployed.
GUE has much of the same applicability and benefits as GRE-in-UDP
[RFC8086] that are afforded by UDP encapsulation protocols. GUE
offers the possibility of good performance for load-balancing
encapsulated IP traffic in transit networks using existing Equal-Cost
Multipath (ECMP) mechanisms that use a hash of the five-tuple of
source IP address, destination IP address, UDP/TCP source port,
UDP/TCP destination port, and protocol number. Encapsulating packets
in UDP enables use of the UDP source port to provide entropy to ECMP
hashing.
In addition, GUE enables extending the use of atypical IP protocols
(those other than TCP and UDP) across networks that might otherwise
filter packets carrying those protocols. GUE may also be used with
connection oriented UDP semantics in order to facilitate traversal
through stateful firewalls and stateful NAT.
Additional motivation for the GUE protocol is provided in section 6.
1.2. Terminology and acronyms
GUE Generic UDP Encapsulation GUE Generic UDP Encapsulation
GUE Header A variable length protocol header that is composed GUE Header A variable length protocol header that is composed
of a primary four byte header and zero or more four of a primary four byte header and zero or more four
byte words for optional header data byte words of optional header data
GUE packet A UDP/IP packet that contains a GUE header and GUE GUE packet A UDP/IP packet that contains a GUE header and GUE
payload within the UDP payload payload within the UDP payload
GUE variant A version of the GUE protocol or an alternate form GUE variant A version of the GUE protocol or an alternate form
of a version of a version
Encapsulator A network node that encapsulates packets in GUE Encapsulator A network node that encapsulates packets in GUE
Decapsulator A network node that decapsulates and processes Decapsulator A network node that decapsulates and processes
packets encapsulated in GUE packets encapsulated in GUE
Data message An encapsulated packet in the GUE payload that is Data message An encapsulated packet in a GUE payload that is
addressed to the protocol stack for an associated addressed to the protocol stack for an associated
protocol protocol
Control message A formatted message in the GUE payload that is Control message A formatted message in the GUE payload that is
implicitly addressed to the decapsulator to monitor implicitly addressed to the decapsulator to monitor
or control the state or behavior of a tunnel or control the state or behavior of a tunnel
Flags A set of bit flags in the primary GUE header Flags A set of bit flags in the primary GUE header
Extension field Extension field
An optional field in a GUE header whose presence is An optional field in a GUE header whose presence is
indicated by corresponding flag(s) indicated by corresponding flag(s)
C-bit A single bit flag in the primary GUE header that C-bit A single bit flag in the primary GUE header that
indicates whether the GUE packet contains a control indicates whether the GUE packet contains a control
message or data message message or data message
Hlen A field in the primary GUE header that gives the Hlen A field in the primary GUE header that gives the
length of the GUE header length of the GUE header
skipping to change at page 6, line 39 skipping to change at page 7, line 32
Outer IP header Refers to the outer most IP header or packet when Outer IP header Refers to the outer most IP header or packet when
encapsulating a packet over IP encapsulating a packet over IP
Inner IP header Refers to an encapsulated IP header when an IP Inner IP header Refers to an encapsulated IP header when an IP
packet is encapsulated packet is encapsulated
Outer packet Refers to an encapsulating packet Outer packet Refers to an encapsulating packet
Inner packet Refers to a packet that is encapsulated Inner packet Refers to a packet that is encapsulated
1.2. Requirements Language 1.3. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
2. Base packet format 2. Base packet format
A GUE packet is comprised of a UDP packet whose payload is a GUE A GUE packet is comprised of a UDP packet whose payload is a GUE
header followed by a payload which is either an encapsulated packet header followed by a payload which is either an encapsulated packet
of some IP protocol or a control message such as an OAM (Operations, of some IP protocol or a control message such as an OAM (Operations,
skipping to change at page 7, line 29 skipping to change at page 8, line 29
| GUE Header | | GUE Header |
| | | |
|-------------------------------| |-------------------------------|
| | | |
| Encapsulated packet | | Encapsulated packet |
| or control message | | or control message |
| | | |
+-------------------------------+ +-------------------------------+
The GUE header is variable length as determined by the presence of The GUE header is variable length as determined by the presence of
optional extension fields. optional extension fields and private data.
2.1. GUE variant 2.1. GUE variant
The first two bits of the GUE header contain the GUE protocol variant The first two bits of the GUE header contain the GUE protocol variant
number. The variant number can indicate the version of the GUE number. The variant number can indicate the version of the GUE
protocol as well as alternate forms of a version. protocol as well as alternate forms of a version.
Variants 0 and 1 are described in this specification; variants 2 and Variants 0 and 1 are described in this specification; variants 2 and
3 are reserved. 3 are reserved.
skipping to change at page 8, line 21 skipping to change at page 9, line 21
The header format for variant 0 of GUE in UDP is: The header format for variant 0 of GUE in UDP is:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
| Source port | Destination port | | | Source port | Destination port | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
| Length | Checksum | | | Length | Checksum | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
| 0 |C| Hlen | Proto/ctype | Flags | | 0 |C| Hlen | Proto/ctype | Flags |\
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | | | |
~ Extensions Fields (optional) ~ ~ Extensions Fields (optional) ~ |
| | | | GUE
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| | | | |
~ Private data (optional) ~ ~ Private data (optional) ~ |
| | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
The contents of the UDP header are: The contents of the UDP header are:
o Source port: If connection semantics (section 5.6.1) are applied o Source port: If connection semantics (section 5.6.1) are applied
to an encapsulation, this is set to the local source port for to an encapsulation, this is set to the local source port for
the connection. When connection semantics are not applied, the the connection. When connection semantics are not applied, the
source port is either set to a flow entropy value as described source port is either set to a flow entropy value, as described
in section 5.11, or it should be set to the GUE assigned port in section 5.11, or is set to the GUE assigned port number,
number, 6080. 6080.
o Destination port: If connection semantics (section 5.6.1) are o Destination port: If connection semantics (section 5.6.1) are
applied to an encapsulation, this is set to the destination port applied to an encapsulation, this is set to the destination port
for the tuple. If connection semantics are not applied this is for the tuple. If connection semantics are not applied then the
set to the GUE assigned port number, 6080. destination port is set to the GUE assigned port number, 6080.
o Length: Canonical length of the UDP packet (length of UDP header o Length: Canonical length of the UDP packet (length of UDP header
and payload). and payload).
o Checksum: Standard UDP checksum (handling is described in o Checksum: Standard UDP checksum (handling is described in
section 5.7). section 5.7).
The GUE header consists of: The GUE header consists of:
o Variant: 0 indicates GUE protocol version 0 with a header. o Variant: 0 indicates GUE protocol version 0 with a header.
o C: C-bit: When set indicates a control message, not set o C: C-bit: When set indicates a control message. When not set
indicates a data message. indicates a data message.
o Hlen: Length in 32-bit words of the GUE header, including o Hlen: Length in 32-bit words of the GUE header, including
optional extension fields but not the first four bytes of the optional extension fields but not the first four bytes of the
header. Computed as (header_len - 4) / 4, where header_len is header. Computed as (header_len - 4) / 4, where header_len is
the total header length in bytes. All GUE headers are a multiple the total header length in bytes. All GUE headers are a multiple
of four bytes in length. Maximum header length is 128 bytes. of four bytes in length. Maximum header length is 128 bytes.
o Proto/ctype: When the C-bit is set, this field contains a o Proto/ctype: When the C-bit is set, this field contains a
control message type for the payload (section 3.2.2). When the control message type for the payload (section 3.2.2). When the
C-bit is not set, the field holds the Internet protocol number C-bit is not set, the field holds the Internet protocol number
for the encapsulated packet in the payload (section 3.2.1). The for the encapsulated packet in the payload (section 3.2.1). The
control message or encapsulated packet begins at the offset control message or encapsulated packet begins at the offset
provided by Hlen. provided by Hlen.
o Flags: Header flags that may be allocated for various purposes o Flags: Header flags that may be allocated for various purposes
and may indicate presence of extension fields. Undefined header and may indicate the presence of extension fields. Undefined
flag bits MUST be set to zero on transmission. header flag bits MUST be set to zero on transmission.
o Extension Fields: Optional fields whose presence is indicated by o Extension Fields: Optional fields whose presence is indicated by
corresponding flags. corresponding flags.
o Private data: Optional private data block (see section 3.4). If o Private data: Optional private data block (see section 3.4). If
the private block is present, it immediately follows that last the private block is present, it immediately follows that last
extension field present in the header. The private block is extension field present in the header. The private block is
considered to be part of the GUE header. The length of this data considered to be part of the GUE header. The length of this data
is determined by subtracting the starting offset from the header is determined by subtracting the starting offset of the private
length. data from the header length.
3.2. Proto/ctype field 3.2. Proto/ctype field
The proto/ctype fields either contains an Internet protocol number The proto/ctype fields either contains an Internet protocol number
(when the C-bit is not set) or GUE control message type (when the C- (when the C-bit is not set) or GUE control message type (when the C-
bit is set). bit is set).
3.2.1 Proto field 3.2.1. Proto field
When the C-bit is not set, the proto/ctype field MUST contain an IANA When the C-bit is not set, the proto/ctype field MUST contain an IANA
Internet Protocol Number. The protocol number is interpreted relative Internet Protocol Number [IANA-PN]. The protocol number is
to the IP protocol that encapsulates the UDP packet (i.e. protocol of interpreted relative to the IP protocol that encapsulates the UDP
the outer IP header). The protocol number serves as an indication of packet (i.e. protocol of the outer IP header). The protocol number
the type of the next protocol header which is contained in the GUE serves as an indication of the type of the next protocol header which
payload at the offset indicated in Hlen. Intermediate devices MAY is contained in the GUE payload at the offset indicated in Hlen.
parse the GUE payload per the number in the proto/ctype field, and
header flags cannot affect the interpretation of the proto/ctype
field.
When the outer IP protocol is IPv4, the proto field MUST be set to a
valid IP protocol number usable with IPv4; it MUST NOT be set to a
number for IPv6 extension headers or ICMPv6 options (number 58). An
exception is that the destination options extension header using the
PadN option MAY be used with IPv4 as described in section 3.6. The
"no next header" protocol number (59) also MAY be used with IPv4 as
described below.
When the outer IP protocol is IPv6, the proto field can be set to any
defined protocol number except that it MUST NOT be set to Hop-by-hop
options (number 0). If a received GUE packet in IPv6 contains a
protocol number that is an extension header (e.g. Destination
Options) then the extension header is processed after the GUE header
is processed as though the GUE header is an extension header.
IP protocol number 59 ("No next header") can be set to indicate that IP protocol number 59 ("No next header") can be set to indicate that
the GUE payload does not begin with the header of an IP protocol. the GUE payload does not begin with the header of an IP protocol.
This would be the case, for instance, if the GUE payload were a This would be the case, for instance, if the GUE payload were a
fragment when performing GUE level fragmentation. The interpretation fragment when performing GUE level fragmentation. The interpretation
of the payload is performed through other means (such as flags and of the payload is performed through other means such as flags and
extension fields), and intermediate devices MUST NOT parse packets extension fields, and nodes MUST NOT parse packets based on the IP
based on the IP protocol number in this case. protocol number in this case.
3.2.2 Ctype field 3.2.2. Ctype field
When the C-bit is set, the proto/ctype field MUST be set to a valid When the C-bit is set, the proto/ctype field MUST be set to a valid
control message type. A value of zero indicates that the GUE payload control message type. A value of zero indicates that the GUE payload
requires further interpretation to deduce the control type. This requires further interpretation to deduce the control type. This
might be the case when the payload is a fragment of a control might be the case when the payload is a fragment of a control
message, where only the reassembled packet can be interpreted as a message, where only the reassembled packet can be interpreted as a
control message. control message.
Control messages will be defined in an IANA registry. Control message Control messages will be defined in an IANA registry. Control message
types 1 through 127 may be defined in standards. Types 128 through types 1 through 127 may be defined in standards. Types 128 through
skipping to change at page 11, line 10 skipping to change at page 11, line 39
message. Instead, it indicates that the GUE payload is a control message. Instead, it indicates that the GUE payload is a control
message, or part of a control message (as might be the case in GUE message, or part of a control message (as might be the case in GUE
fragmentation), that cannot be correctly parsed or interpreted fragmentation), that cannot be correctly parsed or interpreted
without additional context. without additional context.
3.3. Flags and extension fields 3.3. Flags and extension fields
Flags and associated extension fields are the primary mechanism of Flags and associated extension fields are the primary mechanism of
extensibility in GUE. As mentioned in section 3.1, GUE header flags extensibility in GUE. As mentioned in section 3.1, GUE header flags
indicate the presence of optional extension fields in the GUE header. indicate the presence of optional extension fields in the GUE header.
[GUEXTENS] defines an initial set of GUE extensions. [GUEEXTEN] defines an initial set of GUE extensions.
3.3.1. Requirements 3.3.1. Requirements
There are sixteen flag bits in the GUE header. Flags may indicate There are sixteen flag bits in the GUE header. Flags may indicate
presence of an extension fields. The size of an extension field presence of extension fields. The size of an extension field
indicated by a flag MUST be fixed. indicated by a flag MUST be fixed in the specification of the flag.
Flags can be paired together to allow different lengths for an Flags can be paired together to allow different lengths for an
extension field. For example, if two flag bits are paired, a field extension field. For example, if two flag bits are paired, a field
can possibly be three different lengths-- that is bit value of 00 can possibly be three different lengths-- that is bit value of 00
indicates no field present; 01, 10, and 11 indicate three possible indicates no field present; 01, 10, and 11 indicate three possible
lengths for the field. Regardless of how flag bits are paired, the lengths for the field. Regardless of how flag bits are paired, the
lengths and offsets of optional fields corresponding to a set of lengths and offsets of extension fields corresponding to a set of
flags MUST be well defined. flags MUST be well defined and deterministic.
Extension fields are placed in order of the flags. New flags are to Extension fields are placed in order of the flags. New flags are to
be allocated from high to low order bit contiguously without holes. be allocated from high to low order bit contiguously without holes.
Flags allow random access, for instance to inspect the field Flags allow random access, for instance to inspect the field
corresponding to the Nth flag bit, an implementation only considers corresponding to the Nth flag bit, an implementation only considers
the previous N-1 flags to determine the offset. Flags after the Nth the previous N-1 flags to determine the offset. Flags after the Nth
flag are not pertinent in calculating the offset of the field for the flag are not pertinent in calculating the offset of the field for the
Nth flag. Random access of flags and fields permits processing of Nth flag. Random access of flags and fields permits processing of
optional extensions in an order that is independent of their position optional extensions in an order that is independent of their position
in the packet. in the packet.
skipping to change at page 11, line 50 skipping to change at page 12, line 30
field always holds an IP protocol number as an invariant). field always holds an IP protocol number as an invariant).
The set of available flags can be extended in the future by defining The set of available flags can be extended in the future by defining
a "flag extensions bit" that refers to a field containing a new set a "flag extensions bit" that refers to a field containing a new set
of flags. of flags.
3.3.2. Example GUE header with extension fields 3.3.2. Example GUE header with extension fields
An example GUE header for a data message encapsulating an IPv4 packet An example GUE header for a data message encapsulating an IPv4 packet
and containing the Group Identifier and Security extension fields and containing the Group Identifier and Security extension fields
(both defined in [GUEXTENS]) is shown below: (both defined in [GUEEXTEN]) is shown below:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0 |0| 3 | 94 |1|0 0 1| 0 | | 0 |0| 3 | 4 |1|0 0 1| 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Group Identifier | | Group Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
+ Security + + Security +
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In the above example, the first flag bit is set which indicates that In the above example, the first flag bit is set which indicates that
the Group Identifier extension is present which is a 32 bit field. the Group Identifier extension is present which is a 32 bit field.
The second through fourth bits of the flags are paired flags that The second through fourth bits of the flags are paired flags that
indicate the presence of a Security field with seven possible sizes. indicate the presence of a Security field with seven possible sizes.
In this example 001 indicates a sixty-four bit security field. In this example 001 indicates a sixty-four bit security field.
3.4. Private data 3.4. Private data
An implementation MAY use private data for its own use. The private An implementation MAY use private data for its own use. The private
data immediately follows the last field in the GUE header and is not data immediately follows the last extension field in the GUE header
a fixed length. This data is considered part of the GUE header and and is not a fixed length. This data is considered part of the GUE
MUST be accounted for in header length (Hlen). The length of the header and MUST be accounted for in header length (Hlen). The length
private data MUST be a multiple of four and is determined by of the private data MUST be a multiple of four bytes and is
subtracting the offset of private data in the GUE header from the determined by subtracting the offset of private data in the GUE
header length. Specifically: header from the header length. Specifically:
Private_length = (Hlen * 4) - Length(flags) Private_length = (Hlen * 4) - Length(flags)
where "Length(flags)" returns the sum of lengths of all the extension where "Length(flags)" returns the sum of lengths of all the extension
fields present in the GUE header. When there is no private data fields present in the GUE header. When there is no private data
present, the length of the private data is zero. present, the length of the private data is zero.
The semantics and interpretation of private data are implementation The semantics and interpretation of private data are implementation
specific. The private data may be structured as necessary, for specific. The private data may be structured as necessary, for
instance it might contain its own set of flags and extension fields. instance it might contain its own set of flags and extension fields.
skipping to change at page 13, line 10 skipping to change at page 13, line 41
If a decapsulator receives a GUE packet with private data, it MUST If a decapsulator receives a GUE packet with private data, it MUST
validate the private data appropriately. If a decapsulator does not validate the private data appropriately. If a decapsulator does not
expect private data from an encapsulator, the packet MUST be dropped. expect private data from an encapsulator, the packet MUST be dropped.
If a decapsulator cannot validate the contents of private data per If a decapsulator cannot validate the contents of private data per
the provided semantics, the packet MUST also be dropped. An the provided semantics, the packet MUST also be dropped. An
implementation MAY place security data in GUE private data which if implementation MAY place security data in GUE private data which if
present MUST be verified for packet acceptance. present MUST be verified for packet acceptance.
3.5. Message types 3.5. Message types
There are two message types in GUE variant 0: control messages and
data messages.
3.5.1. Control messages 3.5.1. Control messages
Control messages carry formatted data that are implicitly addressed Control messages carry formatted data that are implicitly addressed
to the decapsulator to monitor or control the state or behavior of a to the decapsulator to monitor or control the state or behavior of a
tunnel (OAM). For instance, an echo request and corresponding echo tunnel (OAM). For instance, an echo request and corresponding echo
reply message can be defined to test for liveness. reply message can be defined to test for liveness.
Control messages are indicated in the GUE header when the C-bit is Control messages are indicated in the GUE header when the C-bit is
set. The payload is interpreted as a control message with type set. The payload is interpreted as a control message with type
specified in the proto/ctype field. The format and contents of the specified in the proto/ctype field. The format and contents of the
control message are indicated by the type and can be variable length. control message are indicated by the type and can be variable length.
Other than interpreting the proto/ctype field as a control message Other than interpreting the proto/ctype field as a control message
type, the meaning and semantics of the rest of the elements in the type, the meaning and semantics of the rest of the elements in the
GUE header are the same as that of data messages. Forwarding and GUE header are the same as that of data messages. Forwarding and
routing of control messages should be the same as that of a data routing of control messages should be the same as that of a data
message with the same outer IP and UDP header and GUE flags; this message with the same outer IP and UDP header; this ensures that
ensures that control messages can be created that follow the same control messages can be created that follow the same path through the
path as data messages. network as data messages.
3.5.2. Data messages 3.5.2. Data messages
Data messages carry encapsulated packets that are addressed to the Data messages carry encapsulated packets that are addressed to the
protocol stack for the associated protocol. Data messages are a protocol stack for the associated protocol. Data messages are a
primary means of encapsulation and can be used to create tunnels for primary means of encapsulation and can be used to create tunnels for
overlay networks. overlay networks.
Data messages are indicated in GUE header when the C-bit is not set. Data messages are indicated in GUE header when the C-bit is not set.
The payload of a data message is interpreted as an encapsulated The payload of a data message is interpreted as an encapsulated
packet of an Internet protocol indicated in the proto/ctype field. packet of an Internet protocol indicated in the proto/ctype field.
The packet immediately follows the GUE header. The encapsulated packet immediately follows the GUE header.
3.6. Hiding the transport layer protocol number
The GUE header indicates the Internet protocol of the encapsulated
packet. A protocol number is either contained in the Proto/ctype
field of the primary GUE header or in the Payload Type field of a GUE
Transform extension field (used to encrypt the payload with DTLS,
[GUEEXTEN]). If the transport protocol number needs to be hidden from
the network, then a trivial destination options can be used.
The PadN destination option [RFC2460] can be used to encode the
transport protocol as a next header of an extension header (and
maintain alignment of encapsulated transport headers). The
Proto/ctype field or Payload Type field of the GUE Transform field is
set to 60 to indicate that the first encapsulated header is a
destination options extension header.
The format of the extension header is below:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | 2 | 1 | 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
For IPv4, it is permitted in GUE to used this precise destination
option to contain the obfuscated protocol number. In this case next
header MUST refer to a valid IP protocol for IPv4. No other extension
headers or destination options are permitted with IPv4.
4. Variant 1 4. Variant 1
Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP.
In this variant there is no GUE header; a UDP packet carries an IP In this variant there is no GUE header, a UDP packet carries an IP
packet. The first two bits of the UDP payload for GUE are the GUE packet. The first two bits of the UDP payload are the GUE variant
variant and coincide with the first two bits of the version number in field and coincide with the first two bits of the version number in
the IP header. The first two version bits of IPv4 and IPv6 are 01, so the IP header. The first two version bits of IPv4 and IPv6 are 01, so
we use GUE variant 1 for direct IP encapsulation which makes two bits we use GUE variant 1 for direct IP encapsulation which makes the two
of GUE variant to also be 01. bits of GUE variant to also be 01.
This technique is effectively a means to compress out the version 0 This technique is effectively a means to compress out the GUE version
GUE header when encapsulating IPv4 or IPv6 packets and there are no 0 header when encapsulating IPv4 or IPv6 packets and there are no
flags or extension fields present. This method is compatible to use flags, extension fields, or private data present. This method is
on the same port number as packets with the GUE header (GUE variant 0 compatible to use on the same port number as packets with the GUE
packets). This technique saves encapsulation overhead on costly links header (GUE variant 0 packets). This technique saves encapsulation
for the common use of IP encapsulation, and also obviates the need to overhead on costly links for the common use of IP encapsulation, and
allocate a separate port number for IP-over-UDP encapsulation. also obviates the need to allocate a separate UDP port number for IP-
over-UDP encapsulation.
4.1. Direct encapsulation of IPv4 4.1. Direct encapsulation of IPv4
The format for encapsulating IPv4 directly in UDP is: The format for encapsulating IPv4 directly in UDP is:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
| Source port | Destination port | | | Source port | Destination port | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
skipping to change at page 15, line 48 skipping to change at page 15, line 30
| Time to Live | Protocol | Header Checksum | | Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source IPv4 Address | | Source IPv4 Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination IPv4 Address | | Destination IPv4 Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The UDP fields are set in a similar manner as described in section The UDP fields are set in a similar manner as described in section
3.1. 3.1.
Note that the 0100 value in the first four bits of the the UDP Note that the 0100 value in the first four bits of the UDP payload
payload expresses the GUE variant as 1 (bits 01) and IP version as 4 expresses the GUE variant as 1 (bits 01) and IP version as 4 (bits
(bits 0100). 0100).
4.2. Direct encapsulation of IPv6 4.2. Direct encapsulation of IPv6
The format for encapsulating IPv6 directly in UDP is demonstrated The format for encapsulating IPv6 directly in UDP is demonstrated
below: below:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
| Source port | Destination port | | | Source port | Destination port | |
skipping to change at page 17, line 13 skipping to change at page 17, line 13
(bits 0110). (bits 0110).
5. Operation 5. Operation
The figure below illustrates the use of GUE encapsulation between two The figure below illustrates the use of GUE encapsulation between two
hosts. Host 1 is sending packets to Host 2. An encapsulator performs hosts. Host 1 is sending packets to Host 2. An encapsulator performs
encapsulation of packets from Host 1. These encapsulated packets encapsulation of packets from Host 1. These encapsulated packets
traverse the network as UDP packets. At the decapsulator, packets are traverse the network as UDP packets. At the decapsulator, packets are
decapsulated and sent on to Host 2. Packet flow in the reverse decapsulated and sent on to Host 2. Packet flow in the reverse
direction need not be symmetric; for example, the reverse path might direction need not be symmetric; for example, the reverse path might
not use GUE and/or any other form of encapsulation. not use GUE or any other form of encapsulation.
+---------------+ +---------------+ +---------------+ +---------------+
| | | | | | | |
| Host 1 | | Host 2 | | Host 1 | | Host 2 |
| | | | | | | |
+---------------+ +---------------+ +---------------+ +---------------+
| ^ | ^
V | V |
+---------------+ +---------------+ +---------------+ +---------------+ +---------------+ +---------------+
| | | | | | | | | | | |
| Encapsulator |-->| Layer 3 |-->| Decapsulator | | Encapsulator |-->| Layer 3 |-->| Decapsulator |
| | | Network | | | | | | Network | | |
+---------------+ +---------------+ +---------------+ +---------------+ +---------------+ +---------------+
The encapsulator and decapsulator may be co-resident with the The encapsulator and decapsulator may be co-resident with the
corresponding hosts, or may be on separate nodes in the network. corresponding hosts, or may be on separate nodes in the network.
5.1. Network tunnel encapsulation 5.1. Network tunnel encapsulation
Network tunneling can be achieved by encapsulating layer 2 or layer 3 Network tunneling can be achieved by encapsulating layer 2 or layer 3
packets. In this case the encapsulator and decapsulator nodes are the packets. In this case, the encapsulator and decapsulator nodes are
tunnel endpoints. These could be routers that provide network tunnels the tunnel endpoints. These could be routers that provide network
on behalf of communicating hosts. tunnels on behalf of communicating hosts.
5.2. Transport layer encapsulation 5.2. Transport layer encapsulation
When encapsulating layer 4 packets, the encapsulator and decapsulator When encapsulating layer 4 packets, the encapsulator and decapsulator
should be co-resident with the hosts. In this case, the encapsulation should be co-resident with the hosts. In this case, the encapsulation
headers are inserted between the IP header and the transport packet. headers are inserted between the IP header and the transport packet.
The addresses in the IP header refer to both the endpoints of the The addresses in the IP header refer to both the endpoints of the
encapsulation and the endpoints for terminating the transport encapsulation and the endpoints for terminating the encapsulated
protocol. Note that the transport layer ports in the encapsulated transport protocol. Note that the transport layer ports in the
packet are independent of the UDP ports in the outer packet. encapsulated packet are independent of the UDP ports in the outer
packet.
Details about performing transport layer encapsulation are discussed
in [TOU].
5.3. Encapsulator operation 5.3. Encapsulator operation
Encapsulators create GUE data messages, set the fields of the UDP Encapsulators create GUE data messages, set the fields of the UDP
header, set flags and optional extension fields in the GUE header, header, set flags and optional extension fields in the GUE header,
and forward packets to a decapsulator. and forward packets to a decapsulator.
An encapsulator can be an end host originating the packets of a flow, An encapsulator can be an end host originating the packets of a flow,
or can be a network device performing encapsulation on behalf of or can be a network device performing encapsulation on behalf of
hosts (routers implementing tunnels for instance). In either case, hosts (routers implementing tunnels for instance). In either case,
the intended target (decapsulator) is indicated by the outer the intended target (decapsulator) is indicated by the outer
destination IP address and destination port in the UDP header. destination IP address and destination port in the UDP header.
If an encapsulator is tunneling packets -- that is encapsulating If an encapsulator is tunneling packets -- that is encapsulating
packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP
tunnel mode) -- it SHOULD follow standard conventions for tunneling tunnel mode) -- it SHOULD follow standard conventions for tunneling
of one protocol over another. For instance, if an IP packet is being one protocol over another. For instance, if an IP packet is being
encapsualated in GUE then diffserv interaction [RFC2983] and ECN encapsulated in GUE then diffserv interaction [RFC2983] and ECN
propagation for tunnels [RFC6040] SHOULD be followed. propagation for tunnels [RFC6040] SHOULD be followed.
5.4. Decapsulator operation 5.4. Decapsulator operation
A decapsulator performs decapsulation of GUE packets. A decapsulator A decapsulator performs decapsulation of GUE packets. A decapsulator
is addressed by the outer destination IP address of a GUE packet. is addressed by the outer destination IP address and UDP destination
The decapsulator validates packets, including fields of the GUE port of a GUE packet. The decapsulator validates packets, including
header. fields of the GUE header.
If a decapsulator receives a GUE packet with an unsupported variant, If a decapsulator receives a GUE packet with an unsupported variant,
unknown flag, bad header length (too small for included extension unknown flag, bad header length (too small for included extension
fields), unknown control message type, bad protocol number, an fields), unknown control message type, bad protocol number, an
unsupported payload type, or an otherwise malformed header, it MUST unsupported payload type, or an otherwise malformed header, it MUST
drop the packet. Such events MAY be logged subject to configuration drop the packet. Such events MAY be logged subject to configuration
and rate limiting of logging messages. Note that set flags in a GUE and rate limiting of logging messages. Note that set flags in a GUE
header that are unknown to a decapsulator MUST NOT be ignored. If a header that are unknown to a decapsulator MUST NOT be ignored. If a
GUE packet is received by a decapsulator with unknown flags, the GUE packet is received by a decapsulator with unknown flags, the
packet MUST be dropped. packet MUST be dropped.
5.4.1. Processing a received data message 5.4.1. Processing a received data message
If a valid data message is received, the UDP header and GUE header If a valid data message is received, the UDP header and GUE header
are removed from the packet. The outer IP header remains intact and are (logically) removed from the packet. The outer IP header remains
the next protocol in the IP header is set to the protocol from the intact and the next protocol in the IP header is set to the protocol
proto field in the GUE header. The resulting packet is then from the proto field in the GUE header. The resulting packet is then
resubmitted into the protocol stack to process that packet as though resubmitted into the protocol stack to process the packet as though
it was received with the protocol in the GUE header. it was received with the protocol indicated in the GUE header.
As an example, consider that a data message is received where GUE As an example, consider that a data message is received where GUE
encapsulates an IPv4 packet using GUE variant 0. In this case proto encapsulates an IPv4 packet using GUE variant 0. In this case proto
field in the GUE header is set to 4 for IPv4 encapsulation: field in the GUE header is set to 4 for IPv4 encapsulation:
+-------------------------------------+ +-------------------------------------+
| IP header (next proto = 17,UDP) | | IP header (next proto = 17,UDP) |
|-------------------------------------| |-------------------------------------|
| UDP | | UDP |
|-------------------------------------| |-------------------------------------|
skipping to change at page 19, line 22 skipping to change at page 19, line 22
| IPv4 header and packet | | IPv4 header and packet |
+-------------------------------------+ +-------------------------------------+
The receiver removes the UDP and GUE headers and sets the next The receiver removes the UDP and GUE headers and sets the next
protocol field in the IP packet to 4, which is derived from the GUE protocol field in the IP packet to 4, which is derived from the GUE
proto field. The resultant packet would have the format: proto field. The resultant packet would have the format:
+-------------------------------------+ +-------------------------------------+
| IP header (next proto = 4,IPv4) | | IP header (next proto = 4,IPv4) |
|-------------------------------------| |-------------------------------------|
| IP header and packet | | IPv4 header and packet |
+-------------------------------------+ +-------------------------------------+
This packet is then resubmitted into the protocol stack to be This packet is then resubmitted into the protocol stack to be
processed as an IPv4 encapsulated packet. processed as an IPv4 encapsulated packet.
5.4.2. Processing a received control message 5.4.2. Processing a received control message
If a valid control message is received, the packet MUST be processed If a valid control message is received, the packet MUST be processed
as a control message. The specific processing to be performed depends as a control message. The specific processing to be performed depends
on the value in the ctype field of the GUE header. on the value in the ctype field of the GUE header.
5.5. Router and switch operation 5.5. Middlebox inspection
A middlebox MAY inspect a GUE header. A middlebox MUST NOT modify a
GUE header or UDP payload.
To inspect a GUE header, a middlebox needs to identify GUE packets.
The obvious method is to match the destination UDP port number to be
the GUE port number (i.e. 6080). Per [RFC7605], transport port
numbers only have meaning at the endpoints of communications, so
inferring the type of a UDP payload based on port number may be
incorrect. Middleboxes MUST NOT take any action that would have
harmful side effects if a UDP packet were misinterpreted as being a
GUE packet. In particular, a middlebox MUST NOT modify a UDP payload
based on inferring the payload type from the port number lest the
middlebox could cause silent data corruption.
A middlebox MAY interpret some flags and extension fields of the GUE
header for classification purposes, but is not required to understand
any of the flags or extension fields in GUE packets. A middlebox MUST
NOT drop a GUE packet merely because there are flags unknown to it.
Similarly, a middlebox MUST NOT arbitrarily filter packets based on
GUE flags or extension fields that are present or not present. The
header length in the GUE header allows a middlebox to inspect the
payload packet without needing to parse the flags or extension
fields.
5.6. Router and switch operation
Routers and switches SHOULD forward GUE packets as standard UDP/IP Routers and switches SHOULD forward GUE packets as standard UDP/IP
packets. The outer five-tuple should contain sufficient information packets. The outer five-tuple should contain sufficient information
to perform flow classification corresponding to the flow of the inner to perform flow classification corresponding to the flow of the inner
packet. A router does not normally need to parse a GUE header, and packet. A router does not normally need to parse a GUE header, and
none of the flags or extension fields in the GUE header are expected none of the flags or extension fields in the GUE header are expected
to affect routing. In cases where the outer five-tuple does not to affect routing. In cases where the outer five-tuple does not
provide sufficient entropy for flow classification, for instance UDP provide sufficient entropy for flow classification, for instance UDP
ports are fixed to provide connection semantics (section 5.6.1), then ports are fixed to provide connection semantics (section 5.6.1), then
the encapsulated packet MAY be parsed to determine flow entropy. the encapsulated packet MAY be parsed to determine flow entropy.
A router MUST NOT modify a GUE header when forwarding a packet. It A router MUST NOT modify a GUE header or payload when forwarding a
MAY encapsulate a GUE packet in another GUE packet, for instance to packet. It MAY encapsulate a GUE packet in another GUE packet, for
implement a network tunnel (i.e. by encapsulating an IP packet with a instance to implement a network tunnel (i.e. by encapsulating an IP
GUE payload in another IP packet as a GUE payload). In this case, the packet with a GUE payload in another IP packet as a GUE payload). In
router takes the role of an encapsulator, and the corresponding this case, the router takes the role of an encapsulator, and the
decapsulator is the logical endpoint of the tunnel. When corresponding decapsulator is the logical endpoint of the tunnel.
encapsulating a GUE packet within another GUE packet, there are no When encapsulating a GUE packet within another GUE packet, there are
provisions to automatically copy flags or fields to the outer GUE no provisions to automatically copy flags or fields to the outer GUE
header. Each layer of encapsulation is considered independent. header. Each layer of encapsulation is considered independent.
5.6. Middlebox interactions 5.6.1. Connection semantics
A middlebox MAY interpret some flags and extension fields of the GUE
header for classification purposes, but is not required to understand
any of the flags or extension fields in GUE packets. A middlebox MUST
NOT drop a GUE packet merely because there are flags unknown to it.
The header length in the GUE header allows a middlebox to inspect the
payload packet without needing to parse the flags or extension
fields.
5.6.1. Inferring connection semantics
A middlebox might infer bidirectional connection semantics for a UDP A middlebox might infer bidirectional connection semantics for a UDP
flow. For instance, a stateful firewall might create a five-tuple flow. For instance, a stateful firewall might create a five-tuple
rule to match flows on egress, and a corresponding five-tuple rule rule to match flows on egress, and a corresponding five-tuple rule
for matching ingress packets where the roles of source and for matching ingress packets where the roles of source and
destination are reversed for the IP addresses and UDP port numbers. destination are reversed for the IP addresses and UDP port numbers.
To operate in this environment, a GUE tunnel should be configured to To operate in this environment, a GUE tunnel should be configured to
assume connected semantics defined by the UDP five tuple and the use assume connected semantics defined by the UDP five tuple and the use
of GUE encapsulation needs to be symmetric between both endpoints. of GUE encapsulation needs to be symmetric between both endpoints.
The source port set in the UDP header MUST be the destination port The source port set in the UDP header MUST be the destination port
skipping to change at page 20, line 40 skipping to change at page 21, line 8
described in section 5.11. described in section 5.11.
The selection of whether to make the UDP source port fixed or set to The selection of whether to make the UDP source port fixed or set to
a flow entropy value for each packet sent SHOULD be configurable for a flow entropy value for each packet sent SHOULD be configurable for
a tunnel. The default MUST be to set the flow entropy value in the a tunnel. The default MUST be to set the flow entropy value in the
UDP source port. UDP source port.
5.6.2. NAT 5.6.2. NAT
IP address and port translation can be performed on the UDP/IP IP address and port translation can be performed on the UDP/IP
headers adhering to the requirements for NAT with UDP [RFC4787]. In headers adhering to the requirements for NAT (Network Address
the case of stateful NAT, connection semantics MUST be applied to a Translation) with UDP [RFC4787]. In the case of stateful NAT,
GUE tunnel as described in section 5.6.1. GUE endpoints MAY also connection semantics MUST be applied to a GUE tunnel as described in
invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings section 5.6.1. GUE endpoints MAY also invoke STUN [RFC5389] or ICE
for encapsulations. [RFC5245] to manage NAT port mappings for encapsulations.
5.7. Checksum Handling 5.7. Checksum Handling
The potential for mis-delivery of packets due to corruption of IP, The potential for mis-delivery of packets due to corruption of IP,
UDP, or GUE headers needs to be considered. Historically, the UDP UDP, or GUE headers needs to be considered. Historically, the UDP
checksum would be considered sufficient as a check against corruption checksum would be considered sufficient as a check against corruption
of either the UDP header and payload or the IP addresses. of either the UDP header and payload or the IP addresses.
Encapsulation protocols, such as GUE, can be originated or terminated Encapsulation protocols, such as GUE, can be originated or terminated
on devices incapable of computing the UDP checksum for packet. This on devices incapable of computing the UDP checksum for packet. This
section discusses the requirements around checksum and alternatives section discusses the requirements around checksum and alternatives
that might be used when an endpoint does not support UDP checksum. that might be used when an endpoint does not support UDP checksum.
5.7.1. Requirements 5.7.1. Requirements
One of the following requirements MUST be met: One of the following requirements MUST be met:
o UDP checksums are enabled (for IPv4 or IPv6). o UDP checksums are enabled (for IPv4 or IPv6).
o The GUE header checksum is used (defined in [GUEEXTEN]). o The GUE header checksum is used (defined in [GUEEXTEN]).
o Use zero UDP checksums. This is always permissible with IPv4; in o Use zero UDP checksums. This is always permissible with IPv4; in
IPv6, they can only be used in accordance with applicable IPv6, they can only be used in accordance with applicable
requirements in [RFC8086], [RFC6935], and [RFC6936]. requirements in [RFC8086], [RFC6935], and [RFC6936].
5.7.2. UDP Checksum with IPv4 5.7.2. UDP Checksum with IPv4
For UDP in IPv4, the UDP checksum MUST be processed as specified in For UDP in IPv4, the UDP checksum MUST be processed as specified in
[RFC768] and [RFC1122] for both transmit and receive. An [RFC0768] and [RFC1122] for both transmit and receive. An
encapsulator MAY set the UDP checksum to zero for performance or encapsulator MAY set the UDP checksum to zero for performance or
implementation considerations. The IPv4 header includes a checksum implementation considerations. The IPv4 header includes a checksum
that protects against mis-delivery of the packet due to corruption that protects against mis-delivery of the packet due to corruption of
of IP addresses. The UDP checksum potentially provides protection IP addresses. The UDP checksum potentially provides protection
against corruption of the UDP header, GUE header, and GUE payload. against corruption of the UDP header, GUE header, and GUE payload.
Enabling or disabling the use of checksums is a deployment Enabling or disabling the use of checksums is a deployment
consideration that should take into account the risk and effects of consideration that should take into account the risk and effects of
packet corruption, and whether the packets in the network are packet corruption, and whether the packets in the network are already
already adequately protected by other, possibly stronger mechanisms, adequately protected by other, possibly stronger mechanisms, such as
such as the Ethernet CRC. If an encapsulator sets a zero UDP the Ethernet CRC. If an encapsulator sets a zero UDP checksum for
checksum for IPv4, it SHOULD use the GUE header checksum as IPv4, it SHOULD use the GUE header checksum as described in
described in [GUEEXTEN] assuming there are no other mechanisms used [GUEEXTEN] if there are no other mechanisms used that would detect
to protect the GUE packet. corruption of GUE packets.
When a decapsulator receives a packet, the UDP checksum field MUST When a decapsulator receives a packet, the UDP checksum field MUST be
be processed. If the UDP checksum is non-zero, the decapsulator MUST processed. If the UDP checksum is non-zero, the decapsulator MUST
verify the checksum before accepting the packet. By default, a verify the checksum before accepting the packet. By default, a
decapsulator SHOULD accept UDP packets with a zero checksum. A node decapsulator SHOULD accept UDP packets with a zero checksum. A node
MAY be configured to disallow zero checksums per [RFC1122]. MAY be configured to disallow zero checksums per [RFC1122].
Configuration of zero checksums can be selective. For instance, zero Configuration of zero checksums can be selective. For instance, zero
checksums might be disallowed from certain hosts that are known to checksums might be disallowed from certain hosts that are known to be
be traversing paths subject to packet corruption. If verification of traversing paths subject to packet corruption. If verification of a
a non-zero checksum fails, a decapsulator lacks the capability to non-zero checksum fails, a decapsulator lacks the capability to
verify a non-zero checksum, or a packet with a zero-checksum was verify a non-zero checksum, or a packet with a zero-checksum was
received and the decapsulator is configured to disallow, then the received and the decapsulator is configured to disallow that, then
packet MUST be dropped. the packet MUST be dropped.
5.7.3. UDP Checksum with IPv6 5.7.3. UDP Checksum with IPv6
In IPv6, there is no checksum in the IPv6 header that protects In IPv6, there is no checksum in the IPv6 header that protects
against mis-delivery due to address corruption. Therefore, when GUE against mis-delivery due to address corruption. Therefore, when GUE
is used over IPv6, either the UDP checksum or the GUE header is used over IPv6, either the UDP checksum or the GUE header checksum
checksum SHOULD be used unless there are alternative mechanisms in SHOULD be used unless there are alternative mechanisms in use that
use that protect against misdelivery. The UDP checksum and GUE protect against misdelivery. The UDP checksum and GUE header checksum
header checksum SHOULD NOT be used at the same time since that would SHOULD NOT be used at the same time since that would be mostly
be mostly redundant. redundant.
If neither the UDP checksum or the GUE header checksum is used, then If neither the UDP checksum nor the GUE header checksum is used, then
the requirements for using zero IPv6 UDP checksums in [RFC6935] and the requirements for using zero IPv6 UDP checksums in [RFC6935] and
[RFC6936] MUST be met. [RFC6936] MUST be met.
When a decapsulator receives a packet, the UDP checksum field MUST When a decapsulator receives a packet, the UDP checksum field MUST be
be processed. If the UDP checksum is non-zero, the decapsulator MUST processed. If the UDP checksum is non-zero, the decapsulator MUST
verify the checksum before accepting the packet. By default a verify the checksum before accepting the packet. By default a
decapsulator MUST only accept UDP packets with a zero checksum if decapsulator MUST only accept UDP packets with a zero checksum if the
the GUE header checksum is used and is verified. If verification of GUE header checksum is used and is verified. If verification of a
a non-zero checksum fails, a decapsulator lacks the capability to non-zero checksum fails or a decapsulator lacks the capability to
verify a non-zero checksum, or a packet with a zero-checksum and no verify a non-zero checksum then the packet MUST be dropped. If a
GUE header checksum was received, the packet MUST be dropped. packet is received with a zero UDP checksum, no GUE header checksum,
and zero UDP checksums are disallowed then the packet MUST be
dropped.
5.8. MTU and fragmentation 5.8. MTU and fragmentation
Standard conventions for handling of MTU (Maximum Transmission Unit) Standard conventions for handling of MTU (Maximum Transmission Unit)
and fragmentation in conjunction with networking tunnels and fragmentation in conjunction with networking tunnels
(encapsulation of layer 2 or layer 3 packets) SHOULD be followed. (encapsulation of layer 2 or layer 3 packets) SHOULD be followed.
Details are described in MTU and Fragmentation Issues with In-the- Details are described in MTU and Fragmentation Issues with In-the-
Network Tunneling [RFC4459]. Network Tunneling [RFC4459].
If a packet is fragmented before encapsulation in GUE, all the If a packet is fragmented before encapsulation in GUE, all the
related fragments MUST be encapsulated using the same UDP source related fragments MUST be encapsulated using the same UDP source
port. An operator SHOULD set MTU to account for encapsulation port. An operator SHOULD set MTU to account for encapsulation
overhead and reduce the likelihood of fragmentation. overhead and reduce the likelihood of fragmentation.
Alternative to IP fragmentation, the GUE fragmentation extension can Alternative to IP fragmentation, the GUE fragmentation extension can
be used. GUE fragmentation is described in [GUEEXTEN]. be used. GUE fragmentation is described in [GUEEXTEN].
5.9. Congestion control 5.9. Congestion control
Per requirements of [RFC5405], if the IP traffic encapsulated with Per requirements of [RFC8085], if the IP traffic encapsulated with
GUE implements proper congestion control no additional mechanisms GUE implements proper congestion control then no additional
should be required. mechanisms should be required.
In the case that the encapsulated traffic does not implement any or In the case that the encapsulated traffic does not implement any or
sufficient control, or it is not known whether a transmitter will sufficient control, or it is not known whether a transmitter will
consistently implement proper congestion control, then congestion consistently implement proper congestion control, then congestion
control at the encapsulation layer MUST be provided per [RFC5405]. control at the encapsulation layer MUST be provided per [RFC8085].
Note that this case applies to a significant use case in network Note that this case applies to a significant use case in network
virtualization in which guests run third party networking stacks virtualization in which guests run third party networking stacks that
that cannot be implicitly trusted to implement conformant congestion cannot be implicitly trusted to implement conformant congestion
control. control.
Out of band mechanisms such as rate limiting, Managed Circuit Out of band mechanisms such as rate limiting, Managed Circuit Breaker
Breaker [RFC8084], or traffic isolation MAY be used to provide [RFC8084], or traffic isolation MAY be used to provide rudimentary
rudimentary congestion control. For finer-grained congestion control congestion control. For finer-grained congestion control that allows
that allows alternate congestion control algorithms, reaction time alternate congestion control algorithms, reaction time within an RTT,
within an RTT, and interaction with ECN, in-band mechanisms might be and interaction with ECN, in-band mechanisms might be warranted.
warranted.
5.10. Multicast 5.10. Multicast
GUE packets can be multicast to decapsulators using a multicast GUE packets can be multicast to decapsulators using a multicast
destination address in the encapsulating IP headers. Each receiving destination address in the outer IP header. Each receiving host will
host will decapsulate the packet independently following normal decapsulate the packet independently following normal decapsulator
decapsulator operations. The receiving decapsulators need to agree operations. The receiving decapsulators need to agree on the same set
on the same set of GUE parameters and properties; how such an of GUE parameters and properties; how such an agreement is reached is
agreement is reached is outside the scope of this document. outside the scope of this document.
GUE allows encapsulation of unicast, broadcast, or multicast GUE allows encapsulation of unicast, broadcast, or multicast traffic.
traffic. Flow entropy (the value in the UDP source port) can be Flow entropy (the value in the UDP source port) can be generated from
generated from the header of encapsulated unicast or the header of encapsulated unicast or broadcast/multicast packets at
broadcast/multicast packets at an encapsulator. The mapping an encapsulator. The mapping mechanism between the encapsulated
mechanism between the encapsulated multicast traffic and the multicast traffic and the multicast capability in the IP network is
multicast capability in the IP network is transparent and transparent and independent of the encapsulation and is otherwise
independent of the encapsulation and is otherwise outside the scope outside the scope of this document.
of this document.
5.11. Flow entropy for ECMP 5.11. Flow entropy for ECMP
A major objective of using GUE is that a network device can perform
flow classification corresponding to the flow of the inner
encapsulated packet based on the contents of the outer headers.
5.11.1. Flow classification 5.11.1. Flow classification
A major objective of using GUE is that a network device can perform When a packet is encapsulated with GUE and connection semantics are
flow classification corresponding to the flow of the inner not applied, the source port in the outer UDP packet is set to a flow
encapsulated packet based on the contents in the outer headers. entropy value that corresponds to the flow of the inner packet. When
a device computes a five-tuple hash on the outer UDP/IP header of a
Hardware devices commonly perform hash computations on packet GUE packet, the resultant value classifies the packet per its inner
headers to classify packets into flows or flow buckets. Flow flow.
classification is done to support load balancing of flows across a
set of networking resources. Examples of such load balancing
techniques are Equal Cost Multipath routing (ECMP), port selection
in Link Aggregation, and NIC device Receive Side Scaling (RSS).
Hashes are usually either a three-tuple hash of IP protocol, source
address, and destination address; or a five-tuple hash consisting of
IP protocol, source address, destination address, source port, and
destination port. Typically, networking hardware will compute five-
tuple hashes for TCP and UDP, but only three-tuple hashes for other
IP protocols. Since the five-tuple hash provides more granularity,
load balancing can be finer-grained with better distribution. When a
packet is encapsulated with GUE and connection semantics are not
applied, the source port in the outer UDP packet is set to a flow
entropy value that corresponds to the flow of the inner packet. When
a device computes a five-tuple hash on the outer UDP/IP header of a
GUE packet, the resultant value classifies the packet per its inner
flow.
Examples of deriving flow entropy for encapsulation are: Examples of deriving flow entropy for encapsulation are:
o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for
instance, the flow entropy could be based on the canonical five- instance, the flow entropy could be based on the canonical five-
tuple hash of the inner packet. tuple hash of the inner packet.
o If the encapsulated packet is an AH transport mode packet with o If the encapsulated packet is an AH transport mode packet with
TCP as next header, the flow entropy could be a hash over a TCP as next header, the flow entropy could be a hash over a
three-tuple: TCP protocol and TCP ports of the encapsulated three-tuple: TCP protocol and TCP ports of the encapsulated
packet. packet.
o If a node is encrypting a packet using ESP tunnel mode and GUE o If a node is encrypting a packet using ESP tunnel mode and GUE
encapsulation, the flow entropy could be based on the contents encapsulation, the flow entropy could be based on the contents
of the clear-text packet. For instance, a canonical five-tuple of the clear-text packet. For instance, a canonical five-tuple
hash for a TCP/IP packet could be used. hash for a TCP/IP packet could be used.
[RFC6438] discusses methods to compute and set flow entropy value for [RFC6438] discusses methods to compute and set flow entropy value for
IPv6 flow labels. Such methods can also be used to create flow IPv6 flow labels, such methods can also be used to create flow
entropy values for GUE. entropy values for GUE.
5.11.2. Flow entropy properties 5.11.2. Flow entropy properties
The flow entropy is the value set in the UDP source port of a GUE The flow entropy is the value set in the UDP source port of a GUE
packet. Flow entropy in the UDP source port SHOULD adhere to the packet. Flow entropy in the UDP source port SHOULD adhere to the
following properties: following properties:
o The value set in the source port is within the ephemeral port o The value set in the source port is within the ephemeral port
range (49152 to 65535 [RFC6335]). Since the high order two bits range (49152 to 65535 [RFC6335]). Since the high order two bits
skipping to change at page 25, line 16 skipping to change at page 25, line 18
o Decapsulators, or any networking devices, SHOULD NOT attempt to o Decapsulators, or any networking devices, SHOULD NOT attempt to
interpret flow entropy as anything more than an opaque value. interpret flow entropy as anything more than an opaque value.
Neither should they attempt to reproduce the hash calculation Neither should they attempt to reproduce the hash calculation
used by an encapasulator in creating a flow entropy value. They used by an encapasulator in creating a flow entropy value. They
MAY use the value to match further receive packets for steering MAY use the value to match further receive packets for steering
decisions, but MUST NOT assume that the hash uniquely or decisions, but MUST NOT assume that the hash uniquely or
permanently identifies a flow. permanently identifies a flow.
o Input to the flow entropy calculation is not restricted to ports o Input to the flow entropy calculation is not restricted to ports
and addresses; input could include flow label from an IPv6 and addresses; input could include the flow label from an IPv6
packet, SPI from an ESP packet, or other flow related state in packet, SPI from an ESP packet, or other flow related state in
the encapsulator that is not necessarily conveyed in the packet. the encapsulator that is not necessarily conveyed in the packet.
o The assignment function for flow entropy SHOULD be randomly o The assignment function for flow entropy SHOULD be randomly
seeded to mitigate denial of service attacks. The seed SHOULD be seeded to mitigate denial of service attacks. The seed SHOULD be
changed periodically. changed periodically.
5.12 Negotiation of acceptable flags and extension fields 5.12. Negotiation of acceptable flags and extension fields
An encapsulator and decapsulator need to achieve agreement about GUE An encapsulator and decapsulator need to achieve agreement about GUE
parameters that will be used in communications. Parameters include parameters that will be used in communications. Parameters include
supported GUE variants, flags and extension fields that can be used, supported GUE variants, flags and extension fields that can be used,
security algorithms and keys, supported protocols and control security algorithms and keys, supported protocols and control
messages, etc. This document proposes different general methods to messages, etc. This document proposes different general methods to
accomplish this, however the details of implementing these are accomplish this, however the details of implementing these are
considered out of scope. considered out of scope.
General methods for this are: General methods for this are:
o Configuration. The parameters used for a tunnel are configured o Configuration. The parameters used for a tunnel are configured
at each endpoint. at each endpoint.
o Negotiation. A tunnel negotiation can be performed. This could o Negotiation. A tunnel negotiation can be performed. This could
be accomplished in-band of GUE using control messages or private be accomplished in-band of GUE using control messages.
data.
o Via a control plane. Parameters for communicating with a tunnel o Via a control plane. Parameters for communicating with a tunnel
endpoint can be set in a control plane protocol (such as that endpoint can be set in a control plane protocol (such as that
needed for network virtualization). needed for network virtualization).
o Via security negotiation. Use of security typically implies a o Via security negotiation. Use of security typically implies a
key exchange between endpoints. Other GUE parameters may be key exchange between endpoints. Other GUE parameters may be
conveyed as part of that process. conveyed as part of that process.
6. Motivation for GUE 6. Motivation for GUE
This section presents the motivation for GUE with respect to other This section provides the motivation for GUE with respect to other
encapsulation methods. encapsulation methods.
6.1. Benefits of GUE 6.1. Benefits of GUE
* GUE is a generic encapsulation protocol. GUE can encapsulate * GUE is a generic encapsulation protocol. GUE can encapsulate
protocols that are represented by an IP protocol number. This protocols that are represented by an IP protocol number. This
includes layer 2, layer 3, and layer 4 protocols. includes layer 2, layer 3, and layer 4 protocols.
* GUE is an extensible encapsulation protocol. Standardized * GUE is an extensible encapsulation protocol. Standardized
optional data such as security, virtual networking identifiers, optional data such as security, virtual networking identifiers,
fragmentation are being defined. fragmentation are defined.
* For extensilbity, GUE uses flag fields as opposed to TLVs as * For extensibility, GUE uses flag fields as opposed to TLVs as
some other encapsulation protocols do. Flag fields are strictly some other encapsulation protocols do. Flag fields are strictly
ordered, allow random access, and are efficient in use of header ordered, allow random access, and are efficient in use of header
space. space.
* GUE allows private data to be sent as part of the encapsulation. * GUE allows private data to be sent as part of the encapsulation.
This permits experimentation or customization in deployment. This permits experimentation or customization in deployment.
* GUE allows sending of control messages such as OAM using the * GUE allows sending of control messages such as OAM using the
same GUE header format (for routing purposes) as normal data same GUE header format (for routing purposes) as normal data
messages. messages.
* GUE maximizes deliverability of non-UDP and non-TCP protocols. * GUE maximizes deliverability of non-UDP and non-TCP protocols.
* GUE provides a means for exposing per flow entropy for ECMP for * GUE provides a means for exposing per flow entropy for ECMP for
atypical protocols such as SCTP, DCCP, ESP, etc. atypical protocols such as SCTP, DCCP, ESP, etc.
6.2 Comparison of GUE to other encapsulations 6.2. Comparison of GUE to other encapsulations
A number of different encapsulation techniques have been proposed for A number of different encapsulation techniques have been proposed for
the encapsulation of one protocol over another. EtherIP [RFC3378] the encapsulation of one protocol over another. EtherIP [RFC3378]
provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784],
MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling
layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN
[RFC7348] are proposals for encapsulation of layer 2 packets for [RFC7348] are proposals for encapsulation of layer 2 packets for
network virtualization. IPIP [RFC2003] and Generic packet tunneling network virtualization. IPIP [RFC2003] and Generic packet tunneling
in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. in IPv6 [RFC2473] provide methods for tunneling IP packets over IP.
skipping to change at page 27, line 13 skipping to change at page 27, line 13
[RFC8086]. [RFC8086].
GUE has the following discriminating features: GUE has the following discriminating features:
o UDP encapsulation leverages specialized network device o UDP encapsulation leverages specialized network device
processing for efficient transport. The semantics for using the processing for efficient transport. The semantics for using the
UDP source port for flow entropy as input to ECMP are defined in UDP source port for flow entropy as input to ECMP are defined in
section 5.11. section 5.11.
o GUE permits encapsulation of arbitrary IP protocols, which o GUE permits encapsulation of arbitrary IP protocols, which
includes layer 2 3, and 4 protocols. includes layer 2, 3, and 4 protocols.
o Multiple protocols can be multiplexed over a single UDP port o Multiple protocols can be multiplexed over a single UDP port
number. This is in contrast to techniques to encapsulate number. This is in contrast to techniques to encapsulate
protocols over UDP using a protocol specific port number (such protocols over UDP using a protocol specific port number (such
as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and
extensible mechanism for encapsulating all IP protocols in UDP extensible mechanism for encapsulating all IP protocols in UDP
with minimal overhead (four bytes of additional header). with minimal overhead (four bytes of additional header).
o GUE is extensible. New flags and extension fields can be o GUE is extensible. New flags and extension fields can be
defined. defined.
skipping to change at page 27, line 37 skipping to change at page 27, line 37
to parse the full encapsulation header. to parse the full encapsulation header.
o Private data in the encapsulation header allows local o Private data in the encapsulation header allows local
customization and experimentation while being compatible with customization and experimentation while being compatible with
processing in network nodes (routers and middleboxes). processing in network nodes (routers and middleboxes).
o GUE includes both data messages (encapsulation of packets) and o GUE includes both data messages (encapsulation of packets) and
control messages (such as OAM). control messages (such as OAM).
o The flags-field model facilitates efficient implementation of o The flags-field model facilitates efficient implementation of
extensibility in hardware. For instance, a TCAM can be use to extensibility in hardware. For instance, a TCAM can be used to
parse a known set of N flags where the number of entries in the parse a known set of N flags where the number of entries in the
TCAM is 2^N. By comparison, the number of TCAM entries needed to TCAM is 2^N. By comparison, the number of TCAM entries needed to
parse a set of N arbitrarily ordered TLVS is approximately e*N!. parse a set of N arbitrarily ordered TLVs is approximately e*N!.
o GUE includes a variant that encapsulates IPv4 and IPv6 packets o GUE includes a variant that encapsulates IPv4 and IPv6 packets
directly within UDP. directly within UDP.
7. Security Considerations 7. Security Considerations
There are two important considerations of security with respect to There are two important considerations of security with respect to
GUE. GUE.
o Authentication and integrity of the GUE header. o Authentication and integrity of the GUE header.
o Authentication, integrity, and confidentiality of the GUE o Authentication, integrity, and confidentiality of the GUE
payload. payload.
GUE security is provided by extensions for security defined in GUE security is provided by extensions for security defined in
[GUEEXTEN]. These extensions include methods to authenticate the GUE [GUEEXTEN]. These extensions include methods to authenticate the GUE
header and encrypt the GUE payload. header and encrypt the GUE payload.
The GUE header can be authenticated using a security extension for an The GUE header can be authenticated using a security extension for an
HMAC. Securing the GUE payload can be accomplished use of the GUE HMAC (Hashed Message Authentication Code). Securing the GUE payload
Payload Transform. This extension can be used to perform DTLS in the can be accomplished use of the GUE Payload Transform extension. This
payload of a GUE packet to encrypt the payload. extension allows the use of DTLS (Datagram Transport Layer Security)
to encrypt and authenticate the GUE payload.
A hash function for computing flow entropy (section 5.11) SHOULD be A hash function for computing flow entropy (section 5.11) SHOULD be
randomly seeded to mitigate some possible denial service attacks. randomly seeded to mitigate some possible denial service attacks.
8. IANA Considerations 8. IANA Considerations
8.1. UDP source port 8.1. UDP source port
A user UDP port number assignment for GUE has been assigned: A user UDP port number assignment for GUE has been assigned:
skipping to change at page 29, line 8 skipping to change at page 29, line 8
Description: Generic UDP Encapsulation Description: Generic UDP Encapsulation
Reference: draft-herbert-gue Reference: draft-herbert-gue
Port Number: 6080 Port Number: 6080
Service Code: N/A Service Code: N/A
Known Unauthorized Uses: N/A Known Unauthorized Uses: N/A
Assignment Notes: N/A Assignment Notes: N/A
8.2. GUE variant number 8.2. GUE variant number
IANA is requested to set up a registry for the GUE variant number. IANA is requested to set up a registry for the GUE variant number.
The GUE variant number is 2 bits containing four possible values. The GUE variant number is two bits containing four possible values.
This document defines version 0 and 1. New values are assigned in This document defines variants 0 and 1. New values are assigned in
accordance with RFC Required policy [RFC5226]. accordance with RFC Required policy [RFC5226].
+----------------+----------------+---------------+ +----------------+----------------+---------------+
| Variant number | Description | Reference | | Variant number | Description | Reference |
+----------------+----------------+---------------+ +----------------+----------------+---------------+
| 0 | GUE Version 0 | This document | | 0 | GUE Version 0 | This document |
| | with header | | | | with header | |
| | | | | | | |
| 1 | GUE Version 0 | This document | | 1 | GUE Version 0 | This document |
| | with direct IP | | | | with direct IP | |
skipping to change at page 29, line 48 skipping to change at page 29, line 48
| | | | | | | |
| 1..127 | Unassigned | | | 1..127 | Unassigned | |
| | | | | | | |
| 128..255 | User defined | This document | | 128..255 | User defined | This document |
+----------------+------------------+---------------+ +----------------+------------------+---------------+
9. Acknowledgements 9. Acknowledgements
The authors would like to thank David Liu, Erik Nordmark, Fred The authors would like to thank David Liu, Erik Nordmark, Fred
Templin, Adrian Farrel, Bob Briscoe, and Murray Kucherawy for Templin, Adrian Farrel, Bob Briscoe, and Murray Kucherawy for
valuable input on this draft. valuable input on this draft. Special thanks to Fred Templin who is
serving as document shepherd.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI
10.17487/RFC0768, August 1980, <http://www.rfc- 10.17487/RFC0768, August 1980, <http://www.rfc-
editor.org/info/rfc768>. editor.org/info/rfc768>.
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
Communication Layers", STD 3, RFC 1122, DOI Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
10.17487/RFC1122, October 1989, <http://www.rfc- March 2017, <https://www.rfc-editor.org/info/rfc8085>.
editor.org/info/rfc1122>.
[RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
IANA Considerations Section in RFCs", RFC 2434, DOI Requirement Levels", BCP 14, RFC 2119, DOI
10.17487/RFC2434, October 1998, <http://www.rfc- 10.17487/RFC2119, March 1997, <https://www.rfc-
editor.org/info/rfc2434>. editor.org/info/rfc2119>.
[RFC2983] Black, D., "Differentiated Services and Tunnels", RFC [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC
2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc- 2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc-
editor.org/info/rfc2983>. editor.org/info/rfc2983>.
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion
Notification", RFC 6040, DOI 10.17487/RFC6040, November Notification", RFC 6040, DOI 10.17487/RFC6040, November
2010, <http://www.rfc-editor.org/info/rfc6040>. 2010, <http://www.rfc-editor.org/info/rfc6040>.
[RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and
UDP Checksums for Tunneled Packets", RFC 6935, DOI UDP Checksums for Tunneled Packets", RFC 6935, DOI
10.17487/RFC6935, April 2013, <http://www.rfc- 10.17487/RFC6935, April 2013, <http://www.rfc-
editor.org/info/rfc6935>. editor.org/info/rfc6935>.
[RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement
for the Use of IPv6 UDP Datagrams with Zero Checksums", for the Use of IPv6 UDP Datagrams with Zero Checksums",
RFC 6936, DOI 10.17487/RFC6936, April 2013, RFC 6936, DOI 10.17487/RFC6936, April 2013,
<http://www.rfc-editor.org/info/rfc6936>. <http://www.rfc-editor.org/info/rfc6936>.
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts -
Communication Layers", STD 3, RFC 1122, DOI
10.17487/RFC1122, October 1989, <http://www.rfc-
editor.org/info/rfc1122>.
[RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the-
Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April
2006, <http://www.rfc-editor.org/info/rfc4459>. 2006, <http://www.rfc-editor.org/info/rfc4459>.
10.2. Informative References [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S.
Cheshire, "Internet Assigned Numbers Authority (IANA)
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., Procedures for the Management of the Service Name and
and G. Fairhurst, Ed., "The Lightweight User Datagram Transport Protocol Port Number Registry", BCP 165, RFC
Protocol (UDP-Lite)", RFC 3828, July 2004, 6335, DOI 10.17487/RFC6335, August 2011, <https://www.rfc-
<http://www.rfc-editor.org/info/rfc3828>. editor.org/info/rfc6335>.
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual IANA Considerations Section in RFCs", RFC 5226, DOI
eXtensible Local Area Network (VXLAN): A Framework for 10.17487/RFC5226, May 2008, <https://www.rfc-
Overlaying Virtualized Layer 2 Networks over Layer 3 editor.org/info/rfc5226>.
Networks", RFC 7348, August 2014, <http://www.rfc-
editor.org/info/rfc7348>.
[RFC7605] Touch, J., "Recommendations on Using Assigned Transport [GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for
Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, Generic UDP Encapsulation", draft-herbert-gue-extensions-
August 2015, <http://www.rfc-editor.org/info/rfc7605>. 06
[RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network 10.2. Informative References
Virtualization Using Generic Routing Encapsulation", RFC
7637, DOI 10.17487/RFC7637, September 2015,
<http://www.rfc-editor.org/info/rfc7637>.
[RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- [RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE-
in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086,
March 2017, <http://www.rfc-editor.org/info/rfc8086>. March 2017, <http://www.rfc-editor.org/info/rfc8086>.
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, [RFC7605] Touch, J., "Recommendations on Using Assigned Transport
"Encapsulating MPLS in UDP", RFC 7510, DOI Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605,
10.17487/RFC7510, April 2015, <http://www.rfc- August 2015, <https://www.rfc-editor.org/info/rfc7605>.
editor.org/info/rfc7510>.
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
Congestion Control Protocol (DCCP)", RFC 4340, DOI
10.17487/RFC4340, March 2006, <http://www.rfc-
editor.org/info/rfc4340>.
[RFC4787] Audet, F., Ed., and C. Jennings, "Network Address [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address
Translation (NAT) Behavioral Requirements for Unicast Translation (NAT) Behavioral Requirements for Unicast
UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January
2007, <http://www.rfc-editor.org/info/rfc4787>. 2007, <http://www.rfc-editor.org/info/rfc4787>.
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
"Session Traversal Utilities for NAT (STUN)", RFC 5389, "Session Traversal Utilities for NAT (STUN)", RFC 5389,
DOI 10.17487/RFC5389, October 2008, <http://www.rfc- DOI 10.17487/RFC5389, October 2008, <http://www.rfc-
editor.org/info/rfc5389>. editor.org/info/rfc5389>.
[RFC5285] Rosenberg, J., "Interactive Connectivity Establishment [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment
(ICE): A Protocol for Network Address Translator (NAT) (ICE): A Protocol for Network Address Translator (NAT)
Traversal for Offer/Answer Protocols", RFC 5245, DOI Traversal for Offer/Answer Protocols", RFC 5245, DOI
10.17487/RFC5245, April 2010, <http://www.rfc- 10.17487/RFC5245, April 2010, <http://www.rfc-
editor.org/info/rfc5245>. editor.org/info/rfc5245>.
[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", BCP
for Application Designers", BCP 145, RFC 5405, DOI 208, RFC 8084, DOI 10.17487/RFC8084, March 2017,
10.17487/RFC5405, November 2008, <http://www.rfc- <https://www.rfc-editor.org/info/rfc8084>.
editor.org/info/rfc5405>.
[RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label
for Equal Cost Multipath Routing and Link Aggregation in for Equal Cost Multipath Routing and Link Aggregation in
Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011,
<http://www.rfc-editor.org/info/rfc6438>. <http://www.rfc-editor.org/info/rfc6438>.
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI
10.17487/RFC2003, October 1996, <http://www.rfc-
editor.org/info/rfc2003>.
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M.
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc-
editor.org/info/rfc3948>.
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI
10.17487/RFC6830, January 2013, <http://www.rfc-
editor.org/info/rfc6830>.
[RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling
Ethernet Frames in IP Datagrams", RFC 3378, DOI Ethernet Frames in IP Datagrams", RFC 3378, DOI
10.17487/RFC3378, September 2002, <http://www.rfc- 10.17487/RFC3378, September 2002, <http://www.rfc-
editor.org/info/rfc3378>. editor.org/info/rfc3378>.
[RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
DOI 10.17487/RFC2784, March 2000, <http://www.rfc- DOI 10.17487/RFC2784, March 2000, <http://www.rfc-
editor.org/info/rfc2784>. editor.org/info/rfc2784>.
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed.,
"Encapsulating MPLS in IP or Generic Routing Encapsulation "Encapsulating MPLS in IP or Generic Routing Encapsulation
(GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005,
<http://www.rfc-editor.org/info/rfc4023>. <http://www.rfc-editor.org/info/rfc4023>.
[RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn,
G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"",
RFC 2661, DOI 10.17487/RFC2661, August 1999, RFC 2661, DOI 10.17487/RFC2661, August 1999,
<http://www.rfc-editor.org/info/rfc2661>. <http://www.rfc-editor.org/info/rfc2661>.
[RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", BCP [RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network
208, RFC 8084, DOI 10.17487/RFC8084, March 2017, Virtualization Using Generic Routing Encapsulation", RFC
<https://www.rfc-editor.org/info/rfc8084>. 7637, DOI 10.17487/RFC7637, September 2015,
<https://www.rfc-editor.org/info/rfc7637>.
[GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
Generic UDP Encapsulation" draft-herbert-gue-extensions-00 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
eXtensible Local Area Network (VXLAN): A Framework for
Overlaying Virtualized Layer 2 Networks over Layer 3
Networks", RFC 7348, August 2014, <http://www.rfc-
editor.org/info/rfc7348>.
[GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP Encapsulation [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI
(GUE) for Network Virtualization Overlay" draft-hy-nvo3- 10.17487/RFC2003, October 1996, <http://www.rfc-
gue-4-nvo-03 editor.org/info/rfc2003>.
[GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in
for Secure Transport" draft-hy-gue-4-secure-transport-03 IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473,
December 1998, <https://www.rfc-editor.org/info/rfc2473>.
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M.
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc-
editor.org/info/rfc3948>.
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI
10.17487/RFC6830, January 2013, <http://www.rfc-
editor.org/info/rfc6830>.
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black,
"Encapsulating MPLS in UDP", RFC 7510, DOI
10.17487/RFC7510, April 2015, <http://www.rfc-
editor.org/info/rfc7510>.
[IANA-PN] IANA, "Protocol Numbers",
<https://www.iana.org/assignments/protocol-numbers>.
[TCPUDP] Chesire, S., Graessley, J., and McGuire, R., [TCPUDP] Chesire, S., Graessley, J., and McGuire, R.,
"Encapsulation of TCP and other Transport Protocols over "Encapsulation of TCP and other Transport Protocols over
UDP" draft-cheshire-tcp-over-udp-00 UDP", draft-cheshire-tcp-over-udp-00
[TOU] Herbert, T., "Transport layer protocols over UDP" draft-
herbert-transports-over-udp-00
[GENEVE] Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve: [GENEVE] Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve:
Generic Network Virtualization Encapsulation", draft-ietf- Generic Network Virtualization Encapsulation", draft-ietf-
nvo3-geneve-05 nvo3-geneve-10
[LCO] Cree, E., https://www.kernel.org/doc/Documentation/ [UDPENCAP] Herbert, T., "UDP Encapsulation in Linux",
networking/checksum-offloads.txt <http://people.netfilter.org/pablo/netdev0.1/papers/UDP-
Encapsulation-in-Linux.pdf>
[MULTIQ] Herbert, T. and de Bruijn, W., "Scaling in the Linux
Networking Stack", <https://www.kernel.org/doc/
Documentation/networking/scaling.txt>
[CSUMOFF] Cree, E., "Checksum Offloads in the Linux Networking
Stack", <https://www.kernel.org/doc/Documentation/
networking/checksum-offloads.txt>
[SEGOFF] Duyck, A., "Segmentation Offloads in the Linux Networking
Stack", <https://www.kernel.org/doc/
Documentation/networking/segmentation-offloads.txt>
Appendix A: NIC processing for GUE Appendix A: NIC processing for GUE
This appendix is informational and does not constitute a normative
part of this document.
This appendix provides some guidelines for Network Interface Cards This appendix provides some guidelines for Network Interface Cards
(NICs) to implement common offloads and accelerations to support GUE. (NICs) to implement common offloads and accelerations to support GUE.
Note that most of this discussion is generally applicable to other Note that most of this discussion is generally applicable to other
methods of UDP based encapsulation. methods of UDP based encapsulation. An overview of UDP based
encapsulation and acceleration is in [UDPENCAP]
A.1. Receive multi-queue A.1. Receive multi-queue
Contemporary NICs support multiple receive descriptor queues (multi- Contemporary NICs support multiple receive descriptor queues (multi-
queue). Multi-queue enables load balancing of network processing for queue) [MUTLIQ]. Multi-queue enables load balancing of network
a NIC across multiple CPUs. On packet reception, a NIC selects the processing for a NIC across multiple CPUs. On packet reception, a NIC
appropriate queue for host processing. Receive Side Scaling is a selects an appropriate queue for host processing. Receive Side
common method which uses the flow hash for a packet to index an Scaling (RSS) is a common method which uses the flow hash for a
indirection table where each entry stores a queue number. Flow packet to index an indirection table where each entry stores a queue
Director and Accelerated Receive Flow Steering (aRFS) allow a host to number. Flow Director and Accelerated Receive Flow Steering (aRFS)
program the queue that is used for a given flow which is identified allow a host to program the queue that is used for a given flow which
either by an explicit five-tuple or by the flow's hash. is identified either by an explicit five-tuple or by the flow's hash.
GUE encapsulation is compatible with multi-queue NICs that support GUE encapsulation is compatible with multi-queue NICs that support
five-tuple hash calculation for UDP/IP packets as input to RSS. The five-tuple hash calculation for UDP/IP packets as input to RSS. The
flow entropy in the UDP source port ensures classification of the flow entropy in the UDP source port ensures classification of the
encapsulated flow even in the case that the outer source and encapsulated flow even in the case that the outer source and
destination addresses are the same for all flows (e.g. all flows are destination addresses are the same for all flows (e.g. all flows are
going over a single tunnel). going over a single tunnel).
By default, UDP RSS support is often disabled in NICs to avoid out- By default, UDP RSS support is often disabled in NICs to avoid out-
of-order reception that can occur when UDP packets are fragmented. As of-order reception that can occur when UDP packets are fragmented. As
discussed above, fragmentation of GUE packets is mostly avoided by discussed is section 5.8, fragmentation of GUE packets is mostly
fragmenting packets before entering a tunnel, GUE fragmentation, path avoided by fragmenting packets before entering a tunnel, GUE
MTU discovery in higher layer protocols, or operator adjusting MTUs. fragmentation, path MTU discovery in higher layer protocols, or
Other UDP traffic might not implement such procedures to avoid operator adjusting MTUs. Other UDP traffic might not implement such
fragmentation, so enabling UDP RSS support in the NIC might be a procedures to avoid fragmentation, so enabling UDP RSS support in the
considered tradeoff during configuration. NIC might be a considered tradeoff during configuration.
A.2. Checksum offload A.2. Checksum offload
Many NICs provide capabilities to calculate standard ones complement Many NICs provide capabilities to calculate the standard ones
payload checksum for packets in transmit or receive. When using GUE complement checksum for packets in transmit or receive [CSUMOFF].
encapsulation, there are at least two checksums that are of interest: When using GUE encapsulation, there are at least two checksums that
the encapsulated packet's transport checksum, and the UDP checksum in are of interest: the encapsulated packet's transport checksum, and
the outer header. the UDP checksum in the outer header.
A.2.1. Transmit checksum offload A.2.1. Transmit checksum offload
NICs can provide a protocol agnostic method to offload transmit NICs can provide a protocol agnostic method to offload the transmit
checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with
GUE. In this method, the host provides checksum related parameters in GUE. In this method, the host provides checksum related parameters in
a transmit descriptor for a packet. These parameters include the a transmit descriptor for a packet. These parameters include the
starting offset of data to checksum, the length of data to checksum, starting offset of data to checksum, the length of data to checksum,
and the offset in the packet where the computed checksum is to be and the offset in the packet where the computed checksum is to be
written. The host initializes the checksum field to pseudo header written. The host initializes the checksum field to a pseudo header
checksum. checksum.
In the case of GUE, the checksum for an encapsulated transport layer In the case of GUE, the checksum for an encapsulated transport layer
packet, a TCP packet for instance, can be offloaded by setting the packet, a TCP packet for instance, can be offloaded by setting the
appropriate checksum parameters. appropriate checksum parameters.
NICs typically can offload only one transmit checksum per packet, so NICs typically can offload only one transmit checksum per packet, so
simultaneously offloading both an inner transport packet's checksum simultaneously offloading both an inner transport packet's checksum
and the outer UDP checksum is likely not possible. and the outer UDP checksum is likely not possible.
If an encapsulator is co-resident with a host, then checksum offload If an encapsulator is co-resident with a host, then checksum offload
may be performed using remote checksum offload (described in may be performed using remote checksum offload (RCO)[GUEEXTEN].
[GUEEXTEN]). Remote checksum offload relies on NIC offload of the Remote checksum offload relies on NIC offload of the simple UDP/IP
simple UDP/IP checksum which is commonly supported even in legacy checksum which is commonly supported even in legacy devices. In
devices. In remote checksum offload, the outer UDP checksum is set remote checksum offload, the outer UDP checksum is set and the GUE
and the GUE header includes an option indicating the start and offset header includes an option indicating the start and offset of the
of the inner "offloaded" checksum. The inner checksum is initialized inner "offloaded" checksum. The inner checksum is initialized to the
to the pseudo header checksum. When a decapsulator receives a GUE pseudo header checksum. When a decapsulator receives a GUE packet
packet with the remote checksum offload option, it completes the with the remote checksum offload option, it completes the offload
offload operation by determining the packet checksum from the operation by determining the packet checksum from the indicated start
indicated start point to the end of the packet, and then adds this point to the end of the packet, and then adds this into the checksum
into the checksum field at the offset given in the option. Computing field at the offset given in the option. Computing the checksum from
the checksum from the start to end of packet is efficient if the start to end of packet is efficient if checksum-complete is
checksum-complete is provided on the receiver. provided on the receiver.
Another alternative when an encapsulator is co-resident with a host Another alternative when an encapsulator is co-resident with a host
is to perform Local Checksum Offload [LCO]. In this method, the inner is to perform Local Checksum Offload (LCO) [CSUMOFF]. In this method,
transport layer checksum is offloaded and the outer UDP checksum can the inner transport layer checksum is offloaded and the outer UDP
be deduced based on the fact that the portion of the packet covered checksum can be deduced based on the fact that the portion of the
by the inner transport checksum will sum to zero (or at least the bit packet covered by the inner transport checksum will sum to zero or at
wise "not" of the inner pseudo header). least the bitwise "not" of the inner pseudo header.
A.2.2. Receive checksum offload A.2.2. Receive checksum offload
GUE is compatible with NICs that perform a protocol agnostic receive GUE is compatible with NICs that perform a protocol agnostic receive
checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a
NIC computes a ones complement checksum over all (or some predefined NIC computes a ones complement checksum over all (or some predefined
portion) of a packet. The computed value is provided to the host portion) of a packet. The computed value is provided to the host
stack in the packet's receive descriptor. The host driver can use stack in the packet's receive descriptor. The host driver can use
this checksum to "patch up" and validate any inner packet transport this checksum to "patch up" and validate any inner packet transport
checksum, as well as the outer UDP checksum if it is non-zero. checksums, as well as the outer UDP checksum if it is non-zero.
Many legacy NICs don't provide checksum-complete but instead provide Many legacy NICs don't provide checksum-complete but instead provide
an indication that a checksum has been verified (CHECKSUM_UNNECESSARY an indication that a checksum has been verified (CHECKSUM_UNNECESSARY
in Linux). Usually, such validation is only done for simple TCP/IP or in Linux). Usually, such validation is only done for simple TCP/IP or
UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the
checksum-complete value for the UDP packet is the "not" of the pseudo checksum-complete value for the UDP packet is the bitwise "not" of
header checksum. In this way, checksum-unnecessary can be converted the pseudo header checksum. In this way, checksum-unnecessary can be
to checksum-complete. So, if the NIC provides checksum-unnecessary converted to checksum-complete. So, if the NIC provides checksum-
for the outer UDP header in an encapsulation, checksum conversion can unnecessary for the outer UDP header in an encapsulation, checksum
be done so that the checksum-complete value is derived and can be conversion can be done so that the checksum-complete value is derived
used by the stack to validate checksums in the encapsulated packet. and can be used by the stack to validate checksums in the
encapsulated packet.
A.3. Transmit Segmentation Offload A.3. Transmit Segmentation Offload
Transmit Segmentation Offload (TSO) is a NIC feature where a host Transmit Segmentation Offload (TSO) [SEGOFF] is a NIC feature where a
provides a large (>MTU size) TCP packet to the NIC, which in turn host provides a large (>MTU size) TCP packet to the NIC, which in
splits the packet into separate segments and transmits each one. This turn splits the packet into separate segments and transmits each one.
is useful to reduce CPU load on the host. This is useful to reduce CPU load on the host.
The process of TSO can be generalized as: The process of TSO can be generalized as:
- Split the TCP payload into segments which allow packets with - Split the TCP payload into segments of size less than or equal
size less than or equal to MTU. to MTU.
- For each created segment: - For each created segment:
1. Replicate the TCP header and all preceding headers of the 1. Replicate the TCP header and all preceding headers of the
original packet. original packet.
2. Set payload length fields in any headers to reflect the 2. Set payload length fields in any headers to reflect the
length of the segment. length of the segment.
3. Set TCP sequence number to correctly reflect the offset of 3. Set TCP sequence number to correctly reflect the offset of
skipping to change at page 36, line 31 skipping to change at page 37, line 16
To facilitate TSO with GUE, it is recommended that extension fields To facilitate TSO with GUE, it is recommended that extension fields
do not contain values that need to be updated on a per segment basis. do not contain values that need to be updated on a per segment basis.
For example, extension fields should not include checksums, lengths, For example, extension fields should not include checksums, lengths,
or sequence numbers that refer to the payload. If the GUE header does or sequence numbers that refer to the payload. If the GUE header does
not contain such fields then the TSO engine only needs to copy the not contain such fields then the TSO engine only needs to copy the
bits in the GUE header when creating each segment and does not need bits in the GUE header when creating each segment and does not need
to parse the GUE header. to parse the GUE header.
A.4. Large Receive Offload A.4. Large Receive Offload
Large Receive Offload (LRO) is a NIC feature where packets of a TCP Large Receive Offload (LRO) [SEGOFF] is a NIC feature where packets
connection are reassembled, or coalesced, in the NIC and delivered to of a TCP connection are reassembled, or coalesced, in the NIC and
the host as one large packet. This feature can reduce CPU utilization delivered to the host as one large packet. This feature can reduce
in the host. CPU utilization in the host.
LRO requires significant protocol awareness to be implemented LRO requires significant protocol awareness to be implemented
correctly and is difficult to generalize. Packets in the same flow correctly and is difficult to generalize. Packets in the same flow
need to be unambiguously identified. In the presence of tunnels or need to be unambiguously identified. In the presence of tunnels or
network virtualization, this may require more than a five-tuple match network virtualization, this may require more than a five-tuple match
(for instance packets for flows in two different virtual networks may (for instance packets for flows in two different virtual networks may
have identical five-tuples). Additionally, a NIC needs to perform have identical five-tuples). Additionally, a NIC needs to perform
validation over packets that are being coalesced, and needs to validation over packets that are being coalesced, and needs to
fabricate a single meaningful header from all the coalesced packets. fabricate a single meaningful header from all the coalesced packets.
skipping to change at page 37, line 35 skipping to change at page 38, line 21
Assuming that networking switches perform ECMP based on the flow Assuming that networking switches perform ECMP based on the flow
hash, a sender can affect the path by altering the flow entropy. For hash, a sender can affect the path by altering the flow entropy. For
instance, a host can store a flow hash in its protocol control block instance, a host can store a flow hash in its protocol control block
(PCB) for an inner flow, and might alter the value upon detecting (PCB) for an inner flow, and might alter the value upon detecting
that packets are traversing a lossy path. Changing the flow entropy that packets are traversing a lossy path. Changing the flow entropy
for a flow SHOULD be subject to hysteresis (at most once every thirty for a flow SHOULD be subject to hysteresis (at most once every thirty
seconds) to limit the number of out of order packets. seconds) to limit the number of out of order packets.
B.3. Hardware protocol implementation considerations B.3. Hardware protocol implementation considerations
Low level data path protocols, such is GUE, are often supported in Low level data path protocols, such as GUE, are often supported in
high speed network device hardware. Variable length header (VLH) high speed network device hardware. Variable length header (VLH)
protocols like GUE are often considered difficult to efficiently protocols like GUE are sometimes considered difficult to efficiently
implement in hardware. In order to retain the important implement in hardware. In order to retain the important
characteristics of an extensible and robust protocol, hardware characteristics of an extensible and robust protocol, hardware
vendors may practice "constrained flexibility". In this model, only vendors may practice "constrained flexibility". In this model, only
certain combinations or protocol header parameterizations are certain combinations or protocol header parameterizations are
implemented in hardware fast path. Each such parameterization is implemented in the hardware fast path. Each such parameterization is
fixed length so that the particular instance can be optimized as a fixed length so that the particular instance can be optimized as a
fixed length protocol. In the case of GUE this constitutes specific fixed length protocol. In the case of GUE, this constitutes specific
combinations of GUE flags, fields, and next protocol. The selected combinations of GUE flags, fields, and next protocol. The selected
combinations would naturally be the most common cases which form the combinations would naturally be the most common cases which form the
"fast path", and other combinations are assumed to take the "slow "fast path", and other combinations are assumed to take the "slow
path". path".
In time, needs and requirements of the protocol may change which may In time, the needs and requirements of a protocol may change which
manifest themselves as new parameterizations to be supported in the may manifest themselves as new parameterizations to be supported in
fast path. To allow this extensibility, a device practicing the fast path. To allow this extensibility, a device practicing
constrained flexibility should allow the fast path parameterizations constrained flexibility should allow fast path parameterizations to
to be programmable. be programmable.
Authors' Addresses Authors' Addresses
Tom Herbert Tom Herbert
Quantonium Quantonium
4701 Patrick Henry 4701 Patrick Henry
Santa Clara, CA 95054 Santa Clara, CA 95054
US US
Email: tom@herbertland.com Email: tom@herbertland.com
Lucy Yong Lucy Yong
Huawei USA Independent
5340 Legacy Dr. Austin, TX
Plano, TX 75024
US US
Email: lucy.yong@huawei.com
Osama Zia Osama Zia
Microsoft Microsoft
1 Microsoft Way 1 Microsoft Way
Redmond, WA 98029 Redmond, WA 98029
US US
Email: osamaz@microsoft.com Email: osamaz@microsoft.com
 End of changes. 132 change blocks. 
477 lines changed or deleted 490 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/