draft-ietf-intarea-gue-00.txt | draft-ietf-intarea-gue-01.txt | |||
---|---|---|---|---|
Internet Area WG T. Herbert | Internet Area WG T. Herbert | |||
Internet-Draft Facebook | Internet-Draft Quantonium | |||
Intended status: Standard track L. Yong | Intended status: Standard track L. Yong | |||
Expires May 4, 2017 Huawei USA | Expires September 14, 2017 Huawei USA | |||
O. Zia | O. Zia | |||
Microsoft | Microsoft | |||
October 31, 2016 | March 13, 2017 | |||
Generic UDP Encapsulation | Generic UDP Encapsulation | |||
draft-ietf-intarea-gue-00 | draft-ietf-intarea-gue-01 | |||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
Drafts. | Drafts. | |||
skipping to change at page 1, line 35 ¶ | skipping to change at page 1, line 35 ¶ | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
This Internet-Draft will expire on May 4, 2017. | This Internet-Draft will expire on September 14, 2017. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. | to this document. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
skipping to change at page 3, line 13 ¶ | skipping to change at page 3, line 13 ¶ | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Abstract | Abstract | |||
This specification describes Generic UDP Encapsulation (GUE), which | This specification describes Generic UDP Encapsulation (GUE), which | |||
is a scheme for using UDP to encapsulate packets of different IP | is a scheme for using UDP to encapsulate packets of different IP | |||
protocols for transport across layer 3 networks. By encapsulating | protocols for transport across layer 3 networks. By encapsulating | |||
packets in UDP, specialized capabilities in networking hardware for | packets in UDP, specialized capabilities in networking hardware for | |||
efficient handling of UDP packets can be leveraged. GUE specifies | efficient handling of UDP packets can be leveraged. GUE specifies | |||
basic encapsulation methods upon which higher level constructs, such | basic encapsulation methods upon which higher level constructs, such | |||
tunnels and overlay networks for network virtualization, can be | as tunnels and overlay networks for network virtualization, can be | |||
constructed. GUE is extensible by allowing optional data fields as | constructed. GUE is extensible by allowing optional data fields as | |||
part of the encapsulation, and is generic in that it can encapsulate | part of the encapsulation, and is generic in that it can encapsulate | |||
packets of various IP protocols. | packets of various IP protocols. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.1. Terminology and acronyms . . . . . . . . . . . . . . . . . 5 | |||
1.2. Requirements Language . . . . . . . . . . . . . . . . . . 6 | ||||
2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 7 | 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 7 | |||
2.1. GUE version . . . . . . . . . . . . . . . . . . . . . . . . 7 | 2.1. GUE version . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
3. Version 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 3. Version 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 8 | 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 9 | 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.2.1 Proto field . . . . . . . . . . . . . . . . . . . . . . 9 | 3.2.1 Proto field . . . . . . . . . . . . . . . . . . . . . . 9 | |||
3.2.2 Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 | 3.2.2 Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.3. Flags and extension fields . . . . . . . . . . . . . . . . 10 | 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 10 | |||
3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 10 | 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 10 | |||
3.3.2. Example GUE header with extension fields . . . . . . . 11 | 3.3.2. Example GUE header with extension fields . . . . . . . 11 | |||
skipping to change at page 3, line 49 ¶ | skipping to change at page 3, line 50 ¶ | |||
4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 15 | 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 15 | |||
5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 | 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 | |||
5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 | 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 | |||
5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 16 | 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 16 | |||
5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 | 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 | |||
5.4.1. Processing a received data message . . . . . . . . . . 17 | 5.4.1. Processing a received data message . . . . . . . . . . 17 | |||
5.4.2. Processing a received control message . . . . . . . . . 18 | 5.4.2. Processing a received control message . . . . . . . . . 18 | |||
5.5. Router and switch operation . . . . . . . . . . . . . . . . 18 | 5.5. Router and switch operation . . . . . . . . . . . . . . . . 18 | |||
5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 18 | 5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 18 | |||
5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 19 | 5.6.1. Inferring connection semantics . . . . . . . . . . . . 19 | |||
5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 19 | 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 19 | 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 | 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 | |||
5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 20 | 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 20 | |||
5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 | 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 | |||
5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 21 | 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 21 | |||
5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 21 | 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 22 | 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 22 | |||
5.11.1. Flow classification . . . . . . . . . . . . . . . . . 22 | 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 22 | |||
5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 23 | 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 23 | |||
5.12. Negotiation of acceptable flags and extension fields . . . 24 | 5.12 Negotiation of acceptable flags and extension fields . . . 24 | |||
6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 24 | 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 24 | |||
6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 24 | 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 24 | |||
6.2. Comparison of GUE to other encapsulations . . . . . . . . . 25 | 6.2 Comparison of GUE to other encapsulations . . . . . . . . . 25 | |||
7. Security Considerations . . . . . . . . . . . . . . . . . . . . 26 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 26 | |||
8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 27 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 26 | |||
8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 27 | 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 26 | |||
8.2. GUE version number . . . . . . . . . . . . . . . . . . . . 27 | 8.2. GUE version number . . . . . . . . . . . . . . . . . . . . 28 | |||
8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 27 | 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
8.4. Flag-fields . . . . . . . . . . . . . . . . . . . . . . . . 28 | 8.4. Flag-fields . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . . 29 | 10.1. Normative References . . . . . . . . . . . . . . . . . . . 29 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . . 30 | 10.2. Informative References . . . . . . . . . . . . . . . . . . 30 | |||
Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 32 | Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 33 | |||
A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 32 | A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 33 | |||
A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 33 | A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 33 | |||
A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 33 | A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 34 | |||
A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 34 | A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 34 | |||
A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 34 | A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 35 | |||
A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 35 | A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 36 | |||
Appendix B: Implementation considerations . . . . . . . . . . . . 36 | Appendix B: Implementation considerations . . . . . . . . . . . . 36 | |||
B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 36 | B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 36 | |||
B.2. Setting flow entropy as a route selector . . . . . . . . . 36 | B.2. Setting flow entropy as a route selector . . . . . . . . . 37 | |||
B.3. Hardware protocol implementation considerations . . . . . . 36 | B.3. Hardware protocol implementation considerations . . . . . . 37 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
1. Introduction | 1. Introduction | |||
This specification describes Generic UDP Encapsulation (GUE) which is | This specification describes Generic UDP Encapsulation (GUE) which is | |||
a general method for encapsulating packets of arbitrary IP protocols | a general method for encapsulating packets of arbitrary IP protocols | |||
within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating | within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating | |||
packets in UDP facilitates efficient transport across networks. | packets in UDP facilitates efficient transport across networks. | |||
Networking devices widely provide protocol specific processing and | Networking devices widely provide protocol specific processing and | |||
optimizations for UDP (as well as TCP) packets. Packets for atypical | optimizations for UDP (as well as TCP) packets. Packets for atypical | |||
IP protocols (those not usually parsed by networking hardware) can be | IP protocols (those not usually parsed by networking hardware) can be | |||
encapsulated in UDP packets to maximize deliverability and to | encapsulated in UDP packets to maximize deliverability and to | |||
leverage flow specific mechanisms for routing and packet steering. | leverage flow specific mechanisms for routing and packet steering. | |||
GUE provides an extensible header format for including optional data | GUE provides an extensible header format for including optional data | |||
in the encapsulation header. This data potentially covers items such | in the encapsulation header. This data potentially covers items such | |||
as virtual networking identifier, security data for validating or | as the virtual networking identifier, security data for validating or | |||
authenticating the GUE header, congestion control data, etc. GUE also | authenticating the GUE header, congestion control data, etc. GUE also | |||
allows private optional data in the encapsulation header. This | allows private optional data in the encapsulation header. This | |||
feature can be used by a site or implementation to define local | feature can be used by a site or implementation to define local | |||
custom optional data, and allows experimentation of options that may | custom optional data, and allows experimentation of options that may | |||
eventually become standard. | eventually become standard. | |||
This document does not define any specific GUE extensions. | This document does not define any specific GUE extensions. | |||
[GUEEXTENS] specifies a set of core extensions and [GUE4NVO3] defines | [GUEEXTENS] specifies a set of core extensions and [GUE4NVO3] defines | |||
an extension for using GUE with network virtualization. | an extension for using GUE with network virtualization. | |||
The motivation for the GUE protocol is described in section 6. | The motivation for the GUE protocol is described in section 6. | |||
1.1 Terminology | 1.1. Terminology and acronyms | |||
GUE Generic UDP Encapsulation | GUE Generic UDP Encapsulation | |||
GUE Header A variable length protocol header that is composed | GUE Header A variable length protocol header that is composed | |||
of a primary four byte header and zero or more four | of a primary four byte header and zero or more four | |||
byte words for optional header data | byte words for optional header data | |||
GUE packet A UDP/IP packet that contains a GUE header and GUE | GUE packet A UDP/IP packet that contains a GUE header and GUE | |||
payload within the UDP payload | payload within the UDP payload | |||
Encapsulator A network node that encapsulates a packet in GUE | Encapsulator A network node that encapsulates a packet in GUE | |||
Decapsulator A network node that decapsulates and processes | Decapsulator A network node that decapsulates and processes | |||
packets encapsulated in GUE | packets encapsulated in GUE | |||
Data message An encapsulated packet in the GUE payload that is | Data message An encapsulated packet in the GUE payload that is | |||
addressed to the protocol stack for an associated | addressed to the protocol stack for an associated | |||
protocol | protocol | |||
Control message A formatted message in the GUE payload that is | Control message A formatted message in the GUE payload that is | |||
implicitly addressed to a decapsulator to monitor or | implicitly addressed to the decapsulator to monitor | |||
control the state or behavior of a tunnel | or control the state or behavior of a tunnel | |||
Flags A set of bit flags in the primary GUE header | Flags A set of bit flags in the primary GUE header | |||
Extension field | Extension field | |||
An optional field in a GUE header whose presence is | An optional field in a GUE header whose presence is | |||
indicated by corresponding flag(s) | indicated by corresponding flag(s) | |||
C-bit A single bit flag in the primary GUE header that | C-bit A single bit flag in the primary GUE header that | |||
indicates whether the GUE packet contains a control | indicates whether the GUE packet contains a control | |||
message or not. | message or data message | |||
Hlen A field in the primary GUE header that gives the | Hlen A field in the primary GUE header that gives the | |||
length of the GUE header | length of the GUE header | |||
Proto/ctype A field in the GUE header that holds either the IP | Proto/ctype A field in the GUE header that holds either the IP | |||
protocol number for a data message or a type for a | protocol number for a data message or a type for a | |||
control message | control message | |||
Private data Optional data in the GUE header that may be used for | Private data Optional data in the GUE header that can be used for | |||
private purposes | private purposes | |||
Outer IP header Refers to the outer most IP header of a packet when | Outer IP header Refers to the outer most IP header or packet when | |||
encapsulating a packet over IP | encapsulating a packet over IP | |||
Inner IP header Refers to an encapsulated IP header when an IP | Inner IP header Refers to an encapsulated IP header when an IP | |||
packets is encapsulated | packet is encapsulated | |||
Outer packet Refers to an encapsulating packet | Outer packet Refers to an encapsulating packet | |||
Inner packet Refers to a packet that is encapsulated | Inner packet Refers to a packet that is encapsulated | |||
Tunnel An abstraction of a path across a network that ships | 1.2. Requirements Language | |||
packets or protocols across a network that normally | ||||
wouldn't support them. Tunnels provide communication | ||||
paths between two endpoints. Encapsulation is one | ||||
common technique used to actualize tunnels | ||||
Overlay network A computer network that is built on top of another | ||||
network | ||||
Underlay network | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
A network over which an overlay network is built | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in [RFC2119]. | ||||
2. Base packet format | 2. Base packet format | |||
A GUE packet is comprised of a UDP packet whose payload is a GUE | A GUE packet is comprised of a UDP packet whose payload is a GUE | |||
header followed by a payload which is either an encapsulated packet | header followed by a payload which is either an encapsulated packet | |||
of some IP protocol or a control message (like an OAM message). A GUE | of some IP protocol or a control message such as an OAM (Operations, | |||
packet has the general format: | Administration, and Management) message. A GUE packet has the general | |||
format: | ||||
+-------------------------------+ | +-------------------------------+ | |||
| | | | | | |||
| UDP/IP header | | | UDP/IP header | | |||
| | | | | | |||
|-------------------------------| | |-------------------------------| | |||
| | | | | | |||
| GUE Header | | | GUE Header | | |||
| | | | | | |||
|-------------------------------| | |-------------------------------| | |||
skipping to change at page 8, line 30 ¶ | skipping to change at page 8, line 30 ¶ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
~ Private data (optional) ~ | ~ Private data (optional) ~ | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
The contents of the UDP header are: | The contents of the UDP header are: | |||
o Source port: If connection semantics (section 5.6.1) are applied | o Source port: If connection semantics (section 5.6.1) are applied | |||
to an encapsulation, this is set to the source port in the local | to an encapsulation, this is set to the local source port for | |||
tuple. When connection semantics are not applied this should be | the connection. When connection semantics are not applied, this | |||
set to a flow entropy value for use with ECMP; the properties of | is set to a flow entropy value for use with ECMP (Equal-Cost | |||
flow entropy are described in section 5.11. | Mulit-Path [RFC2992]). The properties of flow entropy are | |||
described in section 5.11. | ||||
o Destination port: If connection semantics (section 5.6.1) are | o Destination port: If connection semantics (section 5.6.1) are | |||
applied to an encapsulation, this is set to the destination port | applied to an encapsulation, this is set to the destination port | |||
for the tuple. If connection semantics are not applied this is | for the tuple. If connection semantics are not applied this is | |||
set to the GUE assigned port number, 6080. | set to the GUE assigned port number, 6080. | |||
o Length: Canonical length of the UDP packet (length of UDP header | o Length: Canonical length of the UDP packet (length of UDP header | |||
and payload). | and payload). | |||
o Checksum: Standard UDP checksum (handling is described in | o Checksum: Standard UDP checksum (handling is described in | |||
section 5.7). | section 5.7). | |||
The GUE header consists of: | The GUE header consists of: | |||
o Ver: GUE protocol version (0). | o Ver: GUE protocol version (0). | |||
o C: C-bit. When set indicates a control message, not set | o C: C-bit: When set indicates a control message, not set | |||
indicates a data message. | indicates a data message. | |||
o Hlen: Length in 32-bit words of the GUE header, including | o Hlen: Length in 32-bit words of the GUE header, including | |||
optional extension fields but not the first four bytes of the | optional extension fields but not the first four bytes of the | |||
header. Computed as (header_len - 4) / 4. All GUE headers are a | header. Computed as (header_len - 4) / 4 where header_len is the | |||
multiple of four bytes in length. Maximum header length is 128 | total header length in bytes. All GUE headers are a multiple of | |||
bytes. | four bytes in length. Maximum header length is 128 bytes. | |||
o Proto/ctype: When the C-bit is set this field contains a control | o Proto/ctype: When the C-bit is set, this field contains a | |||
message type for the payload (section 3.2.2). When C-bit is not | control message type for the payload (section 3.2.2). When C-bit | |||
set, the field holds the Internet protocol number for the | is not set, the field holds the Internet protocol number for the | |||
encapsulated packet in the payload (section 3.2.1). The control | encapsulated packet in the payload (section 3.2.1). The control | |||
message or encapsulated packet begins at the offset provided by | message or encapsulated packet begins at the offset provided by | |||
Hlen. | Hlen. | |||
o Flags. Header flags that may be allocated for various purposes | o Flags: Header flags that may be allocated for various purposes | |||
and may indicate presence of extension fields. Undefined header | and may indicate presence of extension fields. Undefined header | |||
flag bits MUST be set to zero on transmission. | flag bits MUST be set to zero on transmission. | |||
o Extension Fields: Optional fields whose presence is indicated by | o Extension Fields: Optional fields whose presence is indicated by | |||
corresponding flags. | corresponding flags. | |||
o Private data: Optional private data (see section 3.4). If | o Private data: Optional private data block (see section 3.4). If | |||
private data is present it immediately follows that last | the private block is present, it immediately follows that last | |||
extension field present in the header. The length of this data | extension field present in the header. The private block is | |||
considered to be part of the GUE header. The length of this data | ||||
is determined by subtracting the starting offset from the header | is determined by subtracting the starting offset from the header | |||
length. | length. | |||
3.2. Proto/ctype field | 3.2. Proto/ctype field | |||
The proto/ctype field contains the type of the GUE payload. This can | The proto/ctype fields either contains an Internet protocol number | |||
either be an IP protocol number or a control message type number. | (when the C-bit is not set) or GUE control message type (when the C- | |||
Intermediate devices may parse the GUE payload per the number in the | bit is set). | |||
proto/ctype field, and header flags cannot affect the interpretation | ||||
of the proto/ctype field. | ||||
3.2.1 Proto field | 3.2.1 Proto field | |||
When the C-bit is not set the proto/ctype field contains an IANA | When the C-bit is not set, the proto/ctype field MUST contain an IANA | |||
Internet Protocol Number. The protocol number is interpreted relative | Internet Protocol Number. The protocol number is interpreted relative | |||
to the IP protocol that encapsulates the UDP packet (i.e. protocol of | to the IP protocol that encapsulates the UDP packet (i.e. protocol of | |||
the outer IP header). | the outer IP header). The protocol number serves as an indication of | |||
the type of the next protocol header which is contained in the GUE | ||||
payload at the offset indicated in Hlen. Intermediate devices may | ||||
parse the GUE payload per the number in the proto/ctype field, and | ||||
header flags cannot affect the interpretation of the proto/ctype | ||||
field. | ||||
When the outer IP protocol is IPv4 the proto field may be set to any | When the outer IP protocol is IPv4, the proto field MUST be set to a | |||
number except for those that refer to IPv6 extension headers or | valid IP protocol number usable with IPv4; it MUST NOT be set to a | |||
ICMPv6 options (number 58). An exception is that the destination | number for IPv6 extension headers or ICMPv6 options (number 58). An | |||
options extension header using the PadN option may be used with IPv4 | exception is that the destination options extension header using the | |||
as described in section 3.6. The "no next header" protocol number | PadN option MAY be used with IPv4 as described in section 3.6. The | |||
(59) may be used with IPv4 as described below. | "no next header" protocol number (59) also MAY be used with IPv4 as | |||
described below. | ||||
When the outer IP protocol is IPv6 the proto field may be set to any | When the outer IP protocol is IPv6, the proto field can be set to any | |||
defined protocol number except Hop-by-hop options (number 0). If a | defined protocol number except that it MUST NOT be set to Hop-by-hop | |||
received GUE packet in IPv6 contains a protocol number that is an | options (number 0). If a received GUE packet in IPv6 contains a | |||
extension header (e.g. Destination Options) then the extension header | protocol number that is an extension header (e.g. Destination | |||
is processed after the GUE header as though the GUE header itself | Options) then the extension header is processed after the GUE header | |||
were an extension header. | is processed as though the GUE header is an extension header. | |||
IP protocol number 59 ("No next header") may be set to indicate that | IP protocol number 59 ("No next header") can be set to indicate that | |||
the GUE payload does not begin with the header of an IP protocol. | the GUE payload does not begin with the header of an IP protocol. | |||
This would be the case, for instance, if the GUE payload were a | This would be the case, for instance, if the GUE payload were a | |||
fragment when performing GUE level fragmentation. The interpretation | fragment when performing GUE level fragmentation. The interpretation | |||
of the payload is performed through other means (such as flags and | of the payload is performed through other means (such as flags and | |||
extension fields), and intermediate devices must not parse packets | extension fields), and intermediate devices MUST NOT parse packets | |||
based on the IP protocol number in this case. | based on the IP protocol number in this case. | |||
3.2.2 Ctype field | 3.2.2 Ctype field | |||
When the C-bit is set, the proto/ctype field must be set to a valid | When the C-bit is set, the proto/ctype field MUST be set to a valid | |||
control message type. A value of zero indicates that the GUE payload | control message type. A value of zero indicates that the GUE payload | |||
requires further interpretation to deduce the control type. This | requires further interpretation to deduce the control type. This | |||
might be the case when the payload is a fragment of a control | might be the case when the payload is a fragment of a control | |||
message, where only the reassembled packet can be interpreted as a | message, where only the reassembled packet can be interpreted as a | |||
control message. | control message. | |||
Control message types 1 through 127 may be defined in standards. | Control messages will be defined in an IANA registry. Control message | |||
Types 128 through 255 are reserved to be user defined for | types 1 through 127 may be defined in by RFCs. Types 128 through 255 | |||
experimentation or private control messages. | are reserved to be user defined for experimentation or private | |||
control messages. | ||||
This document does not specify any standard control message types | This document does not specify any standard control message types | |||
other than type 0. | other than type 0. | |||
3.3. Flags and extension fields | 3.3. Flags and extension fields | |||
Flags and associated extension fields are the primary mechanism of | Flags and associated extension fields are the primary mechanism of | |||
extensibility in GUE. As mentioned in section 3.1 GUE header flags | extensibility in GUE. As mentioned in section 3.1, GUE header flags | |||
may indicate the presence of optional extension fields in the GUE | indicate the presence of optional extension fields in the GUE header. | |||
header. [GUEXTENS] defines a basic set of GUE extensions. | [GUEXTENS] defines a basic set of GUE extensions. | |||
3.3.1. Requirements | 3.3.1. Requirements | |||
There are sixteen flag bits in the GUE header. A flag may indicate | There are sixteen flag bits in the GUE header. Some flags indicate | |||
presence of an extension fields. The size of an extension field | the presence of an extension fields. The size of an extension field | |||
indicated by a flag must be fixed. | indicated by a flag MUST be fixed. | |||
Flags may be paired together to allow different lengths for an | Flags can be paired together to allow different lengths for an | |||
extension field. For example, if two flag bits are paired, a field | extension field. For example, if two flag bits are paired, a field | |||
may possibly be three different lengths. Regardless of how flag bits | can possibly be three different lengths-- that is bit value of 00 | |||
may be paired, the lengths and offsets of optional fields | indicates no field present; 01, 10, and 11 indicate three possible | |||
corresponding to a set of flags must be well defined. | lengths for the field. Regardless of how flag bits are paired, the | |||
lengths and offsets of optional fields corresponding to a set of | ||||
flags MUST be well defined. | ||||
Extension fields are placed in order of the flags. New flags are to | Extension fields are placed in order of the flags. New flags are to | |||
be allocated from high to low order bit contiguously without holes. | be allocated from high to low order bit contiguously without holes. | |||
Flags allow random access, for instance to inspect the field | Flags allow random access, for instance to inspect the field | |||
corresponding to the Nth flag bit, an implementation only considers | corresponding to the Nth flag bit, an implementation only considers | |||
the previous N-1 flags to determine the offset. Flags after the Nth | the previous N-1 flags to determine the offset. Flags after the Nth | |||
flag are not pertinent in calculating the offset of an extension | flag are not pertinent in calculating the offset of the Nth flag. | |||
field indicated by the Nth flag. Random access of flags and fields | Random access of flags and fields permits processing of optional | |||
permits processing of optional extensions in an order that is | extensions in an order that is independent of their position in the | |||
independent of their position in the packet. The processing order of | packet. The processing order of extensions defined in [GUEEXTENS] | |||
extensions defined in [GUEEXTENS] demonstrates this property. | demonstrates this property. | |||
Flags (or paired flags) are idempotent such that new flags must not | Flags (or paired flags) are idempotent such that new flags MUST NOT | |||
cause reinterpretation of old flags. Also, new flags should not alter | cause reinterpretation of old flags. Also, new flags MUST NOT alter | |||
interpretation of other elements in the GUE header nor how the | interpretation of other elements in the GUE header nor how the | |||
message is parsed (for instance, in a data message the proto/ctype | message is parsed (for instance, in a data message the proto/ctype | |||
field always holds an IP protocol number as an invariant). | field always holds an IP protocol number as an invariant). | |||
The set of available flags may be extended in the future by defining | The set of available flags can be extended in the future by defining | |||
a "flag extensions bit" that refers to a field containing a new set | a "flag extensions bit" that refers to a field containing an | |||
of flags. | additional set of flags. | |||
3.3.2. Example GUE header with extension fields | 3.3.2. Example GUE header with extension fields | |||
An example GUE header for a data message encapsulating an IPv4 packet | An example GUE header for a data message encapsulating an IPv4 packet | |||
and containing the VNID and Security extension fields (both defined | and containing the VNID and Security extension fields (both defined | |||
in [GUEXTENS]) is shown below: | in [GUEXTENS]) is shown below: | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
skipping to change at page 11, line 44 ¶ | skipping to change at page 12, line 4 ¶ | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| 0 |0| 3 | 94 |1|0 0 1| 0 | | | 0 |0| 3 | 94 |1|0 0 1| 0 | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| VNID | | | VNID | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
+ Security + | + Security + | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
In the above example, the first flag bit is set which indicates that | In the above example, the first flag bit is set which indicates that | |||
the VNID extension is present; this is a 32 bit field. The second | the VNID extension is present this is a 32 bit field. The second | |||
through fourth bits of the flags are paired flags that indicate the | through fourth bits of the flags are paired flags that indicate the | |||
presence of a security field with seven possible sizes. In this | presence of a security field with eigth possible sizes. In this | |||
example 001 indicates a sixty-four bit security field. | example 001 indicates a sixty-four bit security field. | |||
3.4. Private data | 3.4. Private data | |||
An implementation may use private data for its own use. The private | An implementation MAY use private data for its own use. The private | |||
data immediately follows the last extension field in the GUE header | data immediately follows the last field in the GUE header and is not | |||
and is not a fixed length. This data is considered part of the GUE | a fixed length. This data is considered part of the GUE header and | |||
header and must be accounted for in header length (Hlen). The length | MUST be accounted for in header length (Hlen). The length of the | |||
of the private data must be a multiple of four and is determined by | private data MUST be a multiple of four and is determined by | |||
subtracting the offset of private data in the GUE header from the | subtracting the offset of private data in the GUE header from the | |||
header length. Specifically: | header length. Specifically: | |||
Private_length = (Hlen * 4) - Length(flags) | Private_length = (Hlen * 4) - Length(flags) | |||
Where "Length(flags)" returns the sum of lengths of all the extension | where "Length(flags)" returns the sum of lengths of all the extension | |||
fields present in the GUE header. When there is no private data | fields present in the GUE header. When there is no private data | |||
present, the length of the private data is zero. | present, the length of the private data is zero. | |||
The semantics and interpretation of private data are implementation | The semantics and interpretation of private data are implementation | |||
specific. The private data may be structured as necessary, for | specific. The private data may be structured as necessary, for | |||
instance it might contain its own set of flags and extension fields. | instance it might contain its own set of flags and extension fields. | |||
An encapsulator and decapsulator MUST agree on the meaning of private | An encapsulator and decapsulator MUST agree on the meaning of private | |||
data before using it. The mechanism to achieve this agreement is | data before using it. The mechanism to achieve this agreement is | |||
outside the scope of this document but could include implementation- | outside the scope of this document but could include implementation- | |||
defined behavior, coordinated configuration, in-band communication | defined behavior, coordinated configuration, in-band communication | |||
using GUE control messages, or out-of-band messages. | using GUE control messages, or out-of-band messages. | |||
If a decapsulator receives a GUE packet with private data, it MUST | If a decapsulator receives a GUE packet with private data, it MUST | |||
validate the private data appropriately. If a decapsulator does not | validate the private data appropriately. If a decapsulator does not | |||
expect private data from an encapsulator the packet MUST be dropped. | expect private data from an encapsulator, the packet MUST be dropped. | |||
If a decapsulator cannot validate the contents of private data per | If a decapsulator cannot validate the contents of private data per | |||
the provided semantics the packet MUST also be dropped. An | the provided semantics, the packet MUST also be dropped. An | |||
implementation may place security data in GUE private data which must | implementation MAY place security data in GUE private data which if | |||
be verified for packet acceptance. | present MUST be verified for packet acceptance. | |||
3.5. Message types | 3.5. Message types | |||
3.5.1. Control messages | 3.5.1. Control messages | |||
Control messages carry formatted message that are implicitly | Control messages carry formatted data that are implicitly addressed | |||
addressed to the decapsulator to monitor or control the state or | to the decapsulator to monitor or control the state or behavior of a | |||
behavior of a tunnel (OAM). For instance, an echo request and | tunnel (OAM). For instance, an echo request and corresponding echo | |||
corresponding echo reply message may be defined to test for liveness. | reply message can be defined to test for liveness. | |||
Control messages are indicated in the GUE header when the C-bit is | Control messages are indicated in the GUE header when the C-bit is | |||
set. The payload is interpreted as a control message with type | set. The payload is interpreted as a control message with type | |||
specified in the proto/ctype field. The format and contents of the | specified in the proto/ctype field. The format and contents of the | |||
control message are indicated by the type and can be variable length. | control message are indicated by the type and can be variable length. | |||
Other than interpreting the proto/ctype field as a control message | Other than interpreting the proto/ctype field as a control message | |||
type, the meaning and semantics of the rest of the elements in the | type, the meaning and semantics of the rest of the elements in the | |||
GUE header are the same as that of data messages. Forwarding and | GUE header are the same as that of data messages. Forwarding and | |||
routing of control messages should be the same as that of a data | routing of control messages should be the same as that of a data | |||
message with the same outer IP and UDP header and GUE flags-- this | message with the same outer IP and UDP header and GUE flags; this | |||
ensures that control messages can be created that follow the same | ensures that control messages can be created that follow the same | |||
path as data messages. | path as data messages. | |||
3.5.2. Data messages | 3.5.2. Data messages | |||
Data messages carry encapsulated packets that are addressed to the | Data messages carry encapsulated packets that are addressed to the | |||
protocol stack for the associated protocol. Data messages are a | protocol stack for the associated protocol. Data messages are a | |||
primary means of encapsulation and can be used to create tunnels for | primary means of encapsulation and can be used to create tunnels for | |||
overlay networks. | overlay networks. | |||
Data messages are indicated in GUE header when the C-bit is not set. | Data messages are indicated in GUE header when the C-bit is not set. | |||
The payload of a data message is interpreted as an encapsulated | The payload of a data message is interpreted as an encapsulated | |||
packet of an Internet protocol indicated in the proto/ctype field. | packet of an Internet protocol indicated in the proto/ctype field. | |||
The encapsulated packet immediately follows the GUE header. | The encapsulated packet immediately follows the GUE header. | |||
3.6. Hiding the transport layer protocol number | 3.6. Hiding the transport layer protocol number | |||
The GUE header indicates the Internet protocol of the encapsulated | The GUE header indicates the Internet protocol of an encapsulated | |||
packet. This is either contained in the Proto/ctype field of the | packet. A protocol number is either contained in the Proto/ctype | |||
primary GUE header, or is contained in the Payload Type field of a | field of the primary GUE header or in the Payload Type field of a GUE | |||
GUE Transform Field (used to encrypt the payload with DTLS, | Transform extension field (used to encrypt the payload with DTLS, | |||
[GUESEC]). If the protocol number must be obfuscated, that is the | [GUEEXTENS). If the transport protocol number needs to be hidden from | |||
transport protocol in use must be hidden from the network, then a | the network, then a trivial destination options can be used. | |||
trivial destination options can be used at the beginning of the | ||||
payload. | ||||
The PadN destination option can be used to encode the transport | The PadN destination option [RFC2460] can be used to encode the | |||
protocol as a next header of an extension header (and maintain | transport protocol as a next header of an extension header (and | |||
alignment of encapsulated transport headers). The Proto/ctype field | maintain alignment of encapsulated transport headers). The | |||
or Payload Type field of the GUE Transform field is set to 60 to | Proto/ctype field or Payload Type field of the GUE Transform field is | |||
indicate that the first encapsulated header is a Destination Options | set to 60 to indicate that the first encapsulated header is a | |||
extension header. | destination options extension header. | |||
The format of the extension header is below: | The format of the extension header is below: | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Next Header | 2 | 1 | 0 | | | Next Header | 2 | 1 | 0 | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
For IPv4, it is permitted in GUE to use this precise destination | For IPv4, it is permitted in GUE to used this precise destination | |||
option to contain the obfuscated protocol number. In this case next | option to contain the obfuscated protocol number. In this case next | |||
header must refer to a valid IP protocol for IPv4. No other extension | header MUST refer to a valid IP protocol for IPv4. No other extension | |||
headers or destination options are permitted with IPv4. | headers or destination options are permitted with IPv4. | |||
4. Version 1 | 4. Version 1 | |||
Version 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. | Version 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. | |||
In this version there is no GUE header; a UDP packet encapsulates an | In this version there is no GUE header; a UDP packet carries an IP | |||
IP packet. The first two bits of the UDP payload for GUE are the GUE | packet. The first two bits of the UDP payload for GUE are the GUE | |||
version and coincide with the first two bits of the version number in | version and coincide with the first two bits of the version number in | |||
the IP header. The first two version bits of IPv4 and IPv6 are 01, so | the IP header. The first two version bits of IPv4 and IPv6 are 01, so | |||
we use GUE version 1 for direct IP encapsulation which makes two bits | we use GUE version 1 for direct IP encapsulation which makes two bits | |||
of GUE version to also be 01. | of GUE version to also be 01. | |||
This technique is effectively a means to compress out the GUE header | This technique is effectively a means to compress out the GUE header | |||
when encapsulating IPv4 or IPv6 packets and there are no flags or | when encapsulating IPv4 or IPv6 packets and there are no flags or | |||
extension fields present. This method is compatible to use on the | extension fields present. This method is compatible to use on the | |||
same port number as packets with the GUE header (GUE version 0 | same port number as packets with the GUE header (GUE version 0 | |||
packets). This technique saves encapsulation overhead on costly links | packets). This technique saves encapsulation overhead on costly links | |||
for the common use of IP encapsulation, and also obviates the need to | for the common use case of IP encapsulation, and also obviates the | |||
allocate a separate port number for IP-over-UDP encapsulation. | need to allocate a separate port number for IP-over-UDP | |||
encapsulation. | ||||
4.1. Direct encapsulation of IPv4 | 4.1. Direct encapsulation of IPv4 | |||
The format for encapsulating IPv4 directly in UDP is demonstrated | The format for encapsulating IPv4 directly in UDP is: | |||
below: | ||||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | |||
| Source port | Destination port | | | | Source port | Destination port | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | |||
| Length | Checksum | | | | Length | Checksum | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ | |||
|0|1|0|0| IHL |Type of Service| Total Length | | |0|1|0|0| IHL |Type of Service| Total Length | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Identification |Flags| Fragment Offset | | | Identification |Flags| Fragment Offset | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Time to Live | Protocol | Header Checksum | | | Time to Live | Protocol | Header Checksum | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Source IPv4 Address | | | Source IPv4 Address | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Destination IPv4 Address | | | Destination IPv4 Address | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Note that 0100 value IP version field expresses the GUE version as 1 | Note that 0100 value IP version field express the GUE version as 1 | |||
(bits 01) and IP version as 4 (bits 0100). | (bits 01) and IP version as 4 (bits 0100). | |||
4.2. Direct encapsulation of IPv6 | 4.2. Direct encapsulation of IPv6 | |||
The format for encapsulating IPv4 directly in UDP is demonstrated | The format for encapsulating IPv6 directly in UDP is demonstrated | |||
below: | below: | |||
0 1 2 3 | 0 1 2 3 | |||
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ | |||
| Source port | Destination port | | | | Source port | Destination port | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP | |||
| Length | Checksum | | | | Length | Checksum | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ | |||
|0|1|1|0| Traffic Class | Flow Label | | |0|1|1|0| Traffic Class | Flow Label | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| Payload Length | NextHdr | Hop Limit | | | Payload Length | NextHdr | Hop Limit | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
+ + | + + | |||
| | | | | | |||
+ Outer Source IPv6 Address + | + Source IPv6 Address + | |||
| | | | | | |||
+ + | + + | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
| | | | | | |||
+ + | + + | |||
| | | | | | |||
+ Outer Destination IPv6 Address + | + Destination IPv6 Address + | |||
| | | | | | |||
+ + | + + | |||
| | | | | | |||
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | |||
Note that 0110 value IP version field expresses the GUE version as 1 | Note that 0110 value IP version field expresses the GUE version as 1 | |||
(bits 01) and IP version as 6 (bits 0110). | (bits 01) and IP version as 6 (bits 0110). | |||
5. Operation | 5. Operation | |||
The figure below illustrates the use of GUE encapsulation between two | The figure below illustrates the use of GUE encapsulation between two | |||
hosts. Sever 1 is sending packets to host 2. An encapsulator performs | hosts. Host 1 is sending packets to Host 2. An encapsulator performs | |||
encapsulation of packets from host 1. These encapsulated packets | encapsulation of packets from Host 1. These encapsulated packets | |||
traverse the network as UDP packets. At the decapsulator, packets are | traverse the network as UDP packets. At the decapsulator, packets are | |||
decapsulated and sent on to host 2. Packet flow in the reverse | decapsulated and sent on to Host 2. Packet flow in the reverse | |||
direction need not be symmetric; GUE encapsulation is not required in | direction need not be symmetric; GUE encapsulation is not required in | |||
the reverse path. | the reverse path. | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| | | | | | | | | | |||
| Host 1 | | Host 2 | | | Host 1 | | Host 2 | | |||
| | | | | | | | | | |||
+---------------+ +---------------+ | +---------------+ +---------------+ | |||
| ^ | | ^ | |||
V | | V | | |||
skipping to change at page 16, line 34 ¶ | skipping to change at page 16, line 34 ¶ | |||
packets. In this case the encapsulator and decapsulator nodes are the | packets. In this case the encapsulator and decapsulator nodes are the | |||
tunnel endpoints. These could be routers that provide network tunnels | tunnel endpoints. These could be routers that provide network tunnels | |||
on behalf of communicating hosts. | on behalf of communicating hosts. | |||
5.2. Transport layer encapsulation | 5.2. Transport layer encapsulation | |||
When encapsulating layer 4 packets, the encapsulator and decapsulator | When encapsulating layer 4 packets, the encapsulator and decapsulator | |||
should be co-resident with the hosts. In this case, the encapsulation | should be co-resident with the hosts. In this case, the encapsulation | |||
headers are inserted between the IP header and the transport packet. | headers are inserted between the IP header and the transport packet. | |||
The addresses in the IP header refer to both the endpoints of the | The addresses in the IP header refer to both the endpoints of the | |||
encapsulation and the endpoints for terminating the the transport | encapsulation and the endpoints for terminating the transport | |||
protocol. Note that the transport layer ports in the encapsulated | protocol. Note that the transport layer ports in the encapsulated | |||
packet are independent of the UDP ports in the outer packet. | packet are independent of the UDP ports in the outer packet. | |||
Details about performing transport layer encapsulation are discussed | Details about performing transport layer encapsulation are discussed | |||
in [TOU]. | in [TOU]. | |||
5.3. Encapsulator operation | 5.3. Encapsulator operation | |||
Encapsulators create GUE data messages, set the fields of the UDP | Encapsulators create GUE data messages, set the fields of the UDP | |||
header, set flags and optional extension fields in the GUE header, | header, set flags and optional extension fields in the GUE header, | |||
and forward packets to a decapsulator. | and forward packets to a decapsulator. | |||
An encapsulator may be an end host originating the packets of a flow, | An encapsulator can be an end host originating the packets of a flow, | |||
or may be a network device performing encapsulation on behalf of | or can be a network device performing encapsulation on behalf of | |||
hosts (routers implementing tunnels for instance). In either case, | hosts (routers implementing tunnels for instance). In either case, | |||
the intended target (decapsulator) is indicated by the outer | the intended target (decapsulator) is indicated by the outer | |||
destination IP address and destination port in the UDP header. | destination IP address and destination port in the UDP header. | |||
If an encapsulator is tunneling packets, that is encapsulating | If an encapsulator is tunneling packets -- that is encapsulating | |||
packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP | packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, or ESP | |||
tunnel mode), it should follow standard conventions for tunneling of | tunnel mode) -- it SHOULD follow standard conventions for tunneling | |||
one protocol over another. For instance, if an IP packet is being | of one protocol over another. For instance, if an IP packet is being | |||
encapsualated in GUE then diffserv interaction [RFC2983] and ECN | encapsualated in GUE then diffserv interaction [RFC2983] and ECN | |||
propagation for tunnels [RFC6040] should be followed. | propagation for tunnels [RFC6040] SHOULD be followed. | |||
5.4. Decapsulator operation | 5.4. Decapsulator operation | |||
A decapsulator performs decapsulation of GUE packets. A decapsulator | A decapsulator performs decapsulation of GUE packets. A decapsulator | |||
is addressed by the outer destination IP address of a GUE packet. | is addressed by the outer destination IP address of a GUE packet. | |||
The decapsulator validates packets, including fields of the GUE | The decapsulator validates packets, including fields of the GUE | |||
header. | header. | |||
If a decapsulator receives a GUE packet with an unsupported version, | If a decapsulator receives a GUE packet with an unsupported version, | |||
unknown flag, bad header length (too small for included extension | unknown flag, bad header length (too small for included extension | |||
fields), unknown control message type, bad protocol number, an | fields), unknown control message type, bad protocol number, an | |||
unsupported Proto/ctype, or an otherwise malformed header, it MUST | unsupported payload type, or an otherwise malformed header, it MUST | |||
drop the packet. Such events may be logged subject to configuration | drop the packet. Such events MAY be logged subject to configuration | |||
and rate limiting of logging messages. No error message is returned | and rate limiting of logging messages. No error message is returned | |||
back to the encapsulator. Note that set flags in GUE that are unknown | back to the encapsulator. Note that set flags in a GUE header that | |||
to a decapsulator MUST NOT be ignored. If a GUE packet is received by | are unknown to a decapsulator MUST NOT be ignored. If a GUE packet is | |||
a decapsulator with unknown flags, the packet MUST be dropped. | received by a decapsulator with unknown flags, the packet MUST be | |||
dropped. | ||||
5.4.1. Processing a received data message | 5.4.1. Processing a received data message | |||
If a valid data message is received the UDP and GUE headers are | If a valid data message is received, the UDP and GUE headers are | |||
removed from the packet. The outer IP header remains in tact and the | (logically) removed from the packet. The outer IP header remains | |||
next protocol in the header is set to the protocol from the proto | intact and the next protocol in the IP header is set to the protocol | |||
field in the GUE header. The resulting packet is then resubmitted | from the proto field in the GUE header. The resulting packet is then | |||
into the protocol stack to process that packet as though it was | resubmitted into the protocol stack to process that packet as though | |||
received with the protocol in the GUE header. | it was received with the protocol in the GUE header. | |||
As an example, consider that a data message is received where GUE | As an example, consider that a data message is received where GUE | |||
encapsulates an IP packet. In this case proto field in the GUE header | encapsulates an IP packet. In this case proto field in the GUE header | |||
is set 94 for IPIP: | is set 94 for IPIP: | |||
+-------------------------------------+ | +-------------------------------------+ | |||
| IP header (next proto = 17,UDP) | | | IP header (next proto = 17,UDP) | | |||
|-------------------------------------| | |-------------------------------------| | |||
| UDP | | | UDP | | |||
|-------------------------------------| | |-------------------------------------| | |||
skipping to change at page 17, line 51 ¶ | skipping to change at page 18, line 4 ¶ | |||
+-------------------------------------+ | +-------------------------------------+ | |||
| IP header (next proto = 17,UDP) | | | IP header (next proto = 17,UDP) | | |||
|-------------------------------------| | |-------------------------------------| | |||
| UDP | | | UDP | | |||
|-------------------------------------| | |-------------------------------------| | |||
| GUE (proto = 94,IPIP) | | | GUE (proto = 94,IPIP) | | |||
|-------------------------------------| | |-------------------------------------| | |||
| IP header and packet | | | IP header and packet | | |||
+-------------------------------------+ | +-------------------------------------+ | |||
The receiver removes the UDP and GUE headers and sets the next | The receiver removes the UDP and GUE headers and sets the next | |||
protocol field in the IP packet to IPIP which is derived from the GUE | protocol field in the IP packet to IPIP, which is derived from the | |||
proto field. The resultant packet would have the format: | GUE proto field. The resultant packet would have the format: | |||
+-------------------------------------+ | +-------------------------------------+ | |||
| IP header (next proto = 94,IPIP) | | | IP header (next proto = 94,IPIP) | | |||
|-------------------------------------| | |-------------------------------------| | |||
| IP header and packet | | | IP header and packet | | |||
+-------------------------------------+ | +-------------------------------------+ | |||
This packet is then resubmitted into the protocol stack to be | This packet is then resubmitted into the protocol stack to be | |||
processed as an IPIP packet. | processed as an IPIP packet. | |||
5.4.2. Processing a received control message | 5.4.2. Processing a received control message | |||
If a valid control message is received the packet must be processed | If a valid control message is received, the packet MUST be processed | |||
as a control message. The specific processing to be performed depends | as a control message. The specific processing to be performed depends | |||
on the ctype in the GUE header. | on the ctype in the GUE header. | |||
5.5. Router and switch operation | 5.5. Router and switch operation | |||
Routers and switches should forward GUE packets as standard UDP/IP | Routers and switches SHOULD forward GUE packets as standard UDP/IP | |||
packets. The outer five-tuple should contain sufficient information | packets. The outer five-tuple should contain sufficient information | |||
to perform flow classification corresponding to the flow of the inner | to perform flow classification corresponding to the flow of the inner | |||
packet. A switch should not normally need to parse a GUE header, and | packet. A switch does not normally need to parse a GUE header, and | |||
none of the flags or extension fields in the GUE header should affect | none of the flags or extension fields in the GUE header are expected | |||
routing. | to affect routing. | |||
An intermediate node SHOULD NOT modify a GUE header or GUE payload | A router MUST NOT modify a GUE header when forwarding a packet. It | |||
when forwarding packets since correctly identifying GUE packets in | MAY encapsulate a GUE packet in another GUE packet, for instance to | |||
the network based on port numbers is not robust (see [RFC7605]). An | implement a network tunnel (i.e. by encapsulating an IP packet with a | |||
intermediate node may encapsulate a GUE packet in another GUE packet, | GUE payload in another IP packet as a GUE payload). In this case, the | |||
for instance to implement a network tunnel (i.e. by encapsulating an | router takes the role of an encapsulator, and the corresponding | |||
IP packet with a GUE payload in another IP packet as a GUE payload). | decapsulator is the logical endpoint of the tunnel. When | |||
In this case the router takes the role of an encapsulator, and the | encapsulating a GUE packet within another GUE packet, there are no | |||
corresponding decapsulator is the logical endpoint of the tunnel. | specified provisions to automatically GUE copy flags or fields to the | |||
When encapsulating a GUE packet within another GUE packet, there are | ||||
no provisions to automatically copy flags or extension fields to the | ||||
outer GUE header. Each layer of encapsulation is considered | outer GUE header. Each layer of encapsulation is considered | |||
independent. | independent. | |||
5.6. Middlebox interactions | 5.6. Middlebox interactions | |||
A middle box may interpret some flags and extension fields of the GUE | A middle box MAY interpret some flags and extension fields of the GUE | |||
header for classification purposes, but is not required to understand | header for classification purposes, but is not required to understand | |||
any of the flags or extension fields in GUE packets. A middle box | any of the flags or extension fields in GUE packets. A middle box | |||
must not drop a GUE packet because there are flags unknown to it. The | MUST NOT drop a GUE packet merely because there are flags unknown to | |||
header length in the GUE header allows a middlebox to inspect the | it. The header length in the GUE header allows a middlebox to inspect | |||
payload packet without needing to parse the flags or extension | the payload packet without needing to parse the flags or extension | |||
fields. | fields. | |||
5.6.1. Connection semantics | 5.6.1. Inferring connection semantics | |||
A middlebox may infer bidirectional connection semantics for a UDP | A middlebox might infer bidirectional connection semantics for a UDP | |||
flow. For instance a stateful firewall may create a five-tuple rule | flow. For instance, a stateful firewall might create a five-tuple | |||
to match flows on egress, and a corresponding five-tuple rule for | rule to match flows on egress, and a corresponding five-tuple rule | |||
matching ingress packets where the roles of source and destination | for matching ingress packets where the roles of source and | |||
are reversed for the IP addresses and UDP port numbers. To operate in | destination are reversed for the IP addresses and UDP port numbers. | |||
this environment, a GUE tunnel must assume connected semantics | To operate in this environment, a GUE tunnel SHOULD be configured to | |||
defined by the UDP five tuple and the use of GUE encapsulation must | assume connected semantics defined by the UDP five tuple and the use | |||
be symmetric between both endpoints. The source port set in the UDP | of GUE encapsulation needs to be symmetric between both endpoints. | |||
header must be the destination port the peer would set for replies. | The source port set in the UDP header MUST be the destination port | |||
In this case the UDP source port for a tunnel would be a fixed value | the peer would set for replies. In this case the UDP source port for | |||
for a tunnel and not set to be flow entropy as described in section | a tunnel would be a fixed value and not set to be flow entropy as | |||
5.11. | described in section 5.11. | |||
The selection of whether to make the UDP source port fixed or set to | The selection of whether to make the UDP source port fixed or set to | |||
a flow entropy value for each packet sent should be configurable for | a flow entropy value for each packet sent SHOULD be configurable for | |||
a tunnel. | a tunnel. | |||
5.6.2. NAT | 5.6.2. NAT | |||
IP address and port translation can be performed on the UDP/IP | IP address and port translation can be performed on the UDP/IP | |||
headers adhering to the requirements for NAT with UDP [RFC4787]. In | headers adhering to the requirements for NAT with UDP [RFC4787]. In | |||
the case of stateful NAT, connection semantics must be applied to a | the case of stateful NAT, connection semantics MUST be applied to a | |||
GUE tunnel as described in section 5.6.1. GUE endpoints may also | GUE tunnel as described in section 5.6.1. GUE endpoints MAY also | |||
invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings | invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings | |||
for encapsulations. | for encapsulations. | |||
5.7. Checksum Handling | 5.7. Checksum Handling | |||
The potential for mis-delivery of packets due to corruption of IP, | The potential for mis-delivery of packets due to corruption of IP, | |||
UDP, or GUE headers must be considered. Historically, the UDP | UDP, or GUE headers needs to be considered. Historically, the UDP | |||
checksum would be considered sufficient as a check against corruption | checksum would be considered sufficient as a check against corruption | |||
of either the UDP header and payload or the IP addresses. | of either the UDP header and payload or the IP addresses. | |||
Encapsulation protocols, such as GUE, may be originated or terminated | Encapsulation protocols, such as GUE, can be originated or terminated | |||
on devices incapable of computing the UDP checksum for packet. This | on devices incapable of computing the UDP checksum for packet. This | |||
section discusses the requirements around checksum and alternatives | section discusses the requirements around checksum and alternatives | |||
that might be used when an endpoint does not support UDP checksum. | that might be used when an endpoint does not support UDP checksum. | |||
5.7.1. Requirements | 5.7.1. Requirements | |||
One of the following requirements must be met: | One of the following requirements MUST be met: | |||
o UDP checksums are enabled (for IPv4 or IPv6). | o UDP checksums are enabled (for IPv4 or IPv6). | |||
o The GUE header checksum is used (defined in [GUEEXTENS]). | o The GUE header checksum is used (defined in [GUEEXTENS]). | |||
o Use zero UDP checksums. This is always permissable with IPv4, in | o Use zero UDP checksums. This is always permissible with IPv4; in | |||
IPv6 they may only be used in accordance with applicable | IPv6, they can only be used in accordance with applicable | |||
requirements in [GREUDP], [RFC6935], and [RFC6936]. | requirements in [RFC8086], [RFC6935], and [RFC6936]. | |||
5.7.2. UDP Checksum with IPv4 | 5.7.2. UDP Checksum with IPv4 | |||
For UDP in IPv4, the UDP checksum MUST be processed as specified in | For UDP in IPv4, the UDP checksum MUST be processed as specified in | |||
[RFC768] and [RFC1122] for both transmit and receive. An encapsulator | [RFC768] and [RFC1122] for both transmit and receive. An | |||
MAY set the UDP checksum to zero for performance or implementation | encapsulator MAY set the UDP checksum to zero for performance or | |||
considerations. The IPv4 header includes a checksum that protects | implementation considerations. The IPv4 header includes a checksum | |||
against mis-delivery of the packet due to corruption of IP addresses. | that protects against mis-delivery of the packet due to corruption | |||
The UDP checksum potentially provides protection against corruption | of IP addresses. The UDP checksum potentially provides protection | |||
of the UDP header, GUE header, and GUE payload. Enabling or disabling | against corruption of the UDP header, GUE header, and GUE payload. | |||
the use of checksums is a deployment consideration that should take | Enabling or disabling the use of checksums is a deployment | |||
into account the risk and effects of packet corruption, and whether | consideration that should take into account the risk and effects of | |||
the packets in the network are already adequately protected by other, | packet corruption, and whether the packets in the network are | |||
possibly stronger mechanisms such as the Ethernet CRC. If an | already adequately protected by other, possibly stronger mechanisms | |||
encapsulator sets a zero UDP checksum for IPv4 it SHOULD use the GUE | such as the Ethernet CRC. If an encapsulator sets a zero UDP | |||
header checksum as described in [GUEEXTENS]. | checksum for IPv4, it SHOULD use the GUE header checksum as | |||
described in [GUEEXTENS]. | ||||
When a decapsulator receives a packet, the UDP checksum field MUST be | When a decapsulator receives a packet, the UDP checksum field MUST | |||
processed. If the UDP checksum is non-zero, the decapsulator MUST | be processed. If the UDP checksum is non-zero, the decapsulator MUST | |||
verify the checksum before accepting the packet. By default a | verify the checksum before accepting the packet. By default, a | |||
decapsulator SHOULD accept UDP packets with a zero checksum. A node | decapsulator SHOULD accept UDP packets with a zero checksum. A node | |||
MAY be configured to disallow zero checksums per [RFC1122]; this may | MAY be configured to disallow zero checksums per [RFC1122]. | |||
be done selectively, for instance disallowing zero checksums from | Configuration of zero checksums can be selective. For instance, zero | |||
certain hosts that are known to be sending over paths subject to | checksums might be disallowed from certain hosts that are known to | |||
packet corruption. If verification of a non-zero checksum fails, a | be sending over paths subject to packet corruption. If verification | |||
decapsulator lacks the capability to verify a non-zero checksum, or a | of a non-zero checksum fails, a decapsulator lacks the capability to | |||
packet with a zero-checksum was received and the decapsulator is | verify a non-zero checksum, or a packet with a zero-checksum was | |||
configured to disallow, the packet MUST be dropped. | received and the decapsulator is configured to disallow, the packet | |||
MUST be dropped. | ||||
5.7.3. UDP Checksum with IPv6 | 5.7.3. UDP Checksum with IPv6 | |||
In IPv6 there is no checksum in the IPv6 header that protects against | In IPv6, there is no checksum in the IPv6 header that protects | |||
mis-delivery due to address corruption. Therefore, when GUE is used | against mis-delivery due to address corruption. Therefore, when GUE | |||
over IPv6, either the UDP checksum must be enabled, the GUE header | is used over IPv6, either the UDP checksum or the GUE header | |||
checksum must be used, or a zero UDP checksum is used if applicable | checksum SHOULD be used. The UDP checksum and GUE header checksum | |||
requirements are met. Setting a zero checksum may be desirable for | SHOULD not be used at the same time since that would be mostly | |||
performance or implementation reasons, in which case the GUE header | redundant. | |||
checksum MUST be used or requirements for using zero UDP checksums in | ||||
[RFC6935] and [RFC6936] MUST be met. If the UDP checksum is enabled, | ||||
then the GUE header checksum should not be used since it is mostly | ||||
redundant. | ||||
When a decapsulator receives a packet, the UDP checksum field MUST be | If neither the UDP checksum or the GUE header checksum is used, then | |||
processed. If the UDP checksum is non-zero, the decapsulator MUST | the requirements for using zero IPv6 UDP checksums in [RFC6935] and | |||
verify the checksum before accepting the packet. By default a | [RFC6936] MUST be met. | |||
decapsulator MUST only accept UDP packets with a zero checksum if the | ||||
GUE header checksum is used and is verified. If verification of a | When a decapsulator receives a packet, the UDP checksum field MUST | |||
non-zero checksum fails, a decapsulator lacks the capability to | be processed. If the UDP checksum is non-zero, the decapsulator MUST | |||
verify a non-zero checksum, or a packet with a zero-checksum and no | verify the checksum before accepting the packet. By default a | |||
GUE header checksum was received, the packet MUST be dropped. | decapsulator MUST only accept UDP packets with a zero checksum if | |||
the GUE header checksum is used and is verified. If verification of | ||||
a non-zero checksum fails, a decapsulator lacks the capability to | ||||
verify a non-zero checksum, or a packet with a zero-checksum and no | ||||
GUE header checksum was received, the packet MUST be dropped. | ||||
5.8. MTU and fragmentation | 5.8. MTU and fragmentation | |||
Standard conventions for handling of MTU (Maximum Transmission Unit) | Standard conventions for handling of MTU (Maximum Transmission Unit) | |||
and fragmentation in conjunction with networking tunnels | and fragmentation in conjunction with networking tunnels | |||
(encapsulation of layer 2 or layer 3 packets) should be followed. | (encapsulation of layer 2 or layer 3 packets) SHOULD be followed. | |||
Details are described in MTU and Fragmentation Issues with In-the- | Details are described in MTU and Fragmentation Issues with In-the- | |||
Network Tunneling [RFC4459] | Network Tunneling [RFC4459]. | |||
If a packet is fragmented before encapsulation in GUE, all the | If a packet is fragmented before encapsulation in GUE, all the | |||
related fragments must be encapsulated using the same UDP source | related fragments MUST be encapsulated using the same UDP source | |||
port. An operator should set MTU to account for encapsulation | port. An operator SHOULD set MTU to account for encapsulation | |||
overhead and reduce the likelihood of fragmentation. | overhead and reduce the likelihood of fragmentation. | |||
Alternative to IP fragmentation, the GUE fragmentation extension can | Alternatively to IP fragmentation, the GUE fragmentation extension | |||
be used. GUE fragmentation is described in [GUEEXTENS]. | can be used. GUE fragmentation is described in [GUEEXTENS]. | |||
5.9. Congestion control | 5.9. Congestion control | |||
Per requirements of [RFC5405], if the IP traffic encapsulated with | Per requirements of [RFC5405], if the IP traffic encapsulated with | |||
GUE implements proper congestion control no additional mechanisms | GUE implements proper congestion control no additional mechanisms | |||
should be required. | should be required. | |||
In the case that the encapsulated traffic does not implement any or | In the case that the encapsulated traffic does not implement any or | |||
sufficient control, or it is not known whether a transmitter will | sufficient control, or it is not known whether a transmitter will | |||
consistently implement proper congestion control, then congestion | consistently implement proper congestion control, then congestion | |||
control at the encapsulation layer MUST be provided per RFC5405. Note | control at the encapsulation layer MUST be provided per [RFC5405]. | |||
this case applies to a significant use case in network virtualization | Note that this case applies to a significant use case in network | |||
in which guests run third party networking stacks that cannot be | virtualization in which guests run third party networking stacks | |||
implicitly trusted to implement conformant congestion control. | that cannot be implicitly trusted to implement conformant congestion | |||
control. | ||||
Out of band mechanisms such as rate limiting, Managed Circuit Breaker | Out of band mechanisms such as rate limiting, Managed Circuit | |||
[CIRCBRK], or traffic isolation may be used to provide rudimentary | Breaker [CIRCBRK], or traffic isolation MAY be used to provide | |||
congestion control. For finer grained congestion control that allows | rudimentary congestion control. For finer-grained congestion control | |||
alternate congestion control algorithms, reaction time within an RTT, | that allows alternate congestion control algorithms, reaction time | |||
and interaction with ECN, in-band mechanisms may be warranted. | within an RTT, and interaction with ECN, in-band mechanisms might be | |||
warranted. | ||||
5.10. Multicast | 5.10. Multicast | |||
GUE packets may be multicast to decapsulators using a multicast | GUE packets can be multicast to decapsulators using a multicast | |||
destination address in the encapsulating IP headers. Each receiving | destination address in the encapsulating IP headers. Each receiving | |||
host will decapsulate the packet independently following normal | host will decapsulate the packet independently following normal | |||
decapsulator operations. The receiving decapsulators should agree on | decapsulator operations. The receiving decapsulators need to agree | |||
the same set of GUE parameters and properties; how such an agreement | on the same set of GUE parameters and properties; how such an | |||
is reached is outside the scope of this document. | agreement is reached is outside the scope of this document. | |||
GUE allows encapsulation of unicast, broadcast, or multicast traffic. | GUE allows encapsulation of unicast, broadcast, or multicast | |||
Flow entropy (the value in the UDP source port) may be generated from | traffic. Flow entropy (the value in the UDP source port) can be | |||
the header of encapsulated unicast or broadcast/multicast packets at | generated from the header of encapsulated unicast or | |||
an encapsulator. The mapping mechanism between the encapsulated | broadcast/multicast packets at an encapsulator. The mapping | |||
multicast traffic and the multicast capability in the IP network is | mechanism between the encapsulated multicast traffic and the | |||
transparent and independent of the encapsulation and is otherwise | multicast capability in the IP network is transparent and | |||
outside the scope of this document. | independent of the encapsulation and is otherwise outside the scope | |||
of this document. | ||||
5.11. Flow entropy for ECMP | 5.11. Flow entropy for ECMP | |||
5.11.1. Flow classification | 5.11.1. Flow classification | |||
A major objective of using GUE is that a network device can perform | A major objective of using GUE is that a network device can perform | |||
flow classification corresponding to the flow of the inner | flow classification corresponding to the flow of the inner | |||
encapsulated packet based on the contents in the outer headers. | encapsulated packet based on the contents in the outer headers. | |||
Hardware devices commonly perform hash computations on packet headers | Hardware devices commonly perform hash computations on packet | |||
to classify packets into flows or flow buckets. Flow classification | headers to classify packets into flows or flow buckets. Flow | |||
is done to support load balancing of flows across a set of networking | classification is done to support load balancing of flows across a | |||
resources. Examples of such load balancing techniques are Equal Cost | set of networking resources. Examples of such load balancing | |||
Multipath routing (ECMP), port selection in Link Aggregation, and NIC | techniques are Equal Cost Multipath routing (ECMP), port selection | |||
device Receive Side Scaling (RSS). Hashes are usually either a | in Link Aggregation, and NIC device Receive Side Scaling (RSS). | |||
three-tuple hash of IP protocol, source address, and destination | Hashes are usually either a three-tuple hash of IP protocol, source | |||
address; or a five-tuple hash consisting of IP protocol, source | address, and destination address; or a five-tuple hash consisting of | |||
address, destination address, source port, and destination port. | IP protocol, source address, destination address, source port, and | |||
Typically, networking hardware will compute five-tuple hashes for TCP | destination port. Typically, networking hardware will compute five- | |||
and UDP, but only three-tuple hashes for other IP protocols. Since | tuple hashes for TCP and UDP, but only three-tuple hashes for other | |||
the five-tuple hash provides more granularity, load balancing can be | IP protocols. Since the five-tuple hash provides more granularity, | |||
finer grained with better distribution. When a packet is encapsulated | load balancing can be finer-grained with better distribution. When a | |||
with GUE and connection semantics are not applied, the source port in | packet is encapsulated with GUE and connection semantics are not | |||
the outer UDP packet is set to a flow entropy value that corresponds | applied, the source port in the outer UDP packet is set to a flow | |||
to the flow of the inner packet. When a device computes a five-tuple | entropy value that corresponds to the flow of the inner packet. When | |||
hash on the outer UDP/IP header of a GUE packet, the resultant value | a device computes a five-tuple hash on the outer UDP/IP header of a | |||
classifies the packet per its inner flow. | GUE packet, the resultant value classifies the packet per its inner | |||
flow. | ||||
Examples of deriving flow entropy for encapsulation are: | Examples of deriving flow entropy for encapsulation are: | |||
o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for | o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for | |||
instance, the flow entropy could be based on the canonical five- | instance, the flow entropy could be based on the canonical five- | |||
tuple hash of the inner packet. | tuple hash of the inner packet. | |||
o If the encapsulated packet is an AH transport mode packet with | o If the encapsulated packet is an AH transport mode packet with | |||
TCP as next header, the flow entropy could be a hash over a | TCP as next header, the flow entropy could be a hash over a | |||
three-tuple: TCP protocol and TCP ports of the encapsulated | three-tuple: TCP protocol and TCP ports of the encapsulated | |||
packet. | packet. | |||
o If a node is encrypting a packet using ESP tunnel mode and GUE | o If a node is encrypting a packet using ESP tunnel mode and GUE | |||
encapsulation, the flow entropy could be based on the contents | encapsulation, the flow entropy could be based on the contents | |||
of clear-text packet. For instance, a canonical five-tuple hash | of the clear-text packet. For instance, a canonical five-tuple | |||
for a TCP/IP packet could be used. | hash for a TCP/IP packet could be used. | |||
[RFC6438] discusses methods to compute and flow entropy value for | [RFC6438] discusses methods to compute and set flow entropy value for | |||
IPv6 flow labels, those methods can also be used to create flow | IPv6 flow labels. Such methods can also be used to create flow | |||
entropy values for GUE. | entropy values for GUE. | |||
5.11.2. Flow entropy properties | 5.11.2. Flow entropy properties | |||
The flow entropy is the value set in the UDP source port of a GUE | The flow entropy is the value set in the UDP source port of a GUE | |||
packet. Flow entropy in the UDP source port should adhere to the | packet. Flow entropy in the UDP source port SHOULD adhere to the | |||
following properties: | following properties: | |||
o The value set in the source port should be within the ephemeral | o The value set in the source port is within the ephemeral port | |||
port range (49152 to 65535 [RFC6335]). Since the high order two | range (49152 to 65535 [RFC6335]). Since the high order two bits | |||
bits of the port are set to one this provides fourteen bits of | of the port are set to one, this provides fourteen bits of | |||
entropy for the value. | entropy for the value. | |||
o The flow entropy should have a uniform distribution across | o The flow entropy has a uniform distribution across encapsulated | |||
encapsulated flows. | flows. | |||
o An encapsulator may occasionally change the flow entropy used | o An encapsulator MAY occasionally change the flow entropy used | |||
for an inner flow per its discretion (for security, route | for an inner flow per its discretion (for security, route | |||
selection, etc). To avoid thrashing or flapping the value, the | selection, etc). To avoid thrashing or flapping the value, the | |||
flow entropy used for a flow should not change more than once | flow entropy used for a flow SHOULD NOT change more than once | |||
every thirty seconds (or a configurable value). | every thirty seconds (or a configurable value). | |||
o Decapsulators, or any networking devices, should not attempt to | o Decapsulators, or any networking devices, SHOULD NOT attempt to | |||
interpret flow entropy as anything more than an opaque value. | interpret flow entropy as anything more than an opaque value. | |||
Neither should they attempt to reproduce the hash calculation | Neither should they attempt to reproduce the hash calculation | |||
used by an encapasulator in creating a flow entropy value. They | used by an encapasulator in creating a flow entropy value. They | |||
may use the value to match further receive packets for steering | MAY use the value to match further receive packets for steering | |||
decisions, but cannot assume that the hash uniquely or | decisions, but MUST NOT assume that the hash uniquely or | |||
permanently identifies a flow. | permanently identifies a flow. | |||
o Input to the flow entropy calculation is not restricted to ports | o Input to the flow entropy calculation is not restricted to ports | |||
and addresses; input could include flow label from an IPv6 | and addresses; input could include flow label from an IPv6 | |||
packet, SPI from an ESP packet, or other flow related state in | packet, SPI from an ESP packet, or other flow related state in | |||
the encapsulator that is not necessarily conveyed in the packet. | the encapsulator that is not necessarily conveyed in the packet. | |||
o The assignment function for flow entropy should be randomly | o The assignment function for flow entropy SHOULD be randomly | |||
seeded to mitigate denial of service attacks. The seed may be | seeded to mitigate denial of service attacks. The seed may be | |||
changed periodically. | changed periodically. | |||
5.12. Negotiation of acceptable flags and extension fields | 5.12 Negotiation of acceptable flags and extension fields | |||
An encapsulator and decapsulator must achieve agreement about GUE | An encapsulator and decapsulator need to achieve agreement about GUE | |||
parameters that will be used in communications. Parameters include | parameters will be used in communications. Parameters include GUE | |||
GUE versions, flags and optional extension fields that can be used, | version, flags and extension fields that can be used, security | |||
security algorithms and keys, supported protocols and control | algorithms and keys, supported protocols and control messages, etc. | |||
messages, etc. This document proposes different general methods to | This document proposes different general methods to accomplish this, | |||
accomplish this, the details of implementing these are considered out | however the details of implementing these are considered out of | |||
of scope. | scope. | |||
General methods for this are: | Possible negotiation methods are: | |||
o Configuration. The parameters used for a tunnel are configured | o Configuration. The parameters used for a tunnel are configured | |||
at each endpoint. | at each endpoint. | |||
o Negotiation. A tunnel negotiation can be performed. This could | o Negotiation. A tunnel negotiation can be performed. This could | |||
be accomplished in-band of GUE using control messages or private | be accomplished in-band of GUE using control messages or private | |||
data. | data. | |||
o Via a control plane. Parameters for communicating with a tunnel | o Via a control plane. Parameters for communicating with a tunnel | |||
endpoint can be set in a control plane protocol (such as that | endpoint can be set in a control plane protocol (such as that | |||
needed for nvo3). | needed for nvo3). | |||
o Via security negotiation. If security is used that would | o Via security negotiation. Use of security typically implies a | |||
typically imply a key exchange between endpoints. Other GUE | key exchange between endpoints. Other GUE parameters may be | |||
parameters may be conveyed as part of that process. | conveyed as part of that process. | |||
6. Motivation for GUE | 6. Motivation for GUE | |||
This section presents the motivation for GUE with respect to other | This section presents the motivation for GUE with respect to other | |||
encapsulation methods. | encapsulation methods. | |||
6.1. Benefits of GUE | 6.1. Benefits of GUE | |||
* GUE is a generic encapsulation protocol. GUE can encapsulate | * GUE is a generic encapsulation protocol. GUE can encapsulate | |||
protocols that are represented by an IP protocol number. This | protocols that are represented by an IP protocol number. This | |||
includes layer 2, layer 3, and layer 4 protocols. | includes layer 2, layer 3, and layer 4 protocols. | |||
* GUE is an extensible encapsulation protocol. Standardized | * GUE is an extensible encapsulation protocol. Standard optional | |||
optional data such as security, virtual networking identifiers, | data such as security, virtual networking identifiers, | |||
fragmentation are being defined. | fragmentation are being defined. | |||
* For extensilbity, GUE uses flag fields as opposed to TLVs as | ||||
some other encapsulation protocols do. Flag fields are strictly | ||||
ordered, allow random access, and an efficient use of header | ||||
space. | ||||
* GUE allows private data to be sent as part of the encapsulation. | * GUE allows private data to be sent as part of the encapsulation. | |||
This permits experimentation or customization in deployment. | This permits experimentation or customization in deployment. | |||
* GUE allows sending of control messages such as OAM using the | * GUE allows sending of control messages such as OAM using the | |||
same GUE header format (for routing purposes) as normal data | same GUE header format (for routing purposes) as normal data | |||
messages. | messages. | |||
* GUE maximizes deliverability of non-UDP and non-TCP protocols. | * GUE maximizes deliverability of non-UDP and non-TCP protocols. | |||
* GUE provides a means for exposing per flow entropy for ECMP for | * GUE provides a means for exposing per flow entropy for ECMP for | |||
atypical protocols such as SCTP, DCCP, ESP, etc. | atypical protocols such as SCTP, DCCP, ESP, etc. | |||
6.2. Comparison of GUE to other encapsulations | 6.2 Comparison of GUE to other encapsulations | |||
A number of different encapsulation techniques have been proposed for | A number of different encapsulation techniques have been proposed for | |||
the encapsulation of one protocol over another. EtherIP [RFC3378] | the encapsulation of one protocol over another. EtherIP [RFC3378] | |||
provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], | provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], | |||
MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling | MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling | |||
layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN | layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN | |||
[RFC7348] are proposals for encapsulation of layer 2 packets for | [RFC7348] are proposals for encapsulation of layer 2 packets for | |||
network virtualization. IPIP [RFC2003] and Generic packet tunneling | network virtualization. IPIP [RFC2003] and Generic packet tunneling | |||
in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. | in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. | |||
Several proposals exist for encapsulating packets over UDP including | Several proposals exist for encapsulating packets over UDP including | |||
ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN | ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN | |||
[RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, | [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, | |||
MPLS/UDP [7510], and Generic UDP Encapsulation for IP Tunneling (GRE | MPLS/UDP [RFC7510], and Generic UDP Encapsulation for IP Tunneling | |||
over UDP)[GREUDP]. Generic UDP tunneling [GUT] is a proposal similar | (GRE over UDP)[RFC8086]. Generic UDP tunneling [GUT] is a proposal | |||
to GUE in that it aims to tunnel packets of IP protocols over UDP. | similar to GUE in that it aims to tunnel packets of IP protocols over | |||
UDP. | ||||
GUE has the following discriminating features: | GUE has the following discriminating features: | |||
o UDP encapsulation leverages specialized network device | o UDP encapsulation leverages specialized network device | |||
processing for efficient transport. The semantics for using the | processing for efficient transport. The semantics for using the | |||
UDP source port for flow entropy as input to ECMP are defined in | UDP source port for flow entropy as input to ECMP are defined in | |||
section 5.11. | section 5.11. | |||
o GUE permits encapsulation of arbitrary IP protocols, which | o GUE permits encapsulation of arbitrary IP protocols, which | |||
includes layer 2 3, and 4 protocols. | includes layer 2 3, and 4 protocols. | |||
o Multiple protocols can be multiplexed over a single UDP port | o Multiple protocols can be multiplexed over a single UDP port | |||
number. This is in contrast to techniques to encapsulate | number. This is in contrast to techniques to encapsulate | |||
protocols over UDP using a protocol specific port number (such | protocols over UDP using a protocol specific port number (such | |||
as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and | as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and | |||
extensible mechanism for encapsulating various IP protocols in | extensible mechanism for encapsulating all IP protocols in UDP | |||
UDP with minimal overhead (four bytes of additional header). | with minimal overhead (four bytes of additional header). | |||
o GUE is extensible. New flags and extension fields can be | o GUE is extensible. New flags and extension fields can be | |||
defined. | defined. | |||
o The GUE header includes a header length field. This allows a | o The GUE header includes a header length field. This allows a | |||
network node to inspect an encapsulated packet without needing | network node to inspect an encapsulated packet without needing | |||
to parse the full encapsulation header. | to parse the full encapsulation header. | |||
o Private data in the encapsulation header allows local | o Private data in the encapsulation header allows local | |||
customization and experimentation while being compatible with | customization and experimentation while being compatible with | |||
processing in network nodes (routers and middleboxes). | processing in network nodes (routers and middleboxes). | |||
o GUE includes both data messages (encapsulation of packets) and | o GUE includes both data messages (encapsulation of packets) and | |||
control messages (such as OAM). | control messages (such as OAM). | |||
o The flags-field model facilitates efficient implementation of | o The flags-field model facilitates efficient implementation of | |||
extensibility in hardware. | extensibility in hardware. For example, a TCAM can be use to | |||
parse a known set of N flags where the number of entries in the | ||||
For instance a TCAM can be use to parse a known set of N flags | TCAM is 2^N. By contrast, the number of TCAM entries needed to | |||
where the number of entries in the TCAM is 2^N. | parse a set of N arbitrarily ordered TLVS is approximately e*N!. | |||
By comparison, the number of TCAM entries needed to parse a set | ||||
of N arbitrarily ordered TLVS is: | ||||
N! + (N N-1)(N-1)! + (N N-2)(N-2)! + ... + (N 2)2! + (N 1)1! | ||||
7. Security Considerations | 7. Security Considerations | |||
There are two important considerations of security with respect to | There are two important considerations of security with respect to | |||
GUE. | GUE. | |||
o Authentication and integrity of the GUE header | o Authentication and integrity of the GUE header. | |||
o Authentication, integrity, and confidentiality of the GUE | o Authentication, integrity, and confidentiality of the GUE | |||
payload. | payload. | |||
Security is integrated into GUE by the use of GUE security related | GUE security is provided by extensions for security defined in | |||
extensions; these are defined in [GUEEXTENS]. These extensions | [GUEEXTENS]. These extensions include methods to authenticate the GUE | |||
include methods to authenticate the GUE header and encrypt the GUE | header and encrypt the GUE payload. | |||
payload. | ||||
IPsec in transport mode may be used to authenticate or encrypt GUE | The GUE header can be authenticated using a security extension for an | |||
packets (GUE header and payload). Existing network security | HMAC. Securing the GUE payload can be accomplished use of the GUE | |||
mechanisms, such as address spoofing detection, DDOS mitigation, and | Payload Transform that can provide DTLS [RFC6347] in the payload of a | |||
transparent encrypted tunnels can be applied to GUE packets. | GUE packet to encrypt the payload. | |||
A hash function for computing flow entropy (section 5.11) should be | A hash function for computing flow entropy (section 5.11) SHOULD be | |||
randomly seeded to mitigate some possible denial service attacks. | randomly seeded to mitigate some possible denial service attacks. | |||
8. IANA Consideration | 8. IANA Considerations | |||
8.1. UDP source port | 8.1. UDP source port | |||
A user UDP port number assignment for GUE has been assigned: | A user UDP port number assignment for GUE has been assigned: | |||
Service Name: gue | Service Name: gue | |||
Transport Protocol(s): UDP | Transport Protocol(s): UDP | |||
Assignee: Tom Herbert <therbert@google.com> | Assignee: Tom Herbert <therbert@google.com> | |||
Contact: Tom Herbert <therbert@google.com> | Contact: Tom Herbert <therbert@google.com> | |||
Description: Generic UDP Encapsulation | Description: Generic UDP Encapsulation | |||
Reference: draft-herbert-gue | Reference: draft-herbert-gue | |||
Port Number: 6080 | Port Number: 6080 | |||
Service Code: N/A | Service Code: N/A | |||
skipping to change at page 27, line 26 ¶ | skipping to change at page 28, line 9 ¶ | |||
Reference: draft-herbert-gue | Reference: draft-herbert-gue | |||
Port Number: 6080 | Port Number: 6080 | |||
Service Code: N/A | Service Code: N/A | |||
Known Unauthorized Uses: N/A | Known Unauthorized Uses: N/A | |||
Assignment Notes: N/A | Assignment Notes: N/A | |||
8.2. GUE version number | 8.2. GUE version number | |||
IANA is requested to set up a registry for the GUE version number. | IANA is requested to set up a registry for the GUE version number. | |||
The GUE version number is 2 bits containing four possible values. | The GUE version number is 2 bits containing four possible values. | |||
This document defines version 0 and 1. New values are assigned via | This document defines version 0 and 1. New values are assigned in | |||
Standards Action [RFC5226]. | accordance with RFC Required policy [RFC5226]. | |||
+----------------+-------------+---------------+ | +----------------+-------------+---------------+ | |||
| Version number | Description | Reference | | | Version number | Description | Reference | | |||
+----------------+-------------+---------------+ | +----------------+-------------+---------------+ | |||
| 0 | Version 0 | This document | | | 0 | Version 0 | This document | | |||
| | | | | | | | | | |||
| 1 | Version 1 | This document | | | 1 | Version 1 | This document | | |||
| | | | | | | | | | |||
| 2..3 | Unassigned | | | | 2..3 | Unassigned | | | |||
+----------------+-------------+---------------+ | +----------------+-------------+---------------+ | |||
8.3. Control types | 8.3. Control types | |||
IANA is requested to set up a registry for the GUE control types. | IANA is requested to set up a registry for the GUE control types. | |||
Control types are 8 bit values. New values for control types 1-127 | Control types are 8 bit values. New values for control types 1-127 | |||
are assigned via Standards Action [RFC5226]. | are assigned in accordance with RFC Required policy [RFC5226]. | |||
+----------------+------------------+---------------+ | +----------------+------------------+---------------+ | |||
| Control type | Description | Reference | | | Control type | Description | Reference | | |||
+----------------+------------------+---------------+ | +----------------+------------------+---------------+ | |||
| 0 | Need further | This document | | | 0 | Need further | This document | | |||
| | interpretation | | | | | interpretation | | | |||
| | | | | | | | | | |||
| 1..127 | Unassigned | | | | 1..127 | Unassigned | | | |||
| | | | | | | | | | |||
| 128..255 | User defined | This document | | | 128..255 | User defined | This document | | |||
+----------------+------------------+---------------+ | +----------------+------------------+---------------+ | |||
8.4. Flag-fields | 8.4. Flag-fields | |||
IANA is requested to create a "GUE flag-fields" registry to allocate | IANA is requested to create a "GUE flag-fields" registry to allocate | |||
flags and extension fields used with GUE. This shall be a registry of | flags and extension fields used with GUE. This shall be a registry of | |||
bit assignments for flags, length of extension fields for | bit assignments for flags, length of extension fields for | |||
corresponding flags, and descriptive strings. There are sixteen bits | corresponding flags, and descriptive strings. There are sixteen bits | |||
for primary GUE header flags (bit number 0-15). New values are | for primary GUE header flags (bit number 0-15). New values are | |||
assigned via Standards Action [RFC5226]. | assigned in accordance with RFC Required policy [RFC5226]. | |||
+-------------+--------------+-------------+--------------------+ | +-------------+--------------+-------------+--------------------+ | |||
| Flags bits | Field size | Description | Reference | | | Flags bits | Field size | Description | Reference | | |||
+-------------+--------------+-------------+--------------------+ | +-------------+--------------+-------------+--------------------+ | |||
| Bit 0 | 4 bytes | VNID | [GUE4NVO3] | | | Bit 0 | 4 bytes | VNID | [GUE4NVO3] | | |||
| | | | | | | | | | | | |||
| Bit 1..3 | 001->8 bytes | Security | [GUEEXTENS] | | | Bit 1..3 | 001->8 bytes | Security | [GUEEXTENS] | | |||
| | 010->16 bytes| | | | | | 010->16 bytes| | | | |||
| | 011->32 bytes| | | | | | 011->32 bytes| | | | |||
| | | | | | | | | | | | |||
skipping to change at page 29, line 8 ¶ | skipping to change at page 29, line 35 ¶ | |||
| | | | | | | | | | | | |||
| Bit 8..15 | | Unassigned | | | | Bit 8..15 | | Unassigned | | | |||
+-------------+--------------+-------------+--------------------+ | +-------------+--------------+-------------+--------------------+ | |||
New flags are to be allocated from high to low order bit contiguously | New flags are to be allocated from high to low order bit contiguously | |||
without holes. | without holes. | |||
9. Acknowledgements | 9. Acknowledgements | |||
The authors would like to thank David Liu, Erik Nordmark, Fred | The authors would like to thank David Liu, Erik Nordmark, Fred | |||
Templin, Adrian Farrel, and Bob Briscoe for valuable input on this | Templin, Adrian Farrel, Bob Briscoe, and Murray Kucherawy for | |||
draft. | valuable input on this draft. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
[RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI | [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI | |||
10.17487/RFC0768, August 1980, <http://www.rfc- | 10.17487/RFC0768, August 1980, <http://www.rfc- | |||
editor.org/info/rfc768>. | editor.org/info/rfc768>. | |||
[RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Communication Layers", STD 3, RFC 1122, DOI | Requirement Levels", BCP 14, RFC 2119, DOI | |||
10.17487/RFC1122, October 1989, <http://www.rfc- | 10.17487/RFC2119, March 1997, <http://www.rfc- | |||
editor.org/info/rfc1122>. | editor.org/info/rfc2119>. | |||
[RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an | [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 | |||
IANA Considerations Section in RFCs", RFC 2434, DOI | (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, | |||
10.17487/RFC2434, October 1998, <http://www.rfc- | December 1998, <http://www.rfc-editor.org/info/rfc2460>. | |||
editor.org/info/rfc2434>. | ||||
[RFC2983] Black, D., "Differentiated Services and Tunnels", RFC | [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC | |||
2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc- | 2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc- | |||
editor.org/info/rfc2983>. | editor.org/info/rfc2983>. | |||
[RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion | [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion | |||
Notification", RFC 6040, DOI 10.17487/RFC6040, November | Notification", RFC 6040, DOI 10.17487/RFC6040, November | |||
2010, <http://www.rfc-editor.org/info/rfc6040>. | 2010, <http://www.rfc-editor.org/info/rfc6040>. | |||
[RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and | [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and | |||
skipping to change at page 30, line 5 ¶ | skipping to change at page 30, line 31 ¶ | |||
[RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement | [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement | |||
for the Use of IPv6 UDP Datagrams with Zero Checksums", | for the Use of IPv6 UDP Datagrams with Zero Checksums", | |||
RFC 6936, DOI 10.17487/RFC6936, April 2013, | RFC 6936, DOI 10.17487/RFC6936, April 2013, | |||
<http://www.rfc-editor.org/info/rfc6936>. | <http://www.rfc-editor.org/info/rfc6936>. | |||
[RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- | [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- | |||
Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April | Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April | |||
2006, <http://www.rfc-editor.org/info/rfc4459>. | 2006, <http://www.rfc-editor.org/info/rfc4459>. | |||
10.2. Informative References | [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - | |||
Communication Layers", STD 3, RFC 1122, DOI | ||||
[RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., | 10.17487/RFC1122, October 1989, <http://www.rfc- | |||
and G. Fairhurst, Ed., "The Lightweight User Datagram | editor.org/info/rfc1122>. | |||
Protocol (UDP-Lite)", RFC 3828, July 2004, | ||||
<http://www.rfc-editor.org/info/rfc3828>. | ||||
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, | ||||
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | ||||
eXtensible Local Area Network (VXLAN): A Framework for | ||||
Overlaying Virtualized Layer 2 Networks over Layer 3 | ||||
Networks", RFC 7348, August 2014, <http://www.rfc- | ||||
editor.org/info/rfc7348>. | ||||
[RFC7605] Touch, J., "Recommendations on Using Assigned Transport | ||||
Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, | ||||
August 2015, <http://www.rfc-editor.org/info/rfc7605>. | ||||
[RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network | [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. | |||
Virtualization Using Generic Routing Encapsulation", RFC | Cheshire, "Internet Assigned Numbers Authority (IANA) | |||
7637, DOI 10.17487/RFC7637, September 2015, | Procedures for the Management of the Service Name and | |||
<http://www.rfc-editor.org/info/rfc7637>. | Transport Protocol Port Number Registry", BCP 165, RFC | |||
6335, DOI 10.17487/RFC6335, August 2011, <http://www.rfc- | ||||
editor.org/info/rfc6335>. | ||||
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, | 10.2. Informative References | |||
"Encapsulating MPLS in UDP", RFC 7510, DOI | ||||
10.17487/RFC7510, April 2015, <http://www.rfc- | ||||
editor.org/info/rfc7510>. | ||||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path | |||
Congestion Control Protocol (DCCP)", RFC 4340, DOI | Algorithm", RFC 2992, DOI 10.17487/RFC2992, November 2000, | |||
10.17487/RFC4340, March 2006, <http://www.rfc- | <http://www.rfc-editor.org/info/rfc2992>. | |||
editor.org/info/rfc4340>. | ||||
[RFC4787] Audet, F., Ed., and C. Jennings, "Network Address | [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address | |||
Translation (NAT) Behavioral Requirements for Unicast | Translation (NAT) Behavioral Requirements for Unicast | |||
UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January | UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January | |||
2007, <http://www.rfc-editor.org/info/rfc4787>. | 2007, <http://www.rfc-editor.org/info/rfc4787>. | |||
[RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, | [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, | |||
"Session Traversal Utilities for NAT (STUN)", RFC 5389, | "Session Traversal Utilities for NAT (STUN)", RFC 5389, | |||
DOI 10.17487/RFC5389, October 2008, <http://www.rfc- | DOI 10.17487/RFC5389, October 2008, <http://www.rfc- | |||
editor.org/info/rfc5389>. | editor.org/info/rfc5389>. | |||
[RFC5285] Rosenberg, J., "Interactive Connectivity Establishment | [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment | |||
(ICE): A Protocol for Network Address Translator (NAT) | (ICE): A Protocol for Network Address Translator (NAT) | |||
Traversal for Offer/Answer Protocols", RFC 5245, DOI | Traversal for Offer/Answer Protocols", RFC 5245, DOI | |||
10.17487/RFC5245, April 2010, <http://www.rfc- | 10.17487/RFC5245, April 2010, <http://www.rfc- | |||
editor.org/info/rfc5245>. | editor.org/info/rfc5245>. | |||
[RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- | ||||
in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, | ||||
March 2017, <http://www.rfc-editor.org/info/rfc8086>. | ||||
[RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines | [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines | |||
for Application Designers", BCP 145, RFC 5405, DOI | for Application Designers", BCP 145, RFC 5405, DOI | |||
10.17487/RFC5405, November 2008, <http://www.rfc- | 10.17487/RFC5405, November 2008, <http://www.rfc- | |||
editor.org/info/rfc5405>. | editor.org/info/rfc5405>. | |||
[RFC7605] Touch, J., "Recommendations on Using Assigned Transport | ||||
Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, | ||||
August 2015, <http://www.rfc-editor.org/info/rfc7605>. | ||||
[RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label | [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label | |||
for Equal Cost Multipath Routing and Link Aggregation in | for Equal Cost Multipath Routing and Link Aggregation in | |||
Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, | Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, | |||
<http://www.rfc-editor.org/info/rfc6438>. | <http://www.rfc-editor.org/info/rfc6438>. | |||
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI | ||||
10.17487/RFC2003, October 1996, <http://www.rfc- | ||||
editor.org/info/rfc2003>. | ||||
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. | ||||
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC | ||||
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc- | ||||
editor.org/info/rfc3948>. | ||||
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The | ||||
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI | ||||
10.17487/RFC6830, January 2013, <http://www.rfc- | ||||
editor.org/info/rfc6830>. | ||||
[RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling | [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling | |||
Ethernet Frames in IP Datagrams", RFC 3378, DOI | Ethernet Frames in IP Datagrams", RFC 3378, DOI | |||
10.17487/RFC3378, September 2002, <http://www.rfc- | 10.17487/RFC3378, September 2002, <http://www.rfc- | |||
editor.org/info/rfc3378>. | editor.org/info/rfc3378>. | |||
[RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. | [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. | |||
Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, | Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, | |||
DOI 10.17487/RFC2784, March 2000, <http://www.rfc- | DOI 10.17487/RFC2784, March 2000, <http://www.rfc- | |||
editor.org/info/rfc2784>. | editor.org/info/rfc2784>. | |||
[RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., | [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., | |||
"Encapsulating MPLS in IP or Generic Routing Encapsulation | "Encapsulating MPLS in IP or Generic Routing Encapsulation | |||
(GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, | (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, | |||
<http://www.rfc-editor.org/info/rfc4023>. | <http://www.rfc-editor.org/info/rfc4023>. | |||
[RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, | [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, | |||
G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", | G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", | |||
RFC 2661, DOI 10.17487/RFC2661, August 1999, | RFC 2661, DOI 10.17487/RFC2661, August 1999, | |||
<http://www.rfc-editor.org/info/rfc2661>. | <http://www.rfc-editor.org/info/rfc2661>. | |||
[RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network | ||||
Virtualization Using Generic Routing Encapsulation", RFC | ||||
7637, DOI 10.17487/RFC7637, September 2015, | ||||
<http://www.rfc-editor.org/info/rfc7637>. | ||||
[RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, | ||||
L., Sridhar, T., Bursell, M., and C. Wright, "Virtual | ||||
eXtensible Local Area Network (VXLAN): A Framework for | ||||
Overlaying Virtualized Layer 2 Networks over Layer 3 | ||||
Networks", RFC 7348, August 2014, <http://www.rfc- | ||||
editor.org/info/rfc7348>. | ||||
[RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI | ||||
10.17487/RFC2003, October 1996, <http://www.rfc- | ||||
editor.org/info/rfc2003>. | ||||
[RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in | ||||
IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473, | ||||
December 1998, <http://www.rfc-editor.org/info/rfc2473>. | ||||
[RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. | ||||
Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC | ||||
3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc- | ||||
editor.org/info/rfc3948>. | ||||
[RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The | ||||
Locator/ID Separation Protocol (LISP)", RFC 6830, DOI | ||||
10.17487/RFC6830, January 2013, <http://www.rfc- | ||||
editor.org/info/rfc6830>. | ||||
[RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, | ||||
"Encapsulating MPLS in UDP", RFC 7510, DOI | ||||
10.17487/RFC7510, April 2015, <http://www.rfc- | ||||
editor.org/info/rfc7510>. | ||||
[RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram | ||||
Congestion Control Protocol (DCCP)", RFC 4340, DOI | ||||
10.17487/RFC4340, March 2006, <http://www.rfc- | ||||
editor.org/info/rfc4340>. | ||||
[GUEEXTENS] Herbert, T., Yong, L., and Templin, F., "Extensions for | [GUEEXTENS] Herbert, T., Yong, L., and Templin, F., "Extensions for | |||
Generic UDP Encapsulation" draft-herbert-gue-extensions-00 | Generic UDP Encapsulation" draft-herbert-gue-extensions-00 | |||
[GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP | [GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP | |||
Encapsulation (GUE) for Network Virtualization Overlay" | Encapsulation (GUE) for Network Virtualization Overlay" | |||
draft-hy-nvo3-gue-4-nvo-03 | draft-hy-nvo3-gue-4-nvo-03 | |||
[GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) for | [TOU] Herbert, T., "Transport layer protocols over UDP" draft- | |||
Secure Transport" draft-hy-gue-4-secure-transport-03 | herbert-transports-over-udp-00 | |||
[CIRCBRK] Fairhurst, G., "Network Transport Circuit Breakers", | ||||
[TCPUDP] Chesire, S., Graessley, J., and McGuire, R., | [TCPUDP] Chesire, S., Graessley, J., and McGuire, R., | |||
"Encapsulation of TCP and other Transport Protocols over | "Encapsulation of TCP and other Transport Protocols over | |||
UDP" draft-cheshire-tcp-over-udp-00 | UDP" draft-cheshire-tcp-over-udp-00 | |||
[TOU] Herbert, T., "Transport layer protocols over UDP" draft- | ||||
herbert-transports-over-udp-00 | ||||
[GREUDP] Crabbe, E., Yong, L., Xu, X., and Herbert, T., "Generic | ||||
UDP Encapsulation for IP Tunneling" draft-ietf-tsvwg-gre- | ||||
in-udp-encap-19 | ||||
[GUT] Manner, J., Varia, N., and Briscoe, B., "Generic UDP | [GUT] Manner, J., Varia, N., and Briscoe, B., "Generic UDP | |||
Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt" | Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt" | |||
[CIRCBRK] Fairhurst, G., "Network Transport Circuit Breakers", | ||||
draft-ietf-tsvwg-circuit-breaker-15 | ||||
[LCO] Cree, E., https://www.kernel.org/doc/Documentation/ | [LCO] Cree, E., https://www.kernel.org/doc/Documentation/ | |||
networking/checksum-offloads.txt | networking/checksum-offloads.txt | |||
Appendix A: NIC processing for GUE | Appendix A: NIC processing for GUE | |||
This appendix provides some guidelines for Network Interface Cards | This appendix provides some guidelines for Network Interface Cards | |||
(NICs) to implement common offloads and accelerations to support GUE. | (NICs) to implement common offloads and accelerations to support GUE. | |||
Note that most of this discussion is generally applicable to other | Note that most of this discussion is generally applicable to other | |||
methods of UDP based encapsulation. | methods of UDP based encapsulation. | |||
This appendix is informational and does not constitute a normative | ||||
part of this document. | ||||
A.1. Receive multi-queue | A.1. Receive multi-queue | |||
Contemporary NICs support multiple receive descriptor queues (multi- | Contemporary NICs support multiple receive descriptor queues (multi- | |||
queue). Multi-queue enables load balancing of network processing for | queue). Multi-queue enables load balancing of network processing for | |||
a NIC across multiple CPUs. On packet reception, a NIC must select | a NIC across multiple CPUs. On packet reception, a NIC selects the | |||
the appropriate queue for host processing. Receive Side Scaling is a | appropriate queue for host processing. Receive Side Scaling is a | |||
common method which uses the flow hash for a packet to index an | common method which uses the flow hash for a packet to index an | |||
indirection table where each entry stores a queue number. Flow | indirection table where each entry stores a queue number. Flow | |||
Director and Accelerated Receive Flow Steering (aRFS) allow a host to | Director and Accelerated Receive Flow Steering (aRFS) allow a host to | |||
program the queue that is used for a given flow which is identified | program the queue that is used for a given flow which is identified | |||
either by an explicit five-tuple or by the flow's hash. | either by an explicit five-tuple or by the flow's hash. | |||
GUE encapsulation should be compatible with multi-queue NICs that | GUE encapsulation is compatible with multi-queue NICs that support | |||
support five-tuple hash calculation for UDP/IP packets as input to | five-tuple hash calculation for UDP/IP packets as input to RSS. The | |||
RSS. The flow entropy in the UDP source port ensures classification | flow entropy in the UDP source port ensures classification of the | |||
of the encapsulated flow even in the case that the outer source and | encapsulated flow even in the case that the outer source and | |||
destination addresses are the same for all flows (e.g. all flows are | destination addresses are the same for all flows (e.g. all flows are | |||
going over a single tunnel). | going over a single tunnel). | |||
By default, UDP RSS support is often disabled in NICs to avoid out of | By default, UDP RSS support is often disabled in NICs to avoid out- | |||
order reception that can occur when UDP packets are fragmented. As | of-order reception that can occur when UDP packets are fragmented. As | |||
discussed above, fragmentation of GUE packets should be mitigated by | discussed above, fragmentation of GUE packets is mostly avoided by | |||
fragmenting packets before entering a tunnel, GUE fragmentation, path | fragmenting packets before entering a tunnel, GUE fragmentation, path | |||
MTU discovery in higher layer protocols, or operator adjusting MTUs. | MTU discovery in higher layer protocols, or operator adjusting MTUs. | |||
Other UDP traffic may not implement such procedures to avoid | Other UDP traffic might not implement such procedures to avoid | |||
fragmentation, so enabling UDP RSS support in the NIC should be a | fragmentation, so enabling UDP RSS support in the NIC might be a | |||
considered tradeoff during configuration. | considered tradeoff during configuration. | |||
A.2. Checksum offload | A.2. Checksum offload | |||
Many NICs provide capabilities to calculate standard ones complement | Many NICs provide capabilities to calculate standard ones complement | |||
payload checksum for packets in transmit or receive. When using GUE | payload checksum for packets in transmit or receive. When using GUE | |||
encapsulation there are at least two checksums that may be of | encapsulation, there are at least two checksums that are of interest: | |||
interest: the encapsulated packet's transport checksum, and the UDP | the encapsulated packet's transport checksum, and the UDP checksum in | |||
checksum in the outer header. | the outer header. | |||
A.2.1. Transmit checksum offload | A.2.1. Transmit checksum offload | |||
NICs may provide a protocol agnostic method to offload transmit | NICs can provide a protocol agnostic method to offload transmit | |||
checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with | checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with | |||
GUE. In this method the host provides checksum related parameters in | GUE. In this method, the host provides checksum related parameters in | |||
a transmit descriptor for a packet. These parameters include the | a transmit descriptor for a packet. These parameters include the | |||
starting offset of data to checksum, the length of data to checksum, | starting offset of data to checksum, the length of data to checksum, | |||
and the offset in the packet where the computed checksum is to be | and the offset in the packet where the computed checksum is to be | |||
written. The host initializes the checksum field to pseudo header | written. The host initializes the checksum field to pseudo header | |||
checksum. | checksum. | |||
In the case of GUE, the checksum for an encapsulated transport layer | In the case of GUE, the checksum for an encapsulated transport layer | |||
packet, a TCP packet for instance, can be offloaded by setting the | packet, a TCP packet for instance, can be offloaded by setting the | |||
appropriate checksum parameters. | appropriate checksum parameters. | |||
NICs typically can offload only one transmit checksum per packet, so | NICs typically can offload only one transmit checksum per packet, so | |||
simultaneously offloading both an inner transport packet's checksum | simultaneously offloading both an inner transport packet's checksum | |||
and the outer UDP checksum is likely not possible. | and the outer UDP checksum is likely not possible. | |||
If an encapsulator is co-resident with a host, then checksum offload | If an encapsulator is co-resident with a host, then checksum offload | |||
may be performed using remote checksum offload (described in | may be performed using remote checksum offload (described in | |||
[GUEEXTENS]). Remote checksum offload relies on NIC offload of the | [GUEEXTENS]). Remote checksum offload relies on NIC offload of the | |||
simple UDP/IP checksum which is commonly supported even in legacy | simple UDP/IP checksum which is commonly supported even in legacy | |||
devices. In remote checksum offload the outer UDP checksum is set and | devices. In remote checksum offload, the outer UDP checksum is set | |||
the GUE header includes an option indicating the start and offset of | and the GUE header includes an option indicating the start and offset | |||
the inner "offloaded" checksum. The inner checksum is initialized to | of the inner "offloaded" checksum. The inner checksum is initialized | |||
the pseudo header checksum. When a decapsulator receives a GUE packet | to the pseudo header checksum. When a decapsulator receives a GUE | |||
with the remote checksum offload option, it completes the offload | packet with the remote checksum offload option, it completes the | |||
operation by determining the packet checksum from the indicated start | offload operation by determining the packet checksum from the | |||
point to the end of the packet, and then adds this into the checksum | indicated start point to the end of the packet, and then adds this | |||
field at the offset given in the option. Computing the checksum from | into the checksum field at the offset given in the option. Computing | |||
the start to end of packet is efficient if checksum-complete is | the checksum from the start to end of packet is efficient if | |||
provided on the receiver. | checksum-complete is provided on the receiver. | |||
Another alternative when an encapsulator is co-resident with a host | Another alternative when an encapsulator is co-resident with a host | |||
is to perform Local Checksum Offload [LCO]. In this method the inner | is to perform Local Checksum Offload [LCO]. In this method, the inner | |||
transport layer checksum is offloaded and the outer UDP checksum can | transport layer checksum is offloaded and the outer UDP checksum can | |||
be deduced based on the fact that the portion of the packet cover by | be deduced based on the fact that the portion of the packet covered | |||
the inner transport checksum will sum to zero (or at least the bit | by the inner transport checksum will sum to zero (or at least the bit | |||
wise not of the inner pseudo header). | wise "not" of the inner pseudo header). | |||
A.2.2. Receive checksum offload | A.2.2. Receive checksum offload | |||
GUE is compatible with NICs that perform a protocol agnostic receive | GUE is compatible with NICs that perform a protocol agnostic receive | |||
checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a | checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a | |||
NIC computes a ones complement checksum over all (or some predefined | NIC computes a ones complement checksum over all (or some predefined | |||
portion) of a packet. The computed value is provided to the host | portion) of a packet. The computed value is provided to the host | |||
stack in the packet's receive descriptor. The host driver can use | stack in the packet's receive descriptor. The host driver can use | |||
this checksum to "patch up" and validate any inner packet transport | this checksum to "patch up" and validate any inner packet transport | |||
checksum, as well as the outer UDP checksum if it is non-zero. | checksum, as well as the outer UDP checksum if it is non-zero. | |||
Many legacy NICs don't provide checksum-complete but instead provide | Many legacy NICs don't provide checksum-complete but instead provide | |||
an indication that a checksum has been verified (CHECKSUM_UNNECESSARY | an indication that a checksum has been verified (CHECKSUM_UNNECESSARY | |||
skipping to change at page 34, line 41 ¶ | skipping to change at page 35, line 18 ¶ | |||
stack in the packet's receive descriptor. The host driver can use | stack in the packet's receive descriptor. The host driver can use | |||
this checksum to "patch up" and validate any inner packet transport | this checksum to "patch up" and validate any inner packet transport | |||
checksum, as well as the outer UDP checksum if it is non-zero. | checksum, as well as the outer UDP checksum if it is non-zero. | |||
Many legacy NICs don't provide checksum-complete but instead provide | Many legacy NICs don't provide checksum-complete but instead provide | |||
an indication that a checksum has been verified (CHECKSUM_UNNECESSARY | an indication that a checksum has been verified (CHECKSUM_UNNECESSARY | |||
in Linux). Usually, such validation is only done for simple TCP/IP or | in Linux). Usually, such validation is only done for simple TCP/IP or | |||
UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the | UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the | |||
checksum-complete value for the UDP packet is the "not" of the pseudo | checksum-complete value for the UDP packet is the "not" of the pseudo | |||
header checksum. In this way, checksum-unnecessary can be converted | header checksum. In this way, checksum-unnecessary can be converted | |||
to checksum-complete. So if the NIC provides checksum-unnecessary for | to checksum-complete. So, if the NIC provides checksum-unnecessary | |||
the outer UDP header in an encapsulation, checksum conversion can be | for the outer UDP header in an encapsulation, checksum conversion can | |||
done so that the checksum-complete value is derived and can be used | be done so that the checksum-complete value is derived and can be | |||
by the stack to validate checksums in the encapsulated packet. | used by the stack to validate checksums in the encapsulated packet. | |||
A.3. Transmit Segmentation Offload | A.3. Transmit Segmentation Offload | |||
Transmit Segmentation Offload (TSO) is a NIC feature where a host | Transmit Segmentation Offload (TSO) is a NIC feature where a host | |||
provides a large (>MTU size) TCP packet to the NIC, which in turn | provides a large (greater than MTU size) TCP packet to the NIC, which | |||
splits the packet into separate segments and transmits each one. This | in turn splits the packet into separate segments and transmits each | |||
is useful to reduce CPU load on the host. | one. This is useful to reduce CPU load on the host. | |||
The process of TSO can be generalized as: | The process of TSO can be generalized as: | |||
- Split the TCP payload into segments which allow packets with | - Split the TCP payload into segments which allow packets with | |||
size less than or equal to MTU. | size less than or equal to MTU. | |||
- For each created segment: | - For each created segment: | |||
1. Replicate the TCP header and all preceding headers of the | 1. Replicate the TCP header and all preceding headers of the | |||
original packet. | original packet. | |||
skipping to change at page 35, line 27 ¶ | skipping to change at page 36, line 4 ¶ | |||
3. Set TCP sequence number to correctly reflect the offset of | 3. Set TCP sequence number to correctly reflect the offset of | |||
the TCP data in the stream. | the TCP data in the stream. | |||
4. Recompute and set any checksums that either cover the payload | 4. Recompute and set any checksums that either cover the payload | |||
of the packet or cover header which was changed by setting a | of the packet or cover header which was changed by setting a | |||
payload length. | payload length. | |||
Following this general process, TSO can be extended to support TCP | Following this general process, TSO can be extended to support TCP | |||
encapsulation in GUE. For each segment the Ethernet, outer IP, UDP | encapsulation in GUE. For each segment the Ethernet, outer IP, UDP | |||
header, GUE header, inner IP header if tunneling, and TCP headers are | header, GUE header, inner IP header (if tunneling), and TCP headers | |||
replicated. Any packet length header fields need to be set properly | are replicated. Any packet length header fields need to be set | |||
(including the length in the outer UDP header), and checksums need to | properly (including the length in the outer UDP header), and | |||
be set correctly (including the outer UDP checksum if being used). | checksums need to be set correctly (including the outer UDP checksum | |||
if being used). | ||||
To facilitate TSO with GUE it is recommended that extension fields | To facilitate TSO with GUE, it is recommended that extension fields | |||
should not contain values that must be updated on a per segment | do not contain values that need to be updated on a per segment basis. | |||
basis-- for example, extension fields should not include checksums, | For example, extension fields should not include checksums, lengths, | |||
lengths, or sequence numbers that refer to the payload. If the GUE | or sequence numbers that refer to the payload. If the GUE header does | |||
header does not contain such fields then the TSO engine only needs to | not contain such fields then the TSO engine only needs to copy the | |||
copy the bits in the GUE header when creating each segment and does | bits in the GUE header when creating each segment and does not need | |||
not need to parse the GUE header. | to parse the GUE header. | |||
A.4. Large Receive Offload | A.4. Large Receive Offload | |||
Large Receive Offload (LRO) is a NIC feature where packets of a TCP | Large Receive Offload (LRO) is a NIC feature where packets of a TCP | |||
connection are reassembled, or coalesced, in the NIC and delivered to | connection are reassembled, or coalesced, in the NIC and delivered to | |||
the host as one large packet. This feature can reduce CPU utilization | the host as one large packet. This feature can reduce CPU utilization | |||
in the host. | in the host. | |||
LRO requires significant protocol awareness to be implemented | LRO requires significant protocol awareness to be implemented | |||
correctly and is difficult to generalize. Packets in the same flow | correctly and is difficult to generalize. Packets in the same flow | |||
skipping to change at page 36, line 15 ¶ | skipping to change at page 36, line 42 ¶ | |||
fabricate a single meaningful header from all the coalesced packets. | fabricate a single meaningful header from all the coalesced packets. | |||
The conservative approach to supporting LRO for GUE would be to | The conservative approach to supporting LRO for GUE would be to | |||
assign packets to the same flow only if they have identical five- | assign packets to the same flow only if they have identical five- | |||
tuple and were encapsulated the same way. That is the outer IP | tuple and were encapsulated the same way. That is the outer IP | |||
addresses, the outer UDP ports, GUE protocol, GUE flags and fields, | addresses, the outer UDP ports, GUE protocol, GUE flags and fields, | |||
and inner five tuple are all identical. | and inner five tuple are all identical. | |||
Appendix B: Implementation considerations | Appendix B: Implementation considerations | |||
This appendix is informational and does not constitute a normative | ||||
part of this document. | ||||
B.1. Priveleged ports | B.1. Priveleged ports | |||
Using the source port to contain a flow entropy value disallows the | Using the source port to contain a flow entropy value disallows the | |||
security method of a receiver enforcing that the source port be a | security method of a receiver enforcing that the source port be a | |||
privileged port. Privileged ports are defined by some operating | privileged port. Privileged ports are defined by some operating | |||
systems to restrict source port binding. Unix, for instance, | systems to restrict source port binding. Unix, for instance, | |||
considered port number less than 1024 to be privileged. | considered port number less than 1024 to be privileged. | |||
Enforcing that packets are sent from a privileged port is widely | Enforcing that packets are sent from a privileged port is widely | |||
considered an inadequate security mechanism and has been mostly | considered an inadequate security mechanism and has been mostly | |||
deprecated. To approximate this behavior, an implementation could | deprecated. To approximate this behavior, an implementation could | |||
restrict a user from sending a packet destined to the GUE port | restrict a user from sending a packet destined to the GUE port | |||
without proper credentials. | without proper credentials. | |||
B.2. Setting flow entropy as a route selector | B.2. Setting flow entropy as a route selector | |||
An encapsulator generating flow entropy in the UDP source port may | An encapsulator generating flow entropy in the UDP source port could | |||
modulate the value to perform a type of multipath source routing. | modulate the value to perform a type of multipath source routing. | |||
Assuming that networking switches perform ECMP based on the flow | Assuming that networking switches perform ECMP based on the flow | |||
hash, a sender can affect the path by altering the flow entropy. For | hash, a sender can affect the path by altering the flow entropy. For | |||
instance, a host may store a flow hash in its PCB for an inner flow, | instance, a host can store a flow hash in its PCB for an inner flow, | |||
and may alter the value upon detecting that packets are traversing a | and might alter the value upon detecting that packets are traversing | |||
lossy path. Changing the flow entropy for a flow should be subject to | a lossy path. Changing the flow entropy for a flow SHOULD be subject | |||
hysteresis (at most once every thirty seconds) to limit the number of | to hysteresis (at most once every thirty seconds) to limit the number | |||
out of order packets. | of out of order packets. | |||
B.3. Hardware protocol implementation considerations | B.3. Hardware protocol implementation considerations | |||
A low level protocol, such is GUE, is likely interesting to being | Low level data path protocol, such is GUE, are often supported in | |||
supported by high speed network devices. Variable length header (VLH) | high speed network device hardware. Variable length header (VLH) | |||
protocols like GUE are often considered difficult to efficiently | protocols like GUE are often considered difficult to efficiently | |||
implement in hardware. In order to retain the important | implement in hardware. In order to retain the important | |||
characteristics of an extensible and robust protocol, hardware | characteristics of an extensible and robust protocol, hardware | |||
vendors may practice "constrained flexibility". In this model, only | vendors may practice "constrained flexibility". In this model, only | |||
certain combinations or protocol header parameterizations are | certain combinations or protocol header parameterizations are | |||
implemented in hardware fast path. Each such parameterization is | implemented in hardware fast path. Each such parameterization is | |||
fixed length so that the particular instance can be optimized as a | fixed length so that the particular instance can be optimized as a | |||
fixed length protocol. In the case of GUE this constitutes specific | fixed length protocol. In the case of GUE, this constitutes specific | |||
combinations of GUE flags, fields, and next protocol. The selected | combinations of GUE flags, fields, and next protocol. The selected | |||
combinations would naturally be the most common cases which form the | combinations would naturally be the most common cases which form the | |||
"fast path", and other combinations are assumed to take the "slow | "fast path", and other combinations are assumed to take the "slow | |||
path". | path". | |||
In time, needs and requirements of the protocol may change which may | In time, needs and requirements of the protocol may change which may | |||
manifest themselves as new parameterizations to be supported in the | manifest themselves as new parameterizations to be supported in the | |||
fast path. To allow allow this extensibility, a device practicing | fast path. To allow allow this extensibility, a device practicing | |||
constrained flexibility should allow the fast path parameterizations | constrained flexibility should allow the fast path parameterizations | |||
to be programmable. | to be programmable. | |||
Authors' Addresses | Authors' Addresses | |||
Tom Herbert | Tom Herbert | |||
Quantonium | ||||
1 Hacker Way | 4701 Patrick Henry | |||
Menlo Park, CA 94052 | Santa Clara, CA 95054 | |||
US | US | |||
Email: tom@herbertland.com | Email: tom@herbertland.com | |||
Lucy Yong | Lucy Yong | |||
Huawei USA | Huawei USA | |||
5340 Legacy Dr. | 5340 Legacy Dr. | |||
Plano, TX 75024 | Plano, TX 75024 | |||
US | US | |||
Email: lucy.yong@huawei.com | Email: lucy.yong@huawei.com | |||
Osama Zia | Osama Zia | |||
Microsoft | Microsoft | |||
End of changes. 179 change blocks. | ||||
518 lines changed or deleted | 517 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |