--- 1/draft-ietf-intarea-gue-08.txt 2019-10-26 11:13:08.614041615 -0700 +++ 2/draft-ietf-intarea-gue-09.txt 2019-10-26 11:13:08.670042408 -0700 @@ -1,21 +1,21 @@ Internet Area WG T. Herbert Internet-Draft Quantonium Intended status: Standard track L. Yong -Expires April 6, 2020 Independent +Expires April 28, 2020 Independent O. Zia Microsoft - October 4, 2019 + October 26, 2019 Generic UDP Encapsulation - draft-ietf-intarea-gue-08 + draft-ietf-intarea-gue-09 Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. @@ -24,21 +24,21 @@ and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html - This Internet-Draft will expire on April 6, 2020. + This Internet-Draft will expire on April 28, 2020. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -74,77 +74,78 @@ 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . . 5 1.2. Terminology and acronyms . . . . . . . . . . . . . . . . . 6 1.3. Requirements Language . . . . . . . . . . . . . . . . . . . 7 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 8 2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . . 8 3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 9 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 10 3.2.1. Proto field . . . . . . . . . . . . . . . . . . . . . . 10 3.2.2. Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 - 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 11 - 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 11 + 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 12 + 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 12 3.3.2. Example GUE header with extension fields . . . . . . . 12 - 3.4. Surplus space . . . . . . . . . . . . . . . . . . . . . . . 12 + 3.4. Surplus space . . . . . . . . . . . . . . . . . . . . . . . 13 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 13 - 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 - 4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 14 - 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 15 - 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 - 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 - 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 17 - 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 - 5.4.1. Processing a received data message . . . . . . . . . . 17 - 5.4.2. Processing a received control message . . . . . . . . . 18 - 5.5. Middlebox inspection . . . . . . . . . . . . . . . . . . . 18 - 5.6. Router and switch operation . . . . . . . . . . . . . . . . 19 - 5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 19 - 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 5.7. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 20 - 5.8. UDP Checksum Handling . . . . . . . . . . . . . . . . . . . 20 - 5.8.1. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 - 5.8.2. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 21 - 5.9. Congestion Considerations . . . . . . . . . . . . . . . . . 24 - 5.9.1. GUE tunnels . . . . . . . . . . . . . . . . . . . . . . 24 - 5.9.2 Transport layer encapsulation . . . . . . . . . . . . . 25 - 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 25 - 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 25 - 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 25 - 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 26 - 5.12. Negotiation of acceptable flags and extension fields . . . 27 - 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 27 - 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 27 - 6.2. Comparison of GUE to other encapsulations . . . . . . . . . 28 - 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 30 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 30 - 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 30 - 8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 31 - 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 31 - 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31 - 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32 - 10.1. Normative References . . . . . . . . . . . . . . . . . . . 32 - 10.2. Informative References . . . . . . . . . . . . . . . . . . 33 - Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 36 - A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 36 - A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 36 - A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 37 - A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 37 - A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 38 - A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 39 - Appendix B: Implementation considerations . . . . . . . . . . . . 39 - B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 39 - B.2. Setting flow entropy as a route selector . . . . . . . . . 40 - B.3. Hardware protocol implementation considerations . . . . . . 40 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 + 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 14 + 4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 + 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 15 + 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 16 + 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 + 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 17 + 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17 + 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 18 + 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 18 + 5.4.1. Processing a received data message . . . . . . . . . . 18 + 5.4.2. Processing a received control message . . . . . . . . . 19 + 5.5. Middlebox inspection . . . . . . . . . . . . . . . . . . . 19 + 5.6. Router and switch operation . . . . . . . . . . . . . . . . 20 + 5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 20 + 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 21 + 5.7. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 + 5.8. UDP Checksum Handling . . . . . . . . . . . . . . . . . . . 21 + 5.8.1. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 21 + 5.8.2. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 22 + 5.9. Congestion Considerations . . . . . . . . . . . . . . . . . 25 + 5.9.1. GUE tunnels . . . . . . . . . . . . . . . . . . . . . . 25 + 5.9.2 Transport layer encapsulation . . . . . . . . . . . . . 26 + 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 26 + 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 26 + 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 26 + 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 27 + 5.12. Negotiation of acceptable flags and extension fields . . . 28 + 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 28 + 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 28 + 6.2. Comparison of GUE to other encapsulations . . . . . . . . . 29 + 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 31 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 31 + 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 31 + 8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 32 + 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 32 + 8.4 Control Type Experimental Identifiers . . . . . . . . . . . 32 + 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 33 + 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34 + 10.1. Normative References . . . . . . . . . . . . . . . . . . . 34 + 10.2. Informative References . . . . . . . . . . . . . . . . . . 35 + Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 38 + A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 38 + A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 38 + A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 39 + A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 39 + A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 40 + A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 41 + Appendix B: Implementation considerations . . . . . . . . . . . . 41 + B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 41 + B.2. Setting flow entropy as a route selector . . . . . . . . . 42 + B.3. Hardware protocol implementation considerations . . . . . . 42 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 43 1. Introduction This specification describes Generic UDP Encapsulation (GUE) which is a general method for encapsulating packets of arbitrary IP protocols within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating packets in UDP facilitates efficient transport across networks. Networking devices widely provide protocol specific processing and optimizations for UDP (as well as TCP) packets. Packets for atypical IP protocols (those not usually parsed by networking hardware) can be @@ -393,35 +394,71 @@ the GUE payload does not begin with the header of an IP protocol. This would be the case, for instance, if the GUE payload were a fragment when performing GUE level fragmentation. The interpretation of the payload is performed through other means such as flags and extension fields, and nodes MUST NOT parse packets based on the IP protocol number in this case. 3.2.2. Ctype field When the C-bit is set, the proto/ctype field MUST be set to a valid - control message type. A value of zero indicates that the GUE payload - requires further interpretation to deduce the control type. This - might be the case when the payload is a fragment of a control - message, where only the reassembled packet can be interpreted as a - control message. + control message type. Control messages will be defined in an IANA + registry. Type 0 and type 255 are specified in this document, type 1 + through 254 are reserved and may be defined in standards. - Control messages will be defined in an IANA registry. Control message - types 1 through 127 may be defined in standards. Types 128 through - 255 are reserved to be user defined for experimentation. + Type 0 indicates that the GUE payload is a control message, or part + of a control message that cannot be correctly parsed or interpreted + without additional context. This might be the case when the payload + is a fragment of a control message, where only the reassembled packet + can be interpreted as a control message. - This document does not specify any standard control message types - other than type 0. Type 0 indicates that the GUE payload is a control - message, or part of a control message (as might be the case in GUE - fragmentation) that cannot be correctly parsed or interpreted without - additional context. + Type 255 is reserved for experimentation. When this control type is + set the first four bytes of the GUE payload (control message) are an + experiment identifier (ExId). The ExID is used to differentiate + experiments (similar to the experimental identifier defined for TCP + options in [RFC6994]). A control message of type 255 MUST include an + ExID. + + The format of a GUE control message with the experimental control + message type is: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ + | Source port | Destination port | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP + | Length | Checksum | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ + | 0 |1| Hlen | 255 | Flags |\ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | | GUE + ~ Extensions Fields (optional) ~ | + | | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ + | ExID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + ~ Control message ~ + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Note that the ExID is not part of the GUE header, it is in the + payload. In particular, the ExID is not accounted for in the GUE + Hlen. + + ExIDs are selected at design time, when the protocol designer first + implements or specifies the experimental control message. An ExID is + thirty-two bits. The value is stored in the header in network- + standard (big-endian) byte order. + + ExIDs are registered with IANA using "first come, first served" + (FCFS) priority. ExIDs MUST be unique. 3.3. Flags and extension fields Flags and associated extension fields are the primary mechanism of extensibility in GUE. As mentioned in section 3.1, GUE header flags indicate the presence of optional extension fields in the GUE header. [GUEEXTEN] defines an initial set of GUE extensions. 3.3.1. Requirements @@ -735,20 +772,31 @@ This packet is then resubmitted into the protocol stack to be processed as an IPv4 encapsulated packet. 5.4.2. Processing a received control message If a valid control message is received, the packet MUST be processed as a control message. The specific processing to be performed depends on the value in the ctype field of the GUE header. + If an experimental control message is received (ctype is 255) then + the ExID MUST be processed. The ExID is used to identify the + particular experimental control message. + + If a receiver does not recognize a control message type, or an + experimental identifier in an experimental control message, then the + packet MUST be dropped and and error message MAY be logged. If a GUE + control message is received with control type 255 and the length of + the GUE payload is less than four, the size of the ExId, then the + packet MUST be dropped and an error message MAY be logged. + 5.5. Middlebox inspection A middlebox MAY inspect a GUE header. A middlebox MUST NOT modify a GUE header or UDP payload. To inspect a GUE header, a middlebox needs to identify GUE packets. The obvious method is to match the destination UDP port number to be the GUE port number (i.e. 6080). Per [RFC7605], transport port numbers only have meaning at the endpoints of communications, so inferring the type of a UDP payload based on port number may be @@ -1320,36 +1369,73 @@ are assigned in accordance with RFC Required policy [RFC5226]. +----------------+------------------+---------------+ | Control type | Description | Reference | +----------------+------------------+---------------+ | 0 | Control payload | This document | | | needs more | | | | context for | | | | interpretation | | | | | | - | 1..127 | Unassigned | | + | 1..254 | Unassigned | | | | | | - | 128..255 | Experimental | This document | + | 255 | Experimental | This document | +----------------+------------------+---------------+ +8.4 Control Type Experimental Identifiers + + IANA is requested to create a "GUE Control Type Experimental + Identifiers (GUE Control ExIDs)" registry. The registry records 32- + bit ExIDs, as well as a reference (description, document pointer, + assignee name, and e-mail contact) for each entry. + + Entries are assigned on a First Come, First Served (FCFS) basis + [RFC5226]. The registry operates FCFS on the entire ExID (in network- + standard order). + + IANA will advise applicants of duplicate entries to select an + alternate value, as per typical FCFS processing. + + IANA will record known duplicate uses to assist the community in both + debugging assigned uses as well as correcting unauthorized duplicate + uses. + + IANA should impose no requirements on making a registration other + than indicating the desired codepoint and providing a point of + contact. A short description or acronym for the use is desired but + should not be required. + + Initial assignments are: + + +----------------+----------------+---------------+ + | ExI D | Description | Reference | + +----------------+----------------+---------------+ + | 1..x0ffffffff | Unassigned | | + +----------------+----------------+---------------+ + 9. Acknowledgements The authors would like to thank David Liu, Erik Nordmark, Fred Templin, Adrian Farrel, Bob Briscoe, Murray Kucherawy, Mirja - Kuhlewind, and David Black for valuable input on this draft. Special - thanks to Fred Templin who is serving as document shepherd. + Kuhlewind, David Black, Joe Touch, and Greg Mirsky for valuable input + on this draft. Special thanks to Fred Templin who is serving as + document shepherd. 10. References 10.1. Normative References + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, DOI + 10.17487/RFC2119, March 1997, . + [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, August 1980, . [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, March 2017, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI @@ -1390,20 +1476,24 @@ 6335, DOI 10.17487/RFC6335, August 2011, . [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 5226, DOI 10.17487/RFC5226, May 2008, . 10.2. Informative References + [RFC6994] Touch, J., "Shared Use of Experimental TCP Options", RFC + 6994, DOI 10.17487/RFC6994, August 2013, . + [RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, March 2017, . [RFC7605] Touch, J., "Recommendations on Using Assigned Transport Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, August 2015, . [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address Translation (NAT) Behavioral Requirements for Unicast