DNSEXT Working Group                                            M. Graff
Internet-Draft                                                  P. Vixie
Obsoletes: 2671 (if approved) 2671, 2673                        Internet Systems Consortium
(if approved)                                             March 25, 2010
Intended status: Standards Track                           July 28, 2009
Expires: January 29, September 26, 2010

                  Extension Mechanisms for DNS (EDNS0)
                 draft-ietf-dnsext-rfc2671bis-edns0-02
                 draft-ietf-dnsext-rfc2671bis-edns0-03

Abstract

   The Domain Name System's wire protocol includes a number of fixed
   fields whose range has been or soon will be exhausted and does not
   allow requestors to advertise their capabilities to responders.  This
   document describes backward compatible mechanisms for allowing the
   protocol to grow.

   This document updates the EDNS0 specification (RFC2671) based on 10
   years of deployment experience.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 29, September 26, 2010.

Copyright Notice

   Copyright (c) 2009 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info). document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.

Abstract

   The Domain Name System's wire protocol includes a number of fixed
   fields whose range has been or soon will be exhausted and does not
   allow requestors to advertise their capabilities to responders.  This  Code Components extracted from this document describes backward compatible mechanisms for allowing must
   include Simplified BSD License text as described in Section 4.e of
   the
   protocol to grow.

   This document updates Trust Legal Provisions and are provided without warranty as
   described in the EDNS0 specification based on 10 years of
   operational experience. BSD License.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Requirements Language  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  3
   3.  EDNS Support Requirement . . . . . . . . . . . . . . . . . . .  3
   4.  Affected Protocol Elements . . . . . . . . . . . . . . . . . .  3  4
     4.1.  Message Header . . . . . . . . . . . . . . . . . . . . . .  3  4
     4.2.  Label Types  . . . . . . . . . . . . . . . . . . . . . . .  4
     4.3.  UDP Message Size . . . . . . . . . . . . . . . . . . . . .  4
   5.  Extended Label Types . . . . . . . . . . . . . . . . . . . . .  4
   6.  OPT pseudo-RR  . . . . . . . . . . . . . . . . . . . . . . . .  4  5
     6.1.  OPT Record Behavior Definition  . . . . . . . . . . . . . . . . . . .  4  5
     6.2.  OPT Record Format  . . . . . . . . . . . . . . . . . . . .  5
     6.3.  Requestor's Payload Size  Caching behavior . . . . . . . . . . . . . . . . .  6 . . . .  7
     6.4.  Responder's Payload Size  Fallback . . . . . . . . . . . . . . . . .  6 . . . . . . . .  7
     6.5.  Requestor's Payload Size Selection . . . . . . . . . . . . . . . . . .  7
     6.6.  Middleware Boxes  Responder's Payload Size . . . . . . . . . . . . . . . . .  7
     6.7.  Payload Size Selection . . . . . .  7
     6.7.  Extended RCODE . . . . . . . . . . . .  8
     6.8.  Middleware Boxes . . . . . . . . . . .  7
     6.8.  OPT Options Type Allocation Procedure . . . . . . . . . .  8
   7.  Transport Considerations
     6.9.  OPT Record TTL Field Use . . . . . . . . . . . . . . . . .  8
     6.10. Flags  . . . . . .  8
   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  9
   9.  IANA Considerations .  9
     6.11. OPT Options Code Allocation Procedure  . . . . . . . . . .  9
   7.  Transport Considerations . . . . . . . . . .  9
   10. Acknowledgements . . . . . . . . .  9
   8.  Security Considerations  . . . . . . . . . . . . . . 10
   11. References . . . . . 10
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
     11.1. Normative References
   Appendix A.   Document Editing History . . . . . . . . . . . . . . 11
   Appendix A.1. Changes since RFC2671  . . . . . 10
     11.2. Informative References . . . . . . . . . . 11
   Appendix A.2. Changes since -02  . . . . . . . . 10
   Authors' Addresses . . . . . . . . . 11
   10. References . . . . . . . . . . . . . . . 10 . . . . . . . . . . . 12
     10.1. Normative References . . . . . . . . . . . . . . . . . . . 12
     10.2. Informative References . . . . . . . . . . . . . . . . . . 12
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12

1.  Introduction

   DNS [RFC1035] specifies a Message Format and within such messages
   there are standard formats for encoding options, errors, and name
   compression.  The maximum allowable size of a DNS Message is fixed.
   Many of DNS's protocol limits are too small for uses which are or
   which are desired to become common.  There is no way for
   implementations to advertise their capabilities.

   Unextended agents will not know how to interpret the protocol
   extensions detailed here.  In practice, these clients will be
   upgraded when they have need of a new feature, and only new features
   will make use of the extensions.  Extended agents must be prepared
   for behaviour behavior of unextended clients in the face of new protocol
   elements, and fall back gracefully to unextended DNS.  [RFC2671]
   originally
   proposed extensions to the basic DNS protocol to overcome these
   deficiencies.  This memo refines that specification and obsoletes
   [RFC2671].

   [RFC2671] specified extended label types.  The only one ever proposed
   was in RFC2673 for a label type called "Bitstring Labels."  For
   various reasons introducing a new label type was found to be
   extremely difficult, and RFC2673 was moved to Experimental.  This
   document Obsoletes Extended Labels.

2.  Requirements Language  Terminology

   "Requestor" is the side which sends a request.  "Responder" is an
   authoritative, recursive resolver, or other DNS component which
   responds to questions.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

3.  EDNS Support Requirement

   EDNS support is manditory mandatory in a modern world.  DNSSEC requires EDNS
   support, and many other featres Features are made possible only by EDNS
   support to request or advertise them.  Many organizations are
   beginning to require DNSSEC.  Without common interoperability, DNSSEC
   cannot be as easily deployed.

   DNS publishers are wanting to put more data in answers.  DNSSEC
   DNSKEY records, negative answers, and many other DNSSEC queries cause
   larger answers to be returned.  In order to support this, DNS
   servers, middleware, and stub resolvers MUST support larger packet
   sizes advertised via EDNS0.

4.  Affected Protocol Elements

4.1.  Message Header

   The DNS Message Header's (see , section 4.1.1 [RFC1035]) second full 16-bit word is divided into a
   4-bit OPCODE, a 4-bit RCODE, and a number of 1-bit flags.  The original reserved Z bits flags (see ,
   section 4.1.1 [RFC1035]).  Some of these were marked for future use,
   and most these have since been
   allocated to various purposes, and allocated.  Also, most of the RCODE
   values are now in use.  More flags and more possible RCODEs are needed.  The OPT pseudo-RR specified below contains
   subfields that carry a bit field extension of the RCODE field and
   additional flag bits, respectively.

4.2.  Label Types

   The first two bits of a wire format domain label are used to denote
   the type of the label. ,section 4.1.4  [RFC1035] allocates two of the four possible
   types and reserves the other two.  More label types were proposed defined in [RFC2671] section 3.
   [RFC2671].

4.3.  UDP Message Size

   Traditional DNS Messages are limited to 512 octets in size when sent
   over UDP.
   While the minimum maximum reassembly buffer size still allows a limit
   of 512 octets of UDP payload, most of the hosts now connected ([RFC1035]).  Today, many organizations wish to the
   Internet return many
   records in a single reply, and special tricks are able needed to reassemble make the
   responses fit in this 512-byte limit.  Additionally, DNSSEC
   signatures can easily generate a much larger datagrams.  Some mechanism
   must be created to allow requestors response than a 512 byte
   message can hold.

   EDNS0 is intended to advertise address these larger buffer packet sizes and continue
   to responders.  To this end, the OPT pseudo-RR specified below
   contains use UDP.  It specifies a maximum payload way to advertise additional features such
   as larger response size field. capability, which is intended to help avoid
   truncated UDP responses which then cause retry over TCP.

5.  Extended Label Types

   The first octet in the on-the-wire representation of a DNS label
   specifies the label type; the basic DNS specification [RFC1035]
   dedicates the two most significant bits of that octet for this
   purpose.

   This document reserves

   [RFC2671] defined DNS label type 0b01 for use as an indication for
   Extended Label Types.  A specific extended label type Extended Label Type is selected by
   the 6 least significant bits of the first octet.  Thus, Extended
   Label Types are indicated by the values 64-127 (0b01xxxxxx) in the
   first octet of the label.

   This document does not describe any specific Extended Label Type.

   In practice, Extended Label Types are difficult to use due to support
   in clients and intermediate gateways.  Therefore, the registry of
   Extended Label Types is requested to be closed.  They cause
   interoperability problems and at present no defined label types are
   in use.

   Bitstring labels were originally created to solve problems with IPv6
   reverse zones.  Due to the problems of introducing a new label type
   they were moved to experimental.  This document moves them from
   experimental to historical, making them obsoleted.

6.  OPT pseudo-RR

6.1.  OPT Record Behavior

   One Definition

   An OPT pseudo-RR (RR type 41) (sometimes called a meta-RR) MAY be added to the
   additional data section of a request.

   The OPT RR has been assigned RR type 41.

   If present in requests, compliant responders
   which implement EDNS MUST include an OPT
   record in non-truncated
   responses, and SHOULD attempt to include them in all responses.

   An OPT is called a pseudo-RR because it pertains to a particular
   transport level message and record does not to carry any actual DNS data.  It is used only to
   contain control information pertaining to the question and answer
   sequence of a specific transaction.  OPT RRs MUST NOT be cached,
   forwarded, or stored in or loaded from master files.

   The quantity of OPT pseudo-RRs per RR MAY be placed anywhere within the additional data section.
   Only one OPT RR MAY be included within any DNS message.  If a message
   with more than one OPT RR is received, a FORMERR MUST be either zero or
   one, but not greater. returned.

6.2.  OPT Record Format

   An OPT RR has a fixed part and a variable set of options expressed as
   {attribute, value} pairs.  The fixed part holds some DNS meta data
   and also a small collection of basic extension elements which we
   expect to be so popular that it would be a waste of wire space to
   encode them as {attribute, value} pairs.

   The fixed part of an OPT RR is structured as follows:

       +------------+--------------+------------------------------+
       | Field Name | Field Type   | Description                  |
       +------------+--------------+------------------------------+
       | NAME       | domain name  | empty (root domain)          |
       | TYPE       | u_int16_t    | OPT                          |
       | CLASS      | u_int16_t    | requestor's UDP payload size |
       | TTL        | u_int32_t    | extended RCODE and flags     |
       | RDLEN      | u_int16_t    | describes RDATA              |
       | RDATA      | octet stream | {attribute,value} pairs      |
       +------------+--------------+------------------------------+

                               OPT RR Format

   The variable part of an OPT RR is encoded in its RDATA and is
   structured as zero or more of the following:

                  +0 (MSB)                            +1 (LSB)
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    0: |                          OPTION-CODE                          |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    2: |                         OPTION-LENGTH                         |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    4: |                                                               |
       /                          OPTION-DATA                          /
       /                                                               /
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

   OPTION-CODE
         Assigned by Expert Review.

   OPTION-LENGTH
         Size (in octets) of OPTION-DATA.

   OPTION-DATA
         Varies per OPTION-CODE.

   Order

   The order of appearance of option tuples is never relevant.  Any not guaranteed.  If one
   option
   whose meaning is affected by other modifies the behavior of another or multiple options is so affected no matter
   which are
   related to one comes first another in some way, they have the OPT RDATA. same effect
   regardless of ordering in the RDATA wire encoding.

   Any OPTION-CODE values not understood by a responder or requestor
   MUST be ignored.  Specifications of such options might wish to
   include some kind of signalled signaled acknowledgement.  For example, an
   option specification might say that if a responder sees option XYZ,
   it SHOULD MUST include option XYZ in its response.

6.3.  Caching behavior

   The OPT record must not be cached.

6.4.  Fallback

   If a requestor detects that the remote end does not support EDNS0, it
   MAY issue queries without an OPT record.  It MAY cache this knowledge
   for a brief time in order to avoid fallback delays in the future.
   However, if DNSSEC is required, no fallback should be performed as
   DNSSEC is only signaled through EDNS0.

6.5.  Requestor's Payload Size

   The requestor's UDP payload size (which OPT stores in the RR CLASS
   field) is the number of octets of the largest UDP payload that can be
   reassembled and delivered in the requestor's network stack.  Note
   that path MTU, with or without fragmentation, may be smaller than
   this.  Values lower than 512 MUST be treated as equal to 512.

   Note

   Requestors SHOULD place a value in this field that it can actually
   receive.  For example, if a 512-octet UDP payload requires requestor sits behind a 576-octet firewall which
   will block fragmented IP reassembly
   buffer.  Choosing 1280 for IPv4 over Ethernet packets, a requestor SHOULD not choose a
   value which will cause fragmentation.  Doing so will prevent large
   responses from being received, and can cause fallback to occur.

   Note that a 512-octet UDP payload requires a 576-octet IP reassembly
   buffer.  Choosing 1280 for IPv4 over Ethernet would be reasonable.
   The consequence of choosing too large
   Choosing a very large value will guarantee fragmentation at the IP
   layer, and may be an ICMP message prevent answers from an intermediate gateway, or even being received due to a silent drop of the response
   message. single
   fragment loss or misconfigured firewalls.

   The requestor's maximum payload size can change over time, and time.  It MUST
   therefore
   not be cached for use beyond the transaction in which it is
   advertised.

6.4.

6.6.  Responder's Payload Size

   The responder's maximum payload size can change over time, but can be
   reasonably expected to remain constant between two sequential
   transactions; for example, a meaningless QUERY to discover a
   responder's maximum UDP payload size, followed immediately by an
   UPDATE which takes advantage of this size.  (This  This is considered
   preferrable
   preferable to the outright use of TCP for oversized requests, if
   there is any reason to suspect that the responder implements EDNS,
   and if a request will not fit in the default 512 payload size limit.)

6.5. limit.

6.7.  Payload Size Selection

   Due to transaction overhead, it is unwise to advertise an
   architectural limit as a maximum UDP payload size.  Just because your
   stack can reassemble 64KB datagrams, don't assume that you want to
   spend more than about 4KB of state memory per ongoing transaction.

   A requestor MAY choose to implement a fallback to smaller advertised
   sizes to work around firewall or other network limitations.  A
   requestor SHOULD choose to use a fallback mechanism which begins with
   a large size, such as 4096.  If that fails, a fallback around the
   1220
   1280 byte range SHOULD be tried, as it has a reasonable chance to fit
   within a single Ethernet frame.  Failing that, a requestor MAY choose
   a 512 byte packet, which with large answers may cause a TCP retry.

6.6.

6.8.  Middleware Boxes

   Middleware boxes MUST NOT limit DNS messages over UDP to 512 bytes.

   Middleware boxes which simply forward requests to a recursive
   resolver MUST NOT modify the OPT record contents in either direction.

6.7.  Extended RCODE

   Middleware boxes which have additional functionality, such as
   answering certain queries or acting like an intelligent forwarder,
   MUST understand the OPT record.  These boxes MUST consider the
   incoming request and any outgoing requests as separate transactions
   if the characteristics of the messages are different.

6.9.  OPT Record TTL Field Use

   The extended RCODE and flags (which OPT stores in the RR TTL field)
   are structured as follows:

                  +0 (MSB)                            +1 (LSB)
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    0: |         EXTENDED-RCODE        |            VERSION            |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    2: | DO|                           Z                               |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

   EXTENDED-RCODE
         Forms upper 8 bits of extended 12-bit RCODE.  Note that
         EXTENDED-RCODE value "0" 0 indicates that an unextended RCODE is in
         use (values "0" 0 through "15"). 15).

   VERSION
         Indicates the implementation level of whoever sets it.  Full
         conformance with this specification is indicated by version
         ``0.''  Requestors are encouraged to set this to the lowest
         implemented level capable of expressing a transaction, to
         minimize the responder and network load of discovering the
         greatest common implementation level between requestor and
         responder.  A requestor's version numbering strategy MAY
         ideally be a run time configuration option.
         If a responder does not implement the VERSION level of the
         request, then it answers with RCODE=BADVERS.  All responses
         MUST be limited in format to the VERSION level of the request,
         but the VERSION of each response SHOULD be the highest
         implementation level of the responder.  In this way a requestor
         will learn the implementation level of a responder as a side
         effect of every response, including error responses and
         including RCODE=BADVERS.

6.10.  Flags

   DO
         DNSSEC OK bit as defined by [RFC3225].

   Z
         Set to zero by senders and ignored by receivers, unless
         modified in a subsequent specification.

6.8.

6.11.  OPT Options Type Code Allocation Procedure

   Allocations assigned by expert review.  TBD  Assignment of Option Codes
   should be liberal, but duplicate functionality is to be avoided.

7.  Transport Considerations

   The presence of an OPT pseudo-RR in a request should be taken as an
   indication that the requestor fully implements the given version of
   EDNS, and can correctly understand any response that conforms to that
   feature's specification.

   Lack of presence of an OPT record in a request MUST be taken as an
   indication that the requestor does not implement any part of this
   specification and that the responder MUST NOT use any protocol
   extension described here include an OPT record
   in its response.

   Responders who do not implement these protocol extensions MUST
   respond with FORMERR messages without any OPT record.

   If there is a problem with processing the OPT record itself, such as
   an option value that is badly formatted or includes out of range
   values, a FORMERR MAY MUST be retured. returned.  If this occurs the response MUST
   include an OPT record.  This MAY be used is intended to allow the requestor to to
   distinguish between servers whcih which do not implement EDNS and format
   errors within EDNS.

   If EDNS is used in a request, and the

   The minimal response arrives with TC set
   and with no EDNS OPT RR, a requestor SHOULD assume that truncation
   prevented the OPT RR from being appended by must be the responder, DNS header, question section, and
   further, that EDNS is not used in the response.  Correspondingly, an
   EDNS responder who cannot fit all necessary elements (including an
   OPT RR) into a response, SHOULD respond with a normal (unextended)
   DNS response, possibly setting TC if the record.  This must also occur when an truncated response will not fit in (using
   the
   unextended response message's 512-octet size. DNS header's TC bit) is returned.

8.  Security Considerations

   Requestor-side specification of the maximum buffer size may open a
   new DNS denial of service attack if responders can be made to send
   messages which are too large for intermediate gateways to forward,
   thus leading to potential ICMP storms between gateways and
   responders.

   Announcing very large UDP buffer sizes may result in dropping by
   firewalls.  This could cause retransmissions with no hope of success.
   Some devices reject fragmented UDP packets.

   Announcing too small UDP buffer sizes may result in fallback to TCP.
   This is especially important with DNSSEC, where answers are much
   larger.

9.  IANA Considerations

   The IANA has assigned RR type code 41 for OPT.

   [RFC2671] specified a number of IANA sub-registries within "DOMAIN
   NAME SYSTEM PARAMETERS:" "EDNS

   o  EDNS Extended Label Type", "EDNS Type

   o  EDNS Option
   Codes", "EDNS Codes

   o  EDNS Version Numbers", and "Domain Numbers

   o  Domain System Response Code." Code

   IANA is advised to re-parent these subregistries sub-registries to this document.

   RFC 2671

   [RFC2671] created an extended label type registry. the "EDNS Extended Label Type Registry".  We
   request that this registry be closed.

   This document assigns extended label type 0bxx111111 as "Reserved for
   future extended label types."  We request that IANA record this
   assignment.

   This document assigns option code 65535 in the "EDNS Option Codes"
   registry to "Reserved for future expansion."

   This document

   [RFC2671] expands the RCODE space from 4 bits to 12 bits.  This
   will allow IANA to assign
   allows more than the 16 distinct RCODE values allowed in RFC 1035 [RFC1035].
   IETF Standards Action is required to add a new RCODE.  Adding new
   RCODEs should be avoided due to the difficulty in upgrading the
   installed base.

   This document assigns EDNS Extended RCODE "16" 16 to "BADVERS".

   IESG approval should

   IETF Standards Action is required for assignments of new EDNS0 flags.
   Flags SHOULD be used only when necessary for DNS resolution to
   function.  For many uses, a EDNS Option Code may be preferred.

   IETF Standards Action is required to create new entries in the EDNS
   Extended Label Type or EDNS
   Version Number registries, while any
   published RFC (including Informational, Experimental, or BCP) should
   be grounds registry.  Expert Review is required for allocation of
   an EDNS Option Code.

10.  Acknowledgements

   Paul Mockapetris, Mark Andrews, Robert Elz, Don Lewis, Bob Halley,
   Donald Eastlake, Rob Austein, Matt Crawford, Randy Bush,

Appendix A.  Document Editing History

   Following is a list of high-level changes made to the original
   RFC2671.

Appendix A.1.  Changes since RFC2671

   o  Support for the OPT record is now mandatory.

   o  Extended label types obsoleted and Thomas
   Narten were each instrumental the registry is closed.

   o  The bitstring label type, which was already moved from draft to
      experimental, is requested to be moved to historical.

   o  Changes in creating how EDNS buffer sizes are selected, with
      recommendations on how to select them.

   o  Front material (IPR notice and refining this
   specification.

11. such) was updated to current
      requirements.

Appendix A.2.  Changes since -02

   o  Specified the method for allocation of constants.

   o  Cleaned up a lot of wording, along with quite a bit of document
      structure changes.

10.  References

11.1.

10.1.  Normative References

   [RFC1035]  Mockapetris, P., "Domain names - implementation and
              specification", STD 13, RFC 1035, November 1987.

   [RFC2671]  Vixie, P., "Extension Mechanisms for DNS (EDNS0)",
              RFC 2671, August 1999.

   [RFC3225]  Conrad, D., "Indicating Resolver Support of DNSSEC",
              RFC 3225, December 2001.

11.2.

10.2.  Informative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

Authors' Addresses

   Michael Graff
   Internet Systems Consortium
   950 Charter Street
   Redwood City, California  94063
   US

   Phone: +1 650.423.1304
   Email: mgraff@isc.org

   Paul Vixie
   Internet Systems Consortium
   950 Charter Street
   Redwood City, California  94063
   US

   Phone: +1 650.423.1301
   Email: vixie@isc.org