DNSEXT Working Group                                Paul Vixie, ISC
INTERNET-DRAFT
<draft-ietf-dnsext-rfc2671bis-edns0-01.txt>          March 17, 2008

Intended Status: Standards Track                                            M. Graff
Internet-Draft                                                  P. Vixie
Obsoletes: 2671 (if approved)

              Revised extension mechanisms                Internet Systems Consortium
Intended status: Standards Track                           July 28, 2009
Expires: January 29, 2010

                  Extension Mechanisms for DNS (EDNS0)
                 draft-ietf-dnsext-rfc2671bis-edns0-02

Status of this Memo
   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she

   This Internet-Draft is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, submitted to IETF in accordance full conformance with Section 6 the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on January 29, 2010.

Copyright Notice

   Copyright (C) The (c) 2009 IETF Trust (2007). and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   The Domain Name System's wire protocol includes a number of fixed
   fields whose range has been or soon will be exhausted and does not
   allow clients requestors to advertise their capabilities to servers. responders.  This
   document describes backward compatible mechanisms for allowing the
   protocol to grow.

1 -

   This document updates the EDNS0 specification based on 10 years of
   operational experience.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Requirements Language  . . . . . . . . . . . . . . . . . . . .  3
   3.  EDNS Support Requirement . . . . . . . . . . . . . . . . . . .  3
   4.  Affected Protocol Elements . . . . . . . . . . . . . . . . . .  3
     4.1.  Message Header . . . . . . . . . . . . . . . . . . . . . .  3
     4.2.  Label Types  . . . . . . . . . . . . . . . . . . . . . . .  4
     4.3.  UDP Message Size . . . . . . . . . . . . . . . . . . . . .  4
   5.  Extended Label Types . . . . . . . . . . . . . . . . . . . . .  4
   6.  OPT pseudo-RR  . . . . . . . . . . . . . . . . . . . . . . . .  4
     6.1.  OPT Record Behavior  . . . . . . . . . . . . . . . . . . .  4
     6.2.  OPT Record Format  . . . . . . . . . . . . . . . . . . . .  5
     6.3.  Requestor's Payload Size . . . . . . . . . . . . . . . . .  6
     6.4.  Responder's Payload Size . . . . . . . . . . . . . . . . .  6
     6.5.  Payload Size Selection . . . . . . . . . . . . . . . . . .  7
     6.6.  Middleware Boxes . . . . . . . . . . . . . . . . . . . . .  7
     6.7.  Extended RCODE . . . . . . . . . . . . . . . . . . . . . .  7
     6.8.  OPT Options Type Allocation Procedure  . . . . . . . . . .  8
   7.  Transport Considerations . . . . . . . . . . . . . . . . . . .  8
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 10
     11.2. Informative References . . . . . . . . . . . . . . . . . . 10
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10

1.  Introduction

1.1.

   DNS (see [RFC1035]) [RFC1035] specifies a Message Format and within such messages
   there are standard formats for encoding options, errors, and name
   compression.  The maximum allowable size of a DNS Message is fixed.
   Many of DNS's protocol limits are too small for uses which are or
   which are desired to become common.  There is no way for
   implementations to advertise their capabilities.

1.2.

   Unextended agents will not know how to interpret the protocol
   extensions detailed here.  In practice, these clients will be
   upgraded when they have need of a new feature, and only new features
   will make use of the extensions.  Extended agents must be prepared
   for behaviour of unextended clients in the face of new protocol
   elements, and fall back gracefully to unextended DNS.  RFC 2671  [RFC2671]
   originally has proposed extensions to the basic DNS protocol to overcome
   these deficiencies.  This memo refines that specification and
   obsoletes RFC 2671.

1.3. [RFC2671].

2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

2 -

3.  EDNS Support Requirement

   EDNS support is manditory in a modern world.  DNSSEC requires EDNS
   support, and many other featres are made possible only by EDNS
   support to request or advertise them.

4.  Affected Protocol Elements

2.1.

4.1.  Message Header

   The DNS Message Header's (see [RFC1035 4.1.1]) , section 4.1.1 [RFC1035]) second full
   16-bit word is divided into a 4-bit OPCODE, a 4-bit RCODE, and a
   number of 1-bit flags.  The original reserved Z bits have been
   allocated to various purposes, and most of the RCODE values are now
   in use.  More flags and more possible RCODEs are needed.  The OPT
   pseudo-RR specified
in Section 4 below contains subfields that carry a bit field
   extension of the RCODE field and additional flag bits, respectively; for details see
Section 4.6 below.

2.2. respectively.

4.2.  Label Types

   The first two bits of a wire format domain label are used to denote
   the type of the label.  [RFC1035 4.1.4] ,section 4.1.4 [RFC1035] allocates two of the
   four possible types and reserves the other two.  Proposals for use of the
remaining types far outnumber those available.  More label types
   were
needed, and an extension mechanism was proposed in RFC 2671 [RFC2671
Section 3].  Section 3 of this document reserves [RFC2671] section 3.

4.3.  UDP Message Size

   DNS labels with a first
octet in the range of 64-127 decimal (label type 01) for future
standardization of Extended DNS Labels.

2.3. DNS Messages are limited to 512 octets Messages are limited to 512 octets in size when sent over UDP.
   While the minimum maximum reassembly buffer size still allows a limit
   of 512 octets of UDP payload, most of the hosts now connected to the
   Internet are able to reassemble larger datagrams.  Some mechanism
   must be created to allow requestors to advertise larger buffer sizes
   to responders.  To this end, the OPT pseudo-RR specified in Section 4 below
   contains a maximum payload size field; for details see Section 4.5
below.

3 - field.

5.  Extended Label Types

   The first octet in the on-the-wire representation of a DNS label
   specifies the label type; the basic DNS specification [RFC1035]
   dedicates the two most significant bits of that octet for this
   purpose.

   This document reserves DNS label type 0b01 for use as an indication
   for Extended Label Types.  A specific extended label type is selected
   by the 6 least significant bits of the first octet.  Thus, Extended
   Label Types are indicated by the values 64-127 (0b01xxxxxx) in the
   first octet of the label.

Allocations from this range are to be made for IETF documents fully
describing the syntax and semantics as well as the applicability of the
particular Extended Label Type.

   This document does not describe any specific Extended Label Type.

4 -

   In practice, Extended Label Types are difficult to use due to support
   in clients and intermediate gateways.  Therefore, the registry of
   Extended Label Types is requested to be closed.  They cause
   interoperability problems and at present no defined label types are
   in use.

6.  OPT pseudo-RR

4.1.

6.1.  OPT Record Behavior

   One OPT pseudo-RR (RR type 41) MAY be added to the additional data
   section of a request, request.  If present in requests, compliant responders
   which implement EDNS MUST include an OPT record in non-truncated
   responses, and SHOULD attempt to responses to such requests. include them in all responses.  An
   OPT is called a pseudo-RR because it pertains to a particular
   transport level message and not to any actual DNS data.  OPT RRs MUST
   NOT be cached, forwarded, or stored in or loaded from master files.
   The quantity of OPT pseudo-RRs per message MUST be either zero or
   one, but not greater.

4.2.

6.2.  OPT Record Format

   An OPT RR has a fixed part and a variable set of options expressed as
   {attribute, value} pairs.  The fixed part holds some DNS meta data
   and also a small collection of new protocol basic extension elements which we
   expect to be so popular that it would be a waste of wire space to
   encode them as {attribute, value} pairs.

4.3.

   The fixed part of an OPT RR is structured as follows:

       +------------+--------------+------------------------------+
       | Field Name | Field Type   | Description
------------------------------------------------------                  |
       +------------+--------------+------------------------------+
       | NAME       | domain name  | empty (root domain)          |
       | TYPE       | u_int16_t    | OPT (41)                          |
       | CLASS      | u_int16_t      sender's    | requestor's UDP payload size |
       | TTL        | u_int32_t    | extended RCODE and flags     |
       | RDLEN      | u_int16_t    | describes RDATA              |
       | RDATA      | octet stream | {attribute,value} pairs

4.4.      |
       +------------+--------------+------------------------------+

                               OPT RR Format

   The variable part of an OPT RR is encoded in its RDATA and is
   structured as zero or more of the following:

      :

                  +0 (MSB)             :                            +1 (LSB)         :
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    0: |                          OPTION-CODE                          |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    2: |                         OPTION-LENGTH                         |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    4: |                                                               |
       /                          OPTION-DATA                          /
       /                                                               /
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
   OPTION-CODE    (Assigned
         Assigned by IANA.) Expert Review.

   OPTION-LENGTH
         Size (in octets) of OPTION-DATA.

   OPTION-DATA
         Varies per OPTION-CODE.

4.4.1.

   Order of appearance of option tuples is never relevant.  Any option
   whose meaning is affected by other options is so affected no matter
   which one comes first in the OPT RDATA.

4.4.2.

   Any OPTION-CODE values not understood by a responder or requestor
   MUST be ignored.  So, specifications  Specifications of such options might wish to
   include some kind of signalled acknowledgement.  For example, an
   option specification might say that if a responder sees option XYZ,
   it SHOULD include option XYZ in its response.

4.5.

6.3.  Requestor's Payload Size

   The sender's requestor's UDP payload size (which OPT stores in the RR CLASS
   field) is the number of octets of the largest UDP payload that can be
   reassembled and delivered in the sender's requestor's network stack.  Note
   that path MTU, with or without fragmentation, may be smaller than
   this.  Values lower than 512 are undefined, and may be treated as format errors, or
may MUST be treated as equal to 512, at the implementor's discretion.

4.5.1. 512.

   Note that a 512-octet UDP payload requires a 576-octet IP reassembly
   buffer.  Choosing 1280 on an for IPv4 over Ethernet connected requestor would be reasonable.
   The consequence of choosing too large a value may be an ICMP message
   from an intermediate gateway, or even a silent drop of the response
   message.

4.5.2. Both requestors and responders are advised to take account of the
path's discovered MTU (if already known) when considering message sizes.

4.5.3.

   The requestor's maximum payload size can change over time, and
therefore MUST NOT
   therefore not be cached for use beyond the transaction in which it is
   advertised.

4.5.4.

6.4.  Responder's Payload Size

   The responder's maximum payload size can change over time, but can be
   reasonably expected to remain constant between two sequential
   transactions; for example, a meaningless QUERY to discover a
   responder's maximum UDP payload size, followed immediately by an
   UPDATE which takes advantage of this size.  (This is considered
   preferrable to the outright use of TCP for oversized requests, if
   there is any reason to suspect that the responder implements EDNS,
   and if a request will not fit in the default 512 payload size limit.)

4.5.5.

6.5.  Payload Size Selection

   Due to transaction overhead, it is unwise to advertise an
   architectural limit as a maximum UDP payload size.  Just because your
   stack can reassemble 64KB datagrams, don't assume that you want to
   spend more than about 4KB of state memory per ongoing transaction.

4.6. The extended RCODE and flags (which OPT stores in the RR TTL field)
are structured

   A requestor MAY choose to implement a fallback to smaller advertised
   sizes to work around firewall or other network limitations.  A
   requestor SHOULD choose to use a fallback mechanism which begins with
   a large size, such as follows:

      :          +0 (MSB)             :              +1 (LSB)         :
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
   0: |         EXTENDED-RCODE        |            VERSION            |
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
   2: | DO|                           Z                               |
      +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

EXTENDED-RCODE  Forms upper 8 bits of extended 12-bit RCODE.  Note 4096.  If that
                EXTENDED-RCODE value zero (0) indicates fails, a fallback around the
   1220 byte range SHOULD be tried, as it has a reasonable chance to fit
   within a single Ethernet frame.  Failing that, a requestor MAY choose
   a 512 byte packet, which with large answers may cause a TCP retry.

6.6.  Middleware Boxes

   Middleware boxes MUST NOT limit DNS messages over UDP to 512 bytes.

   Middleware boxes which simply forward requests to a recursive
   resolver MUST NOT modify the OPT record contents in either direction.

6.7.  Extended RCODE

   The extended RCODE and flags (which OPT stores in the RR TTL field)
   are structured as follows:

                  +0 (MSB)                            +1 (LSB)
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    0: |         EXTENDED-RCODE        |            VERSION            |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
    2: | DO|                           Z                               |
       +---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

   EXTENDED-RCODE
         Forms upper 8 bits of extended 12-bit RCODE.  Note that
         EXTENDED-RCODE value "0" indicates that an unextended RCODE is
         in use (values zero (0) "0" through
                fifteen (15)). "15").

   VERSION
         Indicates the implementation level of whoever sets it.  Full
         conformance with this specification is indicated by version zero (0).
         ``0.''  Requestors are encouraged to set this to the lowest
         implemented level capable of expressing a transaction, to
         minimize the responder and network load of discovering the
         greatest common implementation level between requestor and
         responder.  A requestor's version numbering strategy should MAY
         ideally be a run time configuration option.

         If a responder does not implement the VERSION level of the
         request, then it answers with RCODE=BADVERS.  All responses
         MUST be limited in format to the VERSION level of the request,
         but the VERSION of each response MUST SHOULD be the highest
         implementation level of the responder.  In this way a requestor
         will learn the implementation level of a responder as a side
         effect of every response, including error responses, responses and
         including RCODE=BADVERS.

   DO
         DNSSEC OK bit as defined by [RFC3225].

   Z
         Set to zero by senders and ignored by receivers, unless
         modified in a subsequent specification [IANAFLAGS].

5 - specification.

6.8.  OPT Options Type Allocation Procedure

   Allocations assigned by expert review.  TBD

7.  Transport Considerations

5.1.

   The presence of an OPT pseudo-RR in a request is should be taken as an
   indication that the requestor fully implements the given version of
   EDNS, and can correctly understand any response that conforms to that
   feature's specification.

5.2.

   Lack of use presence of these features an OPT record in a request is MUST be taken as an
   indication that the requestor does not implement any part of this
   specification and that the responder SHOULD MUST NOT use any protocol
   extension described here in its response.

5.3.

   Responders who do not understand implement these protocol extensions are
expected to send a response MUST
   respond with RCODE NOTIMPL, FORMERR, or SERVFAIL, or
to appear to "time out" due to inappropriate action by FORMERR messages without any OPT record.

   If there is a "middle box" problem with processing the OPT record itself, such as a NAT,
   an option value that is badly formatted or to ignore extensions and respond only to unextended

protocol elements.  Therefore use includes out of extensions SHOULD be "probed" such
that range
   values, a responder who isn't known to support them FORMERR MAY be allowed a retry with
no extensions if it responds with such an RCODE, or does not respond. retured.  If a responder's capability level is cached by a requestor, a new probe
SHOULD this occurs the response MUST
   include an OPT record.  This MAY be sent periodically to test for changes used to responder capability.

5.4. distinguish between
   servers whcih do not implement EDNS and format errors within EDNS.

   If EDNS is used in a request, and the response arrives with TC set
   and with no EDNS OPT RR, a requestor should SHOULD assume that truncation
   prevented the OPT RR from being appended by the responder, and
   further, that EDNS is not used in the response.  Correspondingly, an
   EDNS responder who cannot fit all necessary elements (including an
   OPT RR) into a response, should SHOULD respond with a normal (unextended)
   DNS response, possibly setting TC if the response will not fit in the
   unextended response message's 512-octet size.

6 -

8.  Security Considerations

   Requestor-side specification of the maximum buffer size may open a
   new DNS denial of service attack if responders can be made to send
   messages which are too large for intermediate gateways to forward,
   thus leading to potential ICMP storms between gateways and
   responders.

7 - IANA Considerations

IANA has allocated RR type code 41 for OPT.

This document controls the following

   Announcing very large UDP buffer sizes may result in dropping by
   firewalls.  This could cause retransmissions with no hope of success.
   Some devices reject fragmented UDP packets.

   Announcing too small UDP buffer sizes may result in fallback to TCP.
   This is especially important with DNSSEC, where answers are much
   larger.

9.  IANA Considerations

   The IANA has assigned RR type code 41 for OPT.

   [RFC2671] specified a number of IANA sub-registries in registry within "DOMAIN
   NAME SYSTEM PARAMETERS": PARAMETERS:" "EDNS Extended Label Type" Type", "EDNS Option Codes"
   Codes", "EDNS Version Numbers" Numbers", and "Domain System Response Code" Code."
   IANA is advised to re-parent these subregistries to this document.

   RFC 2671 created an extended label type registry.  We request that
   this registry be closed.

   This document assigns extended label type 0b01xxxxxx 0bxx111111 as "EDNS Extended Label
Type." "Reserved for
   future extended label types."  We request that IANA record this
   assignment.

   This document assigns option code 65535 to "Reserved for future
   expansion."

   This document expands the RCODE space from 4 bits to 12 bits.  This
   will allow IANA to assign more than the 16 distinct RCODE values
   allowed in RFC 1035 [RFC1035].

   This document assigns EDNS Extended RCODE "16" to "BADVERS".

   IESG approval is should be required to create new entries in the EDNS
   Extended Label Type or EDNS Version Number registries, while any
   published RFC (including Informational, Experimental, or BCP) is should
   be grounds for allocation of an EDNS Option Code.

8 -

10.  Acknowledgements

   Paul Mockapetris, Mark Andrews, Robert Elz, Don Lewis, Bob Halley,
   Donald Eastlake, Rob Austein, Matt Crawford, Randy Bush, Thomas Narten,
Alfred Hoenes and Markku Savela Thomas
   Narten were each instrumental in creating and refining this
   specification.

9 -

11.  References

11.1.  Normative References

   [RFC1035]    P.  Mockapetris, P., "Domain Names names - Implementation implementation and
             Specification,"
              specification", STD 13, RFC 1035, USC/Information Sciences
             Institute, November 1987.

[RFC2119]    S. Bradner, "Key words for use in RFCs to Indicate
             Requirement Levels," RFC 2119, Harvard University, March
             1997.

   [RFC2671]    P.  Vixie, P., "Extension mechanisms Mechanisms for DNS (EDNS0)," (EDNS0)",
              RFC 2671,
             Internet Software Consortium, August 1999.

   [RFC3225]    D.  Conrad, D., "Indicating Resolver Support of DNSSEC," DNSSEC",
              RFC 3225, Nominum Inc., December 2001.

[IANAFLAGS]  IANA, "DNS Header Flags and EDNS Header Flags," web site
             http://www.iana.org/assignments/dns-header-flags, as of
             June 2005 or later.

10 - Author's Address

11.2.  Informative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

Authors' Addresses

   Michael Graff
   Internet Systems Consortium
   950 Charter Street
   Redwood City, California  94063
   US

   Phone: +1 650.423.1304
   Email: mgraff@isc.org
   Paul Vixie
   Internet Systems Consortium
   950 Charter Street
   Redwood City, CA California  94063
   US

   Phone: +1 650 423 1301
   EMail: 650.423.1301
   Email: vixie@isc.org

Full Copyright Statement

Copyright (C) IETF Trust (2007).

This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors retain
all their rights.

This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR
IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE
INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in this
document or the extent to which any license under such rights might or
might not be available; nor does it represent that it has made any
independent effort to identify any such rights.  Information on the
procedures with respect to rights in RFC documents can be found in BCP
78 and BCP 79.

Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an attempt
made to obtain a general license or permission for the use of such
proprietary rights by implementers or users of this specification can be
obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.

The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary rights
that may cover technology that may be required to implement this
standard.  Please address the information to the IETF at
ietf-ipr@ietf.org.

Acknowledgement

Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).