DNSOP Working Group                                     Paul Vixie, ISC (Ed.)
   INTERNET-DRAFT                                         Akira Kato, WIDE
   <draft-ietf-dnsop-respsize-01.txt>                           July, 2004
   <draft-ietf-dnsop-respsize-02.txt>                            July 2005

                           DNS Response Size Issues

   Status of this Memo
      This document is an Internet-Draft and is subject to all provisions
      of section 3 of RFC 3667.
      By submitting this Internet-Draft, each author represents that any
      applicable patent or other IPR claims of which we are he or she is aware
      have been or will be disclosed, and any of which
      we become he or she becomes
      aware will be disclosed, in accordance with RFC 3668. Section 6 of BCP 79.

      Internet-Drafts are working documents of the Internet Engineering
      Task Force (IETF), its areas, and its working groups.  Note that
      other groups may also distribute working documents as Internet-

      Internet-Drafts are draft documents valid for a maximum of six months
      and may be updated, replaced, or obsoleted by other documents at any
      time.  It is inappropriate to use Internet-Drafts as reference
      material or to cite them other than as "work in progress."

      The list of current Internet-Drafts can be accessed at

      The list of Internet-Draft Shadow Directories can be accessed at

   Copyright Notice

      Copyright (C) The Internet Society (2003-2004). (2005).  All Rights Reserved.


      With a mandated default minimum maximum message size of 512 octets,
      the DNS protocol presents some special problems for zones wishing to
      expose a moderate or high number of authority servers (NS RRs).  This
      document explains the operational issues caused by, or related to
      this response size limit.

   INTERNET-DRAFT                  June 2003                  July 2005                       RESPSIZE

   1 - Introduction and Overview

   1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512
   octets.  Even though this limitation was due to the required minimum UDP
   reassembly limit for IPv4, it is a hard DNS protocol limit and is not
   implicitly relaxed by changes in transport, for example to IPv6.

   1.2. The EDNS0 standard (see [RFC2671 2.3, 4.5]) permits larger
   responses by mutual agreement of the requestor and responder.  However,
   deployment of EDNS0 cannot be expected to reach every Internet resolver
   in the short or medium term.  The 512 octet message size limit remains
   in practical effect at this time.

   1.3. Since DNS responses include a copy of the request, the space
   available for response data is somewhat less than the full 512 octets.
   For negative responses, there is rarely a space constraint.  For
   positive and delegation responses, though, every octet must be carefully
   and sparingly allocated.  This document specifically addresses
   delegation response sizes.

   2 - Delegation Details

   2.1. A delegation response will include the following elements:

      Header Section: fixed length (12 octets)
      Question Section: original query (name, class, type)
      Answer Section: (empty)
      Authority Section: NS RRset (nameserver names)
      Additional Section: A and AAAA RRsets (nameserver addresses)

   2.2. If the total response size would exceed 512 octets, and if the data
   that would not fit was belonged in the question, answer, or authority
   section, then the TC bit will be set (indicating truncation) which may
   cause the requestor to retry using TCP, depending on what information
   was present desired and what information was omitted.  If a retry using TCP is
   needed, the total cost of the transaction is much higher.  (See [RFC1123] for details on the protocol requirement that UDP be attempted
   before falling back to TCP.)

   2.3. RRsets are never sent partially, so if partially unless truncation occurs, entire
   RRsets are omitted.  Note that in which
   case the authority final apparent RRset in the final nonempty section consists of a
   single RRset.  It is absolutely essential that truncation not occur must be
   considered "possibly damaged".  With or without truncation, the glue
   present in the authority section. additional data section should be considered "possibly
   incomplete", and requestors should be prepared to re-query for any
   damaged or missing RRsets.  For multi-transport name or mail services,
   INTERNET-DRAFT                  June 2003                  July 2005                       RESPSIZE

   this can mean querying for an IPv6 (AAAA) RRset even when an IPv4 (A)
   RRset is present.

   2.4. DNS label compression allows a domain name to be instantiated only
   once per DNS message, and then referenced with a two-octet "pointer"
   from other locations in that same DNS message.  If all nameserver names
   in a message are similar (for example, all ending in ".ROOT-
   SERVERS.NET"), then more space will be available for uncompressable data
   (such as nameserver addresses).

   2.5. The query name can be as long as 255 characters of presentation
   data, which can be up to 256 octets of network data.  In this worst case
   scenario, the question section will be 260 octets in size, which would
   leave only 240 octets for the authority and additional sections (after
   deducting 12 octets for the fixed length header.)

   2.6. Average and maximum question section sizes can be predicted by the
   zone owner, since they will know what names actually exist, and can
   measure which ones are queried for most often.  For cost and performance
   reasons, the majority of requests should be satisfied without truncation
   or TCP retry.

   2.7. Requestors who deliberately send large queries to force truncation
   are only increasing their own costs, and cannot effectively attack the
   resources of an authority server since the requestor would have to retry
   using TCP to complete the attack.  An attack that always used TCP would
   have a lower cost.

   2.8. The minimum useful number of address records is two, since with
   only one address, the probability that it would refer to an unreachable
   server is too high.  Truncation which occurs after two address records
   have been added to the additional data section is therefore less
   operationally significant than truncation which occurs earlier.

   2.9. The best case is no truncation.  (This  This is because many requestors
   will retry using TCP by reflex, or will automatically re-query for
   RRsets that are "possibly truncated", without considering whether the
   omitted data was actually necessary.) necessary.

   2.10. Each added NS RR for a zone will add a minimum of between 16 and
   44 octets to every untruncated referral or negative response from the
   zone's authority servers (16 octets for an NS RR, 16 octets for an A RR,
   and 28 octets for an AAAA RR), in addition to whatever space is taken by
   the nameserver name (NS NSDNAME and A/AAAA owner name).

   INTERNET-DRAFT                  June 2003                  July 2005                       RESPSIZE

   3 - Analysis

   3.1. An instrumented protocol trace of a best case delegation response
   follows.  Note that 13 servers are named, and 13 addresses are given.
   This query was artificially designed to exactly reach the 512 octet

      ;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13
      ;;  [23456789.123456789.123456789.\
           123456789.123456789.123456789.com A IN]        ;; @80

      com.                 86400 NS  E.GTLD-SERVERS.NET.  ;; @112
      com.                 86400 NS  F.GTLD-SERVERS.NET.  ;; @128
      com.                 86400 NS  G.GTLD-SERVERS.NET.  ;; @144
      com.                 86400 NS  H.GTLD-SERVERS.NET.  ;; @160
      com.                 86400 NS  I.GTLD-SERVERS.NET.  ;; @176
      com.                 86400 NS  J.GTLD-SERVERS.NET.  ;; @192
      com.                 86400 NS  K.GTLD-SERVERS.NET.  ;; @208
      com.                 86400 NS  L.GTLD-SERVERS.NET.  ;; @224
      com.                 86400 NS  M.GTLD-SERVERS.NET.  ;; @240
      com.                 86400 NS  A.GTLD-SERVERS.NET.  ;; @256
      com.                 86400 NS  B.GTLD-SERVERS.NET.  ;; @272
      com.                 86400 NS  C.GTLD-SERVERS.NET.  ;; @288
      com.                 86400 NS  D.GTLD-SERVERS.NET.  ;; @304

      A.GTLD-SERVERS.NET.  86400 A           ;; @320
      B.GTLD-SERVERS.NET.  86400 A         ;; @336
      C.GTLD-SERVERS.NET.  86400 A         ;; @352
      D.GTLD-SERVERS.NET.  86400 A         ;; @368
      E.GTLD-SERVERS.NET.  86400 A         ;; @384
      F.GTLD-SERVERS.NET.  86400 A         ;; @400
      G.GTLD-SERVERS.NET.  86400 A         ;; @416
      H.GTLD-SERVERS.NET.  86400 A        ;; @432
      I.GTLD-SERVERS.NET.  86400 A        ;; @448
      J.GTLD-SERVERS.NET.  86400 A         ;; @464
      K.GTLD-SERVERS.NET.  86400 A        ;; @480
      L.GTLD-SERVERS.NET.  86400 A        ;; @496
      M.GTLD-SERVERS.NET.  86400 A         ;; @512

      ;; MSG SIZE  sent: 80  rcvd: 512
   INTERNET-DRAFT                  June 2003                  July 2005                       RESPSIZE

   3.2. For longer query names, the number of address records supplied will
   be lower.  Furthermore, it is only by using a common parent name (which
   is GTLD-SERVERS.NET in this example) that all 13 addresses are able to
   fit.  The following output from a response simulator demonstrates these

      % perl respsize.pl 13 13 0
        common name, average case: msg:303    nsaddr#13 a.dns.br b.dns.br c.dns.br d.dns.br
      a.dns.br requires 10 bytes
      b.dns.br requires 4 bytes
      c.dns.br requires 4 bytes
      d.dns.br requires 4 bytes
      # of NS: 4
      For maximum size query (255 byte):
              if only A is considered:     # of A is 4 (green)
        common name,   worst case: msg:495    nsaddr# 1 (red)
      uncommon name, average case: msg:457    nsaddr#
              if A and AAAA are condered:  # of A+AAAA is 3 (orange)
      uncommon name,   worst case: msg:649(*) nsaddr# 0 (red)
      % perl respsize.pl 13 13 2
        common name, average case: msg:303    nsaddr#11 (orange)
        common name,   worst case: msg:495    nsaddr# 1 (red)
      uncommon name, (yellow)
              if prefer_glue A is assumed: # of A is 4, # of AAAA is 3 (yellow)
      For average case: msg:457    nsaddr# 2 (orange)
      uncommon name,   worst case: msg:649(*) nsaddr# 0 (red)

   (Note: The response simulator program size query (64 byte):
              if only A is shown in Section 5.)

   Here we use the term "green" considered:     # of A is 4 (green)
              if all address records could fit, or
   "orange" A and AAAA are condered:  # of A+AAAA is 4 (green)
              if two prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)

      % perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int
      ns-ext.isc.org requires 16 bytes
      ns.psg.com requires 12 bytes
      ns.ripe.net requires 13 bytes
      ns.eu.int requires 11 bytes
      # of NS: 4
      For maximum size query (255 byte):
              if only A is considered:     # of A is 4 (green)
              if A and AAAA are condered:  # of A+AAAA is 3 (yellow)
              if prefer_glue A is assumed: # of A is 4, # of AAAA is 2 (yellow)
      For average size query (64 byte):
              if only A is considered:     # of A is 4 (green)
              if A and AAAA are condered:  # of A+AAAA is 4 (green)
              if prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)

   (Note: The response simulator program is shown in Section 5.)

   Here we use the term "green" if all address records could fit, or
   "orange" if two or more could fit, or "red" if fewer than two could fit.
   It's clear that without a common parent for nameserver names, much space
   would be lost.  For these examples we use an average/common name size of
   15 octets, befitting our assumption of GTLD-SERVERS.NET as our common
   parent name.

   INTERNET-DRAFT                  July 2005                       RESPSIZE

   We're assuming an average query name size of 64 since that is the
   typical average maximum size seen in trace data at the time of this
   writing.  If Internationalized Domain Name (IDN) or any other technology
   which results in larger query names be deployed significantly in advance
   of EDNS, then more new measurements and new estimates will have to be made.

   4 - Conclusions

   4.1. The current practice of giving all nameserver names a common parent
   (such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS
   responses and allows for more nameservers to be enumerated than would
   otherwise be possible.  (Note that in this case it is wise to serve the
   common parent domain's zone from the same servers that are named within
   it, in order to limit external dependencies when all your eggs are in a
   single basket.)

   4.2. Thirteen (13) seems to be the effective maximum number of
   nameserver names usable traditional (non-extended) DNS, assuming a
   common parent domain name, and assuming given that additional-data response truncation is
   undesirable in the as an average case.

   INTERNET-DRAFT                  June 2003                       RESPSIZE case, and assuming mostly IPv4-only
   reachability (only A RRs exist, not AAAA RRs).

   4.3. Adding two to five IPv6 nameserver address records (AAAA RRs) to a
   prototypical delegation that currently contains thirteen (13) IPv4
   nameserver addresses (A RRs) for thirteen (13) nameserver names under a
   common parent, would not have a significant negative operational impact
   on the domain name system.

   5 - Source Code

   #!/usr/bin/perl -w

   $asize = 2+2+2+4+2+4;
   $aaaasize = 2+2+2+4+2+16;
   ($nns, $na, $naaaa) = @ARGV;
   test("common", "average", common_name_average($nns),
        $na, $naaaa);
   test("common", "worst", common_name_worst($nns),
        $na, $naaaa);
   test("uncommon", "average", uncommon_name_average($nns),
        $na, $naaaa);
   test("uncommon", "worst", uncommon_name_worst($nns),
        $na, $naaaa);
   exit 0;

   sub test { my ($namekind, $casekind, $msg, $na, $naaaa) = @_;
   #    repsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ...
   #        if all queries are assumed to have zone suffux, such as "jp" in
   #     JP TLD servers, specify it in -z option
   use strict;
   use Getopt::Std;
   my $nglue ($sz_msg) = numglue($msg, $na, $naaaa);
        printf "%8s name, %7s case: msg:%3d%s nsaddr#%2d (%s)\n",
             $namekind, $casekind,
             $msg, ($msg > 512) ? "(*)" : "   ",
             $nglue, ($nglue == $na + $naaaa) ? "green"
                  : ($nglue >= 2) ? "orange"
                       : "red";

   sub pnum { (512);
   my ($num, $tot) ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = @_;
        return sprintf "%3d%s",

   sub numglue { (12, 2, 16, 28);
   my ($msg, $na, $naaaa) ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = @_; (2, 2, 4, 2);
   my $space = ($msg > 512) ? 0 : (512 - $msg); (%namedb, $name, $nssect, %opts, $optz);
   my $num $n_ns = 0;

        while ($space && ($na || $naaaa )) {
             if ($na) {
                  if ($space >= $asize) {
                       $space -= $asize;
   INTERNET-DRAFT                  June 2003                  July 2005                       RESPSIZE

             if ($naaaa) {

   getopt('z', opts);
   if ($space >= $aaaasize) (defined($opts{'z'})) {
                       $space -= $aaaasize;
        return $num;
        server_name_len($opts{'z'}); # just register it

   sub msgsize

   foreach $name (@ARGV) {
        my ($qname, $nns, $nsns) $len;
        $len = @_;
        return  12 server_name_len($name);
           print "$name requires $len bytes\n";
        $nssect += $sz_ptr +                            # header
                $qname+2+2 $sz_type +                    # query
                0 $sz_class +                             # answer
                $nns * (4+2+2+4+2+$nsns);       # authority $sz_ttl + $sz_rdlen + $len;
   print "# of NS: $n_ns\n";
   arsect(255, $nssect, $n_ns, "maximum");
   arsect(64, $nssect, $n_ns, "average");

   sub average_case server_name_len {
       my ($nns, $nsns) ($name) = @_;
        return msgsize(64, $nns, $nsns);

   sub worst_case {
       my ($nns, $nsns) (@labels, $len, $n, $suffix);

       $name =~ tr/A-Z/a-z/;
       @labels = @_;
        return msgsize(256, $nns, $nsns);

   sub common_name_average split(/./, $name);
       $len = length(join('.', @labels)) + 2;
       for ($n = 0; $#labels >= 0; $n++, shift @labels) { my ($nns)
           $suffix = @_; join('.', @labels);
           return 15 length($name) - length($suffix) + average_case($nns, 2); $sz_ptr
               if (defined($namedb{$suffix}));
           $namedb{$suffix} = 1;
       return $len;

   sub common_name_worst arsect {
       my ($nns) ($sz_query, $nssect, $n_ns, $cond) = @_;
        return 15
       my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect);
       $ansect = $sz_query + worst_case($nns, 2); 1 + $sz_type + $sz_class;
       $space = $sz_msg - $sz_header - $ansect - $nssect;
       $n_a = atmost(int($space / $sz_rr_a), $n_ns);
       $n_a_aaaa = atmost(int($space / ($sz_rr_a + $sz_rr_aaaa)), $n_ns);
       $n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns) / $sz_rr_aaaa), $n_ns);
       printf "For %s size query (%d byte):\n", $cond, $sz_query;
       printf "if only A is considered:     ";
       printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns);
       printf "if A and AAAA are condered:  ";
       printf "# of A+AAAA is %d (%s)\n", $n_a_aaaa, &judge($n_a_aaaa, $n_ns);
   INTERNET-DRAFT                  July 2005                       RESPSIZE

       printf "if prefer_glue A is assumed: ";
       printf "# of A is %d, # of AAAA is %d (%s)\n",
           $n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns);

   sub uncommon_name_average judge {
       my ($nns) ($n, $n_ns) = @_;
       return average_case($nns, 15); "green" if ($n >= $n_ns);
       return "yellow" if ($n >= 2);
       return "orange" if ($n == 1);
       return "red";

   sub uncommon_name_worst atmost {
       my ($nns) ($a, $b) = @_;
       return worst_case($nns, 15); 0 if ($a < 0);
       return $b if ($a > $b);
       return $a;

   INTERNET-DRAFT                  June 2003                       RESPSIZE

   Security Considerations

   The recommendations contained in this document have no known security

   IANA Considerations

   This document does not call for changes or additions to any IANA

   IPR Statement

   Copyright (C) The Internet Society (2003-2004). (2005).  This document is subject to
   the rights, licenses and restrictions contained in BCP 78, and except as
   set forth therein, the authors retain all their rights.

   This document and the information contained herein are provided on an

   INTERNET-DRAFT                  July 2005                       RESPSIZE

   Authors' Addresses

   Paul Vixie
      950 Charter Street
      Redwood City, CA 94063
      +1 650 423 1301

   Akira Kato
      University of Tokyo, Information Technology Center
      2-11-16 Yayoi Bunkyo
      Tokyo 113-8658, JAPAN
      +81 3 5841 2750