draft-ietf-idn-requirements-05.txt   draft-ietf-idn-requirements-06.txt 
IETF IDN Working Group Editors Zita Wenzel, James Seng IETF IDN Working Group Editors Zita Wenzel, James Seng
Internet Draft draft-ietf-idn-requirements-05.txt Internet Draft draft-ietf-idn-requirements-06.txt
24 April 2001 Expires 24 October 2001 8 May 2001 Expires 8 November 2001
Requirements of Internationalized Domain Names Requirements of Internationalized Domain Names
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at line 53 skipping to change at line 53
developing protocols for internationalized domain names. developing protocols for internationalized domain names.
1. Introduction 1. Introduction
At present, the encoding of Internet domain names is restricted to a At present, the encoding of Internet domain names is restricted to a
subset of 7-bit ASCII (ISO/IEC 646). HTML, XML, IMAP, FTP, and many subset of 7-bit ASCII (ISO/IEC 646). HTML, XML, IMAP, FTP, and many
other text based items on the Internet have already been at least other text based items on the Internet have already been at least
partially internationalized. It is important for domain names to be partially internationalized. It is important for domain names to be
similarly internationalized or for an equivalent solution to be found. similarly internationalized or for an equivalent solution to be found.
This document assumes that the most effective solution involves putting This document assumes that the most effective solution involves putting
non-ASCII names inside some parts of the overall DNS system. non-ASCII names inside some parts of the overall DNS system altho such
assumption may not be the consensus of the IETF community.
This document is being discussed on the "idn" mailing list. To join the This document is being discussed on the "idn" mailing list. To join the
list, send a message to <majordomo@ops.ietf.org> with the words list, send a message to <majordomo@ops.ietf.org> with the words
"subscribe idn" in the body of the message. Archives of the mailing "subscribe idn" in the body of the message. Archives of the mailing
list can also be found at ftp://ops.ietf.org/pub/lists/idn*. list can also be found at ftp://ops.ietf.org/pub/lists/idn*.
1.1 Definitions and Conventions 1.1 Definitions and Conventions
A language is a way that humans interact. In computerised form, a text A language is a way that humans interact. In computerised form, a text
in a written language can be expressed as a string of characters. in a written language can be expressed as a string of characters.
The same set of characters can often be used for many written languages, The same set of characters can often be used for many written languages,
and many written languages can be expressed using different scripts. and many written languages can be expressed using different scripts.
The same characters are often shown with somewhat different glyphs The same characters are often shown with somewhat different glyphs
(shapes) for display of a text depending on the font used, the (shapes) for display of a text depending on the font used, the
automatic shaping applied, or the automatic formation of ligatures. In automatic shaping applied, or the automatic formation of ligatures. In
addition, the same characters can be shown with somewhat different addition, the same characters can be shown with somewhat different
glyphs (shapes) for display of a text depending on the language being glyphs (shapes) for display of a text depending on the language being
used, even within the same font or trough automatic font change. used, even within the same font or through automatic font change.
A character is a member of a set of elements used for organization, A character is a member of a set of elements used for organization,
control, or representation of textual data. control, or representation of textual data.
A graphic character is a character, other than a control function, A graphic character is a character, other than a control function,
that has a visual representation normally handwritten, printed, or that has a visual representation normally handwritten, printed, or
displayed. displayed.
Characters mentioned in this document are identified by their position Characters mentioned in this document are identified by their position
in the Unicode [UNICODE] character set. This character set is also in the Unicode [UNICODE] character set. This character set is also
skipping to change at line 180 skipping to change at line 181
components of resource naming on the Internet (e.g., URI, URN); to make components of resource naming on the Internet (e.g., URI, URN); to make
certain that the set of terms used in this document are well-defined and certain that the set of terms used in this document are well-defined and
non-ambiguous, the definitions are given here. non-ambiguous, the definitions are given here.
A master server for a zone holds the main copy of that zone. This copy A master server for a zone holds the main copy of that zone. This copy
is sometimes stored in a zone file. A slave server for a zone holds a is sometimes stored in a zone file. A slave server for a zone holds a
complete copy of the records for that zone. Slave servers MAY be either complete copy of the records for that zone. Slave servers MAY be either
authorized by the zone owner (secondary servers) or unauthorized authorized by the zone owner (secondary servers) or unauthorized
(so-called "stealth secondaries"). Master and authorized slave servers (so-called "stealth secondaries"). Master and authorized slave servers
are listed in the NS records for the zone, and are termed are listed in the NS records for the zone, and are termed
"authoritative" servers. In many contexts, outside this document the "authoritative" servers. In many contexts outside this document, the
term "primary" is used interchangeably with "master" and "secondary" is term "primary" is used interchangeably with "master" and "secondary" is
used interchangeably with "slave". used interchangeably with "slave".
A caching server holds temporary copies of DNS records; it uses records A caching server holds temporary copies of DNS records; it uses records
to answer queries about domain names. Further explanation of these terms to answer queries about domain names. Further explanation of these terms
can be found in [RFC1034] and [RFC1996]. can be found in [RFC1034] and [RFC1996].
DNS names can be represented in multiple forms, with different DNS names can be represented in multiple forms, with different
properties for internationalization. The most important ones are: properties for internationalization. The most important ones are:
skipping to change at line 243 skipping to change at line 244
The DNS can be seen as a multilayer function: The DNS can be seen as a multilayer function:
- The bottom layer is where the packets are passed across the Internet - The bottom layer is where the packets are passed across the Internet
in a DNS query and a DNS response. At this level, what matters is in a DNS query and a DNS response. At this level, what matters is
the format and meaning of bits and octets in a DNS packet. the format and meaning of bits and octets in a DNS packet.
- Above that is the "DNS service", created by an infrastructure of DNS - Above that is the "DNS service", created by an infrastructure of DNS
servers, NS records that point to those DNS servers, that is servers, NS records that point to those DNS servers, that is
pointed to by the root servers (listed in the "root cache file" on pointed to by the root servers (listed in the "root cache file" on
each DNS server, often called "named.cache". It is at this level each DNS server often called "named.cache"). It is at this level
that the statement "the DNS has a single root" [RFC2826] makes that the statement "the DNS has a single root" [RFC2826] makes
sense, but still, what are being transferred are octets, not sense, but still, what are being transferred are octets, not
characters. characters.
- Interfacing to the user is a service layer, often called "the resolver - Interfacing to the user is a service layer, often called "the resolver
library", and often embedded in the operating system or system library", and often embedded in the operating system or system
libraries of the client machines. It is at the top of this layer that libraries of the client machines. It is at the top of this layer that
the API calls commonly known as "gethostbyname" and "gethostbyaddress" the API calls commonly known as "gethostbyname" and "gethostbyaddress"
reside. These calls are modified to support IPv6 [RFC2553]. A reside. These calls are modified to support IPv6 [RFC2553]. A
conceptually similar layer exists in authoritative DNS servers, conceptually similar layer exists in authoritative DNS servers,
skipping to change at line 306 skipping to change at line 307
The most used ones in the current DNS are: The most used ones in the current DNS are:
- Hostname-to-address service (A, AAAA, A6): Enter a hostname, and get - Hostname-to-address service (A, AAAA, A6): Enter a hostname, and get
back an IPv4 or IPv6 address. back an IPv4 or IPv6 address.
- Hostname-to-Mail server service (MX): As above, but the expected - Hostname-to-Mail server service (MX): As above, but the expected
return value is a hostname and a priority for SMTP servers. return value is a hostname and a priority for SMTP servers.
- Address-to-hostname service (PTR): Enter an IPv4 or IPv6 address (in - Address-to-hostname service (PTR): Enter an IPv4 or IPv6 address (in
in-addr.arpa or ip6.int form respectively) and get back a hostname. in-addr.arpa or ip6.arpa form respectively) and get back a hostname.
- Domain delegation service (NS). Enter a domain name and get back - Domain delegation service (NS). Enter a domain name and get back
nameserver records (designated hosts who provides authoritive nameserver records (designated hosts which provide authoritive
nameservice) for the domain. nameservice) for the domain.
New services are being defined, either as entirely new services (IPv6 to New services are being defined, either as entirely new services (IPv6 to
hostname mapping using binary labels) or as embellishments to other hostname mapping using binary labels) or as embellishments to other
services (DNSSEC returning information about whether a given DNS service services (DNSSEC returning information about whether a given DNS service
is performed securely or not). is performed securely or not).
These services exist, conceptually, at the Application/Resolver These services exist, conceptually, at the Application/Resolver
interface, NOT at the DNS-service interface. This document attempts to interface, NOT at the DNS-service interface. This document attempts to
set requirements for an equivalent of the "used services" given above, set requirements for an equivalent of the "used services" given above,
skipping to change at line 342 skipping to change at line 343
In the requirements, we attempt to use the term "service" whenever a In the requirements, we attempt to use the term "service" whenever a
requirement concerns the service, and "protocol" whenever a requirement requirement concerns the service, and "protocol" whenever a requirement
is believed to constrain the possible implementation. is believed to constrain the possible implementation.
2.1 Compatibility and Interoperability 2.1 Compatibility and Interoperability
[1] The DNS is essential to the entire Internet. Therefore, the service [1] The DNS is essential to the entire Internet. Therefore, the service
MUST NOT damage present DNS protocol interoperability. It MUST make the MUST NOT damage present DNS protocol interoperability. It MUST make the
minimum number of changes to existing protocols on all layers of the minimum number of changes to existing protocols on all layers of the
stack. It MUST continue to allow any system anywhere to resolve any stack. It MUST continue to allow any system anywhere that implements
internationalized domain name. the IDN specification to resolve any internationalized domain name.
[2] The service MUST preserve the basic concept and facilities of domain [2] The service MUST preserve the basic concept and facilities of domain
names as described in [RFC1034]. It MUST maintain a single, global, names as described in [RFC1034]. It MUST maintain a single, global,
universal, and consistent hierarchical namespace. universal, and consistent hierarchical namespace.
[3] The DNS protocol (the packet formats that go on the wire) MUST [3] The DNS protocol (the packet formats that go on the wire) MUST
NOT limit the codepoints that can be used. A service defined on top of NOT limit the codepoints that can be used. A service defined on top of
the DNS, for instance the IDN-to-address function, MAY limit the the DNS, for instance the IDN-to-address function, MAY limit the
codepoints that can be used. The service descriptions MUST describe codepoints that can be used. The service descriptions MUST describe
what limitations are imposed. what limitations are imposed.
skipping to change at line 408 skipping to change at line 409
used in DNS names and records. The protocol MUST specify what charset is used in DNS names and records. The protocol MUST specify what charset is
used when resolving domain names and how characters are encoded in DNS used when resolving domain names and how characters are encoded in DNS
records. records.
[13] Codepoints SHOULD be from the Universal Set as defined in [13] Codepoints SHOULD be from the Universal Set as defined in
ISO-10646 or Unicode. The specifics of versions MUST be defined in the ISO-10646 or Unicode. The specifics of versions MUST be defined in the
proposed solution. If multiple charsets are allowed, each charset MUST proposed solution. If multiple charsets are allowed, each charset MUST
be tagged and conform to [RFC2277]. be tagged and conform to [RFC2277].
[14] The protocol MUST NOT reject any non-IDN characters (to be [14] The protocol MUST NOT reject any non-IDN characters (to be
defined) in any queries or responses. defined) in any DNS queries or responses.
[15] The protocol SHOULD NOT invent a new CCS for the purpose of IDN [15] The protocol SHOULD NOT invent a new CCS for the purpose of IDN
only and SHOULD use existing CES. The charset(s) chosen SHOULD also be only and SHOULD use existing CES. The charset(s) chosen SHOULD also be
non-ambiguous. non-ambiguous.
[16] The protocol SHOULD NOT make any assumptions about the location [16] The protocol SHOULD NOT make any assumptions about the location
in a domain name where internationalization might appear. In other in a domain name where internationalization might appear. In other
words, it SHOULD NOT differentiate between any part of a domain name words, it SHOULD NOT differentiate between any part of a domain name
because this MAY impose restrictions on future internationalization because this MAY impose restrictions on future internationalization
efforts. For example, the TLDs can be internationalized. efforts. For example, the TLDs can be internationalized.
skipping to change at line 431 skipping to change at line 432
protocol. For example, an IDN implementation which only allows domain protocol. For example, an IDN implementation which only allows domain
names to use a single local script would immediately restrict names to use a single local script would immediately restrict
multinational organization. multinational organization.
[18] While there are a wide range of devices that use the DNS and a wide [18] While there are a wide range of devices that use the DNS and a wide
range of characteristics of international scripts and methods of range of characteristics of international scripts and methods of
domain name input and display, IDN is only concerned with the domain name input and display, IDN is only concerned with the
protocol. Therefore, there MUST be a single way of encoding an protocol. Therefore, there MUST be a single way of encoding an
internationalized domain name within the DNS. internationalized domain name within the DNS.
2.4 Canonicalization 2.3 Canonicalization
Matching rules are a complicated process for IDN. Canonicalization Matching rules are a complicated process for IDN. Canonicalization
of characters MUST follow precise and predictable rules to ensure of characters MUST follow precise and predictable rules to ensure
consistency. [CHARREQ] is RECOMMENDED as a guide on canonicalization. consistency. [CHARREQ] is RECOMMENDED as a guide on canonicalization.
The DNS has to match a host name in a request with a host name held The DNS has to match a host name in a request with a host name held
in one or more zones. It also needs to sort names into order. It is in one or more zones. It also needs to sort names into order. It is
expected that some sort of canonicalization algorithm will be used as expected that some sort of canonicalization algorithm will be used as
the first step of this process. This section discusses some of the the first step of this process. This section discusses some of the
properties which will be REQUIRED of that algorithm. properties which will be REQUIRED of that algorithm.
skipping to change at line 481 skipping to change at line 482
[24] Any conversion (case, ligature folding, punctuation folding, etc) [24] Any conversion (case, ligature folding, punctuation folding, etc)
from what the user enters into a client to what the client asks for from what the user enters into a client to what the client asks for
resolution MUST be done identically on any request from any client. resolution MUST be done identically on any request from any client.
[25] If the charset can be normalized, then it SHOULD be normalized [25] If the charset can be normalized, then it SHOULD be normalized
before it is used in IDN. Normalization SHOULD follow [UTR15]. before it is used in IDN. Normalization SHOULD follow [UTR15].
[26] The protocol SHOULD avoid inventing a new normalization form [26] The protocol SHOULD avoid inventing a new normalization form
provided a technically sufficient one is available. provided a technically sufficient one is available.
2.5 Operational Issues 2.4 Operational Issues
[27] Zone files SHOULD remain easily editable. [27] Zone files SHOULD remain easily editable.
[28] An IDN-capable resolver or server SHALL NOT generate more traffic [28] An IDN-capable resolver or server SHALL NOT generate more traffic
than a non-IDN-capable resolver or server would when resolving an than a non-IDN-capable resolver or server would when resolving an
ASCII-only domain name. The amount of traffic generated when resolving ASCII-only domain name. The amount of traffic generated when resolving
an IDN SHALL be similar to that generated when resolving an ASCII-only an IDN SHALL be similar to that generated when resolving an ASCII-only
name. name.
[29] The service SHOULD NOT add new centralized administration for the [29] The service SHOULD NOT add new centralized administration for the
DNS. A domain administrator SHOULD be able to create internationalized DNS. A domain administrator SHOULD be able to create internationalized
names as easily as adding current domain names. names as easily as adding current domain names.
[30] Within a single zone, the zone manager MUST be able to define [30] Within a single zone, the zone manager MAY be able to define
equivalence rules that suit the purpose of the zone, such as, but not equivalence rules that suit the purpose of the zone, such as, but not
limited to, and not necessarily, non-ASCII case folding, Unicode limited to, and not necessarily, non-ASCII case folding, Unicode
normalizations (if Unicode is chosen), Cyrillic/Greek/Latin folding, or normalizations (if Unicode is chosen), Cyrillic/Greek/Latin folding, or
traditional/simplified Chinese equivalence. Such defined equivalences traditional/simplified Chinese equivalence. Such defined equivalences
MUST NOT remove equivalences that are assumed by (old or MUST NOT remove equivalences that are assumed by (old or
local-rule-ignorant) caches. local-rule-ignorant) caches.
[31] The protocol MUST work with DNSSEC. The protocol MAY break [31] The protocol MUST work with DNSSEC. The protocol MAY break
language sort order. language sort order.
 End of changes. 12 change blocks. 
14 lines changed or deleted 15 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/