draft-ietf-idn-idna-08.txt   draft-ietf-idn-idna-09.txt 
Internet Draft Patrik Faltstrom Internet Draft Patrik Faltstrom
draft-ietf-idn-idna-08.txt Cisco draft-ietf-idn-idna-09.txt Cisco
May 22, 2002 Paul Hoffman May 24, 2002 Paul Hoffman
Expires in six months IMC & VPNC Expires in six months IMC & VPNC
Adam M. Costello Adam M. Costello
UC Berkeley UC Berkeley
Internationalizing Domain Names in Applications (IDNA) Internationalizing Domain Names in Applications (IDNA)
Status of this Memo Status of this Memo
This document is an Internet-Draft and is in full conformance with all This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026. provisions of Section 10 of RFC2026.
skipping to change at line 184 skipping to change at line 184
would alter. For every internationalized label that cannot be directly would alter. For every internationalized label that cannot be directly
represented in ASCII, there is an equivalent ACE label. An ACE label represented in ASCII, there is an equivalent ACE label. An ACE label
always begins with the ACE prefix defined in section 5. The conversion always begins with the ACE prefix defined in section 5. The conversion
of labels to and from the ACE form is specified in section 4. of labels to and from the ACE form is specified in section 4.
The "ACE prefix" is defined in this document to be a string of ASCII The "ACE prefix" is defined in this document to be a string of ASCII
characters that appears at the beginning of every ACE label. It is characters that appears at the beginning of every ACE label. It is
specified in section 5. specified in section 5.
A "domain name slot" is defined in this document to be a protocol A "domain name slot" is defined in this document to be a protocol
element or a operation argument or a return value (and so on) explicitly element or a function argument or a return value (and so on) explicitly
designated for carrying a domain name. Examples of domain name slots designated for carrying a domain name. Examples of domain name slots
include: the QNAME field of a DNS query; the name argument of the include: the QNAME field of a DNS query; the name argument of the
gethostbyname() library function; the part of an email address following gethostbyname() library function; the part of an email address following
the at-sign (@) in the From: field of an email message header; and the the at-sign (@) in the From: field of an email message header; and the
host portion of the URI in the src attribute of an HTML <IMG> tag. host portion of the URI in the src attribute of an HTML <IMG> tag.
General text that just happens to contain a domain name is not a domain General text that just happens to contain a domain name is not a domain
name slot; for example, a domain name appearing in the plain text body name slot; for example, a domain name appearing in the plain text body
of an email message is not occupying a domain name slot. of an email message is not occupying a domain name slot.
An "IDN-aware domain name slot" is defined in this document to be a An "IDN-aware domain name slot" is defined in this document to be a
skipping to change at line 215 skipping to change at line 215
3. Requirements 3. Requirements
IDNA conformance means adherence to the following four requirements: IDNA conformance means adherence to the following four requirements:
1) Whenever dots are used as label separators, the following characters 1) Whenever dots are used as label separators, the following characters
MUST be recognized as dots: U+002E (full stop), U+3002 (ideographic full MUST be recognized as dots: U+002E (full stop), U+3002 (ideographic full
stop), U+FF0E (fullwidth full stop), U+FF61 (halfwidth ideographic full stop), U+FF0E (fullwidth full stop), U+FF61 (halfwidth ideographic full
stop). stop).
2) Whenever a domain name is put into an IDN-unaware domain name slot 2) Whenever a domain name is put into an IDN-unaware domain name slot
(see section 2), it MUST contain only ASCII characters, and, if dots are (see section 2), it MUST contain only ASCII characters.
used as label separators, changing all the label separators to U+002E.
Given an internationalized domain name (IDN), an equivalent domain name Given an internationalized domain name (IDN), an equivalent domain name
satisfying this requirement can be obtained by applying the ToASCII satisfying this requirement can be obtained by applying the ToASCII
operation (see section 4) to each label. operation (see section 4) to each label and, if dots are
used as label separators, changing all the label separators to U+002E.
3) ACE labels obtained from domain name slots SHOULD be hidden from 3) ACE labels obtained from domain name slots SHOULD be hidden from
users except when the use of the non-ASCII form would cause problems or users except when the use of the non-ASCII form would cause problems or
when the ACE form is explicitly requested. Given an internationalized when the ACE form is explicitly requested. Given an internationalized
domain name, an equivalent domain name containing no ACE labels can be domain name, an equivalent domain name containing no ACE labels can be
obtained by applying the ToUnicode operation (see section 4) to each obtained by applying the ToUnicode operation (see section 4) to each
label. When requirements 2 and 3 both apply, requirement 1 takes label. When requirements 2 and 3 both apply, requirement 2 takes
precedence. precedence.
4) Whenever two labels are compared, they MUST be considered to match if 4) Whenever two labels are compared, they MUST be considered to match if
and only if they are equivalent, that is, their ASCII forms (obtained by and only if they are equivalent, that is, their ASCII forms (obtained by
applying ToASCII) match using a case-insensitive ASCII comparison. applying ToASCII) match using a case-insensitive ASCII comparison.
Whenever two names are compared, they MUST be considered to match if and Whenever two names are compared, they MUST be considered to match if and
only if their corresponding labels match, regardless of whether the only if their corresponding labels match, regardless of whether the
names use the same forms of label separators. names use the same forms of label separators.
4. Conversion operations 4. Conversion operations
skipping to change at line 263 skipping to change at line 263
"queries" rule from [STRINGPREP], set the flag called "AllowUnassigned". "queries" rule from [STRINGPREP], set the flag called "AllowUnassigned".
2) Split the domain name into individual labels as described in section 2) Split the domain name into individual labels as described in section
3. The labels do not include the separator. 3. The labels do not include the separator.
3) Decide whether or not to enforce the restrictions on ASCII characters 3) Decide whether or not to enforce the restrictions on ASCII characters
in host names [STD3]. If the restrictions are to be enforced, set the in host names [STD3]. If the restrictions are to be enforced, set the
flag called "UseSTD3ASCIIRules". flag called "UseSTD3ASCIIRules".
4) Process each label with either the ToASCII or the ToUnicode 4) Process each label with either the ToASCII or the ToUnicode
operation. Use the ToASCII operation/function if you are about to put operation. Use the ToASCII operation if you are about to put
the name into an IDN-unaware slot. Use the ToUnicode operation if you the name into an IDN-unaware slot. Use the ToUnicode operation if you
are displaying the name to a user. are displaying the name to a user.
5) If ToASCII was applied in step 4 and dots are used as label 5) If ToASCII was applied in step 4 and dots are used as label
separators, change all the label separators to U+002E (full stop). separators, change all the label separators to U+002E (full stop).
The following two subsections define the ToASCII and ToUnicode The following two subsections define the ToASCII and ToUnicode
operations that are used in step 4. operations that are used in step 4.
4.1 ToASCII 4.1 ToASCII
skipping to change at line 285 skipping to change at line 285
The ToASCII operation takes a sequence of Unicode code points that make The ToASCII operation takes a sequence of Unicode code points that make
up one label and transforms it into a sequence of code points in the up one label and transforms it into a sequence of code points in the
ASCII range (0..7F). If ToASCII succeeds, the original sequence and the ASCII range (0..7F). If ToASCII succeeds, the original sequence and the
resulting sequence are equivalent labels. resulting sequence are equivalent labels.
It is important to note that the ToASCII operation can fail. If the It is important to note that the ToASCII operation can fail. If the
ToASCII operation fails on any label in a domain name, that domain name ToASCII operation fails on any label in a domain name, that domain name
MUST NOT be used as an internationalized domain name. The application MUST NOT be used as an internationalized domain name. The application
needs to have some method of dealing with this failure. needs to have some method of dealing with this failure.
The inputs to ToASCII are a sequence of code points; the AllowUnassigned The inputs to ToASCII are a sequence of code points, the AllowUnassigned
flag; and the UseSTD3ASCIIRules flag. The output of ToASCII is either a flag, and the UseSTD3ASCIIRules flag. The output of ToASCII is either a
sequence of ASCII code points or a failure condition. sequence of ASCII code points or a failure condition.
ToASCII never alters a sequence of code points that are all in the ASCII ToASCII never alters a sequence of code points that are all in the ASCII
range to begin with (although it could fail). Applying the ToASCII range to begin with (although it could fail). Applying the ToASCII
operation multiple times has exactly the same effect as applying it just operation multiple times has exactly the same effect as applying it just
once. once.
ToASCII consists of the following steps: ToASCII consists of the following steps:
1. If all code points in the sequence are in the ASCII range (0..7F) 1. If all code points in the sequence are in the ASCII range (0..7F)
skipping to change at line 335 skipping to change at line 335
The ToUnicode operation takes a sequence of Unicode code points that The ToUnicode operation takes a sequence of Unicode code points that
make up one label and returns a sequence of Unicode code points. If the make up one label and returns a sequence of Unicode code points. If the
input sequence is a label in ACE form, then the result is an equivalent input sequence is a label in ACE form, then the result is an equivalent
internationalized label that is not in ACE form, otherwise the original internationalized label that is not in ACE form, otherwise the original
sequence is returned unaltered. sequence is returned unaltered.
ToUnicode never fails. If any step fails, then the original input ToUnicode never fails. If any step fails, then the original input
sequence is returned immediately in that step. sequence is returned immediately in that step.
The inputs to ToUnicode are a sequence of code points; the The inputs to ToUnicode are a sequence of code points, the
AllowUnassigned flag; and the UseSTD3ASCIIRules flag. The output of AllowUnassigned flag, and the UseSTD3ASCIIRules flag. The output of
ToUnicode is always a sequence of Unicode code points. ToUnicode is always a sequence of Unicode code points.
1. If all code points in the sequence are in the ASCII range (0..7F) 1. If all code points in the sequence are in the ASCII range (0..7F)
then skip to step 3. then skip to step 3.
2. Perform the steps specified in [NAMEPREP] and fail if there is an 2. Perform the steps specified in [NAMEPREP] and fail if there is an
error. (If step 3 of ToASCII is also performed here, it will not error. (If step 3 of ToASCII is also performed here, it will not
affect the overall behavior of ToUnicode, but it is not affect the overall behavior of ToUnicode, but it is not
necessary.) The AllowUnassigned flag is used in [NAMEPREP]. necessary.) The AllowUnassigned flag is used in [NAMEPREP].
skipping to change at line 506 skipping to change at line 506
It is expected that new versions of the resolver libraries in the future It is expected that new versions of the resolver libraries in the future
will be able to accept domain names in other formats than ASCII, and will be able to accept domain names in other formats than ASCII, and
application developers might one day pass not only domain names in application developers might one day pass not only domain names in
Unicode, but also in local script to a new API for the resolver Unicode, but also in local script to a new API for the resolver
libraries in the operating system. Thus the ToASCII and ToUnicode libraries in the operating system. Thus the ToASCII and ToUnicode
operations might be performed inside these new versions of the resolver operations might be performed inside these new versions of the resolver
libraries. libraries.
Domain names stored in zones follow the rules for "stored strings" from Domain names stored in zones follow the rules for "stored strings" from
[STRINGPREP]. DNS requests follow the rules for "queries" from [STRINGPREP]. Domain names passed to resolvers or put into the question
section of DNS requests follow the rules for "queries" from
[STRINGPREP]. [STRINGPREP].
6.3 DNS servers 6.3 DNS servers
An operating system might have a set of libraries for performing the An operating system might have a set of libraries for performing the
ToASCII operation. The input to such a library might be in one or more ToASCII operation. The input to such a library might be in one or more
charsets that are used in applications (UTF-8 and UTF-16 are likely charsets that are used in applications (UTF-8 and UTF-16 are likely
candidates for almost any operating system, and script-specific charsets candidates for almost any operating system, and script-specific charsets
are likely for localized operating systems). are likely for localized operating systems).
 End of changes. 9 change blocks. 
13 lines changed or deleted 14 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/