draft-ietf-idn-nameprep-07.txt   draft-ietf-idn-nameprep-08.txt 
Internet Draft Paul Hoffman Internet Draft Paul Hoffman
draft-ietf-idn-nameprep-07.txt IMC & VPNC draft-ietf-idn-nameprep-08.txt IMC & VPNC
January 9, 2001 Marc Blanchet February 24, 2002 Marc Blanchet
Expires in six months ViaGenie Expires in six months ViaGenie
Stringprep Profile for Internationalized Host Names Nameprep: A Stringprep Profile for Internationalized Domain Names
Status of this memo Status of this memo
This document is an Internet-Draft and is in full conformance with all This document is an Internet-Draft and is in full conformance with all
provisions of Section 10 of RFC2026. provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Task Internet-Drafts are working documents of the Internet Engineering Task
Force (IETF), its areas, and its working groups. Note that other groups Force (IETF), its areas, and its working groups. Note that other groups
may also distribute working documents as Internet-Drafts. may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference material time. It is inappropriate to use Internet-Drafts as reference material
or to cite them other than as "work in progress." or to cite them other than as "work in progress."
To view the list Internet-Draft Shadow Directories, see To view the list Internet-Draft Shadow Directories, see
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
This document describes how to prepare internationalized host name parts This document describes how to prepare internationalized domain name
in order to increase the likelihood that name input and name comparison labels in order to increase the likelihood that name input and name
work in ways that make sense for typical users throughout the world. This comparison work in ways that make sense for typical users throughout the
profile of the stringprep protocol is used as part of a suite of on-the-wire world. This profile of the stringprep protocol is used as part of a
protocols for internationalizing the DNS. suite of on-the-wire protocols for internationalizing the DNS.
1. Introduction 1. Introduction
This document specifies processing rules that will allow users to enter This document specifies processing rules that will allow users to enter
internationalized host name parts in applications and have the highest internationalized domain name labels in applications and have the highest
chance of getting the content of the strings correct. It is a profile of chance of getting the content of the strings correct. It is a profile of
stringprep [STRINGPREP]. stringprep [STRINGPREP]. These processing rules are only intended for
internationalized domain names, not for arbitrary text.
This document was previously called "nameprep" before splitting the
structure of the protocol off into the stringprep document.
This profile defines the following, as required by [STRINGPREP] This profile defines the following, as required by [STRINGPREP]
- The intended applicability of the profile: internationalized - The intended applicability of the profile: internationalized
host name parts domain name labels
- The character repertoire that is the input and output to stringprep: - The character repertoire that is the input and output to stringprep:
defined in Section 2 defined in Section 2
- The list of unassigned code points for the repertoire: defined - The list of unassigned code points for the repertoire: defined
in Appendix F. in Appendix F.
- The mappings used: defined in Section 3. - The mappings used: defined in Section 3.
- The Unicode normalization used: defined in Section 4 - The Unicode normalization used: defined in Section 4
- The characters that are prohibited as output: Defined in section 5 - The characters that are prohibited as output: Defined in section 5
1.1 Interaction of protocol parts
Nameprep is used by the IDNA [IDNA] protocol for preparing domain names;
it is not designed for any other purpose. It is explicitly not designed
for processing arbitrary free text and SHOULD NOT be used for that
purpose. Nameprep is a profile of Stringprep [STRINGPREP].
Implementations of Nameprep MUST fully implement Stringprep.
1.2 Terminology 1.2 Terminology
The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and The key words "MUST", "SHOULD", and "MAY" in this document are to be
"MAY" in this document are to be interpreted as described in RFC 2119 interpreted as described in RFC 2119 [RFC2119].
[RFC2119].
Examples in this document use the notation for code points and names Examples in this document use the notation for code points and names
from the Unicode Standard [Unicode3.1] and ISO/IEC 10646 [ISO10646]. For from the Unicode Standard [Unicode3.1] and ISO/IEC 10646 [ISO10646]. For
example, the letter "a" may be represented as either "U+0061" or "LATIN example, the letter "a" may be represented as either "U+0061" or "LATIN
SMALL LETTER A". In the lists of prohibited characters, the "U+" is left SMALL LETTER A". In the lists of prohibited characters, the "U+" is left
off to make the lists easier to read. The comments for character ranges off to make the lists easier to read. The comments for character ranges
are shown in square brackets (such as "[SYMBOLS]") and do not come from are shown in square brackets (such as "[CONTROL CHARACTERS]") and do not
the standards. come from the standards.
2. Character Repertoire 2. Character Repertoire
Unicode 3.1 [Unicode3.1] is the repertoire used in this profile. Unicode 3.1 [Unicode3.1] is the repertoire used in this profile.
The reason Unicode 3.1 was chosen instead of a version of The reason Unicode 3.1 was chosen instead of a version of
ISO/IEC 10646 is that ISO/IEC 10646 is expected to be updated soon after ISO/IEC 10646 is that ISO/IEC 10646 is expected to be updated soon after
this document becomes an RFC. Unicode 3.1 has the exact repertoire that this document becomes an RFC. Unicode 3.1 has the exact repertoire that
is expected in the next version of ISO/IEC 10646, and is therefore used is expected in the next version of ISO/IEC 10646, and is therefore used
here. here.
3. Mapping 3. Mapping
This profile specifies stringprep mapping using the mapping table This profile specifies stringprep mapping using the mapping table
in Appendix D. That table includes all the steps described in this in Appendix D. That table includes all the steps described in this
section. section.
Note that text in this section describe how Appendix D was formed. It is Note that text in this section describes how Appendix D was formed. It is
there for people who want to understand more, but it should be ignored here for people who want to understand more, but it should be ignored
by implementors. Implementations of this profile MUST map based on by implementors. Implementations of this profile MUST map based on
Appendix D, not based on the descriptions in this section of how Appendix D, not based on the descriptions in this section of how
Appendix D was created. Appendix D was created.
3.1 Mapped out 3.1 Mapped out
The following characters are simply deleted from the input (that is, The following characters are simply deleted from the input (that is,
they are mapped to nothing) because their presence or absence should not they are mapped to nothing) because their presence or absence should not
make two strings different. make two strings different.
skipping to change at line 120 skipping to change at line 125
do not bear semantics. do not bear semantics.
180B; MONGOLIAN FREE VARIATION SELECTOR ONE 180B; MONGOLIAN FREE VARIATION SELECTOR ONE
180C; MONGOLIAN FREE VARIATION SELECTOR TWO 180C; MONGOLIAN FREE VARIATION SELECTOR TWO
180D; MONGOLIAN FREE VARIATION SELECTOR THREE 180D; MONGOLIAN FREE VARIATION SELECTOR THREE
200C; ZERO WIDTH NON-JOINER 200C; ZERO WIDTH NON-JOINER
200D; ZERO WIDTH JOINER 200D; ZERO WIDTH JOINER
3.2 Case mapping 3.2 Case mapping
This profile folds case in domain names where possible because the
current DNS has case-insensitive matching for domain names. If this
profile did not do that for the additional characters being added, it
would lead to even greater user confusion. For example, "Abc" matches
"abc", but "<Uppercase-A-with-accent>bc" would not match
"<lowercase-a-with-accent>bc".
The input string is case folded according to [UTR21]. For most The input string is case folded according to [UTR21]. For most
characters, this is the same as changing the input character to a characters, this is the same as changing the input character to a
lowercase character. For some characters, however, more complex lowercase character. For some characters, however, more complex
transformations occur. The "CaseFolding.txt" file from the Unicode transformations occur. The "CaseFolding.txt" file from the Unicode
database was used to prepare the mapping table. database was used to prepare the mapping table.
There are some characters that do not have mappings in [UTR21] but still There are some characters that do not have mappings in [UTR21] but still
need processing. These characters include a few Greek characters and need processing. These characters include a few Greek characters and
many symbols that contain Latin characters. The list of characters to many symbols that contain Latin characters. The list of characters to
add to the mapping table were determined by the following algorithm: add to the mapping table were determined by the following algorithm:
b = NormalizeWithKC(Fold(a)); b = NormalizeWithKC(Fold(a));
c = NormalizeWithKC(Fold(b)); c = NormalizeWithKC(Fold(b));
if c is not the same as b, add a mapping for "a to c". if c is not the same as b, add a mapping for "a to c".
Because NormalizeWithKC(Fold(c)) always equals c, the table is stable Because NormalizeWithKC(Fold(c)) always equals c, the table is stable
from that point on. The "DerivedNormalizationProperties.txt" file from from that point on. The "DerivedNormalizationProperties.txt" file from
the Unicode database was used to prepare Appendix D. This mapping was the Unicode database was used to prepare Appendix D. These mappings were
added to reduce the number of processing steps, that is, to avoid doing added to reduce the number of processing steps, that is, to avoid doing
case mapping and normalization twice. case mapping and normalization twice.
4. Normalization 4. Normalization
This profile specifies using Unicode normalization form KC, as described This profile specifies using Unicode normalization form KC, as described
in [UAX15]. in [UAX15].
5. Prohibited Output 5. Prohibited Output
This profile specifies using the prohibition table in Appendix E. This profile specifies using the prohibition table in Appendix E.
Note that the subsections below describe how Appendix E was formed. They Note that the subsections below describe how Appendix E was formed. They
are there for people who want to understand more, but they should be are here for people who want to understand more, but they should be
ignored by implementors. Implementations of this profile MUST map based ignored by implementors. Implementations of this profile MUST map based
on Appendix E, not based on the descriptions in this section of how on Appendix E, not based on the descriptions in this section of how
Appendix E was created. Appendix E was created.
The collected lists of prohibited code points can be found in Appendix E The collected list of code points prohibited by this profile can be
of this document. The lists in Appendix E MUST be used by implementations found in Appendix E of this document; note that IDNA prohibits
of this specification. If there are any discrepancies between the lists additional characters. The lists in Appendix E MUST be used by
in Appendix E and subsections below, the lists in Appendix E always takes implementations of this specification. If there are any discrepancies
precedence. between the lists in Appendix E and subsections below, the lists in
Appendix E always takes precedence.
Some code points listed in one section would also appear in other Some code points listed in one section would also appear in other
sections. Each code point is only listed once in the tables in Appendix sections. Each code point is only listed once in the tables in Appendix
E. E.
IMPORTANT NOTE: This profile MUST be used with the IDNA protocol. The
IDNA protocol has additional prohibitions that are checked outside of
this profile.
5.1 Space characters 5.1 Space characters
Space characters would make visual transcription of URLs nearly Space characters would make visual transcription of domain names nearly
impossible and could lead to user entry errors in many ways. impossible and could lead to user entry errors in many ways. Note that
an additional space character (U+0020) is prohibited in IDNA.
0020; SPACE
00A0; NO-BREAK SPACE 00A0; NO-BREAK SPACE
1680; OGHAM SPACE MARK 1680; OGHAM SPACE MARK
2000; EN QUAD 2000; EN QUAD
2001; EM QUAD 2001; EM QUAD
2002; EN SPACE 2002; EN SPACE
2003; EM SPACE 2003; EM SPACE
2004; THREE-PER-EM SPACE 2004; THREE-PER-EM SPACE
2005; FOUR-PER-EM SPACE 2005; FOUR-PER-EM SPACE
2006; SIX-PER-EM SPACE 2006; SIX-PER-EM SPACE
2007; FIGURE SPACE 2007; FIGURE SPACE
2008; PUNCTUATION SPACE 2008; PUNCTUATION SPACE
2009; THIN SPACE 2009; THIN SPACE
200A; HAIR SPACE 200A; HAIR SPACE
202F; NARROW NO-BREAK SPACE 202F; NARROW NO-BREAK SPACE
3000; IDEOGRAPHIC SPACE 3000; IDEOGRAPHIC SPACE
5.2 Control characters 5.2 Control characters
Control characters (or characters with control function) cannot be seen Control characters (or characters with control function) cannot be seen
and can cause unpredictable results when displayed. and can cause unpredictable results when displayed. Note that additional
control characters (U+0000 through U+001F, and U+007F) are prohibited in
IDNA.
0000-001F; [CONTROL CHARACTERS]
007F; DELETE
0080-009F; [CONTROL CHARACTERS] 0080-009F; [CONTROL CHARACTERS]
070F; SYRIAC ABBREVIATION MARK 070F; SYRIAC ABBREVIATION MARK
180E; MONGOLIAN VOWEL SEPARATOR 180E; MONGOLIAN VOWEL SEPARATOR
2028; LINE SEPARATOR 2028; LINE SEPARATOR
2029; PARAGRAPH SEPARATOR 2029; PARAGRAPH SEPARATOR
206A-206F; [CONTROL CHARACTERS] 206A-206F; [CONTROL CHARACTERS]
FFF9-FFFC; [CONTROL CHARACTERS] FFF9-FFFC; [CONTROL CHARACTERS]
1D173-1D17A; [MUSICAL CONTROL CHARACTERS] 1D173-1D17A; [MUSICAL CONTROL CHARACTERS]
5.3 Private use and replacement characters 5.3 Private use and replacement characters
skipping to change at line 259 skipping to change at line 276
5.5 Surrogate codes 5.5 Surrogate codes
The following code points are permanently reserved for use as surrogate The following code points are permanently reserved for use as surrogate
code values in the UTF-16 encoding, will never be assigned to code values in the UTF-16 encoding, will never be assigned to
characters, and are therefore prohibited: characters, and are therefore prohibited:
D800-DFFF; [SURROGATE CODES] D800-DFFF; [SURROGATE CODES]
5.6 Inappropriate for plain text 5.6 Inappropriate for plain text
The following characters should not appear in regular text. The following characters do not appear in regular text.
FFF9; INTERLINEAR ANNOTATION ANCHOR FFF9; INTERLINEAR ANNOTATION ANCHOR
FFFA; INTERLINEAR ANNOTATION SEPARATOR FFFA; INTERLINEAR ANNOTATION SEPARATOR
FFFB; INTERLINEAR ANNOTATION TERMINATOR FFFB; INTERLINEAR ANNOTATION TERMINATOR
FFFC; OBJECT REPLACEMENT CHARACTER FFFC; OBJECT REPLACEMENT CHARACTER
5.7 Inappropriate for canonical representation 5.7 Inappropriate for canonical representation
The ideographic description characters allow different sequences of The ideographic description characters allow different sequences of
characters to be rendered the same way, which makes them inappropriate characters to be rendered the same way, which makes them inappropriate
for host names that must have a single canonical representation. for domain names that have to have a single canonical representation.
2FF0-2FFB; [IDEOGRAPHIC DESCRIPTION CHARACTERS] 2FF0-2FFB; [IDEOGRAPHIC DESCRIPTION CHARACTERS]
5.8 Change display properties 5.8 Change display properties
The following characters, some of which are deprecated in ISO/IEC 10646, The following characters, some of which are deprecated in ISO/IEC 10646,
can cause changes in display or the order in which characters appear can cause changes in display or the order in which characters appear
when rendered. when rendered.
200E; LEFT-TO-RIGHT MARK 200E; LEFT-TO-RIGHT MARK
skipping to change at line 299 skipping to change at line 316
206C; INHIBIT ARABIC FORM SHAPING 206C; INHIBIT ARABIC FORM SHAPING
206D; ACTIVATE ARABIC FORM SHAPING 206D; ACTIVATE ARABIC FORM SHAPING
206E; NATIONAL DIGIT SHAPES 206E; NATIONAL DIGIT SHAPES
206F; NOMINAL DIGIT SHAPES 206F; NOMINAL DIGIT SHAPES
5.9 Inappropriate characters from common input mechanisms 5.9 Inappropriate characters from common input mechanisms
U+3002 is used as if it were U+002E in many input mechanisms, U+3002 is used as if it were U+002E in many input mechanisms,
particularly in Asia. This prohibition allows input mechanisms to safely particularly in Asia. This prohibition allows input mechanisms to safely
map U+3002 to U+002E before doing stringprep without worrying about map U+3002 to U+002E before doing stringprep without worrying about
preventing users from accessing legitimate host name parts. preventing users from accessing legitimate domain name labels.
3002; IDEOGRAPHIC FULL STOP 3002; IDEOGRAPHIC FULL STOP
5.10 Tagging characters 5.10 Tagging characters
The following characters are used for tagging text and are invisible. The following characters are used for tagging text and are invisible.
E0001; LANGUAGE TAG E0001; LANGUAGE TAG
E0020-E007F; [TAGGING CHARACTERS] E0020-E007F; [TAGGING CHARACTERS]
6. Unassigned Code Points in Internationalized Host Names 6. Unassigned Code Points in Internationalized Domain Names
This profile lists the unassigned code points for Unicode 3.1 in This profile lists the unassigned code points in the
range 0 to 10FFFF for Unicode 3.1 in
Appendix F. The list in Appendix F MUST be used by implementations of Appendix F. The list in Appendix F MUST be used by implementations of
this specification. If there are any discrepancies between the list in this specification. If there are any discrepancies between the list in
Appendix F and the Unicode 3.1 specification, the list Appendix F always Appendix F and the Unicode 3.1 specification, the list in Appendix F
takes precedence. always takes precedence.
7. Security Considerations 7. Security Considerations
ISO/IEC 10646 has many characters that look similar. In many cases, The Unicode and ISO/IEC 10646 repertoires have many characters that look
users of security protocols might do visual matching, such as when similar. In many cases, users of security protocols might do visual
comparing the names of trusted third parties. This profile does nothing matching, such as when comparing the names of trusted third parties.
to map similar-looking characters together. This profile does nothing to map similar-looking characters together nor
to prohibit some characters because they look like others.
Much of the security of the Internet relies on the DNS. Thus, any change Security on the Internet partly relies on the DNS. Thus, any change
to the characteristics of the DNS can change the security of much of the to the characteristics of the DNS can change the security of much of the
Internet. Internet.
Host names are used by users to connect to Internet servers. The Domain names are used by users to connect to Internet servers. The
security of the Internet would be compromised if a user entering a security of the Internet would be compromised if a user entering a
single internationalized name could be connected to different servers single internationalized name could be connected to different servers
based on different interpretations of the internationalized host name. based on different interpretations of the internationalized domain name.
Current applications may assume that the characters allowed in host Current applications might assume that the characters allowed in domain
names will always be the same as they are in [STD13]. This document names will always be the same as they are in [STD13]. This document
vastly increases the number of characters available in host names. Every vastly increases the number of characters available in domain names.
program that uses "special" characters in conjunction with host names Every program that uses "special" characters in conjunction with domain
may be vulnerable to attack based on the new characters allowed by this names may be vulnerable to attack based on the new characters allowed by
specification. this specification.
8. References 8. References
[CharModel] Unicode Technical Report;17, Character Encoding Model.
<http://www.unicode.org/unicode/reports/tr17/>.
[Glossary] Unicode Glossary, <http://www.unicode.org/glossary/>. [Glossary] Unicode Glossary, <http://www.unicode.org/glossary/>.
[IDNA] Patrik Faltstrom, Paul Hoffman, and Adam M. Costello,
"Internationalizing Domain Names in Applications (IDNA)",
draft-ietf-idn-idna, work-in-progress.
[ISO10646] ISO/IEC 10646-1:2000. International Standard -- Information [ISO10646] ISO/IEC 10646-1:2000. International Standard -- Information
technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part
1: Architecture and Basic Multilingual Plane. 1: Architecture and Basic Multilingual Plane.
[RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate [RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate
Requirement Levels", March 1997, RFC 2119. Requirement Levels", March 1997, RFC 2119.
[STD13] Paul Mockapetris, "Domain names - concepts and facilities" (RFC [STD13] Paul Mockapetris, "Domain names - concepts and facilities" (RFC
1034) and "Domain names - implementation and specification" (RFC 1035, 1034) and "Domain names - implementation and specification" (RFC 1035,
STD 13, November 1987. STD 13, November 1987.
[STRINGPREP] Paul Hoffman and Marc Blanchet, "Preparation of [STRINGPREP] Paul Hoffman and Marc Blanchet, "Preparation of
Internationalized Strings ("stringprep")", draft-hoffman-stringprep, Internationalized Strings ("stringprep")", draft-hoffman-stringprep,
work in progress work in progress.
[Unicode3.1] The Unicode Standard, Version 3.1.0: The Unicode [Unicode3.1] The Unicode Standard, Version 3.1.0: The Unicode
Consortium. The Unicode Standard, Version 3.0. Reading, MA, Consortium. The Unicode Standard, Version 3.0. Reading, MA,
Addison-Wesley Developers Press, 2000. ISBN 0-201-61633-5, as amended Addison-Wesley Developers Press, 2000. ISBN 0-201-61633-5, as amended
by: Unicode Standard Annex #27: Unicode 3.1 by: Unicode Standard Annex #27: Unicode 3.1
<http://www.unicode.org/unicode/reports/tr27/tr27-4.html>. <http://www.unicode.org/unicode/reports/tr27/tr27-4.html>.
[URI] For example: Roy Fielding et al., "Uniform Resource Identifiers:
Generic Syntax", August 1998, RFC 2396; Robert Hinden et. al, "IPv6
Literal Addresses in URL's", December 1999, RFC 2732. Note that
there are many other RFCs that define additional URI schemes.
[UAX15] Mark Davis and Martin Duerst. Unicode Standard Annex #15: [UAX15] Mark Davis and Martin Duerst. Unicode Standard Annex #15:
Unicode Normalization Forms, Version 3.1.0. Unicode Normalization Forms, Version 3.1.0.
<http://www.unicode.org/unicode/reports/tr15/tr15-21.html> <http://www.unicode.org/unicode/reports/tr15/tr15-21.html>.
[UTR21] Mark Davis. Case Mappings. Unicode Technical Report;21. [UTR21] Mark Davis. Case Mappings. Unicode Technical Report;21.
<http://www.unicode.org/unicode/reports/tr21/>. <http://www.unicode.org/unicode/reports/tr21/>.
9. Differences Between -06 and -07 Drafts
5: Removed 5.1 (currently-used ASCII characters) and renumbered
the entire section.
E: Removed the characters that appeared in the old 5.1.
A. Acknowledgements A. Acknowledgements
Many people from the IETF IDN Working Group and the Unicode Technical Many people from the IETF IDN Working Group and the Unicode Technical
Committee contributed ideas that went into the first draft of this Committee contributed ideas that went into the first draft of this
document. document.
The IDN namprep design team made many useful changes to the first The IDN namprep design team made many useful changes to the first
draft. That team and its advisors include: draft. That team and its advisors include:
Asmus Freytag Asmus Freytag
skipping to change at line 437 skipping to change at line 445
Marc Blanchet Marc Blanchet
Viagenie inc. Viagenie inc.
2875 boul. Laurier, bur. 300 2875 boul. Laurier, bur. 300
Ste-Foy, Quebec, Canada, G1V 2M2 Ste-Foy, Quebec, Canada, G1V 2M2
Marc.Blanchet@viagenie.qc.ca Marc.Blanchet@viagenie.qc.ca
D. Mapping Tables D. Mapping Tables
The following is the mapping table from Section 3. The table has three The following is the mapping table from Section 3. The table has three
columns: columns:
- the character that is mapped from - the code point that is mapped from
- the zero or more characters that it is mapped to - the zero or more code points that it is mapped to
- the reason for the mapping - the reason for the mapping
the zero or more <span class="insert">code points</span> that it is mapped to
The columns are separated by semicolons. Note that the second column may The columns are separated by semicolons. Note that the second column may
be empty, or it may have one character, or it may have more than one be empty, or it may have one code point, or it may have more than one
character, with each character separated by a space. code point, with each code point separated by a space.
----- Start Mapping Table ----- ----- Start Mapping Table -----
0041; 0061; Case map 0041; 0061; Case map
0042; 0062; Case map 0042; 0062; Case map
0043; 0063; Case map 0043; 0063; Case map
0044; 0064; Case map 0044; 0064; Case map
0045; 0065; Case map 0045; 0065; Case map
0046; 0066; Case map 0046; 0066; Case map
0047; 0067; Case map 0047; 0067; Case map
0048; 0068; Case map 0048; 0068; Case map
skipping to change at line 1814 skipping to change at line 1822
1D7A5; 03C6; Additional folding 1D7A5; 03C6; Additional folding
1D7A6; 03C7; Additional folding 1D7A6; 03C7; Additional folding
1D7A7; 03C8; Additional folding 1D7A7; 03C8; Additional folding
1D7A8; 03C9; Additional folding 1D7A8; 03C9; Additional folding
1D7BB; 03C3; Additional folding 1D7BB; 03C3; Additional folding
----- End Mapping Table ----- ----- End Mapping Table -----
E. Prohibited Code Point List E. Prohibited Code Point List
----- Start Prohibited Table ----- ----- Start Prohibited Table -----
0000-0020 0080-00A0
007F
0080-009F
00A0
070F 070F
1680 1680
180E 180E
2000 2000-200A
2001 200E-200F
2002 2028-2029
2003 202A-202F
2004
2005
2006
2007
2008
2009
200A
200E
200F
2028
2029
202A
202B
202C
202D
202E
202F
206A-206F 206A-206F
2FF0-2FFB 2FF0-2FFB
3000 3000
3002 3002
D800-DFFF D800-F8FF
E000-F8FF
FDD0-FDEF FDD0-FDEF
FFF9-FFFC FFF9-FFFF
FFFD
FFFE-FFFF
1D173-1D17A 1D173-1D17A
1FFFE-1FFFF 1FFFE-1FFFF
2FFFE-2FFFF 2FFFE-2FFFF
3FFFE-3FFFF 3FFFE-3FFFF
4FFFE-4FFFF 4FFFE-4FFFF
5FFFE-5FFFF 5FFFE-5FFFF
6FFFE-6FFFF 6FFFE-6FFFF
7FFFE-7FFFF 7FFFE-7FFFF
8FFFE-8FFFF 8FFFE-8FFFF
9FFFE-9FFFF 9FFFE-9FFFF
AFFFE-AFFFF AFFFE-AFFFF
BFFFE-BFFFF BFFFE-BFFFF
CFFFE-CFFFF CFFFE-CFFFF
DFFFE-DFFFF DFFFE-DFFFF
E0001 E0001
E0020-E007F E0020-E007F
EFFFE-EFFFF EFFFE-EFFFF
F0000-FFFFD F0000-FFFFF
FFFFE-FFFFF 100000-10FFFF
100000-10FFFD
10FFFE-10FFFF
----- End Prohibited Table ----- ----- End Prohibited Table -----
NOTE WELL: Software that follows this specification that will be used to NOTE WELL: Software that follows this specification that will be used to
check names before they are put in authoritative name servers MUST add check names before they are put in authoritative name servers MUST add
all unassigned code pints to the list of characters that are prohibited. all unassigned code points to the list of characters that are prohibited.
See Section 6 of [STRINGPREP] for more details. See Section 6 of [STRINGPREP] for more details.
F. Unassigned Code Point List F. Unassigned Code Point List
----- Start Unassigned Table ----- ----- Start Unassigned Table -----
0220-0221 0220-0221
0234-024F 0234-024F
02AE-02AF 02AE-02AF
02EF-02FF 02EF-02FF
034F-035F 034F-035F
 End of changes. 46 change blocks. 
110 lines changed or deleted 92 lines changed or added

This html diff was produced by rfcdiff 1.33. The latest version is available from http://tools.ietf.org/tools/rfcdiff/