draft-ietf-eai-rfc5335bis-13.txt   rfc6532.txt 
Email Address Internationalization A. Yang Internet Engineering Task Force (IETF) A. Yang
(EAI) TWNIC Request for Comments: 6532 TWNIC
Internet-Draft S. Steele Obsoletes: 5335 S. Steele
Obsoletes: 5335 (if approved) Microsoft Updates: 2045 Microsoft
Updates: 2045 (if approved) N. Freed Category: Standards Track N. Freed
Intended status: Standards Track Oracle ISSN: 2070-1721 Oracle
Expires: April 23, 2012 October 21, 2011 February 2012
Internationalized Email Headers Internationalized Email Headers
draft-ietf-eai-rfc5335bis-13
Abstract Abstract
Internet mail was originally limited to 7-bit ASCII. MIME added Internet mail was originally limited to 7-bit ASCII. MIME added
support for the use of 8-bit character sets in body parts, and also support for the use of 8-bit character sets in body parts, and also
defined an encoded-word construct so other character sets could be defined an encoded-word construct so other character sets could be
used in certain header field values. But full internationalization used in certain header field values. However, full
of electronic mail requires additional enhancements to allow the use internationalization of electronic mail requires additional
of Unicode, including characters outside the ASCII repertoire, in enhancements to allow the use of Unicode, including characters
mail addresses as well as direct use of Unicode in header fields like outside the ASCII repertoire, in mail addresses as well as direct use
From:, To:, and Subject:, without requiring the use of complex of Unicode in header fields like "From:", "To:", and "Subject:",
encoded-word constructs. This document specifies an enhancement to without requiring the use of complex encoded-word constructs. This
the Internet Message Format and to MIME that allows use of Unicode in document specifies an enhancement to the Internet Message Format and
mail addresses and most header field content. to MIME that allows use of Unicode in mail addresses and most header
field content.
This specification replaces RFC 5335. This specification also This specification updates Section 6.4 of RFC 2045 to eliminate the
updates Section 6.4 of RFC 2045 to eliminate the restriction restriction prohibiting the use of non-identity content-transfer-
prohibiting the use of non-identity content-transfer-encodings on encodings on subtypes of "message/".
subtypes of "message/".
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
This Internet-Draft will expire on April 23, 2012. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc6532.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology Used In This Specification . . . . . . . . . . . . 3 2. Terminology Used in This Specification . . . . . . . . . . . . 3
3. Changes to Message Header Fields . . . . . . . . . . . . . . . 4 3. Changes to Message Header Fields . . . . . . . . . . . . . . . 4
3.1. UTF-8 Syntax and Normalization . . . . . . . . . . . . . . 4 3.1. UTF-8 Syntax and Normalization . . . . . . . . . . . . . . 4
3.2. Syntax Extensions to RFC 5322 . . . . . . . . . . . . . . 5 3.2. Syntax Extensions to RFC 5322 . . . . . . . . . . . . . . 5
3.3. Use of 8-bit UTF-8 in Message-Ids . . . . . . . . . . . . 5 3.3. Use of 8-bit UTF-8 in Message-IDs . . . . . . . . . . . . 5
3.4. Effects on Line Length Limits . . . . . . . . . . . . . . 6 3.4. Effects on Line Length Limits . . . . . . . . . . . . . . 5
3.5. Changes to MIME Message Type Encoding Restrictions . . . . 6 3.5. Changes to MIME Message Type Encoding Restrictions . . . . 6
3.6. Use of MIME Encoded-Words . . . . . . . . . . . . . . . . 6 3.6. Use of MIME Encoded-Words . . . . . . . . . . . . . . . . 6
3.7. The Message/global Media Type . . . . . . . . . . . . . . 7 3.7. The message/global Media Type . . . . . . . . . . . . . . 7
4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9
7. Edit history . . . . . . . . . . . . . . . . . . . . . . . . . 9 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.1. draft-ietf-eai-rfc5335bis-00 . . . . . . . . . . . . . . . 9 7.1. Normative References . . . . . . . . . . . . . . . . . . . 10
7.2. draft-ietf-eai-rfc5335bis-01 . . . . . . . . . . . . . . . 10 7.2. Informative References . . . . . . . . . . . . . . . . . . 10
7.3. draft-ietf-eai-rfc5335bis-02 . . . . . . . . . . . . . . . 10
7.4. draft-ietf-eai-rfc5335bis-03 . . . . . . . . . . . . . . . 10
7.5. draft-ietf-eai-rfc5335bis-04 . . . . . . . . . . . . . . . 10
7.6. draft-ietf-eai-rfc5335bis-05 . . . . . . . . . . . . . . . 10
7.7. draft-ietf-eai-rfc5335bis-06 . . . . . . . . . . . . . . . 10
7.8. draft-ietf-eai-rfc5335bis-07 . . . . . . . . . . . . . . . 10
7.9. draft-ietf-eai-rfc5335bis-09 . . . . . . . . . . . . . . . 11
7.10. draft-ietf-eai-rfc5335bis-10 . . . . . . . . . . . . . . . 11
7.11. draft-ietf-eai-rfc5335bis-11 . . . . . . . . . . . . . . . 11
7.12. draft-ietf-eai-rfc5335bis-12 . . . . . . . . . . . . . . . 11
7.13. draft-ietf-eai-rfc5335bis-13 . . . . . . . . . . . . . . . 11
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.1. Normative References . . . . . . . . . . . . . . . . . . . 12
8.2. Informative References . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
Internet mail distinguishes a message from its transport and further Internet mail distinguishes a message from its transport and further
divides a message between a header and a body [RFC5322]. Internet divides a message between a header and a body [RFC5322]. Internet
mail header field values contain a variety of strings that are mail header field values contain a variety of strings that are
intended to be user-visible. The range of supported characters for intended to be user-visible. The range of supported characters for
these strings was originally limited to 7-bit [ASCII]. MIME these strings was originally limited to [ASCII] in 7-bit form. MIME
[RFC2045] [RFC2046] [RFC2047] provides the ability to use additional [RFC2045] [RFC2046] [RFC2047] provides the ability to use additional
character sets, but this support is limited to body part data and to character sets, but this support is limited to body part data and to
special encoded-word constructs that were only allowed in a limited special encoded-word constructs that were only allowed in a limited
number of places in header field values. number of places in header field values.
Globalization of the Internet requires support of the much larger set Globalization of the Internet requires support of the much larger set
of characters provided by Unicode [RFC5198] in both mail addresses of characters provided by Unicode [RFC5198] in both mail addresses
and most header field values. Additionally, complex encoding schemes and most header field values. Additionally, complex encoding schemes
like encoded-words introduce inefficiencies as well as significant like encoded-words introduce inefficiencies as well as significant
opportunities for processing errors. And finally, native support for opportunities for processing errors. And finally, native support for
the UTF-8 charset is now available on most systems. Hence it is the UTF-8 charset is now available on most systems. Hence, it is
strongly desirable for Internet mail to support UTF-8 [RFC3629] strongly desirable for Internet mail to support UTF-8 [RFC3629]
directly. directly.
This document specifies an enhancement to the Internet Message Format This document specifies an enhancement to the Internet Message Format
[RFC5322] and to MIME that permits the direct use of UTF-8, rather [RFC5322] and to MIME that permits the direct use of UTF-8, rather
than only ASCII, in header field values, including mail addresses. A than only ASCII, in header field values, including mail addresses. A
new media type, message/global, is defined for messages that use this new media type, message/global, is defined for messages that use this
extended format. This specification also lifts the MIME restriction extended format. This specification also lifts the MIME restriction
on having non-identity content-transfer-encodings on any subtype of on having non-identity content-transfer-encodings on any subtype of
the message top-level type so that message/global parts can be safely the message top-level type so that message/global parts can be safely
transmitted across existing mail infrastructure. transmitted across existing mail infrastructure.
This specification is based on a model of native, end-to-end support This specification is based on a model of native, end-to-end support
for UTF-8, which depends on having an "8-bit clean" environment for UTF-8, which depends on having an "8-bit-clean" environment
assured by the transport system. Support for carriage across legacy, assured by the transport system. Support for carriage across legacy,
7-bit infrastructure and for processing by 7-bit receivers requires 7-bit infrastructure and for processing by 7-bit receivers requires
additional mechanisms that are not provided by these specifications. additional mechanisms that are not provided by these specifications.
This specification is a revision of and replacement for [RFC5335]. This specification is a revision of and replacement for [RFC5335].
Section 6 of [I-D.ietf-eai-frmwrk-4952bis] describes the change in Section 6 of [RFC6530] describes the change in approach between this
approach between this specification and the previous version. specification and the previous version.
2. Terminology Used In This Specification 2. Terminology Used in This Specification
A plain ASCII string is fully compatible with [RFC5321] and A plain ASCII string is fully compatible with [RFC5321] and
[RFC5322]. In this document, non-ASCII strings are UTF-8 strings if [RFC5322]. In this document, non-ASCII strings are UTF-8 strings if
they are in header field values which contain at least one <UTF8-non- they are in header field values that contain at least one
ascii> (see Section 3.1). <UTF8-non-ascii> (see Section 3.1).
Unless otherwise noted, all terms used here are defined in [RFC5321], Unless otherwise noted, all terms used here are defined in [RFC5321],
[RFC5322], [RFC6530], or [RFC6531].
[RFC5322], [I-D.ietf-eai-frmwrk-4952bis], or
[I-D.ietf-eai-rfc5336bis].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119]. document are to be interpreted as described in [RFC2119].
The term "8-bit" means octets are present in the data with values The term "8-bit" means octets are present in the data with values
above 0x7F. above 0x7F.
3. Changes to Message Header Fields 3. Changes to Message Header Fields
To permit Unicode characters in field values, the header definition To permit non-ASCII Unicode characters in field values, the header
in [RFC5322] is extended to support the new format. The following definition in [RFC5322] is extended to support the new format. The
sections specify the necessary changes to RFC 5322's ABNF. following sections specify the necessary changes to RFC 5322's ABNF.
The syntax rules not mentioned below remain defined as in [RFC5322]. The syntax rules not mentioned below remain defined as in [RFC5322].
Note that this protocol does not change RFC 5322 rules for defining Note that this protocol does not change rules in RFC 5322 for
header field names. The bodies of header fields are allowed to defining header field names. The bodies of header fields are allowed
contain Unicode characters, but the header field names themselves to contain Unicode characters, but the header field names themselves
must contain only ASCII characters. must consist of ASCII characters only.
Also note that messages in this format require the use of the Also note that messages in this format require the use of the
UTF8SMTPbis extension [I-D.ietf-eai-rfc5336bis] to be transferred via SMTPUTF8 extension [RFC6531] to be transferred via SMTP.
SMTP.
3.1. UTF-8 Syntax and Normalization 3.1. UTF-8 Syntax and Normalization
UTF-8 characters can be defined in terms of octets using the UTF-8 characters can be defined in terms of octets using the
following ABNF [RFC5234], taken from [RFC3629]: following ABNF [RFC5234], taken from [RFC3629]:
UTF8-non-ascii = UTF8-2 / UTF8-3 / UTF8-4 UTF8-non-ascii = UTF8-2 / UTF8-3 / UTF8-4
UTF8-2 = <Defined in Section 4 of RFC3629> UTF8-2 = <Defined in Section 4 of RFC3629>
UTF8-3 = <Defined in Section 4 of RFC3629> UTF8-3 = <Defined in Section 4 of RFC3629>
UTF8-4 = <Defined in Section 4 of RFC3629> UTF8-4 = <Defined in Section 4 of RFC3629>
See [RFC5198] for a discussion of Unicode normalization; See [RFC5198] for a discussion of Unicode normalization;
normalization form NFC [UNF] SHOULD be used. Actually, if one is normalization form NFC [UNF] SHOULD be used. Actually, if one is
going to do internationalization properly, one of the most often- going to do internationalization properly, one of the most often
cited goals is to permit people to spell their names correctly. cited goals is to permit people to spell their names correctly.
Since many mailbox local parts reflect personal names, that principle Since many mailbox local parts reflect personal names, that principle
applies to mailboxes as well. The NFKC normalization form [UNF] applies to mailboxes as well. The NFKC normalization form [UNF]
SHOULD NOT be used because it may lose information that is needed to SHOULD NOT be used because it may lose information that is needed to
correctly spell some names in some unusual circumstances. correctly spell some names in some unusual circumstances.
3.2. Syntax Extensions to RFC 5322 3.2. Syntax Extensions to RFC 5322
The following rules extend the ABNF syntax defined in [RFC5322] and The following rules extend the ABNF syntax defined in [RFC5322] and
[RFC5234] in order to allow UTF-8 content. [RFC5234] in order to allow UTF-8 content.
VCHAR =/ UTF8-non-ascii VCHAR =/ UTF8-non-ascii
ctext =/ UTF8-non-ascii ctext =/ UTF8-non-ascii
atext =/ UTF8-non-ascii atext =/ UTF8-non-ascii
qtext =/ UTF8-non-ascii qtext =/ UTF8-non-ascii
text =/ UTF8-non-ascii text =/ UTF8-non-ascii
; note that this upgrades the body to UTF-8 ; note that this upgrades the body to UTF-8
dtext =/ UTF8-non-ascii dtext =/ UTF8-non-ascii
The preceding changes mean that the following constructs now allow The preceding changes mean that the following constructs now allow
UTF-8: UTF-8:
1. Unstructured text, used in header fields like Subject: or 1. Unstructured text, used in header fields like "Subject:" or
Content-description:. "Content-description:".
2. Any construct that uses atoms, including but not limited to the 2. Any construct that uses atoms, including but not limited to the
local parts of addresses and message-ids. This includes local parts of addresses and Message-IDs. This includes
addresses in the "for" clauses of Received: header fields. addresses in the "for" clauses of "Received:" header fields.
3. Quoted strings. 3. Quoted strings.
4. Domains. 4. Domains.
Note that header field names are not on this list; these are still Note that header field names are not on this list; these are still
restricted to ASCII. restricted to ASCII.
3.3. Use of 8-bit UTF-8 in Message-Ids 3.3. Use of 8-bit UTF-8 in Message-IDs
Implementers of message-id generation algorithms MAY prefer to Implementers of Message-ID generation algorithms MAY prefer to
restrain their output to ASCII since that has some advantages, such restrain their output to ASCII since that has some advantages, such
as when constructing In-reply-to: and References: header fields in as when constructing "In-reply-to:" and "References:" header fields
mailing-list threads where some senders use EAI and others not. in mailing-list threads where some senders use internationalized
addresses and others do not.
3.4. Effects on Line Length Limits 3.4. Effects on Line Length Limits
Section 2.1.1 of [RFC5322] limits lines to 998 characters and Section 2.1.1 of [RFC5322] limits lines to 998 characters and
recommends that the lines be restricted to only 78 characters. This recommends that the lines be restricted to only 78 characters. This
specification changes the former limit to 998 octets. (Note that in specification changes the former limit to 998 octets. (Note that, in
ASCII octets and characters are effectively the same but this is not ASCII, octets and characters are effectively the same, but this is
true in UTF-8.) The 78 character limit remains defined in terms of not true in UTF-8.) The 78-character limit remains defined in terms
characters, not octets, since it is intended to address display width of characters, not octets, since it is intended to address display
issues, not line length issues. width issues, not line-length issues.
3.5. Changes to MIME Message Type Encoding Restrictions 3.5. Changes to MIME Message Type Encoding Restrictions
This specification updates Section 6.4 of [RFC2045]. [RFC2045] This specification updates Section 6.4 of [RFC2045]. [RFC2045]
prohibits applying a content-transfer-encoding to any subtypes of prohibits applying a content-transfer-encoding to any subtypes of
"message/". This specification relaxes that rule -- it allows newly "message/". This specification relaxes that rule -- it allows newly
defined MIME types to permit content-transfer-encoding, and it allows defined MIME types to permit content-transfer-encoding, and it allows
content-transfer-encoding for message/global (see Section 3.7). content-transfer-encoding for message/global (see Section 3.7).
Background: Normally, transfer of message/global will be done in Background: Normally, transfer of message/global will be done in
8-bit-clean channels, and body parts will have "identity" encodings, 8-bit-clean channels, and body parts will have "identity" encodings,
that is, no decoding is necessary. that is, no decoding is necessary.
But in the case where a message containing a message/global is But in the case where a message containing a message/global is
downgraded from 8-bit to 7-bit as described in [RFC6152], an encoding downgraded from 8-bit to 7-bit as described in [RFC6152], an encoding
might have to be applied to the message; if the message travels might have to be applied to the message. If the message travels
multiple times between a 7-bit environment and an environment multiple times between a 7-bit environment and an environment
implementing these extensions, multiple levels of encoding may occur. implementing these extensions, multiple levels of encoding may occur.
This is expected to be rarely seen in practice, and the potential This is expected to be rarely seen in practice, and the potential
complexity of other ways of dealing with the issue is thought to be complexity of other ways of dealing with the issue is thought to be
larger than the complexity of allowing nested encodings where larger than the complexity of allowing nested encodings where
necessary. necessary.
3.6. Use of MIME Encoded-Words 3.6. Use of MIME Encoded-Words
The MIME encoded-words facility [RFC2047] provides the ability to The MIME encoded-words facility [RFC2047] provides the ability to
skipping to change at page 7, line 5 skipping to change at page 7, line 5
encoded-words SHOULD NOT be used when generating header fields for encoded-words SHOULD NOT be used when generating header fields for
messages employing this extension. Agents MAY, when incorporating messages employing this extension. Agents MAY, when incorporating
material from another message, convert encoded-word use to direct use material from another message, convert encoded-word use to direct use
of UTF-8. of UTF-8.
Note that care must be taken when decoding encoded-words because the Note that care must be taken when decoding encoded-words because the
results after replacing an encoded-word with its decoded equivalent results after replacing an encoded-word with its decoded equivalent
in UTF-8 may be syntactically invalid. Processors that elect to in UTF-8 may be syntactically invalid. Processors that elect to
decode encoded-words MUST NOT generate syntactically invalid fields. decode encoded-words MUST NOT generate syntactically invalid fields.
3.7. The Message/global Media Type 3.7. The message/global Media Type
Internationalized messages in this format MUST only be transmitted as Internationalized messages in this format MUST only be transmitted as
authorized by [I-D.ietf-eai-rfc5336bis] or within a non-SMTP authorized by [RFC6531] or within a non-SMTP environment that
environment that supports these messages. A message is a "message/ supports these messages. A message is a "message/global message" if:
global message" if:
o it contains 8-bit UTF-8 header values as specified in this o it contains 8-bit UTF-8 header values as specified in this
document, or document, or
o it contains 8-bit UTF-8 values in the header fields of body parts. o it contains 8-bit UTF-8 values in the header fields of body parts.
The content of a message/global part is otherwise identical to that The content of a message/global part is otherwise identical to that
of a message/rfc822 part. of a message/rfc822 part.
If an object of this type is sent to a 7-bit-only system, it MUST If an object of this type is sent to a 7-bit-only system, it MUST
have an appropriate content-transfer-encoding applied. (Note that a have an appropriate content-transfer-encoding applied. (Note that a
system compliant with MIME that doesn't recognize message/global is system compliant with MIME that doesn't recognize message/global is
supposed to treat it as "application/octet-stream" as described in supposed to treat it as "application/octet-stream" as described in
Section 5.2.4 of [RFC2046].) Section 5.2.4 of [RFC2046].)
The registration is as follows:
Type name: message Type name: message
Subtype name: global Subtype name: global
Required parameters: none Required parameters: none
Optional parameters: none Optional parameters: none
Encoding considerations: Any content-transfer-encoding is permitted. Encoding considerations: Any content-transfer-encoding is permitted.
The 8-bit or binary content-transfer-encodings are recommended The 8-bit or binary content-transfer-encodings are recommended
where permitted. where permitted.
Security considerations: See Section 4. Security considerations: See Section 4.
Interoperability considerations: This media type provides Interoperability considerations: This media type provides
functionality similar to the message/rfc822 content type for email functionality similar to the message/rfc822 content type for email
messages with internationalized email headers. When there is a messages with internationalized email headers. When there is a
need to embed or return such content in another message, there is need to embed or return such content in another message, there is
generally an option to use this media type and leave the content generally an option to use this media type and leave the content
unchanged or down-convert the content to message/rfc822. Both of unchanged or down-convert the content to message/rfc822. Each of
these choices will interoperate with the installed base, but with these choices will interoperate with the installed base, but with
different properties. Systems unaware of internationalized different properties. Systems unaware of internationalized
headers will typically treat a message/global body part as an headers will typically treat a message/global body part as an
unknown attachment, while they will understand the structure of a unknown attachment, while they will understand the structure of a
message/rfc822. However, systems that understand message/global message/rfc822. However, systems that understand message/global
will provide functionality superior to the result of a down- will provide functionality superior to the result of a down-
conversion to message/rfc822. The most interoperable choice conversion to message/rfc822. The most interoperable choice
depends on the deployed software. depends on the deployed software.
Published specification: RFC XXXX Published specification: RFC 6532
Applications that use this media type: SMTP servers and email Applications that use this media type: SMTP servers and email
clients that support multipart/report generation or parsing. clients that support multipart/report generation or parsing.
Email clients that forward messages with internationalized headers Email clients that forward messages with internationalized headers
as attachments. as attachments.
Additional information: Additional information:
Magic number(s): none Magic number(s): none
File extension(s): The extension ".u8msg" is suggested. File extension(s): The extension ".u8msg" is suggested.
Macintosh file type code(s): A uniform type identifier (UTI) of Macintosh file type code(s): A uniform type identifier (UTI) of
"public.utf8-email-message" is suggested. This conforms to "public.utf8-email-message" is suggested. This conforms to
"public.message" and "public.composite-content", but does not "public.message" and "public.composite-content", but does not
necessarily conform to "public.utf8-plain-text". necessarily conform to "public.utf8-plain-text".
Person & email address to contact for further information: See the Person & email address to contact for further information: See the
Author's Address section of this document. Authors' Addresses section of this document.
Intended usage: COMMON Intended usage: COMMON
Restrictions on usage: This is a structured media type that embeds Restrictions on usage: This is a structured media type that embeds
other MIME media types. An 8-bit or binary content-transfer- other MIME media types. An 8-bit or binary content-transfer-
encoding SHOULD be used unless this media type is sent over a encoding SHOULD be used unless this media type is sent over a
7-bit-only transport. 7-bit-only transport.
Author: See the Author's Address section of this document. Author: See the Authors' Addresses section of this document.
Change controller: IETF Standards Process Change controller: IETF Standards Process
4. Security Considerations 4. Security Considerations
Because UTF-8 often requires several octets to encode a single Because UTF-8 often requires several octets to encode a single
character, internationalization may cause header field values in character, internationalization may cause header field values (in
general and mail addresses in particular to become longer. As general) and mail addresses (in particular) to become longer. As
specified in [RFC5322], each line of characters MUST be no more than specified in [RFC5322], each line of characters MUST be no more than
998 octets, excluding the CRLF. On the other hand, MDA (Mail 998 octets, excluding the CRLF. On the other hand, MDA (Mail
Delivery Agent) processes that parse, store, or handle email Delivery Agent) processes that parse, store, or handle email
addresses or local parts must take extra care not to overflow addresses or local parts must take extra care not to overflow
buffers, truncate addresses, or exceed storage allotments. Also, buffers, truncate addresses, or exceed storage allotments. Also,
they must take care, when comparing, to use the entire lengths of the they must take care, when comparing, to use the entire lengths of the
addresses. addresses.
There are lots of ways to use UTF-8 to represent something equivalent There are lots of ways to use UTF-8 to represent something equivalent
or similar to a particular displayed character or group of or similar to a particular displayed character or group of
characters; see the security considerations in [RFC3629] for details characters; see the security considerations in [RFC3629] for details
on the problems this can cause. The normalization process described on the problems this can cause. The normalization process described
in Section 3.1 is recommended to minimize these issues. in Section 3.1 is recommended to minimize these issues.
The security impact of UTF-8 headers on email signature systems such The security impact of UTF-8 headers on email signature systems such
as Domain Keys Identified Mail (DKIM), S/MIME, and OpenPGP is as Domain Keys Identified Mail (DKIM), S/MIME, and OpenPGP is
discussed in [I-D.ietf-eai-frmwrk-4952bis], Section 14. discussed in Section 14 of [RFC6530].
If a user has a non-ASCII mailbox address and an ASCII mailbox If a user has a non-ASCII mailbox address and an ASCII mailbox
address, a digital certificate that identifies that user might have address, a digital certificate that identifies that user might have
both addresses in the identity. Having multiple email addresses as both addresses in the identity. Having multiple email addresses as
identities in a single certificate is already supported in PKIX identities in a single certificate is already supported in PKIX
(Public Key Infrastructure for X.509 Certificates) [RFC5280] and (Public Key Infrastructure using X.509) [RFC5280] and OpenPGP
OpenPGP [RFC3156], but there may be user interface issues associated [RFC3156], but there may be user-interface issues associated with the
with the introduction of UTF-8 into addresses in this context. introduction of UTF-8 into addresses in this context.
5. IANA Considerations 5. IANA Considerations
IANA is requested to update the registration of the message/global IANA has updated the registration of the message/global MIME type
MIME type using the registration form contained in Section 3.7. using the registration form contained in Section 3.7.
6. Acknowledgements 6. Acknowledgements
This document incorporates many ideas first described in Internet- This document incorporates many ideas first described in a draft
Draft form by Paul Hoffman, although many details have changed from document by Paul Hoffman, although many details have changed from
that earlier work. that earlier work.
The author especially thanks Jeff Yeh for his efforts and The authors especially thank Jeff Yeh for his efforts and
contributions on editing previous versions. contributions on editing previous versions.
Most of the content of this document was provided by John C Klensin Most of the content of this document was provided by John C Klensin
and Dave Crocker. Significant comments and suggestions were received and Dave Crocker. Significant comments and suggestions were received
from Martin Duerst, Julien Elie, Arnt Gulbrandsen, Kristin Hubner, from Martin Duerst, Julien Elie, Arnt Gulbrandsen, Kristin Hubner,
Kari Hurtta, Yangwoo Ko, Charles H. Lindsey, Alexey Melnikov, Chris Kari Hurtta, Yangwoo Ko, Charles H. Lindsey, Alexey Melnikov, Chris
Newman, Pete Resnick, Yoshiro Yoneya, and additional members of the Newman, Pete Resnick, Yoshiro Yoneya, and additional members of the
JET team (Joint Engineering Team) and were incorporated into the Joint Engineering Team (JET) and were incorporated into the document.
document. The editors wish to sincerely thank them all for their The authors wish to sincerely thank them all for their contributions.
contributions.
7. Edit history
[[RFC Editor: Please remove this section before publishing.]]
7.1. draft-ietf-eai-rfc5335bis-00
1. Applied Errata suggested by Alfred Hoenes.
2. Adjust [RFC2821] and [RFC2822] to [RFC5321] and [RFC5322].
3. Abrogate <alt-address> in ABNF of <angle-addr>.
4. Revoke [RFC5504] from this document.
5. Upgrade some references from I-Ds to RFC.
7.2. draft-ietf-eai-rfc5335bis-01
1. Author name revised.
7.3. draft-ietf-eai-rfc5335bis-02
1. ABNF revised.
7.4. draft-ietf-eai-rfc5335bis-03
1. Fix typos
2. ABNF revised
3. Improve sentence
7.5. draft-ietf-eai-rfc5335bis-04
1. improve sentences and ABNF revised based on AD and Co-chairs
7.6. draft-ietf-eai-rfc5335bis-05
1. ABNF revised based on AD comments
7.7. draft-ietf-eai-rfc5335bis-06
1. ABNF revised
2. improve Section 5
7.8. draft-ietf-eai-rfc5335bis-07
1. Minor ABNF revised in Section 3.2
2. improve Section 5
7.9. draft-ietf-eai-rfc5335bis-09
Version -08 was posted in error and withdrawn. Version 09 is is
identical to version 07 except for a date change, addition of this
note, and some vertical spacing compression on this page.
7.10. draft-ietf-eai-rfc5335bis-10
1. Add appendix and overview of changes
2. Replace polls result in Abstract and Section 1
3. Minor Sentence modification
7.11. draft-ietf-eai-rfc5335bis-11
1. Major rewrite of entire document to incorporate Dave Crocker's
simplified ABNF.
2. The document has intentionally been refocused on implementors
wishing to adapt their software to support EAI, so much of the
explanatory and historical text has been removed. (Some of it
may be reintroduced later as an appendix.
7.12. draft-ietf-eai-rfc5335bis-12
1. Added a section on the handling of MIME encoded-words.
2. Updated the security considerations to refer to the more complete
discussion in RFC 3629.
3. Added a section on the effects on line length limits.
4. Removed the syntax restriction on the use of 8-bit UTF-8 in
message-ids.
5. Added text recommending that 8-bit UTF-8 be avoided in message-
ids.
7.13. draft-ietf-eai-rfc5335bis-13
1. Updated and alphabetized the contributor list.
2. Corrected various typos, reworded some sections to make them
clearer.
3. Replaced the reference to RFC 5598 with a reference to RFC 5322.
4. Removed the Updates: RFC 5322. RFC 5322 is extended by this
document, not updated.
5. Added some text to the Introduction referring to the framework
document for information about changes between this specification
and RFC 5335.
6. Added text to the Abstract to say that this document replaces RFC
5335 and that RFC 2045 is updated.
8. References 7. References
8.1. Normative References 7.1. Normative References
[ASCII] "Coded Character Set -- 7-bit American [ASCII] "Coded Character Set -- 7-bit American Standard Code for
Standard Code for Information Information Interchange", ANSI X3.4, 1986.
Interchange", ANSI X3.4, 1986.
[I-D.ietf-eai-frmwrk-4952bis] Klensin, J. and Y. Ko, "Overview and [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Framework for Internationalized Requirement Levels", BCP 14, RFC 2119, March 1997.
Email",
draft-ietf-eai-frmwrk-4952bis-10 (work
in progress), September 2010.
[I-D.ietf-eai-rfc5336bis] Yao, J. and W. MAO, "SMTP Extension [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
for Internationalized Email Address", 10646", STD 63, RFC 3629, November 2003.
draft-ietf-eai-rfc5336bis-07 (work in
progress), December 2010.
[RFC2119] Bradner, S., "Key words for use in [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network
RFCs to Indicate Requirement Levels", Interchange", RFC 5198, March 2008.
BCP 14, RFC 2119, March 1997.
[RFC3629] Yergeau, F., "UTF-8, a transformation [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
format of ISO 10646", STD 63, Specifications: ABNF", STD 68, RFC 5234, January 2008.
RFC 3629, November 2003.
[RFC5198] Klensin, J. and M. Padlipsky, "Unicode [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
Format for Network Interchange", October 2008.
RFC 5198, March 2008.
[RFC5234] Crocker, D. and P. Overell, "Augmented [RFC5322] Resnick, P., Ed., "Internet Message Format", RFC 5322,
BNF for Syntax Specifications: ABNF", October 2008.
STD 68, RFC 5234, January 2008.
[RFC5321] Klensin, J., "Simple Mail Transfer [RFC6530] Klensin, J. and Y. Ko, "Overview and Framework for
Protocol", RFC 5321, October 2008. Internationalized Email", RFC 6530, February 2012.
[RFC5322] Resnick, P., Ed., "Internet Message [RFC6531] Yao, J. and W. Mao, "SMTP Extension for Internationalized
Format", RFC 5322, October 2008. Email", RFC 6531, February 2012.
[UNF] Davis, M. and K. Whistler, "Unicode [UNF] Davis, M. and K. Whistler, "Unicode Standard Annex #15:
Standard Annex #15: Unicode Unicode Normalization Forms", September 2010,
Normalization Forms", September 2010, <http://www.unicode.org/reports/tr15/>.
<http://www.unicode.org/reports/
tr15/>.
8.2. Informative References 7.2. Informative References
[RFC2045] Freed, N. and N. Borenstein, [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
"Multipurpose Internet Mail Extensions Extensions (MIME) Part One: Format of Internet Message
(MIME) Part One: Format of Internet Bodies", RFC 2045, November 1996.
Message Bodies", RFC 2045,
November 1996.
[RFC2046] Freed, N. and N. Borenstein, [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
"Multipurpose Internet Mail Extensions Extensions (MIME) Part Two: Media Types", RFC 2046,
(MIME) Part Two: Media Types", November 1996.
RFC 2046, November 1996.
[RFC2047] Moore, K., "MIME (Multipurpose [RFC2047] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
Internet Mail Extensions) Part Three: Part Three: Message Header Extensions for Non-ASCII Text",
Message Header Extensions for Non- RFC 2047, November 1996.
ASCII Text", RFC 2047, November 1996.
[RFC3156] Elkins, M., Del Torto, D., Levien, R., [RFC3156] Elkins, M., Del Torto, D., Levien, R., and T. Roessler,
and T. Roessler, "MIME Security with "MIME Security with OpenPGP", RFC 3156, August 2001.
OpenPGP", RFC 3156, August 2001.
[RFC5280] Cooper, D., Santesson, S., Farrell, [RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
S., Boeyen, S., Housley, R., and W. Housley, R., and W. Polk, "Internet X.509 Public Key
Polk, "Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List
Infrastructure Certificate and (CRL) Profile", RFC 5280, May 2008.
Certificate Revocation List (CRL)
Profile", RFC 5280, May 2008.
[RFC5335] Abel, Y., "Internationalized Email [RFC5335] Yang, A., "Internationalized Email Headers", RFC 5335,
Headers", RFC 5335, September 2008. September 2008.
[RFC6152] Klensin, J., Freed, N., Rose, M., and [RFC6152] Klensin, J., Freed, N., Rose, M., and D. Crocker, "SMTP
D. Crocker, "SMTP Service Extension Service Extension for 8-bit MIME Transport", STD 71,
for 8-bit MIME Transport", STD 71, RFC 6152, March 2011.
RFC 6152, March 2011.
Authors' Addresses Authors' Addresses
Abel Yang Abel Yang
TWNIC TWNIC
4F-2, No. 9, Sec 2, Roosevelt Rd. 4F-2, No. 9, Sec 2, Roosevelt Rd.
Taipei, 100 Taipei 100
Taiwan Taiwan
Phone: +886 2 23411313 ext 505 Phone: +886 2 23411313 ext 505
EMail: abelyang@twnic.net.tw EMail: abelyang@twnic.net.tw
Shawn Steele Shawn Steele
Microsoft Microsoft
EMail: Shawn.Steele@microsoft.com EMail: Shawn.Steele@microsoft.com
 End of changes. 69 change blocks. 
292 lines changed or deleted 143 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/