draft-ietf-appsawg-xml-mediatypes-09.txt | draft-ietf-appsawg-xml-mediatypes-10.txt | |||
---|---|---|---|---|
Network Working Group H. Thompson | Network Working Group H. Thompson | |||
Internet-Draft University of Edinburgh | Internet-Draft University of Edinburgh | |||
Obsoletes: 3023 (if approved) C. Lilley | Obsoletes: 3023 (if approved) C. Lilley | |||
Updates: 6839 (if approved) W3C | Updates: 6839 (if approved) W3C | |||
Intended status: Standards Track March 02, 2014 | Intended status: Standards Track April 07, 2014 | |||
Expires: September 3, 2014 | Expires: October 9, 2014 | |||
XML Media Types | XML Media Types | |||
draft-ietf-appsawg-xml-mediatypes-09 | draft-ietf-appsawg-xml-mediatypes-10 | |||
Abstract | Abstract | |||
This specification standardizes three media types -- application/xml, | This specification standardizes three media types -- application/xml, | |||
application/xml-external-parsed-entity, and application/xml-dtd -- | application/xml-external-parsed-entity, and application/xml-dtd -- | |||
for use in exchanging network entities that are related to the | for use in exchanging network entities that are related to the | |||
Extensible Markup Language (XML) while defining text/xml and text/ | Extensible Markup Language (XML) while defining text/xml and text/ | |||
xml-external-parsed-entity as aliases for the respective application/ | xml-external-parsed-entity as aliases for the respective application/ | |||
types. This specification also standardizes the '+xml' suffix for | types. This specification also standardizes the '+xml' suffix for | |||
naming media types outside of these five types when those media types | naming media types outside of these five types when those media types | |||
skipping to change at page 1, line 39 | skipping to change at page 1, line 39 | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on September 3, 2014. | This Internet-Draft will expire on October 9, 2014. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2014 IETF Trust and the persons identified as the | Copyright (c) 2014 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 34 | skipping to change at page 2, line 34 | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4 | 2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4 | |||
2.1. Conformance Keywords . . . . . . . . . . . . . . . . . . 4 | 2.1. Conformance Keywords . . . . . . . . . . . . . . . . . . 4 | |||
2.2. Characters, Encodings, Charsets . . . . . . . . . . . . . 4 | 2.2. Characters, Encodings, Charsets . . . . . . . . . . . . . 4 | |||
2.3. MIME Entities, XML Entities . . . . . . . . . . . . . . . 4 | 2.3. MIME Entities, XML Entities . . . . . . . . . . . . . . . 4 | |||
3. Encoding Considerations . . . . . . . . . . . . . . . . . . . 5 | 3. Encoding Considerations . . . . . . . . . . . . . . . . . . . 5 | |||
3.1. XML MIME producers . . . . . . . . . . . . . . . . . . . 6 | 3.1. XML MIME producers . . . . . . . . . . . . . . . . . . . 6 | |||
3.2. XML MIME consumers . . . . . . . . . . . . . . . . . . . 6 | 3.2. XML MIME consumers . . . . . . . . . . . . . . . . . . . 6 | |||
3.3. The Byte Order Mark (BOM) and Encoding Conversions . . . 7 | 3.3. The Byte Order Mark (BOM) and Encoding Conversions . . . 7 | |||
4. XML Media Types . . . . . . . . . . . . . . . . . . . . . . . 8 | 4. XML Media Types . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
4.1. XML MIME Entities . . . . . . . . . . . . . . . . . . . . 8 | 4.1. XML MIME Entities . . . . . . . . . . . . . . . . . . . . 9 | |||
4.2. Using '+xml' when Registering XML-based Media Types . . . 10 | 4.2. Using '+xml' when Registering XML-based Media Types . . . 10 | |||
4.3. Registration Guidelines for XML-based Media Types Not | 4.3. Registration Guidelines for XML-based Media Types Not | |||
Using '+xml' . . . . . . . . . . . . . . . . . . . . . . 11 | Using '+xml' . . . . . . . . . . . . . . . . . . . . . . 12 | |||
5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 12 | 5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 12 | |||
6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 13 | 6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
7. XML Versions . . . . . . . . . . . . . . . . . . . . . . . . 13 | 7. XML Versions . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
8.1. UTF-8 Charset . . . . . . . . . . . . . . . . . . . . . . 14 | 8.1. UTF-8 Charset . . . . . . . . . . . . . . . . . . . . . . 14 | |||
8.2. UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . 15 | 8.2. UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . 15 | |||
8.3. Omitted Charset and 8-bit MIME Entity . . . . . . . . . . 15 | 8.3. Omitted Charset and 8-bit MIME Entity . . . . . . . . . . 15 | |||
8.4. Omitted Charset and 16-bit MIME Entity . . . . . . . . . 15 | 8.4. Omitted Charset and 16-bit MIME Entity . . . . . . . . . 16 | |||
8.5. Omitted Charset, no Internal Encoding Declaration . . . . 16 | 8.5. Omitted Charset, no Internal Encoding Declaration . . . . 16 | |||
8.6. UTF-16BE Charset . . . . . . . . . . . . . . . . . . . . 16 | 8.6. UTF-16BE Charset . . . . . . . . . . . . . . . . . . . . 17 | |||
8.7. Non-UTF Charset . . . . . . . . . . . . . . . . . . . . . 17 | 8.7. Non-UTF Charset . . . . . . . . . . . . . . . . . . . . . 17 | |||
8.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal | 8.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal | |||
Encoding Declaration . . . . . . . . . . . . . . . . . . 17 | Encoding Declaration . . . . . . . . . . . . . . . . . . 17 | |||
8.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM . . . . 17 | 8.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM . . . . 18 | |||
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
9.1. Application/xml Registration . . . . . . . . . . . . . . 18 | 9.1. Application/xml Registration . . . . . . . . . . . . . . 18 | |||
9.2. Text/xml Registration . . . . . . . . . . . . . . . . . . 19 | 9.2. Text/xml Registration . . . . . . . . . . . . . . . . . . 20 | |||
9.3. Application/xml-external-parsed-entity Registration . . . 20 | 9.3. Application/xml-external-parsed-entity Registration . . . 20 | |||
9.4. Text/xml-external-parsed-entity Registration . . . . . . 21 | 9.4. Text/xml-external-parsed-entity Registration . . . . . . 21 | |||
9.5. Application/xml-dtd Registration . . . . . . . . . . . . 21 | 9.5. Application/xml-dtd Registration . . . . . . . . . . . . 21 | |||
9.6. The '+xml' Naming Convention for XML-Based Media Types . 22 | 9.6. The '+xml' Naming Convention for XML-Based Media Types . 22 | |||
9.6.1. +xml Structured Syntax Suffix Registration . . . . . 22 | 9.6.1. +xml Structured Syntax Suffix Registration . . . . . 22 | |||
10. Security Considerations . . . . . . . . . . . . . . . . . . . 23 | 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 | |||
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 | 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
11.1. Normative References . . . . . . . . . . . . . . . . . . 25 | 11.1. Normative References . . . . . . . . . . . . . . . . . . 26 | |||
11.2. Informative References . . . . . . . . . . . . . . . . . 27 | 11.2. Informative References . . . . . . . . . . . . . . . . . 28 | |||
Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types? 30 | Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types? 30 | |||
Appendix B. Core XML specifications . . . . . . . . . . . . . . 30 | Appendix B. Core XML specifications . . . . . . . . . . . . . . 30 | |||
Appendix C. Changes from RFC 3023 . . . . . . . . . . . . . . . 30 | Appendix C. Operational considerations . . . . . . . . . . . . . 31 | |||
Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 31 | C.1. General considerations . . . . . . . . . . . . . . . . . 31 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 | C.2. Considerations for producers . . . . . . . . . . . . . . 31 | |||
C.3. Considerations for consumers . . . . . . . . . . . . . . 32 | ||||
Appendix D. Changes from RFC 3023 . . . . . . . . . . . . . . . 32 | ||||
Appendix E. Acknowledgements . . . . . . . . . . . . . . . . . . 33 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 | ||||
1. Introduction | 1. Introduction | |||
The World Wide Web Consortium has issued the Extensible Markup | The World Wide Web Consortium has issued the Extensible Markup | |||
Language (XML) 1.0 [XML] and Extensible Markup Language (XML) 1.1 | Language (XML) 1.0 [XML] and Extensible Markup Language (XML) 1.1 | |||
[XML1.1] specifications. To enable the exchange of XML network | [XML1.1] specifications. To enable the exchange of XML network | |||
entities, this specification standardizes three media types -- | entities, this specification standardizes three media types -- | |||
application/xml, application/xml-external-parsed-entity, and | application/xml, application/xml-external-parsed-entity, and | |||
application/xml-dtd and two aliases -- text/xml and text/xml- | application/xml-dtd and two aliases -- text/xml and text/xml- | |||
external-parsed-entity, as well as a naming convention for | external-parsed-entity, as well as a naming convention for | |||
skipping to change at page 4, line 10 | skipping to change at page 4, line 10 | |||
with application/xml and application/xml-external-parsed-entity | with application/xml and application/xml-external-parsed-entity | |||
respectively, the addition of XPointer and XML Base as fragment | respectively, the addition of XPointer and XML Base as fragment | |||
identifiers and base URIs, respectively, integration of the XPointer | identifiers and base URIs, respectively, integration of the XPointer | |||
Registry and updating of many references. | Registry and updating of many references. | |||
2. Notational Conventions | 2. Notational Conventions | |||
2.1. Conformance Keywords | 2.1. Conformance Keywords | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
specification are to be interpreted as described in [RFC2119]. | "OPTIONAL" in this specification are to be interpreted as described | |||
in [RFC2119]. | ||||
2.2. Characters, Encodings, Charsets | 2.2. Characters, Encodings, Charsets | |||
Both XML (in an XML or Text declaration using the encoding pseudo- | Both XML (in an XML or Text declaration using the encoding pseudo- | |||
attribute) and MIME (in a Content-Type header field using the charset | attribute) and MIME (in a Content-Type header field using the charset | |||
parameter) use a common set of labels [IANA-charsets] to identify the | parameter) use a common set of labels [IANA-charsets] to identify the | |||
MIME charset (mapping from byte stream to character sequence | MIME charset (mapping from byte stream to character sequence | |||
[RFC2978]). | [RFC2978]). | |||
In this specification we will use the phrases "charset parameter" and | In this specification we will use the phrases "charset parameter" and | |||
skipping to change at page 4, line 34 | skipping to change at page 4, line 35 | |||
We reserve the phrase "character encoding" (or, when the context | We reserve the phrase "character encoding" (or, when the context | |||
makes the intention clear, simply "encoding") for the MIME charset | makes the intention clear, simply "encoding") for the MIME charset | |||
actually used in a particular XML MIME entity. | actually used in a particular XML MIME entity. | |||
[UNICODE] defines three "encoding forms", namely UTF-8, UTF-16, and | [UNICODE] defines three "encoding forms", namely UTF-8, UTF-16, and | |||
UTF-32. As UTF-8 can only be serialized in one way, the only | UTF-32. As UTF-8 can only be serialized in one way, the only | |||
possible label for UTF-8-encoded documents when serialised into MIME | possible label for UTF-8-encoded documents when serialised into MIME | |||
entities is "utf-8". UTF-16 XML documents, however, can be | entities is "utf-8". UTF-16 XML documents, however, can be | |||
serialised into MIME entities in one of two ways: either big- endian, | serialised into MIME entities in one of two ways: either big- endian, | |||
labelled (optionally) "utf-16" or "utf-16be", or little- endian, | labelled (optionally) "utf-16" or "utf-16be", or little- endian, | |||
labelled (optionally) "utf-16" or "utf-16le". | labelled (optionally) "utf-16" or "utf-16le". See Section 3.3 below | |||
for how a Byte Order Mark (BOM) is required when the "utf-16" | ||||
serialization is used. | ||||
UTF-32 has four potential serializations, of which only two (UTF-32BE | UTF-32 has four potential serializations, of which only two (UTF-32BE | |||
and UTF-32LE) are given names in in [UNICODE]. Support for the | and UTF-32LE) are given names in [UNICODE]. Support for the various | |||
various serializations varies widely, and security concerns about | serializations varies widely, and security concerns about their use | |||
their use have been raised. The use of UTF-32 is NOT RECOMMENDED for | have been raised (see for example [Sivonen]). The use of UTF-32 is | |||
XML MIME entities. | NOT RECOMMENDED for XML MIME entities. | |||
2.3. MIME Entities, XML Entities | 2.3. MIME Entities, XML Entities | |||
As sometimes happens between two communities, both MIME and XML have | As sometimes happens between two communities, both MIME and XML have | |||
defined the term entity, with different meanings. Section 2.4 of | defined the term entity, with different meanings. Section 2.4 of | |||
[RFC2045] says: | [RFC2045] says: | |||
"The term 'entity' refers specifically to the MIME-defined header | "The term 'entity' refers specifically to the MIME-defined header | |||
fields and contents of either a message or one of the parts in the | fields and contents of either a message or one of the parts in the | |||
body of a multipart entity." | body of a multipart entity." | |||
skipping to change at page 6, line 11 | skipping to change at page 6, line 17 | |||
recommendations given below are intended to maximise interoperability | recommendations given below are intended to maximise interoperability | |||
in the face of this, by on the one hand mandating consistent | in the face of this, by on the one hand mandating consistent | |||
production and encouraging maximally robust forms of production, and | production and encouraging maximally robust forms of production, and | |||
on the other specifying recovery strategies to maximize the | on the other specifying recovery strategies to maximize the | |||
interoperability of consumers when the production rules are broken. | interoperability of consumers when the production rules are broken. | |||
3.1. XML MIME producers | 3.1. XML MIME producers | |||
XML-aware MIME producers SHOULD supply a charset parameter and/or an | XML-aware MIME producers SHOULD supply a charset parameter and/or an | |||
appropriate BOM with non-UTF-8-encoded XML MIME entities which lack | appropriate BOM with non-UTF-8-encoded XML MIME entities which lack | |||
an encoding declaration, and SHOULD remove or correct an encoding | an encoding declaration. Such producers SHOULD remove or correct an | |||
declaration which is known to be incorrect (for example, as a result | encoding declaration which is known to be incorrect (for example, as | |||
of transcoding). | a result of transcoding). | |||
XML-aware MIME producers MUST supply an XML text declaration at the | XML-aware MIME producers MUST supply an XML text declaration at the | |||
beginning of non-UNICODE XML external parsed entities which would | beginning of non-UNICODE XML external parsed entities which would | |||
otherwise begin with the hexadecimal octet sequences 0xFE 0xFF, 0xFF | otherwise begin with the hexadecimal octet sequences 0xFE 0xFF, 0xFF | |||
0xFE or 0xEF 0xBB 0xBF, in order to avoid the mistaken detection of a | 0xFE or 0xEF 0xBB 0xBF, in order to avoid the mistaken detection of a | |||
BOM. | BOM. | |||
XML-unaware MIME producers MUST NOT supply a charset parameter with | XML-unaware MIME producers MUST NOT supply a charset parameter with | |||
an XML MIME entity unless the entity's character encoding is reliably | an XML MIME entity unless the entity's character encoding is reliably | |||
known. Note that this is particularly relevant for central | known. Note that this is particularly relevant for central | |||
skipping to change at page 7, line 47 | skipping to change at page 7, line 49 | |||
If an XML MIME entity is received where the charset parameter is | If an XML MIME entity is received where the charset parameter is | |||
omitted, no information is being provided about the character | omitted, no information is being provided about the character | |||
encoding by the MIME Content-Type header. XML-aware consumers MUST | encoding by the MIME Content-Type header. XML-aware consumers MUST | |||
follow the requirements in section 4.3.3 of [XML] that directly | follow the requirements in section 4.3.3 of [XML] that directly | |||
address this case. XML-unaware MIME consumers SHOULD NOT assume a | address this case. XML-unaware MIME consumers SHOULD NOT assume a | |||
default encoding in this case. | default encoding in this case. | |||
3.3. The Byte Order Mark (BOM) and Encoding Conversions | 3.3. The Byte Order Mark (BOM) and Encoding Conversions | |||
Section 4.3.3 of [XML] specifies that UTF-16 XML MIME entities not | Section 4.3.3 of [XML] specifies that UTF-16 XML MIME entities not | |||
labelled as "utf-16le" or "utf16-be" MUST begin with a byte order | labelled as "utf-16le" or "utf-16be" MUST begin with a byte order | |||
mark (BOM), U+FEFF, which appears as the hexadecimal octet sequence | mark (BOM), U+FEFF, which appears as the hexadecimal octet sequence | |||
0xFE 0xFF (big-endian) or 0xFF 0xFE (little-endian). [XML] further | 0xFE 0xFF (big-endian) or 0xFF 0xFE (little-endian). [XML] further | |||
states that the BOM is an encoding signature, and is not part of | states that the BOM is an encoding signature, and is not part of | |||
either the markup or the character data of the XML document. | either the markup or the character data of the XML document. | |||
Due to the presence of the BOM, applications that convert XML from | Due to the presence of the BOM, applications that convert XML from | |||
UTF-16 to an encoding other than UTF-8 MUST strip the BOM before | UTF-16 to an encoding other than UTF-8 MUST strip the BOM before | |||
conversion. Similarly, when converting from another encoding into | conversion. Similarly, when converting from another encoding into | |||
UTF-16, either without a charset parameter, or labelled "utf-16", the | UTF-16, either without a charset parameter, or labelled "utf-16", the | |||
BOM MUST be added unless the original encoding was UTF-8 and a BOM | BOM MUST be added unless the original encoding was UTF-8 and a BOM | |||
skipping to change at page 8, line 24 | skipping to change at page 8, line 26 | |||
begin with a BOM, which appears as the hexadecimal octet sequence | begin with a BOM, which appears as the hexadecimal octet sequence | |||
0xEF 0xBB 0xBF. This is likewise defined to be an encoding | 0xEF 0xBB 0xBF. This is likewise defined to be an encoding | |||
signature, and not part of either the markup or the character data of | signature, and not part of either the markup or the character data of | |||
the XML document. | the XML document. | |||
Applications that convert XML from UTF-8 to an encoding other than | Applications that convert XML from UTF-8 to an encoding other than | |||
UTF-16 MUST strip the BOM, if present, before conversion. | UTF-16 MUST strip the BOM, if present, before conversion. | |||
Applications which convert XML into UTF-8 MAY add a BOM. | Applications which convert XML into UTF-8 MAY add a BOM. | |||
In addition to the MIME charset "utf-16", [RFC2781] introduces "utf- | In addition to the MIME charset "utf-16", [RFC2781] introduces "utf- | |||
16le" (little endian) and "utf-16be" (big endian). The BOM is | 16le" (little endian) and "utf-16be" (big endian). When an XML MIME | |||
prohibited in MIME entities with these labels. When an XML MIME | ||||
entity is encoded in "utf-16le" or "utf-16be", it MUST NOT begin with | entity is encoded in "utf-16le" or "utf-16be", it MUST NOT begin with | |||
the BOM but SHOULD contain an in-band XML encoding declaration. | the BOM but SHOULD contain an in-band XML encoding declaration. | |||
Conversion from UTF-8 or UTF-16 (unlabelled, or labelled with | Conversion from UTF-8 or UTF-16 (unlabelled, or labelled with | |||
"utf-16") to "utf-16be" or "utf-16le" MUST strip a BOM if present, | "utf-16") to "utf-16be" or "utf-16le" MUST strip a BOM if present. | |||
and conversion in the other direction MUST (for UTF-16) or MAY (for | Conversion from UTF-16 labelled "utf-16le" or "utf-16be" to UTF-16 | |||
UTF-8) add the appropriate BOM. | without a label or labelled "utf-16" MUST add the appropriate BOM. | |||
Conversion from UTF-16 labelled "utf-16le" or "utf-16be" to UTF-8 MAY | ||||
add a UTF-8 BOM, but this is NOT RECOMMENDED. | ||||
Appendix F of [XML] also implies the a UTF-32 BOM may be used in | Appendix F of [XML] also implies the a UTF-32 BOM may be used in | |||
conjunction with UTF-32-encoded documents. As noted above, this | conjunction with UTF-32-encoded documents. As noted above, this | |||
specification recommends against the use of UTF-32, but if it is | specification recommends against the use of UTF-32, but if it is | |||
used, the same considerations apply with respect to its being a | used, the same considerations apply with respect to its being a | |||
signature, not part of the document, with respect to transcoding into | signature, not part of the document, with respect to transcoding into | |||
or out of it and with respect to the MIME charsets "utf-32le" and | or out of it and with respect to the MIME charsets "utf-32le" and | |||
"utf-32be", as for UTF-16. Consumers which do not support UTF-32 | "utf-32be", as for UTF-16. Consumers which do not support UTF-32 | |||
SHOULD none-the-less recognise UTF-32 signatures in order to give | SHOULD none-the-less recognise UTF-32 signatures in order to give | |||
helpful error messages (instead of treating them as invalid UTF-16). | helpful error messages (instead of treating them as invalid UTF-16). | |||
skipping to change at page 11, line 4 | skipping to change at page 11, line 12 | |||
type. This convention will allow applications that can process XML | type. This convention will allow applications that can process XML | |||
generically to detect that the MIME entity is supposed to be an XML | generically to detect that the MIME entity is supposed to be an XML | |||
document, verify this assumption by invoking some XML processor, and | document, verify this assumption by invoking some XML processor, and | |||
then process the XML document accordingly. Applications may check | then process the XML document accordingly. Applications may check | |||
for types that represent XML MIME entities by comparing the last four | for types that represent XML MIME entities by comparing the last four | |||
characters of the subtype to the string '+xml'. (However note that 4 | characters of the subtype to the string '+xml'. (However note that 4 | |||
of the 5 media types defined in this specification -- text/xml, | of the 5 media types defined in this specification -- text/xml, | |||
application/xml, text/xml-external-parsed-entity, and application/ | application/xml, text/xml-external-parsed-entity, and application/ | |||
xml-external-parsed-entity -- also represent XML MIME entities while | xml-external-parsed-entity -- also represent XML MIME entities while | |||
not ending with '+xml'.) | not ending with '+xml'.) | |||
NOTE: Section 5.3.2 of HTTPbis [HTTPbis] does not support any form | NOTE: Section 5.3.2 of HTTPbis [HTTPbis] does not support any form | |||
of Accept header which will match only '+xml' types. In | of Accept header which will match only '+xml' types. In | |||
particular, Accept headers of the form "Accept: */*+xml" are not | particular, Accept headers of the form "Accept: */*+xml" are not | |||
allowed, and so this header MUST NOT be used for this purpose. | allowed, and will not work for this purpose. | |||
Media types following the naming convention '+xml' SHOULD introduce | Media types following the naming convention '+xml' SHOULD define the | |||
the charset parameter for consistency, since XML-generic processing | charset parameter for consistency, since XML-generic processing by | |||
applies the same program for any such media type. However, there are | definition treats treats all XML MIME entities uniformly as regards | |||
some cases that the charset parameter need not be introduced. For | character encoding information. However, there are some cases that | |||
example: | the charset parameter need not be defined. For example: | |||
When an XML-based media type is restricted to UTF-8, it is not | When an XML-based media type is restricted to UTF-8, it is not | |||
necessary to introduce the charset parameter. UTF-8 is the | necessary to define the charset parameter. UTF-8 is the default | |||
default for XML. | for XML. | |||
When an XML-based media type is restricted to UTF-8 and UTF-16, it | When an XML-based media type is restricted to UTF-8 and UTF-16, it | |||
might not be unreasonable to omit the charset parameter. Neither | might not be unreasonable to omit the charset parameter. Neither | |||
UTF-8 nor UTF-16 require XML encoding declarations. | UTF-8 nor UTF-16 require XML encoding declarations. | |||
XML generic processing is not always appropriate for XML-based media | XML generic processing is not always appropriate for XML-based media | |||
types. For example, authors of some such media types may wish that | types. For example, authors of some such media types may wish that | |||
the types remain entirely opaque except to applications that are | the types remain entirely opaque except to applications that are | |||
specifically designed to deal with that media type. By NOT following | specifically designed to deal with that media type. By NOT following | |||
the naming convention '+xml', such media types can avoid XML-generic | the naming convention '+xml', such media types can avoid XML-generic | |||
processing. Since generic processing will be useful in many cases, | processing. Since generic processing will be useful in many cases, | |||
however -- including in some situations that are difficult to predict | however -- including in some situations that are difficult to predict | |||
ahead of time -- the '+xml' convention is to be preferred unless | ahead of time -- the '+xml' convention is to be preferred unless | |||
there is some particularly compelling reason not to. | there is some particularly compelling reason not to. | |||
The registration process for specific '+xml' media types is described | The registration process for specific '+xml' media types is described | |||
in [RFC6838]. The registrar for the IETF tree will encourage new | in [RFC6838]. New XML-based media type registrations in the IETF | |||
XML-based media type registrations in the IETF tree to follow this | must follow these guidelines. When other organisations register XML- | |||
guideline. Registrars for other trees SHOULD follow this convention | based media types via the "Specification Required" IANA registration | |||
in order to ensure maximum interoperability of their XML-based | policy, the relevant Media Reviewer should ensure that they use the | |||
documents. Only media subtypes that represent XML MIME entities are | '+xml' convention, in order to ensure maximum interoperability of | |||
allowed to register with a '+xml' suffix. | their XML-based documents. Only media subtypes that represent XML | |||
MIME entities are allowed to register with a '+xml' suffix. | ||||
In addition to the changes described above, the change controller has | In addition to the changes described above, the change controller has | |||
been changed to be the World Wide Web Consortium (W3C). | been changed to be the World Wide Web Consortium (W3C). | |||
4.3. Registration Guidelines for XML-based Media Types Not Using '+xml' | 4.3. Registration Guidelines for XML-based Media Types Not Using '+xml' | |||
Registrations for new XML-based media types which do _not_ use the | Registrations for new XML-based media types which do _not_ use the | |||
'+xml' suffix SHOULD, in specifying the charset parameter and | '+xml' suffix SHOULD, in specifying the charset parameter and | |||
encoding considerations, define them as: "Same as [charset parameter | encoding considerations, define them as: "Same as [charset parameter | |||
/ encoding considerations] of application/xml as specified in RFC | / encoding considerations] of application/xml as specified in RFC | |||
XXXX." | XXXX." | |||
Enabling the charset parameter is RECOMMENDED, since this information | ||||
Defining the charset parameter is RECOMMENDED, since this information | ||||
can be used by XML processors to determine authoritatively the | can be used by XML processors to determine authoritatively the | |||
character encoding of the XML MIME entity in the absence of a BOM. | character encoding of the XML MIME entity in the absence of a BOM. | |||
If there are some reasons not to follow this advice, they SHOULD be | If there are some reasons not to follow this advice, they SHOULD be | |||
included as part of the registration. As shown above, two such | included as part of the registration. As shown above, two such | |||
reasons are "UTF-8 only" or "UTF-8 or UTF-16 only". | reasons are "UTF-8 only" or "UTF-8 or UTF-16 only". | |||
These registrations SHOULD specify that the XML-based media type | These registrations SHOULD specify that the XML-based media type | |||
being registered has all of the security considerations described in | being registered has all of the security considerations described in | |||
RFC XXXX plus any additional considerations specific to that media | RFC XXXX plus any additional considerations specific to that media | |||
type. | type. | |||
skipping to change at page 12, line 49 | skipping to change at page 13, line 12 | |||
specified by the [XPointerFramework] together with any other | specified by the [XPointerFramework] together with any other | |||
specifications governing the XPointer schemes used in those | specifications governing the XPointer schemes used in those | |||
identifiers which the applications support. Conforming applications | identifiers which the applications support. Conforming applications | |||
MUST support the 'element' scheme as defined in [XPointerElement], | MUST support the 'element' scheme as defined in [XPointerElement], | |||
but need not support other schemes. | but need not support other schemes. | |||
If an XPointer error is reported in the attempt to process the part, | If an XPointer error is reported in the attempt to process the part, | |||
this specification does not define an interpretation for the part. | this specification does not define an interpretation for the part. | |||
A registry of XPointer schemes [XPtrReg] is maintained at the W3C. | A registry of XPointer schemes [XPtrReg] is maintained at the W3C. | |||
Document authors SHOULD NOT use unregistered schemes. Scheme authors | Generic processors of XML MIME entities SHOULD NOT implement | |||
SHOULD register their schemes ([XPtrRegPolicy] describes requirements | unregistered XPointer schemes ([XPtrRegPolicy] describes requirements | |||
and procedures for doing so). | and procedures for registering schemes). | |||
See Section 4.2 for additional requirements which apply when an XML- | See Section 4.2 for additional requirements which apply when an XML- | |||
based media type follows the naming convention '+xml'. | based media type follows the naming convention '+xml'. | |||
If [XPointerFramework] and [XPointerElement] are inappropriate for | If [XPointerFramework] and [XPointerElement] are inappropriate for | |||
some XML-based media type, it SHOULD NOT follow the naming convention | some XML-based media type, it SHOULD NOT follow the naming convention | |||
'+xml'. | '+xml'. | |||
When a URI has a fragment identifier, it is encoded by a limited | When a URI has a fragment identifier, it is encoded by a limited | |||
subset of the repertoire of US-ASCII characters, see | subset of the repertoire of US-ASCII characters, see | |||
skipping to change at page 29, line 5 | skipping to change at page 29, line 25 | |||
[RFC3977] Feather, B., "Network News Transfer Protocol", RFC 3977, | [RFC3977] Feather, B., "Network News Transfer Protocol", RFC 3977, | |||
October 2006. | October 2006. | |||
[RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, | [RFC5321] Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, | |||
October 2008. | October 2008. | |||
[RFC6152] Klensin, J., Freed, N., Rose, M., and D. Crocker, "SMTP | [RFC6152] Klensin, J., Freed, N., Rose, M., and D. Crocker, "SMTP | |||
Service Extension for 8-bit MIME Transport", RFC 6152, | Service Extension for 8-bit MIME Transport", RFC 6152, | |||
March 2011. | March 2011. | |||
[Sivonen] Sivonen, H. and others, "Mozilla bug: Remove support for | ||||
UTF-32 per HTML5 spec", October 2011, <https:// | ||||
bugzilla.mozilla.org/show_bug.cgi?id=604317#c6>. | ||||
[TAGMIME] Bray, T., Ed., "Internet Media Type registration, | [TAGMIME] Bray, T., Ed., "Internet Media Type registration, | |||
consistency of use", April 2004, | consistency of use", April 2004, | |||
<http://www.w3.org/2001/tag/2004/0430-mime>. | <http://www.w3.org/2001/tag/2004/0430-mime>. | |||
[XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible | [XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible | |||
HyperText Markup Language", W3C Recommendation xhtml1, | HyperText Markup Language", W3C Recommendation xhtml1, | |||
December 1999, | December 1999, | |||
<http://www.w3.org/TR/2000/REC-xhtml1-20000126/>. | <http://www.w3.org/TR/2000/REC-xhtml1-20000126/>. | |||
Latest version available at [7]. | Latest version available at [7]. | |||
skipping to change at page 30, line 42 | skipping to change at page 31, line 15 | |||
The W3C Technical Architecture group has produced two documents which | The W3C Technical Architecture group has produced two documents which | |||
are also relevant: | are also relevant: | |||
The Self-Describing Web [FYN] discusses the overall principles of | The Self-Describing Web [FYN] discusses the overall principles of | |||
how document semantics are determined on the Web. | how document semantics are determined on the Web. | |||
Architecture of the World Wide Web, Volume One [AWWW], section | Architecture of the World Wide Web, Volume One [AWWW], section | |||
4.5.4, discusses the specific role of XML Namespace documents in | 4.5.4, discusses the specific role of XML Namespace documents in | |||
this process. | this process. | |||
Appendix C. Changes from RFC 3023 | Appendix C. Operational considerations | |||
This section provides an informal summary of the major operational | ||||
considerations which arise when exchanging XML MIME entities over a | ||||
network. | ||||
C.1. General considerations | ||||
The existence of both XML-aware and XML-unaware agents handling XML | ||||
MIME entities can compromise introperability. Generic transcoding | ||||
proxies pose a particular risk in this regard. Detailed advice about | ||||
the handling of BOMs when transcoding can be found in Section 3.3. | ||||
This specification requires XML consumers to treat BOMs as | ||||
authoritative: this is in principle a backwards-incompatibility. In | ||||
practice serious interoperability issues already exist when BOMs are | ||||
used. Making BOMs authoritative, in conjunction with the deprecation | ||||
of the UTF-32 encoding form and the requirement to include an XML | ||||
encoding declaration in certain cases (Section 3.1), is intended to | ||||
improve in-practice interoperability as much as possible over time. | ||||
This specification establishes Section 5 as the basis for | ||||
interpreting URIs for XML MIME entities which include fragment | ||||
identifiers, mandates support only for shorthand ("simple name") and | ||||
'element'-scheme fragments and deprecates support for unregistered | ||||
XPointer schemes by XML MIME entity processors. Accordingly, URIs | ||||
will interoperate best if they use only simple names and | ||||
'element'-scheme fragment identifiers, with registered schemes | ||||
varying widely in the degree of support to be found in generic tools. | ||||
XPointer scheme authors can only expect generic tool support if they | ||||
register their schemes. | ||||
C.2. Considerations for producers | ||||
Interoperability for all XML MIME entities is maximized by the use of | ||||
UTF-8, without a BOM. When UTF-8 is _not_ used, a charset parameter | ||||
and/or a BOM improve interoperability, particularly when XML-unaware | ||||
consumers may be involved. | ||||
In the very rare case where the substantive content of a non-UNICODE | ||||
XML external parsed entity begins with the hexadecimal octet | ||||
sequences 0xFE 0xFF, 0xFF 0xFE or 0xEF 0xBB 0xBF, including an XML | ||||
text declaration will forestall the mistaken detection of a BOM. | ||||
The use of UTF-32 for XML MIME entities puts interoperability at very | ||||
high risk. | ||||
Web-server configurations which supply default charset parameters | ||||
risk misrepresenting XML MIME entities. Allowing users to control | ||||
the value of charset parameters improves interoperability. | ||||
Supplying a mistaken charset parameter is worse than supplying none | ||||
at all. In particular, generic processors such as transcoders, when | ||||
processing based on a mistaken charset parameter, if they do not fail | ||||
altogether are likely to produce arbitrarily bogus results from which | ||||
the original is not recoverable. | ||||
C.3. Considerations for consumers | ||||
Consumers of XML MIME entities can maximize interoperability by | ||||
1. Taking a BOM as authoritative if it is present in an XML MIME | ||||
entity; | ||||
2. In the absence of a BOM, taking a charset parameter as | ||||
authoritative if it is present. | ||||
Assuming a default character encoding in the absence of a charset | ||||
parameter harms interoperability. | ||||
Although support for UTF-32 is not required by [XML] itself, and this | ||||
specification deprecates its use, consumers which check for UTF-32 | ||||
BOMs can thereby avoid mistakenly processing UTF-32 entities as | ||||
(invalid) UTF-16 entities. | ||||
Appendix D. Changes from RFC 3023 | ||||
There are numerous and significant differences between this | There are numerous and significant differences between this | |||
specification and [RFC3023], which it obsoletes. This appendix | specification and [RFC3023], which it obsoletes. This appendix | |||
summarizes the major differences only. | summarizes the major differences only. | |||
XPointer ([XPointerFramework] and [XPointerElement]) has been | XPointer ([XPointerFramework] and [XPointerElement]) has been | |||
added as fragment identifier syntax for all the XML media types, | added as fragment identifier syntax for all the XML media types, | |||
and the XPointer Registry ([XPtrReg]) mentioned | and the XPointer Registry ([XPtrReg]) mentioned | |||
[XMLBase] has been added as a mechanism for specifying base URIs | [XMLBase] has been added as a mechanism for specifying base URIs | |||
skipping to change at page 31, line 17 | skipping to change at page 33, line 17 | |||
Priority is now given to a Byte Order Mark (BOM) if present | Priority is now given to a Byte Order Mark (BOM) if present | |||
Many references are updated, and the existence of XML 1.1 and | Many references are updated, and the existence of XML 1.1 and | |||
relevance of this specification to it acknowledged | relevance of this specification to it acknowledged | |||
A number of justifications and contextualizations which were | A number of justifications and contextualizations which were | |||
appropriate when XML was new have been removed, including the | appropriate when XML was new have been removed, including the | |||
whole of the original Appendix A | whole of the original Appendix A | |||
Making BOMs authoritative is in principle a backwards- | Appendix E. Acknowledgements | |||
incompatibility. In practice serious interoperability issues already | ||||
exist when BOMs are used. Making BOMs authoritative, in conjunction | ||||
with the deprecation of the UTF-32 encoding form and the requirement | ||||
to include an XML encoding declaration in certain cases | ||||
(Section 3.1), is intended to improve in-practice interoperability as | ||||
much as possible. | ||||
Appendix D. Acknowledgements | ||||
MURATA Makoto (FAMILY Given) and Alexey Melnikov made early and | MURATA Makoto (FAMILY Given) and Alexey Melnikov made early and | |||
important contributions to the effort to revise [RFC3023]. | important contributions to the effort to revise [RFC3023]. | |||
This specification reflects the input of numerous participants to the | This specification reflects the input of numerous participants to the | |||
ietf-xml-mime@imc.org, xml-mime@ietf.org and apps-discuss@ietf.org | ietf-xml-mime@imc.org, xml-mime@ietf.org and apps-discuss@ietf.org | |||
mailing lists, though any errors are the responsibility of the | mailing lists, though any errors are the responsibility of the | |||
authors. Special thanks to: | authors. Special thanks to: | |||
Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed, | Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed, | |||
Yaron Goland, Bjoern Hoehrmann, Rick Jelliffe, Murray S. Kucherawy, | Yaron Goland, Bjoern Hoehrmann, Rick Jelliffe, Murray S. Kucherawy, | |||
Larry Masinter, David Megginson, S. Moonesamy, Keith Moore, Chris | Larry Masinter, David Megginson, S. Moonesamy, Keith Moore, Chris | |||
Newman, Gavin Nicol, Julian Reschke, Marshall Rose, Jim Whitehead, | Newman, Gavin Nicol, Julian Reschke, Marshall Rose, Jim Whitehead, | |||
Erik Wilde and participants of the XML activity and the TAG at the | Erik Wilde and participants of the XML activity and the TAG at the | |||
W3C. | W3C. | |||
Jim Whitehead and Simon St.Laurent were editors of [RFC2376] and | Jim Whitehead and Simon St. Laurent were editors of [RFC2376] and | |||
[RFC3023], respectively. | [RFC3023], respectively. | |||
Authors' Addresses | Authors' Addresses | |||
Henry S. Thompson | Henry S. Thompson | |||
University of Edinburgh | University of Edinburgh | |||
Email: ht@inf.ed.ac.uk | Email: ht@inf.ed.ac.uk | |||
URI: http://www.ltg.ed.ac.uk/~ht/ | URI: http://www.ltg.ed.ac.uk/~ht/ | |||
Chris Lilley | Chris Lilley | |||
End of changes. 30 change blocks. | ||||
61 lines changed or deleted | 143 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |