draft-ietf-appsawg-xml-mediatypes-07.txt   draft-ietf-appsawg-xml-mediatypes-08.txt 
Network Working Group H. S. Thompson Network Working Group H. Thompson
Internet-Draft University of Edinburgh Internet-Draft University of Edinburgh
Obsoletes: 3023 (if approved) C. Lilley Obsoletes: 3023 (if approved) C. Lilley
Updates: 6839 (if approved) W3C Updates: 6839 (if approved) W3C
Intended status: Standards Track February 06, 2014 Intended status: Standards Track February 23, 2014
Expires: August 10, 2014 Expires: August 27, 2014
XML Media Types XML Media Types
draft-ietf-appsawg-xml-mediatypes-07 draft-ietf-appsawg-xml-mediatypes-08
Abstract Abstract
This specification standardizes three media types -- application/xml, This specification standardizes three media types -- application/xml,
application/xml-external-parsed-entity, and application/xml-dtd -- application/xml-external-parsed-entity, and application/xml-dtd --
for use in exchanging network entities that are related to the for use in exchanging network entities that are related to the
Extensible Markup Language (XML) while defining text/xml and text/ Extensible Markup Language (XML) while defining text/xml and text/
xml-external-parsed-entity as aliases for the respective application/ xml-external-parsed-entity as aliases for the respective application/
types. This specification also standardizes the '+xml' suffix for types. This specification also standardizes the '+xml' suffix for
naming media types outside of these five types when those media types naming media types outside of these five types when those media types
skipping to change at page 1, line 39 skipping to change at page 1, line 39
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 10, 2014. This Internet-Draft will expire on August 27, 2014.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 25 skipping to change at page 2, line 25
Without obtaining an adequate license from the person(s) controlling Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other it for publication as an RFC or to translate it into languages other
than English. than English.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Notational Conventions . . . . . . . . . . . . . . . . . . . 3 2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4
2.1. Conformance Keywords . . . . . . . . . . . . . . . . . . 3 2.1. Conformance Keywords . . . . . . . . . . . . . . . . . . 4
2.2. Characters, Encodings, Charsets . . . . . . . . . . . . . 4 2.2. Characters, Encodings, Charsets . . . . . . . . . . . . . 4
2.3. MIME Entities, XML Entities . . . . . . . . . . . . . . . 4 2.3. MIME Entities, XML Entities . . . . . . . . . . . . . . . 4
3. Encoding Considerations . . . . . . . . . . . . . . . . . . . 5 3. Encoding Considerations . . . . . . . . . . . . . . . . . . . 5
3.1. XML MIME producers . . . . . . . . . . . . . . . . . . . 5 3.1. XML MIME producers . . . . . . . . . . . . . . . . . . . 6
3.2. XML MIME consumers . . . . . . . . . . . . . . . . . . . 6 3.2. XML MIME consumers . . . . . . . . . . . . . . . . . . . 6
3.3. The Byte Order Mark (BOM) and Encoding Conversions . . . 7 3.3. The Byte Order Mark (BOM) and Encoding Conversions . . . 7
4. XML Media Types . . . . . . . . . . . . . . . . . . . . . . . 8 4. XML Media Types . . . . . . . . . . . . . . . . . . . . . . . 8
4.1. XML MIME Entities . . . . . . . . . . . . . . . . . . . . 8 4.1. XML MIME Entities . . . . . . . . . . . . . . . . . . . . 9
4.2. Application/xml Registration . . . . . . . . . . . . . . 10 4.2. Application/xml Registration . . . . . . . . . . . . . . 10
4.3. Text/xml Registration . . . . . . . . . . . . . . . . . . 11 4.3. Text/xml Registration . . . . . . . . . . . . . . . . . . 12
4.4. Application/xml-external-parsed-entity Registration . . . 11 4.4. Application/xml-external-parsed-entity Registration . . . 12
4.5. Text/xml-external-parsed-entity Registration . . . . . . 12 4.5. Text/xml-external-parsed-entity Registration . . . . . . 13
4.6. Application/xml-dtd Registration . . . . . . . . . . . . 12 4.6. Application/xml-dtd Registration . . . . . . . . . . . . 13
5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 13 5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 14
6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 14 6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 15
7. XML Versions . . . . . . . . . . . . . . . . . . . . . . . . 14 7. XML Versions . . . . . . . . . . . . . . . . . . . . . . . . 15
8. The '+xml' Naming Convention for XML-Based Media Types . . . 15 8. The '+xml' Naming Convention for XML-Based Media Types . . . 16
8.1. XML-based Media Types . . . . . . . . . . . . . . . . . . 15 8.1. +xml Structured Syntax Suffix Registration . . . . . . . 16
8.2. +xml Structured Syntax Suffix Registration . . . . . . . 16 9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8.3. Registration guidelines for XML-based media types not 9.1. UTF-8 Charset . . . . . . . . . . . . . . . . . . . . . . 18
using '+xml' . . . . . . . . . . . . . . . . . . . . . . 18 9.2. UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . 18
9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18 9.3. Omitted Charset and 8-bit MIME Entity . . . . . . . . . . 19
9.1. UTF-8 Charset . . . . . . . . . . . . . . . . . . . . . . 19 9.4. Omitted Charset and 16-bit MIME Entity . . . . . . . . . 19
9.2. UTF-16 Charset . . . . . . . . . . . . . . . . . . . . . 19 9.5. Omitted Charset, no Internal Encoding Declaration . . . . 20
9.3. Omitted Charset and 8-bit MIME Entity . . . . . . . . . . 20 9.6. UTF-16BE Charset . . . . . . . . . . . . . . . . . . . . 20
9.4. Omitted Charset and 16-bit MIME Entity . . . . . . . . . 20 9.7. Non-UTF Charset . . . . . . . . . . . . . . . . . . . . . 20
9.5. Omitted Charset, no Internal Encoding Declaration . . . . 21
9.6. UTF-16BE Charset . . . . . . . . . . . . . . . . . . . . 21
9.7. Non-UTF Charset . . . . . . . . . . . . . . . . . . . . . 21
9.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal 9.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal
Encoding Declaration . . . . . . . . . . . . . . . . . . 22 Encoding Declaration . . . . . . . . . . . . . . . . . . 21
9.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM . . . . 22 9.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM . . . . 21
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
11. Security Considerations . . . . . . . . . . . . . . . . . . . 23 10.1. Using '+xml' when Registering XML-based Media Types . . 22
10.2. Registration Guidelines for XML-based Media Types Not
Using '+xml' . . . . . . . . . . . . . . . . . . . . . . 23
11. Security Considerations . . . . . . . . . . . . . . . . . . . 24
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 25 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 25
12.1. Normative References . . . . . . . . . . . . . . . . . . 25 12.1. Normative References . . . . . . . . . . . . . . . . . . 25
12.2. Informative References . . . . . . . . . . . . . . . . . 27 12.2. Informative References . . . . . . . . . . . . . . . . . 28
Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types? 29 Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types? 30
Appendix B. Core XML specifications . . . . . . . . . . . . . . 29 Appendix B. Core XML specifications . . . . . . . . . . . . . . 30
Appendix C. Changes from RFC 3023 . . . . . . . . . . . . . . . 30 Appendix C. Changes from RFC 3023 . . . . . . . . . . . . . . . 31
Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 30 Appendix D. Acknowledgements . . . . . . . . . . . . . . . . . . 31
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 31 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 32
1. Introduction 1. Introduction
The World Wide Web Consortium has issued the Extensible Markup The World Wide Web Consortium has issued the Extensible Markup
Language (XML) 1.0 [XML] and Extensible Markup Language (XML) 1.1 Language (XML) 1.0 [XML] and Extensible Markup Language (XML) 1.1
[XML1.1] specifications. To enable the exchange of XML network [XML1.1] specifications. To enable the exchange of XML network
entities, this specification standardizes three media types -- entities, this specification standardizes three media types --
application/xml, application/xml-external-parsed-entity, and application/xml, application/xml-external-parsed-entity, and
application/xml-dtd and two aliases -- text/xml and text/xml- application/xml-dtd and two aliases -- text/xml and text/xml-
external-parsed-entity, as well as a naming convention for external-parsed-entity, as well as a naming convention for
skipping to change at page 4, line 23 skipping to change at page 4, line 28
MIME charset (mapping from byte stream to character sequence MIME charset (mapping from byte stream to character sequence
[RFC2978]). [RFC2978]).
In this specification we will use the phrases "charset parameter" and In this specification we will use the phrases "charset parameter" and
"encoding declaration" to refer to whatever MIME charset is specified "encoding declaration" to refer to whatever MIME charset is specified
by a MIME charset parameter or XML encoding declaration respectively. by a MIME charset parameter or XML encoding declaration respectively.
We reserve the phrase "character encoding" (or, when the context We reserve the phrase "character encoding" (or, when the context
makes the intention clear, simply "encoding") for the MIME charset makes the intention clear, simply "encoding") for the MIME charset
actually used in a particular XML MIME entity. actually used in a particular XML MIME entity.
[UNICODE] defines three "encoding forms", which are independent of [UNICODE] defines three "encoding forms", namely UTF-8, UTF-16, and
serialization, namely UTF-8, UTF-16 and UTF-32. This specification UTF-32. As UTF-8 can only be serialized in one way, the only
follows this precedent. Furthermore, note that UTF-16 XML documents possible label for UTF-8-encoded documents when serialised into MIME
may be serialised into MIME entities in one of two ways: either big- entities is "utf-8". UTF-16 XML documents, however, can be
endian, labelled (optionally) "utf-16" or "utf-16be", or little- serialised into MIME entities in one of two ways: either big- endian,
endian, labelled (optionally) "utf-16" or "utf-16le". As UTF-8 can labelled (optionally) "utf-16" or "utf-16be", or little- endian,
only be serialized in one way, the only possible label for labelled (optionally) "utf-16" or "utf-16le".
UTF-8-encoded documents when serialised into MIME entities is
"utf-8". UTF-32 has four potential serializations, of which only two (UTF-32BE
and UTF-32LE) are given names in in [UNICODE]. Support for the
various serializations varies widely, and security concerns about
their use have been raised. The use of UTF-32 is NOT RECOMMENDED for
XML MIME entities.
2.3. MIME Entities, XML Entities 2.3. MIME Entities, XML Entities
As sometimes happens between two communities, both MIME and XML have As sometimes happens between two communities, both MIME and XML have
defined the term entity, with different meanings. Section 2.4 of defined the term entity, with different meanings. Section 2.4 of
[RFC2045] says: [RFC2045] says:
"The term 'entity' refers specifically to the MIME-defined header "The term 'entity' refers specifically to the MIME-defined header
fields and contents of either a message or one of the parts in the fields and contents of either a message or one of the parts in the
body of a multipart entity." body of a multipart entity."
skipping to change at page 5, line 28 skipping to change at page 5, line 36
declaration (see Section 4.3.3 of [XML]). Ensuring consistency among declaration (see Section 4.3.3 of [XML]). Ensuring consistency among
these sources requires coordination between entity authors and MIME these sources requires coordination between entity authors and MIME
agents (that is, processes which package, transfer, deliver and/or agents (that is, processes which package, transfer, deliver and/or
receive MIME entities). receive MIME entities).
The use of UTF-8, without a BOM, is RECOMMENDED for all XML MIME The use of UTF-8, without a BOM, is RECOMMENDED for all XML MIME
entities. entities.
Some MIME agents will be what we will call "XML-aware", that is, Some MIME agents will be what we will call "XML-aware", that is,
capable of processing XML MIME entities as XML and detecting the XML capable of processing XML MIME entities as XML and detecting the XML
encoding declaration (or its absence). Others, while comforming to encoding declaration (or its absence). All three sources of
this and other media type registrations, will not be XML-aware, and information about encoding are available to them, and they can be
thus cannot know anything about the XML encoding declaration. Some expected to be aware of this spec.
MIME agents, such as proxies and transcoders, both consume and
Other MIME agents will not be XML-aware, and thus cannot know
anything about the XML encoding declaration. Not only do they lack
one of the three sources of information about encoding, they are also
less likely to be aware of or responsive to this spec.
Some MIME agents, such as proxies and transcoders, both consume and
produce MIME entities. produce MIME entities.
This mixture of two kinds of agents handling XML MIME entities
increases the complexity of the coordination task. The
recommendations given below are intended to maximise interoperability
in the face of this, by on the one hand mandating consistent
production and encouraging maximally robust forms of production, and
on the other specifying recovery strategies to maximize the
interoperability of consumers when the production rules are broken.
3.1. XML MIME producers 3.1. XML MIME producers
XML-aware MIME producers SHOULD supply a charset parameter and/or an XML-aware MIME producers SHOULD supply a charset parameter and/or an
appropriate BOM with non-UTF-8-encoded XML MIME entities which lack appropriate BOM with non-UTF-8-encoded XML MIME entities which lack
an encoding declaration, and SHOULD remove or correct an encoding an encoding declaration, and SHOULD remove or correct an encoding
declaration which is known to be incorrect (for example, as a result declaration which is known to be incorrect (for example, as a result
of transcoding). of transcoding).
XML-aware MIME producers MUST supply an XML text declaration at the XML-aware MIME producers MUST supply an XML text declaration at the
beginning of non-UNICODE XML external parsed entities which would beginning of non-UNICODE XML external parsed entities which would
otherwise begin with the hexadecimal octet sequences 0xFE 0xFF, 0xFF otherwise begin with the hexadecimal octet sequences 0xFE 0xFF, 0xFF
0xFE or 0xEF 0xBB 0xBF, in order to avoid the mistaken detection of a 0xFE or 0xEF 0xBB 0xBF, in order to avoid the mistaken detection of a
BOM. BOM.
XML-unaware MIME producers MUST NOT supply a charset parameter with XML-unaware MIME producers MUST NOT supply a charset parameter with
an XML MIME entity unless the entity's character encoding is reliably an XML MIME entity unless the entity's character encoding is reliably
known. known. Note that this is particularly relevant for central
configuration of web servers, where configuring a default for the
XML MIME producers are RECOMMENDED to provide means for XML MIME charset parameter will almost certainly violate this requirement.
entity authors to determine what value, if any, is given to charset
parameters for their entities, for example by enabling user-level
configuration of filename-to-Content-Type-header mappings on a file-
by-file or suffix basis.
The use of UTF-32 is NOT RECOMMENDED for XML MIME entities. XML MIME producers are RECOMMENDED to provide means for users to
control what value, if any, is given to charset parameters for XML
MIME entities, for example by giving users control of the
configuration of Web server filename-to-Content-Type-header mappings
on a file-by-file or suffix basis.
3.2. XML MIME consumers 3.2. XML MIME consumers
For XML MIME consumers, the question of priority arises in cases when For XML MIME consumers, the question of priority arises in cases when
the available character encoding information is not consistent. the available character encoding information is not consistent.
Again, we must distinguish betweeen XML-aware and XML-unaware agents. Again, we must distinguish betweeen XML-aware and XML-unaware agents.
When a charset parameter is specified for an XML MIME entity, the When a charset parameter is specified for an XML MIME entity, the
normative component of the [XML] specification leaves the question normative component of the [XML] specification leaves the question
open as to how to determine the encoding with which to attempt to open as to how to determine the encoding with which to attempt to
skipping to change at page 6, line 38 skipping to change at page 7, line 13
Appendix F it defers to this specification: Appendix F it defers to this specification:
[T]he preferred method of handling conflict should be specified as [T]he preferred method of handling conflict should be specified as
part of the higher-level protocol used to deliver XML. In part of the higher-level protocol used to deliver XML. In
particular, please refer to [IETF RFC 3023] or its successor... particular, please refer to [IETF RFC 3023] or its successor...
Accordingly, to conform with deployed processors and content and to Accordingly, to conform with deployed processors and content and to
avoid conflicting with this or other normative specifications, this avoid conflicting with this or other normative specifications, this
specification sets the priority as follows: specification sets the priority as follows:
All consumers SHOULD treat a BOM (Section 3.3) as authoritative if A BOM (Section 3.3) is authoritative if it is present in an XML
it is present in an XML MIME entity. In the absence of a BOM MIME entity;
(Section 3.3), all consumers SHOULD treat the charset parameter as
authoritative if it is present. For XML-aware consumers, note In the absence of a BOM (Section 3.3), the charset parameter is
that Section 4.3.3 of [XML] does _not_ make it an error for the authoritative if it is present.
charset parameter and the XML encoding declaration (or the UTF-8
default in the absence of encoding declaration and BOM) to be Whenever the above determines a source of encoding information as
inconsistent, although such consumers might choose to issue a authoritative, consumers SHOULD process XML MIME entities based on
warning in this case. that information.
When MIME producers conform to the requirements stated above When MIME producers conform to the requirements stated above
(Section 3.1, Section 3) such inconsistencies will not arise---this (Section 3.1, Section 3) inconsistencies will not arise---the above
statement of priorities only has practical impact in the case of non- statement of priorities only has practical impact in the case of non-
conforming XML MIME entities. conforming XML MIME entities. In the face of inconsistencies, no
uniform strategy can deliver the 'right' answer every time: the
purpose of specifying one here is to encourage convergence over time,
first on the part of consumers, then on the part of producers.
For XML-aware consumers, note that Section 4.3.3 of [XML] does _not_
make it an error for the charset parameter and the XML encoding
declaration (or the UTF-8 default in the absence of encoding
declaration and BOM) to be inconsistent, although such consumers
might choose to issue a warning in this case.
If an XML MIME entity is received where the charset parameter is If an XML MIME entity is received where the charset parameter is
omitted, no information is being provided about the character omitted, no information is being provided about the character
encoding by the MIME Content-Type header. XML-aware consumers MUST encoding by the MIME Content-Type header. XML-aware consumers MUST
follow the requirements in section 4.3.3 of [XML] that directly follow the requirements in section 4.3.3 of [XML] that directly
address this case. XML-unaware MIME consumers SHOULD NOT assume a address this case. XML-unaware MIME consumers SHOULD NOT assume a
default encoding in this case. default encoding in this case.
3.3. The Byte Order Mark (BOM) and Encoding Conversions 3.3. The Byte Order Mark (BOM) and Encoding Conversions
Section 4.3.3 of [XML] specifies thatUTF-16 XML MIME entitiesnot Section 4.3.3 of [XML] specifies that UTF-16 XML MIME entities not
labelled as "utf-16le" or "utf16-be" MUST begin with a byte order labelled as "utf-16le" or "utf16-be" MUST begin with a byte order
mark (BOM), U+FEFF, which appears as the hexadecimal octet sequence mark (BOM), U+FEFF, which appears as the hexadecimal octet sequence
0xFE 0xFF (big-endian) or 0xFF 0xFE (little-endian). [XML] further 0xFE 0xFF (big-endian) or 0xFF 0xFE (little-endian). [XML] further
states that the BOM is an encoding signature, and is not part of states that the BOM is an encoding signature, and is not part of
either the markup or the character data of the XML document. either the markup or the character data of the XML document.
Due to the presence of the BOM, applications that convert XML from Due to the presence of the BOM, applications that convert XML from
UTF-16 to an encoding other than UTF-8 MUST strip the BOM before UTF-16 to an encoding other than UTF-8 MUST strip the BOM before
conversion. Similarly, when converting from another encoding into conversion. Similarly, when converting from another encoding into
UTF-16, either without a charset parameter, or labelled "utf-16", the UTF-16, either without a charset parameter, or labelled "utf-16", the
skipping to change at page 10, line 14 skipping to change at page 11, line 4
4.2. Application/xml Registration 4.2. Application/xml Registration
Type name: application Type name: application
Subtype name: xml Subtype name: xml
Required parameters: none Required parameters: none
Optional parameters: charset Optional parameters: charset
See Section 3. See Section 3.
Encoding considerations: Depending on the character encoding used, Encoding considerations: Depending on the character encoding used,
XML MIME entities can consist of 7bit, 8bit or binary data XML MIME entities can consist of 7bit, 8bit or binary data
[RFC6838]. For 7-bit transports, 7bit data, for example US-ASCII- [RFC6838]. For 7-bit transports, 7bit data, for example US-ASCII-
encoded data, does not require content-transfer-encoding, but 8bit encoded data, does not require content-transfer-encoding, but 8bit
or binary data, for example UTF-8 or UTF-16 data, MUST be content- or binary data, for example UTF-8 or UTF-16 data, MUST be content-
transfer-encoded in quoted-printable or base64. For 8-bit clean transfer-encoded in quoted-printable or base64. For 8-bit clean
transport (e.g. 8BITMIME ESMTP [RFC6152] or NNTP [RFC3977]), 7bit transport (e.g. 8BITMIME ESMTP [RFC6152] or NNTP [RFC3977]), 7bit
or 8bit data, for example US-ASCII or UTF-8 data, does not require or 8bit data, for example US-ASCII or UTF-8 data, does not require
content-transfer-encoding, but binary data, for example data with content-transfer-encoding, but binary data, for example data with
a UTF-16 encoding, MUST be content-transfer-encoded in base64. a UTF-16 encoding, MUST be content-transfer-encoded in base64.
For binary clean transports (e.g. BINARY ESMTP [RFC3030] or HTTP For binary clean transports (e.g. BINARY ESMTP [RFC3030] or HTTP
[HTTPbis]), no content-transfer-encoding is necessary (or even [HTTPbis]), no content-transfer-encoding is necessary (or even
possible, in the case of HTTP) for 7bit, 8bit or binary data. possible, in the case of HTTP) for 7bit, 8bit or binary data.
Security considerations: See Section 11. Security considerations: See Section 11.
Interoperability considerations: XML has proven to be interoperable Interoperability considerations: XML has proven to be interoperable
across both generic and task-specific applications and for import across both generic and task-specific applications and for import
and export from multiple XML authoring and editing tools. and export from multiple XML authoring and editing tools.
Validating processors provide maximum interoperability. Although Validating processors provide maximum interoperability. Although
non-validating processors may be more efficient, they are not non-validating processors may be more efficient, they are not
required to handle all features of XML. For further information, required to handle all features of XML. For further information,
see sub-section 2.9 "Standalone Document Declaration" and section see sub-section 2.9 "Standalone Document Declaration" and section
5 "Conformance" of [XML] . 5 "Conformance" of [XML] .
In practice, character set issues have proved to be the biggest
source of interoperability problems. The use of UTF-8, and
careful attention to the guidelines set out in Section 3, are the
best way to avoid such problems.
Published specification: Extensible Markup Language (XML) 1.0 (Fifth Published specification: Extensible Markup Language (XML) 1.0 (Fifth
Edition) [XML] or subsequent editions or versions thereof. Edition) [XML] or subsequent editions or versions thereof.
Applications that use this media type: XML is device-, platform-, Applications that use this media type: XML is device-, platform-,
and vendor-neutral and is supported by generic and task-specific and vendor-neutral and is supported by generic and task-specific
applications and a wide range of generic XML tools (editors, applications and a wide range of generic XML tools (editors,
parsers, Web agents, ...). parsers, Web agents, ...).
Additional information: Additional information:
skipping to change at page 14, line 23 skipping to change at page 15, line 19
but need not support other schemes. but need not support other schemes.
If an XPointer error is reported in the attempt to process the part, If an XPointer error is reported in the attempt to process the part,
this specification does not define an interpretation for the part. this specification does not define an interpretation for the part.
A registry of XPointer schemes [XPtrReg] is maintained at the W3C. A registry of XPointer schemes [XPtrReg] is maintained at the W3C.
Document authors SHOULD NOT use unregistered schemes. Scheme authors Document authors SHOULD NOT use unregistered schemes. Scheme authors
SHOULD register their schemes ([XPtrRegPolicy] describes requirements SHOULD register their schemes ([XPtrRegPolicy] describes requirements
and procedures for doing so). and procedures for doing so).
See Section 8.3 for additional requirements which apply when an XML- See Section 10.2 for additional requirements which apply when an XML-
based media type follows the naming convention '+xml'. based media type follows the naming convention '+xml'.
If [XPointerFramework] and [XPointerElement] are inappropriate for If [XPointerFramework] and [XPointerElement] are inappropriate for
some XML-based media type, it SHOULD NOT follow the naming convention some XML-based media type, it SHOULD NOT follow the naming convention
'+xml'. '+xml'.
When a URI has a fragment identifier, it is encoded by a limited When a URI has a fragment identifier, it is encoded by a limited
subset of the repertoire of US-ASCII characters, see subset of the repertoire of US-ASCII characters, see
[XPointerFramework] for details.. [XPointerFramework] for details..
skipping to change at page 15, line 35 skipping to change at page 16, line 30
This section supersedes the earlier registration of the '+xml' suffix This section supersedes the earlier registration of the '+xml' suffix
[RFC6839]. [RFC6839].
This specification recommends the use of the '+xml' naming convention This specification recommends the use of the '+xml' naming convention
for identifying XML-based media types, in line with the recognition for identifying XML-based media types, in line with the recognition
in [RFC6838] of structured syntax name suffixes. This allows the use in [RFC6838] of structured syntax name suffixes. This allows the use
of generic XML processors and technologies on a wide variety of of generic XML processors and technologies on a wide variety of
different XML document types at a minimum cost, using existing different XML document types at a minimum cost, using existing
frameworks for media type registration. frameworks for media type registration.
8.1. XML-based Media Types See Section 10 for guidance on when and how to register a
'+xml'-based media subtype, and on registering a media subtype for
When a new media type is introduced for an XML-based format, the name XML but _not_ using '+xml'.
of the media type SHOULD end with '+xml' unless generic XML
processing is in some way inappropriate for documents of the new
type. This convention will allow applications that can process XML
generically to detect that the MIME entity is supposed to be an XML
document, verify this assumption by invoking some XML processor, and
then process the XML document accordingly. Applications may check
for types that represent XML MIME entities by comparing the last four
characters of the subtype to the string '+xml'. (However note that 4
of the 5 media types defined in this specification -- text/xml,
application/xml, text/xml-external-parsed-entity, and application/
xml-external-parsed-entity -- also represent XML MIME entities while
not ending with '+xml'.)
NOTE: Section 5.3.2HTTPbis [HTTPbis] does not support any form of
Accept header which will match only '+xml' types. In particular,
Accept headers of the form "Accept: */*+xml" are not allowed, and
so this header MUST NOT be used for this purpose.
Media types following the naming convention '+xml' SHOULD introduce
the charset parameter for consistency, since XML-generic processing
applies the same program for any such media type. However, there are
some cases that the charset parameter need not be introduced. For
example:
When an XML-based media type is restricted to UTF-8, it is not
necessary to introduce the charset parameter. UTF-8 is the
default for XML.
When an XML-based media type is restricted to UTF-8 and UTF-16, it
might not be unreasonable to omit the charset parameter. Neither
UTF-8 nor UTF-16 require XML encoding declarations.
XML generic processing is not always appropriate for XML-based media
types. For example, authors of some such media types may wish that
the types remain entirely opaque except to applications that are
specifically designed to deal with that media type. By NOT following
the naming convention '+xml', such media types can avoid XML-generic
processing. Since generic processing will be useful in many cases,
however -- including in some situations that are difficult to predict
ahead of time -- the '+xml' convention is to be preferred unless
there is some particularly compelling reason not to.
The registration process for specific '+xml' media types is described
in [RFC6838]. The registrar for the IETF tree will encourage new
XML-based media type registrations in the IETF tree to follow this
guideline. Registrars for other trees SHOULD follow this convention
in order to ensure maximum interoperability of their XML-based
documents. Media subtypes that do not represent XML MIME entities
MUST NOT be allowed to register with a '+xml' suffix.
In addition to the changes described above, the change controller has
been changed to be the World Wide Web Consortium (W3C).
8.2. +xml Structured Syntax Suffix Registration 8.1. +xml Structured Syntax Suffix Registration
Name: Extensible Markup Language (XML) Name: Extensible Markup Language (XML)
+suffix: +xml +suffix: +xml
Reference: This specification Reference: This specification
Encoding considerations: Same as Section 4.2. Encoding considerations: Same as Section 4.2.
Fragment identifier considerations: Registrations which use this Fragment identifier considerations: Registrations which use this
'+xml' convention MUST also make reference to RFC XXXX, '+xml' convention MUST also make reference to RFC XXXX,
specifically Section 5, in specifying fragment identifier syntax specifically Section 5, in specifying fragment identifier syntax
and semantics, and they MAY restrict the syntax to a specified and semantics, and they MAY restrict the syntax to a specified
subset of schemes, except that they MUST NOT disallow barenames or subset of schemes, except that they MUST NOT disallow barenames or
'element' scheme pointers. They MAY further require support for 'element' scheme pointers. They MAY further require support for
other registered schemes. They also MAY add additional syntax other registered schemes. They also MAY add additional syntax
(which MUST NOT overlap with [XPointerFramework] syntax) together (which MUST NOT overlap with [XPointerFramework] syntax) together
skipping to change at page 18, line 9 skipping to change at page 17, line 42
Security considerations: See Section 11. Security considerations: See Section 11.
Contact: See Authors' Addresses section. Contact: See Authors' Addresses section.
Author: See Authors' Addresses section. Author: See Authors' Addresses section.
Change controller: The XML specification is a work product of the Change controller: The XML specification is a work product of the
World Wide Web Consortium's XML Core Working Group. The W3C has World Wide Web Consortium's XML Core Working Group. The W3C has
change control over this specification. change control over this specification.
8.3. Registration guidelines for XML-based media types not using '+xml'
Registrations for new XML-based media types which do _not_ use the
'+xml' suffix SHOULD, in specifying the charset parameter and
encoding considerations, define them as: "Same as [charset parameter
/ encoding considerations] of application/xml as specified in RFC
XXXX."
Enabling the charset parameter is RECOMMENDED, since this information
can be used by XML processors to determine authoritatively the
character encoding of the XML MIME entity in the absence of a BOM.
If there are some reasons not to follow this advice, they SHOULD be
included as part of the registration. As shown above, two such
reasons are "UTF-8 only" or "UTF-8 or UTF-16 only".
These registrations SHOULD specify that the XML-based media type
being registered has all of the security considerations described in
RFC XXXX plus any additional considerations specific to that media
type.
These registrations SHOULD also make reference to RFC XXXX in
specifying magic numbers, base URIs, and use of the BOM.
These registrations MAY reference the application/xml registration in
RFC XXXX in specifying interoperability and fragment identifier
considerations, if these considerations are not overridden by issues
specific to that media type.
9. Examples 9. Examples
This section is non-normative. In particular, note that all This section is non-normative. In particular, note that all
[RFC2119] language herein reproduces or summarizes the consequences [RFC2119] language herein reproduces or summarizes the consequences
of normative statements already made above, and has no independent of normative statements already made above, and has no independent
normative force, and accordingly does not appear in uppercase. normative force, and accordingly does not appear in uppercase.
The examples below give the MIME Content-type header, including the The examples below give the MIME Content-type header, including the
charset parameter, if present and the XML declaration or Text charset parameter, if present and the XML declaration or Text
declaration (which includes the encoding declaration) inside the XML declaration (which includes the encoding declaration) inside the XML
skipping to change at page 19, line 37 skipping to change at page 18, line 37
Content-Type: application/xml; charset=utf-8 Content-Type: application/xml; charset=utf-8
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
or or
<?xml version="1.0"?> <?xml version="1.0"?>
UTF-8 is the recommended encoding for use with all the media types UTF-8 is the recommended encoding for use with all the media types
defined in this specification. Since the charset parameter is defined in this specification. Since the charset parameter is
provided and there is no overriding BOM, both MIME and XML processors provided and there is no overriding BOM, conformant MIME and XML
must treat the enclosed entity as UTF-8 encoded. processors must treat the enclosed entity as UTF-8 encoded.
If sent using a 7-bit transport (e.g. SMTP [RFC5321]), in general, a If sent using a 7-bit transport (e.g. SMTP [RFC5321]), in general, a
UTF-8 XML MIME entity must use a content-transfer-encoding of either UTF-8 XML MIME entity must use a content-transfer-encoding of either
quoted-printable or base64. For an 8-bit clean transport (e.g. quoted-printable or base64. For an 8-bit clean transport (e.g.
8BITMIME ESMTP or NNTP), or a binary clean transport (e.g. BINARY 8BITMIME ESMTP or NNTP), or a binary clean transport (e.g. BINARY
ESMTP or HTTP), no content-transfer-encoding is necessary (or even ESMTP or HTTP), no content-transfer-encoding is necessary (or even
possible, in the case of HTTP). possible, in the case of HTTP).
9.2. UTF-16 Charset 9.2. UTF-16 Charset
Content-Type: application/xml; charset=utf-16 Content-Type: application/xml; charset=utf-16
{BOM}<?xml version="1.0" encoding="utf-16"?> {BOM}<?xml version="1.0" encoding="utf-16"?>
or or
{BOM}<?xml version="1.0"?> {BOM}<?xml version="1.0"?>
For the three application/ media types defined above, if sent using a For the three application/ media types defined above, if sent using a
7-bit transport (e.g. SMTP) or an 8-bit clean transport (e.g. 7-bit transport (e.g. SMTP) or an 8-bit clean transport (e.g.
8BITMIME ESMTP or NNTP), the XML MIME entity must be encoded in 8BITMIME ESMTP or NNTP), the XML MIME entity must be encoded in
quoted-printable or base64; for a binary clean transport (e.g. quoted-printable or base64; for a binary clean transport (e.g. BINARY
BINARY ESMTP or HTTP), no content-transfer-encoding is necessary (or ESMTP or HTTP), no content-transfer-encoding is necessary (or even
even possible, in the case of HTTP). possible, in the case of HTTP).
As described in [RFC2781], the UTF-16 family must not be used with As described in [RFC2781], the UTF-16 family must not be used with
media types under the top-level type "text" except over HTTP or HTTPS media types under the top-level type "text" except over HTTP or HTTPS
(see section A.2 of HTTP [HTTPbis] for details). Hence one of the (see section A.2 of HTTP [HTTPbis] for details). Hence one of the
two text/ media types defined above can be used with this exampleonly two text/ media types defined above can be used with this exampleonly
when the XML MIME entity is transmitted via HTTP or HTTPS, which use when the XML MIME entity is transmitted via HTTP or HTTPS, which use
a MIME-like mechanism and are binary-clean protocols, hence do not a MIME-like mechanism and are binary-clean protocols, hence do not
perform CR and LF transformations and allow NUL octets. Since HTTP perform CR and LF transformations and allow NUL octets. Since HTTP
is binary clean, no content-transfer-encoding is necessary (or even is binary clean, no content-transfer-encoding is necessary (or even
possible). possible).
9.3. Omitted Charset and 8-bit MIME Entity 9.3. Omitted Charset and 8-bit MIME Entity
Content-Type: application/xml Content-Type: application/xml
<?xml version="1.0" encoding="iso-8859-1"?> <?xml version="1.0" encoding="iso-8859-1"?>
Since the charset parameter is not provided in the Content-Type Since the charset parameter is not provided in the Content-Type
header and there is no overriding BOM, XML processors must treat the header and there is no overriding BOM, conformant XML processors must
"iso-8859-1" encoding as authoritative. XML-unaware MIME processors treat the "iso-8859-1" encoding as authoritative. Conformant XML-
should make no assumptions about the character encoding of the XML unaware MIME processors should make no assumptions about the
MIME entity. character encoding of the XML MIME entity.
9.4. Omitted Charset and 16-bit MIME Entity 9.4. Omitted Charset and 16-bit MIME Entity
Content-Type: application/xml Content-Type: application/xml
{BOM}<?xml version="1.0" encoding="utf-16"?> {BOM}<?xml version="1.0" encoding="utf-16"?>
or or
{BOM}<?xml version="1.0"?> {BOM}<?xml version="1.0"?>
This example shows a 16-bit MIME entity with no charset parameter. This example shows a 16-bit MIME entity with no charset parameter.
However since there is a BOM all processors must treat the entity as However since there is a BOM conformant processors must treat the
UTF-16-encoded. entity as UTF-16-encoded.
Omitting the charset parameter is not recommended in conjunction with Omitting the charset parameter is not recommended in conjunction with
media types under the top-level type "application" when used with media types under the top-level type "application" when used with
transports other than HTTP or HTTPS. Media types under the top-level transports other than HTTP or HTTPS. Media types under the top-level
type "text" should not be used for 16-bit MIME with transports other type "text" should not be used for 16-bit MIME with transports other
than HTTP or HTTPS (see discussion above (Section 9.2, Paragraph 7)). than HTTP or HTTPS (see discussion above (Section 9.2, Paragraph 7)).
9.5. Omitted Charset, no Internal Encoding Declaration 9.5. Omitted Charset, no Internal Encoding Declaration
Content-Type: application/xml Content-Type: application/xml
skipping to change at page 21, line 26 skipping to change at page 20, line 26
In this example, the charset parameter has been omitted, there is no In this example, the charset parameter has been omitted, there is no
internal encoding declaration, and there is no BOM. Since there is internal encoding declaration, and there is no BOM. Since there is
no BOM or charset parameter, the XML processor follows the no BOM or charset parameter, the XML processor follows the
requirements in section 4.3.3, and optionally applies the mechanism requirements in section 4.3.3, and optionally applies the mechanism
described in Appendix F (which is non-normative) of [XML] to described in Appendix F (which is non-normative) of [XML] to
determine an encoding of UTF-8. Although the XML MIME entity does determine an encoding of UTF-8. Although the XML MIME entity does
not contain an encoding declaration, provided the encoding actually not contain an encoding declaration, provided the encoding actually
_is_ UTF-8, this is a conforming XML MIME entity. _is_ UTF-8, this is a conforming XML MIME entity.
An XML-unaware MIME processor should make no assumptions about the A conformant XML-unaware MIME processor should make no assumptions
character encoding of the XML MIME entity. about the character encoding of the XML MIME entity.
See Section 9.1 for transport-related issues for UTF-8 XML MIME See Section 9.1 for transport-related issues for UTF-8 XML MIME
entities. entities.
9.6. UTF-16BE Charset 9.6. UTF-16BE Charset
Content-Type: application/xml; charset=utf-16be Content-Type: application/xml; charset=utf-16be
<?xml version='1.0' encoding='utf-16be'?> <?xml version='1.0' encoding='utf-16be'?>
Observe that, as required for this encoding, there is no BOM. Since Observe that, as required for this encoding, there is no BOM. Since
the charset parameter is provided and there is no overriding BOM, the charset parameter is provided and there is no overriding BOM,
MIME and XML processors must treat the enclosed entity as UTF-16BE conformant MIME and XML processors must treat the enclosed entity as
encoded. UTF-16BE encoded.
See also the additional considerations in the UTF-16 example See also the additional considerations in the UTF-16 example
(Section 9.2) above. (Section 9.2) above.
9.7. Non-UTF Charset 9.7. Non-UTF Charset
Content-Type: application/xml; charset=iso-2022-kr Content-Type: application/xml; charset=iso-2022-kr
<?xml version="1.0" encoding="iso-2022-kr"?> <?xml version="1.0" encoding="iso-2022-kr"?>
This example shows the use of a non-UTF character encoding (in this This example shows the use of a non-UTF character encoding (in this
case Hangul, but this example is intended to cover all non-UTF-family case Hangul, but this example is intended to cover all non-UTF-family
character encodings). Since the charset parameter is provided and character encodings). Since the charset parameter is provided and
there is no overriding BOM, all processors must treat the enclosed there is no overriding BOM, conformant processors must treat the
entity as encoded per RFC 1557. enclosed entity as encoded per RFC 1557.
Since ISO-2022-KR [RFC1557] has been defined to use only 7 bits of Since ISO-2022-KR [RFC1557] has been defined to use only 7 bits of
data, no content-transfer-encoding is necessary with any transport: data, no content-transfer-encoding is necessary with any transport:
for character sets needing 8 or more bits, considerations such as for character sets needing 8 or more bits, considerations such as
those discussed above (Section 9.1, Section 9.2) would apply. those discussed above (Section 9.1, Section 9.2) would apply.
9.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal Encoding 9.8. INCONSISTENT EXAMPLE: Conflicting Charset and Internal Encoding
Declaration Declaration
Content-Type: application/xml; charset=iso-8859-1 Content-Type: application/xml; charset=iso-8859-1
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
Although the charset parameter is provided in the Content-Type header Although the charset parameter is provided in the Content-Type header
and there is no BOM and the charset parameter differs from the XML and there is no BOM and the charset parameter differs from the XML
encoding declaration, MIME and XML processors will interoperate. encoding declaration, conformant MIME and XML processors will
Since the charset parameter is authoritative in the absence of a BOM, interoperate. Since the charset parameter is authoritative in the
all processors will treat the enclosed entity as iso-8859-1 encoded. absence of a BOM, conformant processors will treat the enclosed
That is, the "UTF-8" encoding declaration will be ignored. entity as iso-8859-1 encoded. That is, the "UTF-8" encoding
declaration will be ignored.
Processors generating XML MIME entities must not label conflicting Conformant processors generating XML MIME entities must not label
character encoding information between the MIME Content-Type and the conflicting character encoding information between the MIME Content-
XML declaration unless they have definitive information about the Type and the XML declaration unless they have definitive information
actual encoding, for example as a result of systematic transcoding. about the actual encoding, for example as a result of systematic
In particular, the addition by servers of an explicit, site-wide transcoding. In particular, the addition by servers of an explicit,
charset parameter default has frequently lead to interoperability site-wide charset parameter default has frequently lead to
problems for XML documents. interoperability problems for XML documents.
9.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM 9.9. INCONSISTENT EXAMPLE: Conflicting Charset and BOM
Content-Type: application/xml; charset=iso-8859-1 Content-Type: application/xml; charset=iso-8859-1
{BOM}<?xml version="1.0"?> {BOM}<?xml version="1.0"?>
Although the charset parameter is provided in the Content-Type Although the charset parameter is provided in the Content-Type
header, there is a BOM, so MIME and XML processors may not header, there is a BOM, so MIME and XML processors may not
interoperate. Since the BOM parameter is authoritative for XML interoperate. Since the BOM parameter is authoritative for
processors, they will treat the enclosed entity as UTF-16-encoded. conformant XML processors, they will treat the enclosed entity as
That is, the "iso-8859-1" charset parameter will be ignored. XML- UTF-16-encoded. That is, the "iso-8859-1" charset parameter will be
unaware MIME processors on the other hand may be unaware of the BOM ignored. XML-unaware MIME processors on the other hand may be
and so treat the entity as encoded in iso-8859-1. unaware of the BOM and so treat the entity as encoded in iso-8859-1.
Processors generating XML MIME entities must not label conflicting Conformant processors generating XML MIME entities must not label
character encoding information between the MIME Content-Type and an conflicting character encoding information between the MIME Content-
entity-initial BOM. Type and an entity-initial BOM.
10. IANA Considerations 10. IANA Considerations
As described in Section 8, this specification updates the [RFC6839] As described in Section 8, this specification updates the [RFC6839]
registration for XML-based MIME types (the "+xml" types). registration for XML-based MIME types (the '+xml' types).
10.1. Using '+xml' when Registering XML-based Media Types
When a new media type is introduced for an XML-based format, the name
of the media type SHOULD end with '+xml' unless generic XML
processing is in some way inappropriate for documents of the new
type. This convention will allow applications that can process XML
generically to detect that the MIME entity is supposed to be an XML
document, verify this assumption by invoking some XML processor, and
then process the XML document accordingly. Applications may check
for types that represent XML MIME entities by comparing the last four
characters of the subtype to the string '+xml'. (However note that 4
of the 5 media types defined in this specification -- text/xml,
application/xml, text/xml-external-parsed-entity, and application/
xml-external-parsed-entity -- also represent XML MIME entities while
not ending with '+xml'.)
NOTE: Section 5.3.2 of HTTPbis [HTTPbis] does not support any form
of Accept header which will match only '+xml' types. In
particular, Accept headers of the form "Accept: */*+xml" are not
allowed, and so this header MUST NOT be used for this purpose.
Media types following the naming convention '+xml' SHOULD introduce
the charset parameter for consistency, since XML-generic processing
applies the same program for any such media type. However, there are
some cases that the charset parameter need not be introduced. For
example:
When an XML-based media type is restricted to UTF-8, it is not
necessary to introduce the charset parameter. UTF-8 is the
default for XML.
When an XML-based media type is restricted to UTF-8 and UTF-16, it
might not be unreasonable to omit the charset parameter. Neither
UTF-8 nor UTF-16 require XML encoding declarations.
XML generic processing is not always appropriate for XML-based media
types. For example, authors of some such media types may wish that
the types remain entirely opaque except to applications that are
specifically designed to deal with that media type. By NOT following
the naming convention '+xml', such media types can avoid XML-generic
processing. Since generic processing will be useful in many cases,
however -- including in some situations that are difficult to predict
ahead of time -- the '+xml' convention is to be preferred unless
there is some particularly compelling reason not to.
The registration process for specific '+xml' media types is described
in [RFC6838]. The registrar for the IETF tree will encourage new
XML-based media type registrations in the IETF tree to follow this
guideline. Registrars for other trees SHOULD follow this convention
in order to ensure maximum interoperability of their XML-based
documents. Only media subtypes that represent XML MIME entities are
allowed to register with a '+xml' suffix.
In addition to the changes described above, the change controller has
been changed to be the World Wide Web Consortium (W3C).
10.2. Registration Guidelines for XML-based Media Types Not Using
'+xml'
Registrations for new XML-based media types which do _not_ use the
'+xml' suffix SHOULD, in specifying the charset parameter and
encoding considerations, define them as: "Same as [charset parameter
/ encoding considerations] of application/xml as specified in RFC
XXXX."
Enabling the charset parameter is RECOMMENDED, since this information
can be used by XML processors to determine authoritatively the
character encoding of the XML MIME entity in the absence of a BOM.
If there are some reasons not to follow this advice, they SHOULD be
included as part of the registration. As shown above, two such
reasons are "UTF-8 only" or "UTF-8 or UTF-16 only".
These registrations SHOULD specify that the XML-based media type
being registered has all of the security considerations described in
RFC XXXX plus any additional considerations specific to that media
type.
These registrations SHOULD also make reference to RFC XXXX in
specifying magic numbers, base URIs, and use of the BOM.
These registrations MAY reference the application/xml registration in
RFC XXXX in specifying interoperability and fragment identifier
considerations, if these considerations are not overridden by issues
specific to that media type.
11. Security Considerations 11. Security Considerations
XML MIME entities contain information which may be parsed and further XML MIME entities contain information which may be parsed and further
processed by the recipient. These entities may contain, and processed by the recipient. These entities may contain, and
recipients may permit, explicit system level commands to be executed recipients may permit, explicit system level commands to be executed
while processing the data. To the extent that a recipient while processing the data. To the extent that a recipient
application executes arbitrary command strings from within XML MIME application executes arbitrary command strings from within XML MIME
entities, they may be at risk. entities, they may be at risk.
skipping to change at page 23, line 49 skipping to change at page 24, line 40
The simplest attack involves adding declarations that break The simplest attack involves adding declarations that break
validation. Adding extraneous declarations to a list of character validation. Adding extraneous declarations to a list of character
XML-entities can effectively "break the contract" used by documents. XML-entities can effectively "break the contract" used by documents.
A tiny change that produces a fatal error in a DTD could halt XML A tiny change that produces a fatal error in a DTD could halt XML
processing on a large scale. Extraneous declarations are fairly processing on a large scale. Extraneous declarations are fairly
obvious, but more sophisticated tricks, like changing attributes from obvious, but more sophisticated tricks, like changing attributes from
being optional to required, can be difficult to track down. Perhaps being optional to required, can be difficult to track down. Perhaps
the most dangerous option available to attackers, when external DTD the most dangerous option available to attackers, when external DTD
subsets or external parameter entities or other externally-specified subsets or external parameter entities or other externally-specified
defaulting is involved, is redefining default values for attributes: defaulting is involved, is redefining default values for attributes:
e.g. if developers have relied on defaulted attributes for security, e.g. if developers have relied on defaulted attributes for security,
a relatively small change might expose enormous quantities of a relatively small change might expose enormous quantities of
information. information.
Apart from the structural possibilities, another option, "XML-entity Apart from the structural possibilities, another option, "XML-entity
spoofing," can be used to insert text into documents, vandalizing and spoofing," can be used to insert text into documents, vandalizing and
perhaps conveying an unintended message. Because XML permits perhaps conveying an unintended message. Because XML permits
multiple XML-entity declarations, and the first declaration takes multiple XML-entity declarations, and the first declaration takes
precedence, it is possible to insert malicious content where an XML- precedence, it is possible to insert malicious content where an XML-
entity reference is used, such as by inserting the full text of entity reference is used, such as by inserting the full text of
Winnie the Pooh in place of every occurrence of &mdash;. Winnie the Pooh in place of every occurrence of &mdash;.
skipping to change at page 25, line 9 skipping to change at page 25, line 48
recursive expansions may cause problems with the finite computing recursive expansions may cause problems with the finite computing
resources of computers, if they are performed many times. For resources of computers, if they are performed many times. For
example, consider the case where XML-entity A consists of 100 copies example, consider the case where XML-entity A consists of 100 copies
of XML-entity B, which in turn consists of 100 copies of XML-entity of XML-entity B, which in turn consists of 100 copies of XML-entity
C, and so on. C, and so on.
12. References 12. References
12.1. Normative References 12.1. Normative References
[HTTPbis] Fielding, R., Ed. and J. F. Reschke, Ed., "Hypertext [HTTPbis] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Transfer Protocol (HTTP/1.1): Message Syntax and Routing", Protocol (HTTP/1.1): Message Syntax and Routing", draft-
draft-ietf-httpbis-p1-messaging-25 (work in progress), ietf-httpbis-p1-messaging-25 (work in progress), November
November 2013. 2013.
[IANA-charsets] [IANA-charsets]
IANA, "Character Sets Registry", 2013, <http:// IANA, "Character Sets Registry", 2013,
www.iana.org/assignments/character-sets/character- <http://www.iana.org/assignments/character-sets/
sets.xhtml>. character-sets.xhtml>.
[RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Extensions (MIME) Part One: Format of Internet Message
Bodies", RFC 2045, November 1996. Bodies", RFC 2045, November 1996.
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046, Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996. November 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
skipping to change at page 26, line 16 skipping to change at page 27, line 5
Structured Syntax Suffixes", RFC 6839, January 2013. Structured Syntax Suffixes", RFC 6839, January 2013.
[UNICODE] The Unicode Consortium, "The Unicode Standard, Version [UNICODE] The Unicode Consortium, "The Unicode Standard, Version
6.3.0", 2013, 6.3.0", 2013,
<http://www.unicode.org/versions/Unicode6.3.0/>. <http://www.unicode.org/versions/Unicode6.3.0/>.
Defined by: The Unicode Standard, Version 6.3 (Mountain Defined by: The Unicode Standard, Version 6.3 (Mountain
View, CA: The Unicode Consortium, 2013. ISBN View, CA: The Unicode Consortium, 2013. ISBN
978-1-936213-08-5) 978-1-936213-08-5)
[XML1.1] Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., [XML1.1] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E.,
Yergeau, F., and J. Cowan, "Extensible Markup Language Yergeau, F., and J. Cowan, "Extensible Markup Language
(XML) 1.1 (Second Edition)", W3C Recommendation REC-xml, (XML) 1.1 (Second Edition)", W3C Recommendation REC-xml,
September 2006, September 2006,
<http://www.w3.org/TR/2006/REC-xml11-20060816/>. <http://www.w3.org/TR/2006/REC-xml11-20060816/>.
Latest version available at Latest version available at [2].
[XMLBase] Marsh, J. and R. Tobin, "XML Base (Second Edition)", W3C [XMLBase] Marsh, J. and R. Tobin, "XML Base (Second Edition)", W3C
Recommendation REC-xmlbase-20090128, January 2009, Recommendation REC-xmlbase-20090128, January 2009,
<http://www.w3.org/TR/2009/REC-xmlbase-20090128/>. <http://www.w3.org/TR/2009/REC-xmlbase-20090128/>.
Latest version available at Latest version available at [3].
[XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., [XML] Bray, T., Paoli, J., Sperberg-McQueen, C., Maler, E., and
and F. Yergeau, "Extensible Markup Language (XML) 1.0 F. Yergeau, "Extensible Markup Language (XML) 1.0 (Fifth
(Fifth Edition)", W3C Recommendation REC-xml, November Edition)", W3C Recommendation REC-xml, November 2008,
2008, <http://www.w3.org/TR/2008/REC-xml-20081126/>. <http://www.w3.org/TR/2008/REC-xml-20081126/>.
Latest version available at Latest version available at [1].
[XPointerElement] [XPointerElement]
Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer
element() Scheme", W3C Recommendation REC-XPointer- element() Scheme", W3C Recommendation REC-XPointer-
Element, March 2003, Element, March 2003,
<http://www.w3.org/TR/2003/REC-xptr-element-20030325/>. <http://www.w3.org/TR/2003/REC-xptr-element-20030325/>.
Latest version available at Latest version available at [4].
[XPointerFramework] [XPointerFramework]
Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer Grosso, P., Maler, E., Marsh, J., and N. Walsh, "XPointer
Framework", W3C Recommendation REC-XPointer-Framework, Framework", W3C Recommendation REC-XPointer-Framework,
March 2003, March 2003,
<http://www.w3.org/TR/2003/REC-xptr-framework-20030325/>. <http://www.w3.org/TR/2003/REC-xptr-framework-20030325/>.
Latest version available at Latest version available at [5].
[XPtrRegPolicy] [XPtrRegPolicy]
Hazael-Massieux, D., "XPointer Scheme Name Registry Hazael-Massieux, D., "XPointer Scheme Name Registry
Policy", 2005, Policy", 2005,
<http://www.w3.org/2005/04/xpointer-policy.html>. <http://www.w3.org/2005/04/xpointer-policy.html>.
[XPtrReg] Hazael-Massieux, D., "XPointer Registry", 2005, [XPtrReg] Hazael-Massieux, D., "XPointer Registry", 2005,
<http://www.w3.org/2005/04/xpointer-schemes/>. <http://www.w3.org/2005/04/xpointer-schemes/>.
12.2. Informative References 12.2. Informative References
[ASCII] American National Standards Institute, "Coded Character [ASCII] American National Standards Institute, "Coded Character
Set -- 7-bit American Standard Code for Information Set -- 7-bit American Standard Code for Information
Interchange", ANSI X3.4, 1986. Interchange", ANSI X3.4, 1986.
[AWWW] Jacobs, I. and N. Walsh, "Architecture of the World Wide [AWWW] Jacobs, I. and N. Walsh, "Architecture of the World Wide
Web, Volume One", W3C Recommendation REC-webarch-20041215, Web, Volume One", W3C Recommendation REC-webarch-20041215,
December 2004, December 2004,
<http://www.w3.org/TR/2004/REC-webarch-20041215/>. <http://www.w3.org/TR/2004/REC-webarch-20041215/>.
Latest version available at Latest version available at [8].
[FYN] Mendelsohn, N., "The Self-Describing Web", W3C TAG Finding [FYN] Mendelsohn, N., "The Self-Describing Web", W3C TAG Finding
selfDescribingDocuments-2009-02-07, February 2009, <http:/ selfDescribingDocuments-2009-02-07, February 2009,
/www.w3.org/2001/tag/doc/ <http://www.w3.org/2001/tag/doc/
selfDescribingDocuments-2009-02-07.html>. selfDescribingDocuments-2009-02-07.html>.
Latest version available at Latest version available at [9]
[Infoset] Cowan, J. and R. Tobin, "XML Information Set (Second [Infoset] Cowan, J. and R. Tobin, "XML Information Set (Second
Edition)", W3C Recommendation REC-xml-infoset-20040204, Edition)", W3C Recommendation REC-xml-infoset-20040204,
Febuary 2004, Febuary 2004,
<http://www.w3.org/TR/2005/REC-xml-id-20050909/>. <http://www.w3.org/TR/2005/REC-xml-id-20050909/>.
Latest version available at Latest version available at [11].
[MediaFrags] [MediaFrags]
Troncy, R., Mannens, E., Pfeiffer, S., and D. Van Deursen, Troncy, R., Mannens, E., Pfeiffer, S., and D. Van Deursen,
"Media Fragments URI 1.0 (basic)", W3C Recommendation "Media Fragments URI 1.0 (basic)", W3C Recommendation
media-frags, September 2012, media-frags, September 2012,
<http://www.w3.org/TR/2012/REC-media-frags-20120925/>. <http://www.w3.org/TR/2012/REC-media-frags-20120925/>.
Latest version available at Latest version available at [6].
[RFC1557] Choi, U., Chon, K., and H. Park, "Korean Character [RFC1557] Choi, U., Chon, K., and H. Park, "Korean Character
Encoding for Internet Messages", RFC 1557, December 1993. Encoding for Internet Messages", RFC 1557, December 1993.
[RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376, [RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376,
July 1998. July 1998.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., [RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
skipping to change at page 28, line 34 skipping to change at page 29, line 27
[TAGMIME] Bray, T., Ed., "Internet Media Type registration, [TAGMIME] Bray, T., Ed., "Internet Media Type registration,
consistency of use", April 2004, consistency of use", April 2004,
<http://www.w3.org/2001/tag/2004/0430-mime>. <http://www.w3.org/2001/tag/2004/0430-mime>.
[XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible [XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible
HyperText Markup Language", W3C Recommendation xhtml1, HyperText Markup Language", W3C Recommendation xhtml1,
December 1999, December 1999,
<http://www.w3.org/TR/2000/REC-xhtml1-20000126/>. <http://www.w3.org/TR/2000/REC-xhtml1-20000126/>.
Latest version available at Latest version available at [7].
[XMLModel] [XMLModel]
Grosso, P. and J. Kosek, "Associating Schemas with XML Grosso, P. and J. Kosek, "Associating Schemas with XML
documents 1.0 (Third Edition)", W3C Group Note NOTE-xml- documents 1.0 (Third Edition)", W3C Group Note NOTE-xml-
model-20121009, October 2012, model-20121009, October 2012,
<http://www.w3.org/TR/2012/NOTE-xml-model-20121009/>. <http://www.w3.org/TR/2012/NOTE-xml-model-20121009/>.
Latest version available at Latest version available at [13].
[XMLNS10] Bray, T., Hollander, D., Layman, A., Tobin, R., and H. [XMLNS10] Bray, T., Hollander, D., Layman, A., Tobin, R., and H.
Thompson, "Namespaces in XML 1.0 (Third Edition)", W3C Thompson, "Namespaces in XML 1.0 (Third Edition)", W3C
Recommendation REC-xml-names-20091208, December 2009, Recommendation REC-xml-names-20091208, December 2009,
<http://www.w3.org/TR/2009/REC-xml-names-20091208/>. <http://www.w3.org/TR/2009/REC-xml-names-20091208/>.
Latest version available at Latest version available at [12].
[XMLNS11] Bray, T., Hollander, D., Layman, A., and R. Tobin, [XMLNS11] Bray, T., Hollander, D., Layman, A., and R. Tobin,
"Namespaces in XML 1.1 (Second Edition)", W3C "Namespaces in XML 1.1 (Second Edition)", W3C
Recommendation REC-xml-names11-20060816, August 2006, Recommendation REC-xml-names11-20060816, August 2006,
<http://www.w3.org/TR/2006/REC-xml-names11-20060816/>. <http://www.w3.org/TR/2006/REC-xml-names11-20060816/>.
Latest version available at Latest version available at [14].
[XMLSS] Clark, J., Pieters, S., and H. Thompson, "Associating [XMLSS] Clark, J., Pieters, S., and H. Thompson, "Associating
Style Sheets with XML documents 1.0 (Second Edition)", W3C Style Sheets with XML documents 1.0 (Second Edition)", W3C
Recommendation REC-xml-stylesheet-20101028, October 2010, Recommendation REC-xml-stylesheet-20101028, October 2010,
<http://www.w3.org/TR/2010/REC-xml-stylesheet-20101028/>. <http://www.w3.org/TR/2010/REC-xml-stylesheet-20101028/>.
Latest version available at Latest version available at [15].
[XMLid] Marsh, J., Veillard, D., and N. Walsh, "xml:id Version [XMLid] Marsh, J., Veillard, D., and N. Walsh, "xml:id Version
1.0", W3C Recommendation REC-xml-id-20050909, September 1.0", W3C Recommendation REC-xml-id-20050909, September
2005, <http://www.w3.org/TR/2005/REC-xml-id-20050909/>. 2005, <http://www.w3.org/TR/2005/REC-xml-id-20050909/>.
Latest version available at Latest version available at [10].
Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types? Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types?
[RFC3023] contains a detailed discussion of the (at the time) novel [RFC3023] contains a detailed discussion of the (at the time) novel
use of a suffix, a practice which has since become widespread. use of a suffix, a practice which has since become widespread.
Interested parties are referred to [RFC3023], Appendix A. Interested parties are referred to [RFC3023], Appendix A.
The registration process for new '+xml' media types is described in The registration process for new '+xml' media types is described in
[RFC6838] [RFC6838]
skipping to change at page 31, line 12 skipping to change at page 31, line 50
MURATA Makoto (FAMILY Given) and Alexey Melnikov made early and MURATA Makoto (FAMILY Given) and Alexey Melnikov made early and
important contributions to the effort to revise [RFC3023]. important contributions to the effort to revise [RFC3023].
This specification reflects the input of numerous participants to the This specification reflects the input of numerous participants to the
ietf-xml-mime@imc.org, xml-mime@ietf.org and apps-discuss@ietf.org ietf-xml-mime@imc.org, xml-mime@ietf.org and apps-discuss@ietf.org
mailing lists, though any errors are the responsibility of the mailing lists, though any errors are the responsibility of the
authors. Special thanks to: authors. Special thanks to:
Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed, Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed,
Yaron Goland, Bjoern Hoehrmann, Rick Jelliffe, Murray S. Kucherawy, Yaron Goland, Bjoern Hoehrmann, Rick Jelliffe, Murray S. Kucherawy,
Larry Masinter, David Megginson, S. Moonesamy, Keith Moore, Chris Larry Masinter, David Megginson, S. Moonesamy, Keith Moore, Chris
Newman, Gavin Nicol, Julian Reschke, Marshall Rose, Jim Whitehead, Newman, Gavin Nicol, Julian Reschke, Marshall Rose, Jim Whitehead,
Erik Wilde and participants of the XML activity and the TAG at the Erik Wilde and participants of the XML activity and the TAG at the
W3C. W3C.
Jim Whitehead and Simon St.Laurent were editors of [RFC2376] and Jim Whitehead and Simon St.Laurent were editors of [RFC2376] and
[RFC3023], respectively. [RFC3023], respectively.
Authors' Addresses Authors' Addresses
Henry S. Thompson Henry S. Thompson
 End of changes. 64 change blocks. 
229 lines changed or deleted 268 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/