[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]
Versions: 00 01 02 03 04 05 06 07 08 09 RFC 6129
Network Working Group L. Romary
Internet-Draft TEI Consortium and INRIA
Intended status: Standards Track S. Lundberg
Expires: April 22, 2011 The Royal Library, Copenhagen
October 19, 2010
The 'application/tei+xml' mediatype
draft-lundberg-app-tei-xml-04
Abstract
This document defines the 'application/tei+xml' media type for markup
languages defined in accordance with the Text Encoding and
Interchange guidelines
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 22, 2011.
Copyright Notice
Copyright (c) 2010 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Romary & Lundberg Expires April 22, 2011 [Page 1]
Internet-Draft The 'application/tei+xml' mediatype October 2010
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Recognizing TEI files . . . . . . . . . . . . . . . . . . . . 4
3. Fragment identifier . . . . . . . . . . . . . . . . . . . . . 6
4. Security considerations . . . . . . . . . . . . . . . . . . . 7
4.1. Harmful content . . . . . . . . . . . . . . . . . . . . . 7
4.2. Intellectual Property Rights . . . . . . . . . . . . . . . 7
4.3. Authenticity and confidentiality . . . . . . . . . . . . . 7
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
5.1. Registration of MIME type 'application/tei+xml' . . . . . 9
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1. Normative References . . . . . . . . . . . . . . . . . . . 11
6.2. Informative References . . . . . . . . . . . . . . . . . . 11
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12
Romary & Lundberg Expires April 22, 2011 [Page 2]
Internet-Draft The 'application/tei+xml' mediatype October 2010
1. Introduction
Text Encoding and Interchange (TEI) is an international and
interdisciplinary standard that is widely used by libraries, museums,
publishers, and individual scholars to represent all kinds of textual
material for online research and teaching.[TEI]
This document defines the 'application/tei+xml' media type in
accordance with [RFC3023] in order enable generic XML processing of
TEI documents on the Internet.
Romary & Lundberg Expires April 22, 2011 [Page 3]
Internet-Draft The 'application/tei+xml' mediatype October 2010
2. Recognizing TEI files
TEI files are XML documents or fragments having the root element in a
TEI namespace. TEI namespace URIs start with
http://www.tei-c.org/ns/ followed by the version number of the
namespace. The current namespace is http://www.tei-c.org/ns/1.0
In general, a [TEI] file usually contains either of the strings
<TEI
<teiCorpus
near the beginning, and usually after an XML declaration on the first
line, i.e., a string like "<?xml ... ?>".
The teiCorpus documents give the possibility to bundle multiple
documents into a single file.
Examples:
A document beginning with <TEI
<?xml version="1.0" encoding="UTF-8" ?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
...
</teiHeader>
<text>
...
</text>
</TEI>
A document beginning with <teiCorpus
Romary & Lundberg Expires April 22, 2011 [Page 4]
Internet-Draft The 'application/tei+xml' mediatype October 2010
<?xml version="1.0" encoding="UTF-8" ?>
<teiCorpus xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
...
</teiHeader>
<TEI>
<teiHeader>
...
</teiHeader>
<text>
...
</text>
</TEI>
<TEI>
... second document ...
</TEI>
<TEI>
... third document ...
</TEI>
</teiCorpus>
Romary & Lundberg Expires April 22, 2011 [Page 5]
Internet-Draft The 'application/tei+xml' mediatype October 2010
3. Fragment identifier
Documents having the media type 'application/tei+xml', use the
fragment identifier notation as specified in [RFC3023] for the media
type 'application/xml'.
Romary & Lundberg Expires April 22, 2011 [Page 6]
Internet-Draft The 'application/tei+xml' mediatype October 2010
4. Security considerations
An XML resource does not in itself compromise data security. When
being available on a network simply through the dereferencing of an
IRI [RFC3987] or a URI, care must be taken to properly interpret the
data to prevent unintended access. Hence the security issues of
[RFC3986], section 7, apply. In addition, as this media type uses
the "+xml" convention, it shares the same security considerations as
described in RFC 3023 [RFC3023], section 10.
4.1. Harmful content
Any application accepting submitted or retrieving TEI XML for
processing has to be aware of risks connected with injection of
harmful scripts and executable XML. Even common XML inclusion or the
use of external entities, could potentially be used to reveal aspects
of a service that may compromise its security. Any vulnerability of
these kinds are, however, application specific. The TEI namespaces
do not contain such elements.
4.2. Intellectual Property Rights
TEI documents often arise in digitization of cultural heritage
materials. Texts made accessible in TEI format may be unrestricted
in the sense that there are no restrictions to their distribution
through Digital Rights Management[DRM] or Intellectual Property
Rights[IPR] constraints. TEI documents are heterogeneous objects.
Some parts of a document may be unrestricted, whereas other parts,
such as editorial text and annotations may be written much later may
hence be subject to DRM restrictions.
The TEI format provides means for highly granular attribution, down
to the content of individual XML elements. Software agents
participating in the exchange or processing TEI must honour markup of
this kind. Even when there are no IPR constraints, intellectual
property attribution alone requires that document users are able to
tell the difference between content from different sources.
4.3. Authenticity and confidentiality
Historical archival records are often encoded in TEI and legal
document may be binding centuries after they were written.
Digitization and encoding of legal texts may require technologies for
assuring authenticity, such as cryptographic check sums and
electronic signatures.
Similarly, historical documents may in part or in their entirety be
confidential. This may be required by law or by the terms and
Romary & Lundberg Expires April 22, 2011 [Page 7]
Internet-Draft The 'application/tei+xml' mediatype October 2010
conditions such as in the case of donated or deposited text from
private sources. A text archive may need content filtering or
cryptographic technologies to meet such requirements.
Romary & Lundberg Expires April 22, 2011 [Page 8]
Internet-Draft The 'application/tei+xml' mediatype October 2010
5. IANA Considerations
5.1. Registration of MIME type 'application/tei+xml'
MIME media type name: application
MIME subtype name: tei+xml
Required parameters: None
Optional parameters: charset
the parameter has identical semantics to the charset parameter
of the "application/xml" media type as specified in RFC 3023
[RFC3023].
Encoding considerations:
Identical to those for 'application/xml'. See RFC 3023
[RFC3023], Section 3.2.
Security considerations:
See Security considerations (Section 4) in this specification.
Interoperability considerations:
TEI documents often have the extension '.xml', which is not
uncommon for other XML document formats.
Published specification:
This media type registration is for TEI documents[TEI] as
described in this document.
Applications which use this media type:
There are currently no known applications using the media type
'application/tei+xml'.
Additional information:
Magic number(s):
There is no single initial octet sequence that is always
present in TEI documents.
Romary & Lundberg Expires April 22, 2011 [Page 9]
Internet-Draft The 'application/tei+xml' mediatype October 2010
file extension(s):
Common extensions are '.tei', '.teiCorpus' and '.odd'.
Macintosh File Type Code(s)
TEXT
Object Identifier(s) or OID(s)
Not applicable
Romary & Lundberg Expires April 22, 2011 [Page 10]
Internet-Draft The 'application/tei+xml' mediatype October 2010
6. References
6.1. Normative References
[RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media
Types", RFC 3023, January 2001.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, January 2005.
[TEI] "TEI Guidelines", <http://www.tei-c.org/Vault/P5/current/
doc/tei-p5-doc/en/html/>.
6.2. Informative References
[DRM] "Digital rights management",
<http://en.wikipedia.org/wiki/Digital_rights_management>.
[IPR] "Intellectual property", <http://en.wikipedia.org/wiki/
Intellectual_Property_Rights>.
[RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource
Identifiers (IRIs)", RFC 3987, January 2005.
Romary & Lundberg Expires April 22, 2011 [Page 11]
Internet-Draft The 'application/tei+xml' mediatype October 2010
Authors' Addresses
Laurent Romary
TEI Consortium and INRIA
Email: laurent.romary@inria.fr
URI: http://www.tei-c.org/
Sigfrid Lundberg
The Royal Library, Copenhagen
Postbox 2149
1016 Koebenhavn K
Denmark
Email: slu@kb.dk
URI: http://sigfrid-lundberg.se/
Romary & Lundberg Expires April 22, 2011 [Page 12]
Html markup produced by rfcmarkup 1.129d, available from
https://tools.ietf.org/tools/rfcmarkup/