[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 09 RFC 6129

Network Working Group                                          L. Romary
Internet-Draft                                  TEI Consortium and INRIA
Intended status: Informational                               S. Lundberg
Expires: April 23, 2011                    The Royal Library, Copenhagen
                                                        October 20, 2010


                  The 'application/tei+xml' mediatype
                     draft-lundberg-app-tei-xml-06

Abstract

   This document defines the 'application/tei+xml' media type for markup
   languages defined in accordance with the Text Encoding and
   Interchange guidelines

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on April 23, 2011.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.




Romary & Lundberg        Expires April 23, 2011                 [Page 1]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Recognizing TEI files  . . . . . . . . . . . . . . . . . . . .  4
   3.  Fragment identifier  . . . . . . . . . . . . . . . . . . . . .  6
   4.  Security considerations  . . . . . . . . . . . . . . . . . . .  7
     4.1.  Harmful content  . . . . . . . . . . . . . . . . . . . . .  7
     4.2.  Intellectual Property Rights . . . . . . . . . . . . . . .  7
     4.3.  Authenticity and confidentiality . . . . . . . . . . . . .  7
   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  9
     5.1.  Registration of MIME type 'application/tei+xml'  . . . . .  9
   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
     6.1.  Normative References . . . . . . . . . . . . . . . . . . . 11
     6.2.  Informative References . . . . . . . . . . . . . . . . . . 11
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12




































Romary & Lundberg        Expires April 23, 2011                 [Page 2]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


1.  Introduction

   Text Encoding and Interchange (TEI) is an international and
   interdisciplinary standard that is widely used by libraries, museums,
   publishers, and individual scholars to represent all kinds of textual
   material for online research and teaching.[TEI]

   This document defines the 'application/tei+xml' media type in
   accordance with [RFC3023] in order enable generic XML processing of
   TEI documents on the Internet.









































Romary & Lundberg        Expires April 23, 2011                 [Page 3]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


2.  Recognizing TEI files

   TEI files are XML documents or fragments having the root element in a
   TEI namespace.  TEI namespace URIs start with
   http://www.tei-c.org/ns/ followed by the version number of the
   namespace.  The current namespace is http://www.tei-c.org/ns/1.0

   In general, a [TEI] file usually contains either of the strings

      <TEI

      <teiCorpus

   near the beginning, and usually after an XML declaration on the first
   line, i.e., a string like "<?xml ... ?>".

   The teiCorpus documents give the possibility to bundle multiple
   documents into a single file.

   Examples:

      A document beginning with <TEI

               <?xml version="1.0" encoding="UTF-8" ?>
               <TEI xmlns="http://www.tei-c.org/ns/1.0">
               <teiHeader>
               ...
               </teiHeader>
               <text>
               ...
               </text>
               </TEI>

      A document beginning with <teiCorpus

















Romary & Lundberg        Expires April 23, 2011                 [Page 4]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


               <?xml version="1.0" encoding="UTF-8" ?>
               <teiCorpus  xmlns="http://www.tei-c.org/ns/1.0">
               <teiHeader>
               ...
               </teiHeader>
               <TEI>
               <teiHeader>
               ...
               </teiHeader>
               <text>
               ...
               </text>
               </TEI>
               <TEI>
               ... second document ...
               </TEI>
               <TEI>
               ... third document  ...
               </TEI>
               </teiCorpus>

   TEI and teiCorpus files are often given the extensions .tei and
   .teiCorpus, respectively.  There is a third type of file, which often
   is given the suffix .odd.  ODD ('One Document Does it All') is a TEI
   XML document which include schema fragments, prose documentation, and
   reference documentation.  It is used for the definition and
   documentation of XML based languages, and primarily for the TEI
   Guidelines.[ODD] In other words, ODD files does not differ from other
   TEI file in syntax, only in function.






















Romary & Lundberg        Expires April 23, 2011                 [Page 5]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


3.  Fragment identifier

   Documents having the media type 'application/tei+xml', use the
   fragment identifier notation as specified in [RFC3023] for the media
   type 'application/xml'.














































Romary & Lundberg        Expires April 23, 2011                 [Page 6]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


4.  Security considerations

   An XML resource does not in itself compromise data security.  When
   being available on a network simply through the dereferencing of an
   IRI [RFC3987] or a URI, care must be taken to properly interpret the
   data to prevent unintended access.  Hence the security issues of
   [RFC3986], section 7, apply.  In addition, as this media type uses
   the "+xml" convention, it shares the same security considerations as
   described in RFC 3023 [RFC3023], section 10.

4.1.  Harmful content

   Any application accepting submitted or retrieving TEI XML for
   processing has to be aware of risks connected with injection of
   harmful scripts and executable XML.  Even common XML inclusion or the
   use of external entities, could potentially be used to reveal aspects
   of a service that may compromise its security.  Any vulnerability of
   these kinds are, however, application specific.  The TEI namespaces
   do not contain such elements.

4.2.  Intellectual Property Rights

   TEI documents often arise in digitization of cultural heritage
   materials.  Texts made accessible in TEI format may be unrestricted
   in the sense that there are no restrictions to their distribution
   through Digital Rights Management[DRM] or Intellectual Property
   Rights[IPR] constraints.  TEI documents are heterogeneous objects.
   Some parts of a document may be unrestricted, whereas other parts,
   such as editorial text and annotations may be written much later may
   hence be subject to DRM restrictions.

   The TEI format provides means for highly granular attribution, down
   to the content of individual XML elements.  Software agents
   participating in the exchange or processing TEI must honour markup of
   this kind.  Even when there are no IPR constraints, intellectual
   property attribution alone requires that document users are able to
   tell the difference between content from different sources.

4.3.  Authenticity and confidentiality

   Historical archival records are often encoded in TEI and legal
   document may be binding centuries after they were written.
   Digitization and encoding of legal texts may require technologies for
   assuring authenticity, such as cryptographic check sums and
   electronic signatures.

   Similarly, historical documents may in part or in their entirety be
   confidential.  This may be required by law or by the terms and



Romary & Lundberg        Expires April 23, 2011                 [Page 7]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


   conditions such as in the case of donated or deposited text from
   private sources.  A text archive may need content filtering or
   cryptographic technologies to meet such requirements.
















































Romary & Lundberg        Expires April 23, 2011                 [Page 8]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


5.  IANA Considerations

5.1.  Registration of MIME type 'application/tei+xml'

      MIME media type name: application

      MIME subtype name: tei+xml

      Required parameters: None

      Optional parameters: charset

         the parameter has identical semantics to the charset parameter
         of the "application/xml" media type as specified in RFC 3023
         [RFC3023].

      Encoding considerations:

         Identical to those for 'application/xml'.  See RFC 3023
         [RFC3023], Section 3.2.

      Security considerations:

         See Security considerations (Section 4) in this specification.

      Interoperability considerations:

         TEI documents are often given the extension '.xml', which is
         not uncommon for other XML document formats.

      Published specification:

         This media type registration is for TEI documents[TEI] as
         described in this document.

      Applications which use this media type:

         There are currently no known applications using the media type
         'application/tei+xml'.

      Additional information:

         Magic number(s):

            There is no single initial octet sequence that is always
            present in TEI documents.





Romary & Lundberg        Expires April 23, 2011                 [Page 9]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


         file extension(s):

            Common extensions are '.tei', '.teiCorpus' and '.odd'.  See
            Recognizing TEI files (Section 2) in this specification.

         Macintosh File Type Code(s)

            TEXT

         Object Identifier(s) or OID(s)

            Not applicable







































Romary & Lundberg        Expires April 23, 2011                [Page 10]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


6.  References

6.1.  Normative References

   [RFC3023]  Murata, M., St. Laurent, S., and D. Kohn, "XML Media
              Types", RFC 3023, January 2001.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, January 2005.

   [TEI]      "TEI Guidelines", <http://www.tei-c.org/Vault/P5/current/
              doc/tei-p5-doc/en/html/>.

6.2.  Informative References

   [DRM]      "Digital rights management",
              <http://en.wikipedia.org/wiki/Digital_rights_management>.

   [IPR]      "Intellectual property", <http://en.wikipedia.org/wiki/
              Intellectual_Property_Rights>.

   [ODD]      "Getting Started with P5 ODDs",
              <http://www.tei-c.org/Guidelines/Customization/odds.xml>.

   [RFC3987]  Duerst, M. and M. Suignard, "Internationalized Resource
              Identifiers (IRIs)", RFC 3987, January 2005.
























Romary & Lundberg        Expires April 23, 2011                [Page 11]


Internet-Draft     The 'application/tei+xml' mediatype      October 2010


Authors' Addresses

   Laurent Romary
   TEI Consortium and INRIA


   Email: laurent.romary@inria.fr
   URI:   http://www.tei-c.org/


   Sigfrid Lundberg
   The Royal Library, Copenhagen
   Postbox 2149
   1016 Koebenhavn K
   Denmark

   Email: slu@kb.dk
   URI:   http://sigfrid-lundberg.se/

































Romary & Lundberg        Expires April 23, 2011                [Page 12]


Html markup produced by rfcmarkup 1.129c, available from https://tools.ietf.org/tools/rfcmarkup/