[Docs] [txt|pdf] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 RFC 4151

Network Working Group                                        T. Kindberg
Internet-Draft                               Hewlett-Packard Corporation
Expires: April 18, 2005                                         S. Hawke
                                               World Wide Web Consortium
                                                        October 18, 2004


                          The 'tag' URI scheme
                       draft-kindberg-tag-uri-06

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 18, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2004).

Abstract

   This document describes the "tag" Uniform Resource Identifier (URI)
   scheme.  Tag URIs (also known as "tags") are designed to be unique
   across space and time while being tractable to humans.  They are
   distinct from most other URIs in that there is no authoritative
   resolution mechanism.  A tag may be used purely as an entity
   identifier.  Furthermore, using tags has some advantages over the



Kindberg & Hawke         Expires April 18, 2005                 [Page 1]

Internet-Draft                  Tag URIs                    October 2004


   common practice of using "http" URIs as identifiers for
   non-HTTP-accessible resources.

Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119.

Disclaimer

   The views and opinions of authors expressed herein do not necessarily
   state or reflect those of the World Wide Web Consortium, and may not
   be used for advertising or product endorsement purposes.  This
   proposal has not undergone technical review within the Consortium and
   must not be construed as a Consortium recommendation.

Further Information and Discussion of this Document

   Information about the tag URI scheme additional to this document --
   motivation, genesis and discussion -- can be obtained from
   http://www.taguri.org.

   Earlier drafts of this document have been discussed on uri@w3.org.
   The authors welcome further discussion and comments.


























Kindberg & Hawke         Expires April 18, 2005                 [Page 2]

Internet-Draft                  Tag URIs                    October 2004


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Tag Syntax and Rules . . . . . . . . . . . . . . . . . . . . .  5
     2.1   Tag Syntax and Examples  . . . . . . . . . . . . . . . . .  5
     2.2   Rules for Minting Tags . . . . . . . . . . . . . . . . . .  6
     2.3   Resolution of Tags . . . . . . . . . . . . . . . . . . . .  8
     2.4   Equality of Tags . . . . . . . . . . . . . . . . . . . . .  8
   3.  Internationalisation . . . . . . . . . . . . . . . . . . . . .  8
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   5.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
   5.1   Normative References . . . . . . . . . . . . . . . . . . . . 10
   5.2   Informative References . . . . . . . . . . . . . . . . . . . 11
       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 11
       Intellectual Property and Copyright Statements . . . . . . . . 13




































Kindberg & Hawke         Expires April 18, 2005                 [Page 3]

Internet-Draft                  Tag URIs                    October 2004


1.  Introduction

   A tag is a type of Uniform Resource Identifier (URI) [1] designed to
   meet the following requirements:

   1.  Identifiers are likely to be unique across space and time, and
       come from a practically inexhaustible supply.
   2.  Identifiers are relatively convenient for humans to mint
       (create), read, type, remember etc.
   3.  No central registration is necessary, at least for holders of
       domain names or email addresses; and there is negligible cost to
       mint each new identifier.
   4.  The identifiers are independent of any particular resolution
       scheme.

   For example, the above requirements may apply in the case of a user
   who wants to place identifiers on their documents:

   a.  The user wants to be reasonably sure that the identifier is
       unique.  Global uniqueness is valuable because it prevents
       identifiers from becoming unintentionally ambiguous.
   b.  The identifiers should be tractable to the user, who should, for
       example, be able to mint new identifiers conveniently, to
       memorise them, and to type them into emails and forms.
   c.  The user does not want to have to communicate with anyone else in
       order to mint identifiers for their documents.
   d.  The user wants to avoid identifiers that might be taken to imply
       the existence of an electronic resource accessible via a default
       resolution mechanism, when no such electronic resource exists.

   Existing identification schemes satisfy some but not all of the
   requirements above.  For example:

   UUIDs [9], [10] are hard for humans to read.

   OIDs [11], [12] and Digital Object Identifiers [13] require entities
   to register as naming authorities, even in cases where the entity
   already holds a domain name registration.

   URLs (in particular, "http" URLs) are sometimes used as identifiers
   that satisfy most of the above requirements.  Many users and
   organisations have already registered a domain name, and the use of
   the domain name to mint identifiers comes at no additional cost.  But
   there are drawbacks to URLs-as-identifiers:

   o  An attempt may be made to resolve a URL-as-identifier, even though
      there is no resource accessible at the "location".
   o  Domain names change hands and the new assignee of a domain name



Kindberg & Hawke         Expires April 18, 2005                 [Page 4]

Internet-Draft                  Tag URIs                    October 2004


      can't be sure that they are minting new names.  For example, if
      example.org is assigned first to a user Smith and then to a user
      Jones, there is no systematic way for Jones to tell whether Smith
      has already used a particular identifier such as
      http://example.org/9999.
   o  Entities could rely on purl.org or a similar service as a
      (first-come, first-served) assigner of unique URIs; but a solution
      without reliance upon another entity such as the Online Computer
      Library Center (OCLC, which runs purl.org) may be preferable.

   Lastly, many entities -- especially individuals -- are assignees of
   email addresses but not domain names.  It would be preferable to
   enable those entities to mint unique identifiers.

2.  Tag Syntax and Rules

   This section first specifies the syntax of tag URIs and gives
   examples.  It then describes a set of rules for minting tags designed
   to make them unique.  Finally, it discusses the resolution and
   comparison of tags.

2.1  Tag Syntax and Examples

   The general syntax of a tag URI, in ABNF [2], is:

      tagURI        = "tag:" taggingEntity ":" specific

   Where:

      taggingEntity = authorityName "," date
      authorityName = DNSname / emailAddress
      date          = year ["-" month ["-" day ]] ; see ISO8601 [3] &
      RFC3339 [14]
      year          = 4DIGIT
      month         = 2DIGIT
      day           = 2DIGIT
      DNSname       = DNScomp *( "." DNScomp )  ; see RFC1035 [4]
      DNScomp       = dnsChar [*(dnsChar /"-") dnsChar]
      dnsChar       = alphaNum / pct-encoded ; pct-encoded from RFCXXXX
      [1]
      emailAddress  = 1*(alphaNum /"-"/"."/"_"/pct-encoded) "@" DNSname
      alphaNum      = DIGIT / ALPHA
      specific      = *( pchar / "/" / "?" ) ; pchar from RFCXXXX [1]

   The component "taggingEntity" is the name space part of the URI.  To
   avoid ambiguity, the domain name in "authorityName" (whether an email
   address or a simple domain name) MUST be fully qualified.  It is
   RECOMMENDED that the domain name should be in lowercase form.



Kindberg & Hawke         Expires April 18, 2005                 [Page 5]

Internet-Draft                  Tag URIs                    October 2004


   Alternative formulations of the same authority name will be counted
   as distinct and hence tags containing them will be unequal (see
   Section 2.4).  For example, tags beginning "tag:EXAMPLE.com,2000:"
   are never equal to those beginning "tag:example.com,2000:", even
   though they refer to the same domain name.

   Authority names could, in principle, belong to any syntactically
   distinct namespaces whose names are assigned to a unique entity at a
   time.  Those include, for example, certain IP addresses, certain MAC
   addresses, and telephone numbers.  However, to simplify the tag
   scheme, we restrict authority names to be domain names and email
   addresses.  Future standards efforts may allow use of other authority
   names following syntax that is disjoint from this syntax.  To allow
   for such developments, software that processes tags MUST NOT reject
   them on the grounds that they are outside the syntax for
   authorityName defined above.

   The component "specific" is the name-space-specific part of the URI:
   it is a string of URI characters (see restrictions in syntax
   specification) chosen by the minter of the URI.  It is RECOMMENDED
   that specific identifiers should be human-friendly.

   In the interests of tractability to humans, tags SHOULD NOT be minted
   with percent-encoded parts.  However, the tag syntax does allow
   percent-encoded characters, in the "pct-encoded" and "pchar" elements
   (both defined in RFCXXXX [1]), for two purposes.  Those purposes are:
   (a) to verify the syntactic conformance of tags minted as
   Internationalized Resource Identifiers (IRIs) [5]; and (b) to
   represent tag IRIs in systems that cannot handle IRIs.  See Section 3
   for further discussion of internationalisation.

   Examples of tag URIs are:

      tag:timothy@hpl.hp.com,2001:web/externalHome
      tag:sandro@w3.org,2004-05:Sandro
      tag:my-ids.com,2001-09-15:TimKindberg:presentations:UBath2004-05-19
      tag:blogger.com,1999:blog-555
      tag:yaml.org,2002:int
      tag:herv%C3%A9.example.org,2004:r%C3%A9sum%C3%A9

2.2  Rules for Minting Tags

   As Section 2.1 has specified, each tag consists of a "tagging entity"
   followed, optionally, by a specific identifier.  The tagging entity
   is designated by an "authority name" -- a fully qualified domain name
   or an email address containing a fully qualified domain name --
   followed by a date.  The date is chosen to make the tagging entity
   globally unique, exploiting the fact that domain names and email



Kindberg & Hawke         Expires April 18, 2005                 [Page 6]

Internet-Draft                  Tag URIs                    October 2004


   addresses are assigned to at most one entity at a time.  That entity
   then ensures that it mints unique identifiers.

   The date specifies, according to the Gregorian calendar and UTC, any
   particular day on which the authority name was assigned to the
   tagging entity at 00:00 UTC (the start of the day).  The date MAY be
   a past or present date on which the authority name was assigned at
   that moment.  The date is specified using one of the "YYYY",
   "YYYY-MM" and "YYYY-MM-DD" formats allowed by the ISO 8601 standard
   [3].  The tag specification permits no other formats.  Tagging
   entities MUST ascertain the date with sufficient accuracy
   to avoid accidentally using a date on which the authority name was
   not in fact assigned (many computers and mobile devices have poorly
   synchronised clocks).  The date MUST be reckoned from UTC -- which
   may differ from the date in the tagging entity's local timezone at
   00:00 UTC.  That distinction can generally be safely ignored in
   practice, but not on the day of the authority name's assignment.  In
   principle it would otherwise be possible on that day for the previous
   assignee and the new assignee to use the same date and thus mint the
   same tags.

   In the interests of brevity, the month and day default to "01".  A
   day value of "01" MAY be omitted; a month value of "01" MAY be
   omitted unless it is followed by a day value other than "01".  For
   example, "2001-07" is the date 2001-07-01 and "2000" is the date
   2000-01-01.  All date formulations specify a moment (00:00 UTC) of a
   single day, and not a period of a day or more such as "the whole of
   July 2001" or "the whole of 2000".  Assignment at that moment is all
   that is required to use a given date.

   Tagging entities should be aware that alternative formulations of the
   same date will be counted as distinct and hence tags containing them
   will be unequal.  For example, tags beginning "tag:example.com,2000:"
   are never equal to those beginning "tag:example.com,2000-01-01:",
   even though they refer to the same date (see Section 2.4).

   An entity MUST NOT mint tags under an authority name that was
   assigned to a different entity at 00:00 UTC on the given date, and it
   MUST NOT mint tags under a future date.

   An entity that acquires an authority name immediately after a period
   during which the name was unassigned MAY mint tags as if the entity
   was assigned the name during the unassigned period.  This practice
   has considerable potential for error and MUST NOT be used unless the
   entity has substantial evidence that the name was unassigned during
   that period.  The authors are currently unaware of any mechanism that
   would count as evidence, other than daily polling of the "whois"
   registry.



Kindberg & Hawke         Expires April 18, 2005                 [Page 7]

Internet-Draft                  Tag URIs                    October 2004


   For example, Hewlett-Packard holds the domain registration for hp.com
   and may mint any tags rooted at that name with a current or past date
   when it held the registration.  It must not mint tags such as
   "tag:champignon.net,2001:" under domain names not registered to it.
   It must not mint tags dated in the future, such as
   "tag:hp.com,2999:".  If it obtains assignment of
   "extremelyunlikelytobeassigned.org" on 2001-05-01, then it must not
   mint tags under "extremelyunlikelytobeassigned.org,2001-04-01" unless
   it has evidence proving that that name was continuously unassigned
   between 2001-04-01 and 2001-05-01.

   A tagging entity mints specific identifiers that are unique within
   its context, in accordance with any internal scheme that uses only
   URI characters.  Tagging entities SHOULD use record-keeping
   procedures to achieve uniqueness.  Some tagging entities (e.g.
   corporations, mailing lists) consist of many people, in which case
   group decision-making SHOULD also be used to achieve uniqueness.  The
   outcome of such decision-making could be to delegate control over
   parts of the namespace.  For example, the assignees of example.com
   could delegate control over all tags with the prefixes
   tag:example.com,2004:fred: and tag:example.com,2004:bill:
   respectively to the individuals with internal names "fred" and "bill"
   on 2004-01-01.

2.3  Resolution of Tags

   There is no authoritative resolution mechanism for tags.  Unlike most
   other URIs, tags can only be used as identifiers, and are not
   designed to support resolution.  If authoritative resolution is a
   desired feature, a different URI scheme should be used.

2.4  Equality of Tags

   Tags are simply strings of characters and are considered equal if and
   only if they are completely indistinguishable in their machine
   representations when using the same character encoding.  That is, one
   can compare tags for equality by comparing the numeric codes of their
   characters, in sequence, for numeric equality.  This criterion for
   equality allows for simplification of tag-handling software, which
   does not have to transform tags in any way to compare them.

3.  Internationalisation

   So far, we have considered tags as URIs, which are represented in a
   subset of US-ASCII characters.  As befits our requirement for
   identifiers to be tractable to humans, tags can also be minted as
   Internationalized Resource Identifiers (IRIs) [5].  That is, they can
   be minted using any characters from the Universal Character Set, as



Kindberg & Hawke         Expires April 18, 2005                 [Page 8]

Internet-Draft                  Tag URIs                    October 2004


   long as they obey the tag URI syntax after appropriate character
   encoding (the general mapping from an IRI to a URI is defined in
   [5]).  This section defines the constraints on the resultant
   percent-encoded parts.  The purpose of the encoding is either (a) to
   verify conformity of tag IRIs or (b) to enable tags to be used within
   systems that are unable to handle IRIs.  As stated above, tags
   (whether IRIs or URIs) SHOULD NOT be minted with pct-encoded parts.

   The pct-encoded parts in the tag URI syntax are used to encode
   characters using UTF-8.  Octets that are pct-encoded must be above
   %7F, and must only occur in valid UTF-8 octet sequences.  The
   following additional restrictions apply.

   The component DNSName MUST, after decoding of percent-encoded
   characters and interpretation of the resulting octet sequence as
   UTF-8, be an International Domain Name (IDN) [6] represented
   according to the rules of 'nameprep' [7].

   Percent-encoding is also allowed on the left-hand side (before the
   "@") in emailAddress for future compatibility.  However, it is only
   to be used if and when it conforms to a standard for expressing email
   addresses in international form.

   The component "specific" MUST, after decoding of percent-encoded
   characters and interpretation of the resulting octet sequence as
   UTF-8, appear in at least Normalized Form C (NFC) and SHOULD appear
   in Normalized Form KC (NFKC) [8].

   Two tag IRIs are equal if and only if they are identical as character
   sequences -- and thus that their machine representations are
   identical when using the same character encodings.  A tag IRI is not
   equivalent to the tag URI resulting after mapping the IRI to a URI
   according to [5].  To reduce problems that may result from that, tags
   should be used mainly with systems that can handle IRIs (such as
   RDF).  If tag IRIs are converted to URIs because they have to be
   passed to a system that cannot handle IRIs, then they should be
   converted back to IRIs when they are received back from that system.

4.  Security Considerations

   Minting a tag, by itself, is an operation internal to the tagging
   entity with no external consequences.  The consequences of using an
   improperly minted tag (due to malice or error) in an application
   depends on the application, and must be considered in the design of
   any application that uses tags.

   There is a significant possibility of minting errors by people who
   fail to apply the rules governing dates, or who use a shared



Kindberg & Hawke         Expires April 18, 2005                 [Page 9]

Internet-Draft                  Tag URIs                    October 2004


   (organizational) authority-name without prior organization-wide
   agreement.  Tag-aware software MAY help catch and warn against these
   errors.  As stated in Section 2, however, to allow for future
   expansion, software MUST NOT reject tags which do not conform to the
   syntax specified in Section 2.

   A malicious party could make it appear that the same domain name or
   email address was assigned to each of two or more entities.  Tagging
   entities SHOULD use reputable assigning authorities, and verify
   assignment wherever possible.

   Entities SHOULD also avoid the potential for malicious exploitation
   of clock skew, by using authority names that were assigned
   continuously from well before to well after 00:00 UTC on the date
   chosen for the tagging entity -- preferably by intervals in the order
   of days.

5.  References

5.1  Normative References

   [1]  Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
        Identifier (URI): Generic Syntax (Note to the RFC Editor: Please
        update this reference with the RFC resulting from
        draft-fielding-uri-rfc2396bis-xx.txt, and remove this Note)",
        draft-fielding-uri-rfc2396bis-07 (work in progress), September
        2004.

   [2]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997.

   [3]  "Data elements and interchange formats -- Information
        interchange -- Representation of dates and   times", ISO
        (International Organization for Standardization) ISO 8601:1988,
        1988.

   [4]  Mockapetris, P., "Domain names - implementation and
        specification", STD 13, RFC 1035, November 1987.

   [5]  Duerst, M. and M. Suignard, "Internationalized Resource
        Identifiers (IRIs)", draft-duerst-iri-10 (work in progress),
        September 2004.

   [6]  Faltstrom, P., Hoffman, P. and A. Costello, "Internationalizing
        Domain Names in Applications (IDNA)", RFC 3490, March 2003.

   [7]  Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for
        Internationalized Domain Names (IDN)", RFC 3491, March 2003.



Kindberg & Hawke         Expires April 18, 2005                [Page 10]

Internet-Draft                  Tag URIs                    October 2004


   [8]  Duerst, M. and M. Davis, "Unicode Normalization Forms", Unicode
        Standard Annex #15
        http://www.unicode.org/unicode/reports/tr15/tr15-23.html, April
        2003.

5.2  Informative References

   [9]   Leach, P. and R. Salz, "UUIDs and GUIDs", draft-leach-uuids-01
         (work in progress), 1997.

   [10]  "Information technology - Open Systems Interconnection - Remote
         Procedure Call (RPC)", ISO (International Organization for
         Standardization) ISO/IEC 11578:1996, 1996.

   [11]  "Specification of abstract syntax notation one (ASN.1)", ITU-T
         recommendation X.208,  (see also RFC 1778), 1988.

   [12]  Mealling, M., "A URN Namespace of Object Identifiers", RFC
         3061, February 2001.

   [13]  Paskin, N., "Information Identifiers", Learned Publishing Vol.
         10, No. 2, pp. 135-156,  (see also www.doi.org), April 1997.

   [14]  Klyne, G. and C. Newman, "Date and Time on the Internet:
         Timestamps", RFC 3339, July 2002.


Authors' Addresses

   Tim Kindberg
   Hewlett-Packard Corporation
   Hewlett-Packard Laboratories
   Filton Road
   Stoke Gifford
   Bristol  BS34 8QZ
   UK

   Phone: +44 117 312 9920
   EMail: timothy@hpl.hp.com


   Sandro Hawke
   World Wide Web Consortium
   32 Vassar Street
   Building 32-G508
   Cambridge, MA  02139
   USA




Kindberg & Hawke         Expires April 18, 2005                [Page 11]

Internet-Draft                  Tag URIs                    October 2004


   Phone: +1 617 253-7288
   EMail: sandro@w3.org

















































Kindberg & Hawke         Expires April 18, 2005                [Page 12]

Internet-Draft                  Tag URIs                    October 2004


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Kindberg & Hawke         Expires April 18, 2005                [Page 13]


Html markup produced by rfcmarkup 1.109, available from https://tools.ietf.org/tools/rfcmarkup/