Uniform Resource Names (urnbis)                               J. Klensin
Internet-Draft                                              June 8, 2016
Updates: 3986 (if approved)
Intended status: Standards Track
Expires: December 10, 2016

                      URN Semantics Clarification


   Experience has shown that identifiers associated with persistent
   names have properties and requirements that may be somewhat different
   from identifiers associated with the locations of objects.  This is
   especially true when such names are expected to be stable for a very
   long time or when they identify large and complex entities.  In order
   to allow Uniform Resource Names (URNs) to evolve to meet the needs of
   the Library, Museum, Publisher, and Information Science communities
   and other users, this specification separates URNs from the semantic
   constraints that many people believe are part of the specification
   for Uniform Resource Identifiers (URIs) in RFC 3986, updating that
   document accordingly.  The syntax of URNs is still constrained to
   that of RFC 3986, so generic URI parsers are unaffected by this

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Pragmatic Goals . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  The role of queries and fragments in URNs . . . . . . . . . .   5
   4.  Changes to RFC 3986 . . . . . . . . . . . . . . . . . . . . .   6
   5.  Actions Occurring in Parallel with this Specification . . . .   7
   6.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   7
   7.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .   8
   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   9.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
     10.1.  Normative References . . . . . . . . . . . . . . . . . .   9
     10.2.  Informative References . . . . . . . . . . . . . . . . .   9
   Appendix A.  Background on the URN - URI relationship . . . . . .  10
   Appendix B.  Change Log . . . . . . . . . . . . . . . . . . . . .  11
     B.1.  Changes from draft-ietf-urnbis-urns-are-not-uris-00
           (2014-04-07) to -01 (2014-07-03)  . . . . . . . . . . . .  11
     B.2.  Changes from draft-ietf-urnbis-urns-are-not-uris-01
           to draft-ietf-urnbis-semantics-clarif-00 (2014-08-25) . .  12
     B.3.  Changes from draft-ietf-urnbis-semantics-clarif-00
           (2014-08-25) to -01 . . . . . . . . . . . . . . . . . . .  12
     B.4.  Changes from draft-ietf-urnbis-semantics-clarif-01
           (2015-02-14) to -02 . . . . . . . . . . . . . . . . . . .  13
     B.5.  Changes from draft-ietf-urnbis-semantics-clarif-02
           (2015-08-10) to -03 . . . . . . . . . . . . . . . . . . .  13

     B.6.  Changes from draft-ietf-urnbis-semantics-clarif-03
           (2016-02-07) to -04 . . . . . . . . . . . . . . . . . . .  13
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  14

1.  Introduction

   The Generic URI Syntax specification [RFC3986] covers both locators
   and names and mixtures of the two (See its Section 1.1.3) and
   describes Uniform Resource Locators (URLs) -- first documented in the
   IETF in RFC 1738 [RFC1738] -- as an embodiment of the locator concept
   and Uniform Resource Names (URNs), specifically those using the "urn"
   scheme [RFC2141], as an embodiment of the names that do not directly
   provide for resource location.  This specification is concerned only
   about URNs of the variety described in RFC 2141 [RFC2141] and its
   successors [RFC2141bis] (i.e., those that use the "urn" scheme).
   URLs, other types of names, and any URI types that may not fall into
   one of the above categories are out of its scope and unaffected by

   Experience with URNs since the publication of RFC 3986 has identified
   several ways in which their inclusion under the 3986 scope has
   hampered understanding, adoption, and especially extension
   (specifically types of extensions that were anticipated, but not
   defined, in RFC 2141).  The need for extensions to the URN concept is
   now being felt in some communities, especially those that include
   libraries, museums, publishers, and other information scientists.

   In particular, the Generic URI Syntax specification goes beyond
   syntax to specify the meaning and interpretation of various fields,
   especially the "query" and "fragment" ones and the various syntax
   forms and interpretations it allows for <hier-part>.  There is
   disagreement in the community about whether some of the statements in
   RFC 3986 are normative requirements or discussion of possible options
   and, if the latter, whether the options given are an exclusive list.
   Sequences of statements in that document that can be read in
   different ways reinforce those disagreements.  As one example, the
   3986 discussion of fragments (see Section 3.5, especially the first
   two paragraphs) has been read as leaving the interpretation of
   strings in fragment syntax that are not associated with retrievable
   objects and media types as undefined and unconstrained and hence
   available for other uses.  Others have read the second paragraph as
   prohibiting any interpretation or use of fragments on a per-scheme
   basis, essentially prohibiting them when the URI does not resolve to
   an object with a media type.

   This document does not attempt to resolve those disagreements.  Doing
   so does not seem to be necessary and would be far out of scope for
   the WG that produced it, and would mire URN work in controversies

   that might never be resolved.  Instead, it provides what might best
   be thought of as an interpretation rule: if someone reads a statement
   about the meaning or interpretation of a particular field, or non-
   syntactic restrictions on it, as inconsistent between RFC 3986 and
   this document and/or [RFC2141bis], these URN-specific documents
   prevail (again, only for the "urn:" scheme; any extension to other
   types of names would be the subject of other work).

   In other words, this specification excludes URNs from the RFC 3986
   definitions of meaning and interpretation so that RFC 3986 applies,
   for URNs, to their syntax only.  The meaning --and any more specific
   syntax rules-- for those fields for URNs are now defined in a URN-
   specific document [RFC2141bis].  URNs remain members of the URI
   family and parsers for generic URI syntax are not affected by this
   specification although parsers that make assumptions based on other
   URI schemes obviously might be.

   Neither this specification nor the successor to RFC 2141 [RFC2141bis]
   discusses DDDS [RFC3401] resolution or conversion to (and
   interpretation of) URCs [RFC2483] or, with the exception of providing
   some syntax to cover some specific cases, URN "resolution" more
   generally.  Any of those topics that do need to be addressed should
   be covered in other documents.  The document also does not discuss
   alternatives to URNs, either those that might use a different scheme
   name within the RFC 3986 URI framework or those that might use a
   different framework entirely.  In particular, some externally-defined
   content or object identification systems could be represented either
   by a URN namespace or through separate URI schemes.  This
   specification does not offer advice on that choice other than to
   suggest that the two options not be confused (or both used in a way
   that would be confusing).

   In the long term, as the expanded syntax and uses of URNs become
   commonplace and RFC 3986 is updated, this specification is likely to
   become of historical interest only, providing an extended rationale
   for decisions made and adjustment of the boundary between URN
   specifications and generic URI ones, especially those that are used
   as locators rather than names.

2.  Pragmatic Goals

   Despite the important background and rationale in other sections of
   this document, the change made (or clarification provided) by this
   specification is driven by a desire to avoid philosophical debates
   about terminology, ultimate truths, or even different interpretations
   of RFC 3986.  Instead, it is motivated by three very pragmatic
   principles and goals:

   1.  Accommodate the communities who think URNs are necessary, i.e.,
       that they can and should be usefully distinguished from other
       URIs, at least location-oriented ones (including URI schemes
       defined prior to the time work started on this document in August
       2014).  In particular, provide a foundation for extensions to the
       URN syntax (as allowed by and partially defined in RFC 2141) to
       support requirements encountered by some of those communities.

   2.  Provide a path to avoid getting bogged down in declarative
       statements about definitions and debates about what is and is not
       abstractly correct.

   3.  Avoid a fork in the standard that would be likely to lead to
       multiple, conflicting, definitions or criteria for URNs.

   In addition, this document is intended to move past debates about
   whether or not URNs are intended to be parsed at all (i.e., whether a
   "urn"-scheme URI is simply opaque to a URI parser once the scheme
   name is identified) and, if not, how much of it is actually expected
   to be understood and broken into identifiable parts by such a parser.
   It establishes a principle that, for the "urn" scheme, parsing into
   the components identified in RFC 3986 may be performed but that any
   meanings or interpretation assigned to those components (including
   application of the normal English meanings of such terms as "query"
   or "fragment") are a matter for URN-specific specifications.  That
   principle and its application provides a foundation for the
   distinguishing terms "q-component", "r-component", and "f-component"
   that are developed in the accompanying URN definition specification

3.  The role of queries and fragments in URNs

   Part of the concern that led to this document was a desire to
   accommodate URN components that would be analogous to the query and
   fragment components of generalized URNs but that might have different
   properties.  For many cases, the analogy cannot be exact.  For
   example, RFC 3986 ties the interpretation of fragments to media
   types.  Since media type is a function of specific content, URNs that
   are never resolved cannot have an associated media type, nor can URNs
   that resolve to, for example, other URIs that may then not be
   resolved further.  Similarly, while the RFC 3986 syntax for queries

   (and fragments) may be entirely appropriate for URN use, terminology
   like "Service Request" (see Appendix B of the predecessor "URNs are
   not..." draft [ServiceRequests] for additional discussion) may be
   more suitable to the URN context than "query" (if, indeed, the
   portion of the URN that is syntactically equivalent to a URI query is
   where those requests belong).

4.  Changes to RFC 3986

   The interpretation rule discussed in Section 1 notwithstanding, this
   document alters ("updates") RFC 3986 itself only by specifying that
   the interpretation of URNs of the "urn:" scheme, may vary from that
   for other types of URIs.  That might be implemented by, for example,
   inserting text at the end of Section 1.1.3 (of RFC 3986) similar to:

      Nonetheless, URIs classified as names, particularly those of the
      "urn:" scheme, may require different interpretations of, or even
      deviations from, the interpretations of various fields or rules of
      this document that are more obviously applicable to locators.
      Those differences are motivated by differences in the
      relationships to retrievable objects and other resources between
      locators and more abstract names.  For the "urn:" scheme, the
      issues are discussed in [ThisRFC] and specific definitions are
      supplied in [RFC2141bis].

   The effect of the above is to remove URN semantics from the scope of
   RFC 3986.  It makes no changes to the generic URI syntax, nor to the
   semantics of any other URI scheme.  The 3986 syntax still applies to
   URNs as well as to other URI types.  Even as regard to semantics, it
   has no practical effect for URNs defined in strict conformance to the
   prior URN specification [RFC2141] or the associated registration
   specification [RFC3406].

   In particular (but without altering RFC 3986 in any way), the generic
   URI syntax for "queries" (strings starting with "?" and continuing to
   the end of the URI or to a "#"), and for "fragments" (strings
   starting with "#" and continuing to the end of the URI) is unchanged.
   For URNs, additional syntax is introduced to divide the URI "query"
   into two parts, referred to as "q-components" and "r-components".
   The syntax and general semantics of "fragments" (specified in RFC
   3986 as scheme-independent) are unchanged, but a somewhat liberal
   interpretation may be needed in the context of URNs, so a fragment is
   referred to as an "f-component" as a term of convenience to highlight
   that distinction [RFC2141bis].

5.  Actions Occurring in Parallel with this Specification

   The basic URN syntax specification [RFC2141] was published well
   before RFC 3986 and therefore does not depend on it.  The successor
   to that specification [RFC2141bis], fully spells out, or references
   documents that spell out, the semantics and any required within-field
   syntax of URNs.  It uses great care about generic or implicit
   reference to any URI specification and delegates further details to
   specific namespaces.

6.  Acknowledgments

   This specification was inspired by a search in the IETF URNBIS WG for
   an approach that would both satisfy the needs of persistent name-type
   identifiers and still fully conform to various readings and
   understandings of the specifications and intent of RFC 3986.  That
   search lasted several years and considered many alternatives.
   Discussions with Leslie Daigle, Juha Hakala, Barry Leiba, Keith
   Moore, Andrew Newton, and Peter Saint-Andre during the last quarter
   of 2013 and the first quarter of 2014 were particularly helpful in
   arriving at the conclusion that a conceptual separation of notions of
   location-based identifiers (e.g., URLs) and the types of persistent
   identifiers represented by URNs was necessary.  Juha Hakala provided
   useful explanations and significant working text about the needs of
   the library community and their perception of identifiers and
   consequent implications for URN structure.  Peter Saint-Andre
   provided significant text in a pre-publication review.  The author
   also appreciates the efforts of several people, notably Tim Berners-

   Lee, Leslie Daigle, Juha Hakala, Sean Leonard, Larry Masinter, Keith
   Moore, Julian Reschke, Lars Svensson, Henry S.  Thompson, and Dale
   Worely, to challenge text and ideas and demand answers to hard
   questions.  Whether they agree with the results or not, their
   insights have contributed significantly to whatever clarity and
   precision appears in the present document.

   The specification was changed considerably and its focus narrowed
   after an extended discussion at the WG meeting during IETF 90 in July
   2014 [IETF90-URNBISWG] and subsequent comments and clarifications on
   the mailing list [URNBIS-MailingList].  The contributions of all of
   the participants in those discussions, only some of whose names
   appear above, are gratefully acknowledged.

7.  Contributors

   Juha Hakala contributed considerable text, some of which was removed
   from later versions of the document to streamline it.

      Contact Information:
      Juha Hakala
      The National Library of Finland
      P.O.  Box 15, Helsinki University
      Helsinki, MA FIN-00014
      Email: juha.hakala@helsinki.fi

8.  IANA Considerations

   This memo is not believed to require any action on IANA's part.

   There is an existing (i.e. prior to the publication of this document)
   registry for "Uniform Resource Identifier (URI) Schemes" that already
   includes the "urn" scheme itself and a separate existing URN
   Namespace registry.  None of the registrations that predate this
   specification have any specific dependencies on generic URI
   specifications.  More information on this subject appears in
   [RFC2141bis] and documents referenced from it.

9.  Security Considerations

   As discussed in Section 1 above, this document is largely
   precautionary, providing an interpretation rule for the URI
   definition [RFC3986] when URNs are concerned.  Some members of the
   community believe that rule (and hence this document) are

   unnecessary, at most reinforcing provisions already in that
   definition.  Others believe that it restores the original URN
   definition [RFC2141], produced before RFC 3986 was adopted and not
   updated by it.  Still others see this specification as making a
   necessary change to allow the semantics of URNs to be self-contained
   (as specified in other documents), relying on the generic URI syntax
   specification for syntax only.

   Independent of which of those models is applicable, the specification
   should have no effect on Internet security unless the use of a
   definition, syntax, and semantics that are more clear reduces the
   potential for confusion and consequent vulnerabilities.

10.  References

10.1.  Normative References

   [RFC2141]  Moats, R., "URN Syntax", RFC 2141, DOI 10.17487/RFC2141,
              May 1997, <http://www.rfc-editor.org/info/rfc2141>.

              Saint-Andre, P. and J. Klensin, "Uniform Resource Name
              (URN) Syntax", February 2016,

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, DOI 10.17487/RFC3986, January 2005,

10.2.  Informative References

              Mazahir, O., Thaler, D., and G. Montenegro, "Deterministic
              URI Encoding", February 2014, <http://www.ietf.org/id/

              IETF, "URN BIS Working Group Minutes", July 2014,

   [RFC1738]  Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
              Resource Locators (URL)", RFC 1738, DOI 10.17487/RFC1738,
              December 1994, <http://www.rfc-editor.org/info/rfc1738>.

   [RFC2483]  Mealling, M. and R. Daniel, "URI Resolution Services
              Necessary for URN Resolution", RFC 2483,
              DOI 10.17487/RFC2483, January 1999,

   [RFC3401]  Mealling, M., "Dynamic Delegation Discovery System (DDDS)
              Part One: The Comprehensive DDDS", RFC 3401,
              DOI 10.17487/RFC3401, October 2002,

   [RFC3406]  Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom,
              "Uniform Resource Names (URN) Namespace Definition
              Mechanisms", BCP 66, RFC 3406, DOI 10.17487/RFC3406,
              October 2002, <http://www.rfc-editor.org/info/rfc3406>.

              Klensin, J., "Names are Not Locators and URNs are Not
              URIs, Appendix B", July 2014, <http://www.ietf.org/id/

              Klensin, J. and J. Hakala, "Uniform Resource Name (URN)
              Namespace Registration Transition", Feburary 2016,

              IETF, "IETF URN Mailing list", 2014,

Appendix A.  Background on the URN - URI relationship

   The Internet community now has many years of experience with both
   name-type identifiers and location-based identifiers (or "references"
   for those who are sensitive to the term "identifier" (a group that

   includes many members of the library and information science
   communities.  The primary examples of these two categories are
   Uniform Resource Names (URNs [RFC2141] [RFC2141bis]) and Uniform
   Resource Locators (URLs) [RFC1738]).  That experience leads to the
   conclusion that it is impractical to constrain URNs to the high-level
   semantics of URLs.  The generic syntax for URIs [RFC3986] is
   adequately flexible to accommodate the perceived needs of URNs, but
   the specific semantics associated with the URI syntax definition --
   what particular constructions "mean" and how and where they are
   constrained or interpreted -- appear to not be.  Generalization from
   URLs to generic Uniform Resource Identifiers (URIs) [RFC3986],
   especially to name-based, high-stability, long-persistence,
   identifiers such as many URN namespaces, has failed because the
   assumed similarities do not adequately extend to all forms of, and
   requirements for, URNs.

   Ultimately, locators, which typically depend on particular accessing
   protocols (protocols that are typically linked to the particular URI
   scheme) and a specification relative to some physical space or
   network topology, are simply different creatures from long-
   persistence, location-independent, object identifiers or abstract
   designators.  Many of the constraints and interpretation rules that
   are appropriate for locators are either irrelevant to or interfere
   with the needs of resource names (at least of the "urn:" scheme) as a
   class.  That was tolerable as long as the URN system didn't need
   additional capabilities (over those specified in RFC 2141) but
   experience since RFC 2141 was published has shown that they are, in
   fact, needed.

Author's Address

   John C Klensin
   1770 Massachusetts Ave, Ste 322
   Cambridge, MA  02140

   Phone: +1 617 245 1457
   Email: john-ietf@jck.com

Klensin                 Expires December 10, 2016              [Page 14]

