draft-ietf-core-href-02.txt   draft-ietf-core-href-03.txt 
CoRE Working Group K. Hartke CoRE Working Group K. Hartke
Internet-Draft Ericsson Internet-Draft Ericsson
Intended status: Standards Track 8 January 2020 Intended status: Standards Track 9 March 2020
Expires: 11 July 2020 Expires: 10 September 2020
Constrained Resource Identifiers Constrained Resource Identifiers
draft-ietf-core-href-02 draft-ietf-core-href-03
Abstract Abstract
Constrained Resource Identifiers (CoRIs) are an alternate The Constrained Resource Identifier (CRI) is a complement to the
serialization of Uniform Resource Identifiers (URIs) that encodes the Uniform Resource Identifier (URI) that serializes the URI components
URI components in Concise Binary Object Representation (CBOR) instead in Concise Binary Object Representation (CBOR) instead of a sequence
of a string of characters. This simplifies parsing, reference of characters. This simplifies parsing, comparison and reference
resolution, and comparison of URIs in environments with severe resolution in environments with severe limitations on processing
limitations on processing power, code size, and memory size. power, code size, and memory size.
Note to Readers Note to Readers
This note is to be removed before publishing as an RFC. This note is to be removed before publishing as an RFC.
The issues list for this Internet-Draft can be found at The issues list for this Internet-Draft can be found at
<https://github.com/core-wg/coral/labels/href>. <https://github.com/core-wg/coral/labels/href>.
A reference implementation and a set of test vectors can be found at A reference implementation and a set of test vectors can be found at
<https://github.com/core-wg/coral/tree/master/binary/python>. <https://github.com/core-wg/coral/tree/master/binary/python>.
skipping to change at line 44 skipping to change at line 44
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on 11 July 2020. This Internet-Draft will expire on 10 September 2020.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document. license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License. provided without warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction 1. Introduction
1.1. Notational Conventions 1.1. Notational Conventions
2. Data Model 2. Constraints
2.1. Options 3. Creation and Normalization
2.2. Option Sequences 4. Comparison
3. CBOR 5. CRI References
4. Python 5.1. CBOR Serialization
4.1. Reference Resolution 5.2. Reference Resolution
4.2. URI Recomposition 6. Relationship between CRIs, URIs and IRIs
4.3. CoAP Encoding 6.1. Converting CRIs to URIs
5. Security Considerations 7. Security Considerations
6. IANA Considerations 8. IANA Considerations
7. References 9. References
7.1. Normative References 9.1. Normative References
7.2. Informative References 9.2. Informative References
Appendix A. Change Log Appendix A. Change Log
Acknowledgements Acknowledgements
Author's Address Author's Address
1. Introduction 1. Introduction
Uniform Resource Identifier (URI) references [RFC3986] are the The Uniform Resource Identifier (URI) [RFC3986] and its most common
standard way to link to resources in hypertext formats such as HTML usage, the URI reference, are the Internet standard for linking to
[W3C.REC-html52-20171214] or the HTTP "Link" header field [RFC8288]. resources in hypertext formats such as HTML [W3C.REC-html52-20171214]
A URI reference is either a URI or a relative reference that must be and the HTTP "Link" header field [RFC8288].
resolved against a base URI.
URI references are strings of characters chosen from the repertoire A URI reference is a sequence of characters chosen from the
of US-ASCII characters. The individual components of a URI reference repertoire of US-ASCII characters. The individual components of a
are delimited by a number of reserved characters, which necessitates URI reference are delimited by a number of reserved characters, which
the use of percent-encoding when these reserved characters are used necessitates the use of an escape mechanism ("percent-encoding") when
in a non-delimiting function. One component can also contain special these reserved characters are used in a non-delimiting function. The
dot-segments that affect how the component is to be interpreted. The resolution of URI references involves parsing a character sequence
resolution of URI references involves parsing the character string
into its components, combining those components with the components into its components, combining those components with the components
of a base URI, merging path components, removing dot-segments, and of a base URI, merging path components, removing dot-segments, and
recomposing the result back into a character string. recomposing the result back into a character sequence.
Overall, the proper processing of URIs is quite complicated. This Overall, the proper handling of URI references is relatively
can be a problem in particular in constrained environments [RFC7228], intricate. This can be a problem, especially in constrained
where devices often have severe code size limitations. As a result, environments [RFC7228] where nodes often have severe code size and
many implementations in these environments choose to support only an memory size limitations. As a result, many implementations in such
ad-hoc, informally-specified, bug-ridden, non-interoperable subset of environments support only an ad-hoc, informally-specified, bug-
half of the URI standard. ridden, non-interoperable subset of half of RFC 3986.
This document introduces Constrained Resource Identifier (CoRI) This document defines the Constrained Resource Identifier (CRI) by
references, an alternate serialization of URI references that encodes constraining URIs to a simplified subset and serializing their
the URI components in Concise Binary Object Representation (CBOR) components in Concise Binary Object Representation (CBOR)
[RFC7049] instead of a string of characters. Assuming an [RFC7049bis] instead of a sequence of characters. This allows
implementation of CBOR is already present on a device, typical typical operations on URI references such as parsing, comparison and
operations on URI references such as parsing, reference resolution, reference resolution to be implemented (including all corner cases)
and comparison can be implemented more easily than for character in a comparatively small amount of code.
strings. A full implementation that covers all corner cases is
intended to be implementable in a relatively small amount of code.
As a result of the simplification, CoRI references are not capable of As a result of simplification, however, CRIs are not capable of
expressing all URI references permitted by the syntax of RFC 3986. expressing all URIs permitted by the generic syntax of RFC 3986
(Hence the "constrained" in "Constrained Resource Identifiers".) The (hence the "constrained" in "Constrained Resource Identifier"). The
supported subset includes all Constrained Application Protocol (CoAP) supported subset includes all URIs of the Constrained Application
URIs [RFC7252], most Hypertext Transfer Protocol (HTTP) URIs Protocol (CoAP) [RFC7252], most URIs of the Hypertext Transfer
[RFC7230], and many other URIs that function as resource locators. Protocol (HTTP) [RFC7230], and other URIs that are similar. The
exact constraints are defined in Section 2.
1.1. Notational Conventions 1.1. Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
Terms defined in this document appear in _cursive_ where they are Terms defined in this document appear in _cursive_ where they are
introduced. introduced (rendered in plain text as the new term surrounded by
underscores).
2. Data Model
The data model for CoRI references is very similar to the
serialization of the request URI in CoAP messages [RFC7252]: The
components of a URI reference are encoded as a sequence of _options_,
where each path segment and query parameter becomes its own option.
Every option consists of an _option number_ identifying the type of
option (scheme, host name, path segment, etc.) and an _option value_.
2.1. Options
The following types of options are defined:
scheme 2. Constraints
Specifies the URI scheme. The option value can be any Unicode
string matching the "scheme" rule described in Section 3.1 of RFC
3986 [RFC3986], excluding uppercase letters.
host.name A Constrained Resource Identifier consists of the same five
Specifies the host of the URI authority as a registered name. The components as a URI: scheme, authority, path, query, and fragment.
option value can be any Unicode string matching the specifications The components are subject to the following constraints:
of the URI scheme.
host.ip C1. The scheme name can be any Unicode string (see Definition D80
Specifies the host of the URI authority as an IPv4 address or an in [Unicode]) that matches the syntax defined in Section 3.1 of
IPv6 address. The option value is a byte string with a length of [RFC3986] and is lowercase (see Definition D139 in [Unicode]).
either 4 or 16 bytes, respectively.
port C2. An authority is always a host identified by an IP address or
Specifies the port number of the URI authority. The option value registered name, along with optional port information. User
is an integer in the range from 0 to 65535. information is not supported.
path.type C3. An IP address can be either an IPv4 address or an IPv6 address.
Specifies the type of the URI path for reference resolution. The IPv6 scoped addressing zone identifiers and future versions of
option value is an integer in the range from 0 to 127, named as IP are not supported.
follows:
0 absolute-path C4. A registered name can be any Unicode string that is lowercase
1 append-relation and in Unicode Normalization Form C (NFC) (see Definition D120
2 append-path in [Unicode]). (The syntax may be further restricted by the
3 relative-path scheme.)
4 relative-path-1up
5 relative-path-2up
6 relative-path-3up
7 relative-path-4up
...
127 relative-path-124up
path C5. A port is always an integer in the range from 0 to 65535.
Specifies one segment of the URI path. The option value can be Empty ports or ports outside this range are not supported.
any Unicode string with the exception of "." and "..". This
option can occur more than once.
query C6. The port is omitted if and only if the port would be the same
Specifies one argument of the URI query. The option value can be as the scheme's default port (provided the scheme is defining
any Unicode string. This option can occur more than once. such a default port) or the scheme is not using ports.
fragment C7. A path consists of zero or more path segments. A path must not
Specifies the fragment identifier. The option value can be any consist of a single zero-length path segment, which is
Unicode string. considered equivalent to a path of zero path segments.
No percent-encoding is performed in option values. C8. A path segment can be any Unicode string that is in NFC
(including the zero-length string) with the exception of the
special "." and ".." complete path segments. No special
constraints are placed on the first path segment.
2.2. Option Sequences C9. A query always consists of one or more query parameters. A
query parameter can be any Unicode string that is in NFC. It
is often in the form of a "key=value" pair. When converting a
CRI to a URI, query parameters are separated by an ampersand
("&") character. (This matches the structure and encoding of
the query in CoAP URIs.)
_ host.name _ C10. A fragment identifier can be any Unicode string that is in NFC.
____ scheme __/ \___ port _
\ \________/ \__ host.ip __/ / \
\__________________________/ ________/
\ / ________ _________
\ / / \ / \
\__________ path.type __\_\_ path _/__\_ query _/__ fragment __
\___________/ \________/ \_________/ \__________/
Figure 1: Structure of a Well-Formed Sequence of Options C11. The syntax of registered names, path segments, query
parameters, and fragment identifiers may be further restricted
and sub-structured by the scheme. There is no support,
however, for escaping sub-delimiters that are not intended to
be used in a delimiting function.
A sequence of options is considered _well-formed_ if: C12. When converting a CRI to a URI, any character that is outside
the allowed character range or a delimiter in the URI syntax is
percent-encoded. Percent-encoding always uses the UTF-8
encoding form (see Definition D92 in [Unicode]) to convert the
character to a sequence of one or more octets.
* the sequence of options is empty or starts with a "scheme", 3. Creation and Normalization
"host.name", "host.ip", "port", "path.type", "path", "query", or
"fragment" option;
* any "scheme" option is followed by either a "host.name" or a Resource identifiers are generally created on the initial creation of
"host.ip" option; a resource with a certain resource identifier, or the initial
exposition of a resource under a particular resource identifier.
* any "host.name" option is followed by a "port" option; A Constrained Resource Identifier SHOULD be created by the naming
authority that governs the namespace of the resource identifier. For
example, for the resources of an HTTP origin server, that server is
responsible for creating the CRIs for those resources.
* any "host.ip" option is followed by a "port" option; The creator MUST ensure that any CRI created satisfies the
constraints defined in Section 2. The creation of a CRI fails if the
CRI cannot be validated to satisfy all of the constraints.
* any "port" option is followed by a "path", "query", or "fragment" If a creator creates a CRI from user input, it MAY apply the
option or is at the end of the sequence; following (and only the following) normalizations to get the CRI more
likely to validate: map the scheme name to lowercase (C1.); map the
registered name to NFC (C4.); elide the port if it's the default port
for the scheme (C6.); elide a single zero-length path segment (C7.);
map path segments, query parameters and the fragment identifier to
NFC (C8., C9., C10.).
* any "path.type" option is followed by a "path", "query", or Once a CRI has been created, it can be used and transferred without
"fragment" option or is at the end of the sequence; further normalization. All operations that operate on a CRI SHOULD
rely on the assumption that the CRI is appropriately pre-normalized.
(This does not contradict the requirement that when CRIs are
transferred, recipients must operate on as-good-as untrusted input
and fail gracefully in the face of malicious inputs.)
* any "path" option is followed by a "path", "query", or "fragment" 4. Comparison
option or is at the end of the sequence;
* any "query" option is followed by a "query" or "fragment" option One of the most common operations on CRIs is comparison: determining
or is at the end of the sequence; and whether two CRIs are equivalent, without using the CRIs to access
their respective resource(s).
* any "fragment" option is at the end of the sequence. Determination of equivalence or difference of CRIs is based on simple
component-wise comparison. If two CRIs are identical component-by-
component (using code-point-by-code-point comparison for components
that are Unicode strings) then it is safe to conclude that they are
equivalent.
A well-formed sequence of options is considered _absolute_ if the This comparison mechanism is designed to minimize false negatives
sequence of options starts with a "scheme" option. while strictly avoiding false positives. The constraints defined in
Section 2 imply the most common forms of syntax- and scheme-based
normalizations in URIs, but do not comprise protocol-based
normalizations that require accessing the resources or detailed
knowledge of the scheme's dereference algorithm. False negatives can
be caused by resource aliases and CRIs that do not fully satisfy the
constraints.
A well-formed sequence of options is considered _relative_ if the When CRIs are compared to select (or avoid) a network action, such as
sequence of options is empty or starts with an option other than a retrieval of a representation, fragment components (if any) should be
"scheme" option. excluded from the comparison.
An absolute sequence of options is considered _normalized_ if the 5. CRI References
result of resolving the sequence of options against any base is equal
to the input. (It doesn't matter what base it is resolved against,
since it is already absolute.)
The following operations can be performed on a sequence of options: The most common usage of a Constrained Resource Identifier is to
embed it in resource representations, e.g., to express a hyperlink
between the represented resource and the resource identified by the
CRI.
resolve(href, base) This section defines the serialization of CRIs in Concise Binary
Resolves a well-formed sequence of options `href` against an Object Representation (CBOR) [RFC7049bis]. To reduce representation
absolute sequence of options `base`. This operation MUST be size, CRIs are not serialized directly. Instead, CRIs are indirectly
performed by applying any algorithm that is functionally referenced through _CRI references_ that take advantage of
equivalent to the reference implementation in Section 4.1 of this hierarchical locality. The CBOR serialization of CRI references is
document. specified in Section 5.1.
relative(href, base) The only operation defined on a CRI reference is _reference
Makes an absolute sequence of options `href` relative to an resolution_: the act of transforming a CRI reference into a CRI. An
absolute sequence of options `base`. This operation MUST be application MUST implement this operation by applying the algorithm
performed by applying any algorithm that returns a sequence of specified in Section 5.2 or any algorithm that is functionally
options such that `resolve(relative(h, b), b)` is equal to `h` equivalent to it.
given the same `b`.
recompose(href) The method of transforming a CRI into a CRI reference is unspecified;
Recomposes a URI from an absolute sequence of options `href`. This implementations are free to use any algorithm as long as reference
operation MUST be performed by applying any algorithm that is resolution of the resulting CRI reference yields the original CRI.
functionally equivalent to the reference implementation in
Section 4.2 of this document.
To reduce variability, it is RECOMMENDED to uppercase the letters When testing for equivalence or difference, applications SHOULD NOT
in the hexadecimal notation when percent-encoding octets [RFC3986] directly compare CRI references; the references should be resolved to
and to follow the recommendations of Section 4 of RFC 5952 for the their respective CRI before comparison.
text representation of IPv6 addresses [RFC5952].
decompose(str) 5.1. CBOR Serialization
Decomposes a URI `str` into a sequence of options. This operation
MUST be performed by applying any algorithm that returns a
sequence of options such that `recompose(decompose(x))` is
equivalent to `x`.
coap(href) A CRI reference is encoded as a CBOR array [RFC7049bis] that contains
Constructs CoAP options from an absolute, normalized sequence of a sequence of zero or more options. Each option consists of an
options. This operation MUST be performed by recomposing the option number followed by an option value, holding one component or
sequence of options to a URI (as described above) and decomposing sub-component of the CRI reference. To reduce size, both option
the URI into CoAP options (as specified in Section 6.4 of RFC numbers and option values are immediate elements of the CBOR array
7252). A concise implementation of this algorithm is illustrated and appear in alternating order.
in Section 4.3 of this document.
3. CBOR Not all possible sequences of options denote a well-formed CRI
reference. The structure can be described in the Concise Data
Definition Language (CDDL) [RFC8610] as follows:
In Concise Binary Object Representation (CBOR) [RFC7049], a sequence CRI-Reference = [
of options is encoded as an array that contains the option numbers (?scheme, ?((host.name // host.ip), ?port) // path.type),
and option values in alternating order. *path,
*query,
?fragment
]
The structure can be described in the Concise Data Definition scheme = (0, text .regexp "[a-z][a-z0-9+.-]*")
Language (CDDL) [RFC8610] as follows: host.name = (1, text)
host.ip = (2, bytes .size 4 / bytes .size 16)
port = (3, 0..65535)
path.type = (4, 0..127)
path = (5, text)
query = (6, text)
fragment = (7, text)
CoRI = [?(scheme: 1, text .regexp "[a-z][a-z0-9+.-]*"), The options correspond to the (sub-)components of a CRI, as described
?(host.name: 2, text // in Section 2, with the addition of the "path.type" option. The
host.ip: 3, bytes .size 4 / bytes .size 16), "path.type" option can be used to express path prefixes like "/",
?(port: 4, 0..65535), "./", "../", "../../", etc. The exact semantics of the option values
?(path.type: 5, 0..127), are defined by Section 5.2. A sequence of options that is empty or
*(path: 6, text), starts with a "path" option is equivalent the same sequence prefixed
*(query: 7, text), by a "path.type" option with value 2.
?(fragment: 8, text)]
Examples: Examples:
[1, "coap", [0, "coap",
3, h'C6336401', 2, h'C6336401',
4, 5683, 3, 61616,
6, ".well-known", 5, ".well-known",
6, "core"] 5, "core"]
[5, 0,
6, ".well-known",
6, "core",
7, "rt=temperature-c"]
4. Python [4, 0,
5, ".well-known",
5, "core",
6, "rt=temperature-c"]
In Python, a sequence of options is encoded as a list of tuples, A CRI reference is considered _absolute_ if the sequence of options
where each tuple contains one option number and one option value. starts with a "scheme" option.
The following Python 3.6 code illustrates how to check a sequence of A CRI reference is considered _relative_ if the sequence of options
options for being well-formed, absolute, and relative. is empty or starts with an option other than a "scheme" option.
<CODE BEGINS> 5.2. Reference Resolution
import enum
class Option(enum.IntEnum): The term "relative" implies that a "base CRI" exists against which
_BEGIN = 0 the relative reference is applied. Aside from fragment-only
SCHEME = 1 references, relative references are only usable when a base CRI is
HOST_NAME = 2 known.
HOST_IP = 3
PORT = 4
PATH_TYPE = 5
PATH = 6
QUERY = 7
FRAGMENT = 8
_END = 9
class PathType(enum.IntEnum): The following steps define the process of resolving any CRI reference
ABSOLUTE_PATH = 0 against a base CRI so that the result is a CRI in the form of an
APPEND_RELATION = 1 absolute CRI reference:
APPEND_PATH = 2
RELATIVE_PATH = 3
RELATIVE_PATH_1UP = 4
RELATIVE_PATH_2UP = 5
RELATIVE_PATH_3UP = 6
RELATIVE_PATH_4UP = 7
_TRANSITIONS = ([Option.SCHEME, Option.HOST_NAME, Option.HOST_IP, 1. Establish the base CRI of the CRI reference and express it in the
Option.PORT, Option.PATH_TYPE, Option.PATH, Option.QUERY, form of an absolute CRI reference. The base CRI can be
Option.FRAGMENT, Option._END], established in a number of ways; see Section 5.1 of [RFC3986].
[Option.HOST_NAME, Option.HOST_IP],
[Option.PORT],
[Option.PORT],
[Option.PATH, Option.QUERY, Option.FRAGMENT, Option._END],
[Option.PATH, Option.QUERY, Option.FRAGMENT, Option._END],
[Option.PATH, Option.QUERY, Option.FRAGMENT, Option._END],
[Option.QUERY, Option.FRAGMENT, Option._END],
[Option._END])
def is_well_formed(href): 2. Determine the values of two variables, T and E, depending on the
previous = Option._BEGIN first option of the CRI reference to be resolved, according to
for option, _ in href: Table 1.
if option not in _TRANSITIONS[previous]:
return False
previous = option
if Option._END not in _TRANSITIONS[previous]:
return False
return True
def is_absolute(href): +---------------------+------------------+------------------------+
return is_well_formed(href) and \ | First Option Number | T | E |
(len(href) != 0 and href[0][0] == Option.SCHEME) +=====================+==================+========================+
| 0 (scheme) | 0 | 0 |
+---------------------+------------------+------------------------+
| 1 (host.name) | 0 | 1 |
+---------------------+------------------+------------------------+
| 2 (host.ip) | 0 | 1 |
+---------------------+------------------+------------------------+
| 3 (port) | (invalid sequence of options) |
+---------------------+------------------+------------------------+
| 4 (path.type) | option value - 1 | if T < 0 then 5 else 6 |
+---------------------+------------------+------------------------+
| 5 (path) | 1 | 6 |
+---------------------+------------------+------------------------+
| 6 (query) | 0 | 6 |
+---------------------+------------------+------------------------+
| 7 (fragment) | 0 | 7 |
+---------------------+------------------+------------------------+
| none/empty sequence | 0 | 7 |
+---------------------+------------------+------------------------+
def is_relative(href): Table 1: Values of the Variables T and E
return is_well_formed(href) and \
(len(href) == 0 or href[0][0] != Option.SCHEME)
<CODE ENDS>
Examples: 3. Initialize a buffer with all the options from the base CRI where
the option number is less than the value of E.
[(Option.SCHEME, 'coap'), 4. If the value of T is greater than 0, remove the last T-many
(Option.HOST_IP, b'\xC6\x33\x64\x01'), "path" options from the end of the buffer (up to the number of
(Option.PORT, 5683), "path" options in the buffer).
(Option.PATH, '.well-known'),
(Option.PATH, 'core')]
[(Option.PATH_TYPE, PathType.ABSOLUTE_PATH), 5. Append all the options from the CRI reference to the buffer,
(Option.PATH, '.well-known'), except for any "path.type" option.
(Option.PATH, 'core'),
(Option.QUERY, 'rt=temperature-c')]
4.1. Reference Resolution 6. If the buffer contains a single "path" option and the value of
that option is the zero-length string, remove that option from
the buffer.
The following Python 3.6 code defines how to resolve a sequence of 7. Return the sequence of options in the buffer.
options that might be relative to a given base.
<CODE BEGINS> 6. Relationship between CRIs, URIs and IRIs
def resolve(base, href, relation=0):
if not is_absolute(base) or not is_well_formed(href):
return None
result = []
option = Option.FRAGMENT
if len(href) != 0:
option = href[0][0]
if option == Option.HOST_IP:
option = Option.HOST_NAME
elif option == Option.PATH_TYPE:
type = href[0][1]
href = href[1:]
elif option == Option.PATH:
type = PathType.RELATIVE_PATH
option = Option.PATH_TYPE
if option != Option.PATH_TYPE or type == PathType.ABSOLUTE_PATH:
_copy_until(base, result, option)
else:
_copy_until(base, result, Option.QUERY)
if type == PathType.APPEND_RELATION:
_append_and_normalize(result, Option.PATH, str(relation))
while type > PathType.APPEND_PATH:
if len(result) == 0 or result[-1][0] != Option.PATH:
break
del result[-1]
type -= 1
_copy_until(href, result, Option._END)
_append_and_normalize(result, Option._END, None)
return result
def _copy_until(input, output, end): CRIs are meant to replace both Uniform Resource Identifiers (URIs)
for option, value in input: [RFC3986] and Internationalized Resource Identifiers (IRIs) [RFC3987]
if option >= end: in constrained environments [RFC7228]. Applications in these
break environments may never need to use URIs and IRIs directly, especially
_append_and_normalize(output, option, value) when the resource identifier is used simply for identification
purposes or when the CRI can be directly converted into a CoAP
request.
def _append_and_normalize(output, option, value): However, it may be necessary in other environments to determine the
if option > Option.PATH: associated URI or IRI of a CRI, and vice versa. Applications can
if len(output) >= 2 and \ perform these conversions as follows:
output[-1] == (Option.PATH, '') and (
output[-2][0] < Option.PATH_TYPE or (
output[-2][0] == Option.PATH_TYPE and
output[-2][1] == PathType.ABSOLUTE_PATH)):
del output[-1]
if option > Option.FRAGMENT:
return
output.append((option, value))
<CODE ENDS>
4.2. URI Recomposition CRI to URI
A CRI is converted to a URI as specified in Section 6.1.
The following Python 3.6 code defines how to recompose a URI from an URI to CRI
absolute sequence of options. The method of converting a URI to a CRI is unspecified;
implementations are free to use any algorithm as long as
converting the resulting CRI back to a URI yields an equivalent
URI.
<CODE BEGINS> CRI to IRI
def recompose(href): A CRI can be converted to an IRI by first converting it to a URI,
if not is_absolute(href): and then converting the URI to an IRI as described in Section 3.2
return None of [RFC3987].
result = ''
no_path = True
first_query = True
for option, value in href:
if option == Option.SCHEME:
result += value + ':'
elif option == Option.HOST_NAME:
result += '//' + _encode_reg_name(value)
elif option == Option.HOST_IP:
result += '//' + _encode_ip_address(value)
elif option == Option.PORT:
result += ':' + _encode_port(value)
elif option == Option.PATH:
result += '/' + _encode_path_segment(value)
no_path = False
elif option == Option.QUERY:
if no_path:
result += '/'
no_path = False
result += '?' if first_query else '&'
result += _encode_query_argument(value)
first_query = False
elif option == Option.FRAGMENT:
if no_path:
result += '/'
no_path = False
result += '#' + _encode_fragment(value)
if no_path:
result += '/'
no_path = False
return result
def _encode_reg_name(s): IRI to CRI
return ''.join(c if _is_reg_name_char(c) An IRI can be converted to a CRI by first converting it to a URI
else _encode_pct(c) for c in s) as described in Section 3.1 of [RFC3987], and then converting the
URI to a CRI.
def _encode_ip_address(b): Everything in this section also applies to CRI references, URI
if len(b) == 4: references and IRI references.
return '.'.join(str(c) for c in b)
elif len(b) == 16:
return '[' + ... + ']' # see RFC 5952
def _encode_port(p): 6.1. Converting CRIs to URIs
return str(p)
def _encode_path_segment(s): Applications MUST convert a CRI reference to a URI reference by
return ''.join(c if _is_segment_char(c) determining the components of the URI reference according to the
else _encode_pct(c) for c in s) following steps and then recomposing the components to a URI
reference string as specified in Section 5.3 of [RFC3986].
def _encode_query_argument(s): scheme
return ''.join(c if _is_query_char(c) and c not in '&' If the CRI reference contains a "scheme" option, the scheme
else _encode_pct(c) for c in s) component of the URI reference consists of the value of that
option. Otherwise, the scheme component is undefined.
def _encode_fragment(s): authority
return ''.join(c if _is_fragment_char(c) If the CRI reference contains a "host.name" or "host.ip" option,
else _encode_pct(c) for c in s) the authority component consists of the host subcomponent,
optionally followed by a colon (":") character and the port
subcomponent. Otherwise, the authority component is undefined.
def _encode_pct(s): The host subcomponent consists of the value of the "host.name" or
return ''.join('%{0:0>2X}'.format(c) for c in s.encode('utf-8')) "host.ip" option.
def _is_reg_name_char(c): Any character in the value of a "host.name" option that is not in
return _is_unreserved(c) or _is_sub_delim(c) the set of unreserved characters (Section 2.3 of [RFC3986]) or
"sub-delims" (Section 2.2 of [RFC3986]) MUST be percent-encoded.
def _is_segment_char(c): The value of a "host.ip" option MUST be represented as a string
return _is_pchar(c) that matches the "IPv4address" or "IP-literal" rule (Section 3.2.2
of [RFC3986]).
def _is_query_char(c): If the CRI reference contains a "port" option, the port
return _is_pchar(c) or c in '/?' subcomponent consists of the value of that option in decimal
notation. Otherwise, the colon (":") character and the port
subcomponent are both omitted.
def _is_fragment_char(c): path
return _is_pchar(c) or c in '/?' If the CRI reference is an empty sequence of options or starts
with a "port" option, a "path" option, or a "path.type" option
where the value is not 0, the conversion fails.
def _is_pchar(c): If the CRI reference contains a "host.name" option, a "host.ip"
return _is_unreserved(c) or _is_sub_delim(c) or c in ':@' option or a "path.type" option where the value is not 0, the path
component of the URI reference is prefixed by a slash ("/")
character. Otherwise, the path component is prefixed by the empty
string.
def _is_unreserved(c): If the CRI reference contains one or more "path" options, the
return _is_alpha(c) or _is_digit(c) or c in '-._~' prefix is followed by the value of each option, separated by a
slash ("/") character.
def _is_alpha(c): Any character in the value of a "path" option that is not in the
return c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' + \ set of unreserved characters or "sub-delims" or a colon (":") or
'abcdefghijklmnopqrstuvwxyz' commercial at ("@") character MUST be percent-encoded.
def _is_digit(c): If the authority component is defined and the path component does
return c in '0123456789' not match the "path-abempty" rule (Section 3.3 of [RFC3986]), the
conversion fails.
def _is_sub_delim(c): If the authority component is undefined and the scheme component
return c in '!$&\'()*+,;=' is defined and the path component does not match the "path-
<CODE ENDS> absolute", "path-rootless" or "path-empty" rule (Section 3.3 of
[RFC3986]), the conversion fails.
4.3. CoAP Encoding If the authority component is undefined and the scheme component
is undefined and the path component does not match the "path-
absolute", "path-noscheme" or "path-empty" rule (Section 3.3 of
[RFC3986]), the conversion fails.
The following Python 3.6 code illustrates how to construct CoAP query
options from an absolute sequence of options. For simplicity, the If the CRI reference contains one or more "query" options, the
code does not omit CoAP options with their default value. query component of the URI reference consists of the value of each
option, separated by an ampersand ("&") character. Otherwise, the
query component is undefined.
<CODE BEGINS> Any character in the value of a "query" option that is not in the
def coap(href, to_proxy=False): set of unreserved characters or "sub-delims" or a colon (":"),
if not is_absolute(href): commercial at ("@"), slash ("/") or question mark ("?") character
return None MUST be percent-encoded. Additionally, any ampersand character
result = b'' ("&") in the option value MUST be percent-encoded.
previous = 0
for option, value in href:
if option == Option.SCHEME:
pass
elif option == Option.HOST_NAME:
opt = 3 # Uri-Host
val = value.encode('utf-8')
result += _encode_coap_option(opt - previous, val)
previous = opt
elif option == Option.HOST_IP:
opt = 3 # Uri-Host
if len(value) == 4:
val = '.'.join(str(c) for c in value).encode('utf-8')
elif len(value) == 16:
val = b'[' + ... + b']' # see RFC 5952
result += _encode_coap_option(opt - previous, val)
previous = opt
elif option == Option.PORT:
opt = 7 # Uri-Port
val = value.to_bytes((value.bit_length() + 7) // 8, 'big')
result += _encode_coap_option(opt - previous, val)
previous = opt
elif option == Option.PATH:
opt = 11 # Uri-Path
val = value.encode('utf-8')
result += _encode_coap_option(opt - previous, val)
previous = opt
elif option == Option.QUERY:
opt = 15 # Uri-Query
val = value.encode('utf-8')
result += _encode_coap_option(opt - previous, val)
previous = opt
elif option == Option.FRAGMENT:
pass
if to_proxy:
(option, value) = href[0]
opt = 39 # Proxy-Scheme
val = value.encode('utf-8')
result += _encode_coap_option(opt - previous, val)
previous = opt
return result
def _encode_coap_option(delta, value): fragment
length = len(value) If the CRI reference contains a fragment option, the fragment
delta_nibble = _encode_coap_option_nibble(delta) component of the URI reference consists of the value of that
length_nibble = _encode_coap_option_nibble(length) option. Otherwise, the fragment component is undefined.
result = bytes([delta_nibble << 4 | length_nibble])
if delta_nibble == 13:
delta -= 13
result += bytes([delta])
elif delta_nibble == 14:
delta -= 256 + 13
result += bytes([delta >> 8, delta & 255])
if length_nibble == 13:
length -= 13
result += bytes([length])
elif length_nibble == 14:
length -= 256 + 13
result += bytes([length >> 8, length & 255])
result += value
return result
def _encode_coap_option_nibble(n): Any character in the value of a "fragment" option that is not in
if n < 13: the set of unreserved characters or "sub-delims" or a colon (":"),
return n commercial at ("@"), slash ("/") or question mark ("?") character
elif n < 256 + 13: MUST be percent-encoded.
return 13
elif n < 65536 + 256 + 13:
return 14
<CODE ENDS>
5. Security Considerations 7. Security Considerations
Parsers must operate on input that is assumed to be untrusted. This Parsers of CRI references must operate on input that is assumed to be
means that parsers MUST fail gracefully in the face of malicious untrusted. This means that parsers MUST fail gracefully in the face
inputs. Additionally, parsers MUST be prepared to deal with resource of malicious inputs. Additionally, parsers MUST be prepared to deal
exhaustion (e.g., resulting from the allocation of big data items) or with resource exhaustion (e.g., resulting from the allocation of big
exhaustion of the call stack (stack overflow). See Section 8 of RFC data items) or exhaustion of the call stack (stack overflow). See
7049 [RFC7049] for security considerations relating to CBOR. Section 10 of [RFC7049bis] for additional security considerations
relating to CBOR.
The security considerations discussed in Section 7 of RFC 3986 The security considerations discussed in Section 7 of [RFC3986] and
[RFC3986] also apply to Constrained Resource Identifiers. Section 8 of [RFC3987] for URIs and IRIs also apply to CRIs.
6. IANA Considerations 8. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
7. References 9. References
7.1. Normative References 9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66, Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005, RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/info/rfc3986>. <https://www.rfc-editor.org/info/rfc3986>.
[RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource
Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, Identifiers (IRIs)", RFC 3987, DOI 10.17487/RFC3987,
October 2013, <https://www.rfc-editor.org/info/rfc7049>. January 2005, <https://www.rfc-editor.org/info/rfc3987>.
[RFC7049bis]
Bormann, C. and P. Hoffman, "Concise Binary Object
Representation (CBOR)", Work in Progress, Internet-Draft,
draft-ietf-cbor-7049bis-13, 8 March 2020,
<https://tools.ietf.org/html/draft-ietf-cbor-7049bis-13>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
Definition Language (CDDL): A Notational Convention to Definition Language (CDDL): A Notational Convention to
Express Concise Binary Object Representation (CBOR) and Express Concise Binary Object Representation (CBOR) and
JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
June 2019, <https://www.rfc-editor.org/info/rfc8610>. June 2019, <https://www.rfc-editor.org/info/rfc8610>.
7.2. Informative References [Unicode] The Unicode Consortium, "The Unicode Standard, Version
12.1.0", ISBN 978-1-936213-25-2, May 2019,
<http://www.unicode.org/versions/Unicode12.1.0/>.
[RFC5952] Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 9.2. Informative References
Address Text Representation", RFC 5952,
DOI 10.17487/RFC5952, August 2010,
<https://www.rfc-editor.org/info/rfc5952>.
[RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for
Constrained-Node Networks", RFC 7228, Constrained-Node Networks", RFC 7228,
DOI 10.17487/RFC7228, May 2014, DOI 10.17487/RFC7228, May 2014,
<https://www.rfc-editor.org/info/rfc7228>. <https://www.rfc-editor.org/info/rfc7228>.
[RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
Protocol (HTTP/1.1): Message Syntax and Routing", Protocol (HTTP/1.1): Message Syntax and Routing",
RFC 7230, DOI 10.17487/RFC7230, June 2014, RFC 7230, DOI 10.17487/RFC7230, June 2014,
<https://www.rfc-editor.org/info/rfc7230>. <https://www.rfc-editor.org/info/rfc7230>.
skipping to change at line 714 skipping to change at line 602
[W3C.REC-html52-20171214] [W3C.REC-html52-20171214]
Faulkner, S., Eicholz, A., Leithead, T., Danilo, A., and Faulkner, S., Eicholz, A., Leithead, T., Danilo, A., and
S. Moon, "HTML 5.2", World Wide Web Consortium S. Moon, "HTML 5.2", World Wide Web Consortium
Recommendation REC-html52-20171214, 14 December 2017, Recommendation REC-html52-20171214, 14 December 2017,
<https://www.w3.org/TR/2017/REC-html52-20171214>. <https://www.w3.org/TR/2017/REC-html52-20171214>.
Appendix A. Change Log Appendix A. Change Log
This section is to be removed before publishing as an RFC. This section is to be removed before publishing as an RFC.
Changes from -01 to -02: Changes from -02 to -03:
* Changed the syntax of schemes to exclude upper case characters. * Expanded the set of supported schemes (#3).
* Specified creation, normalization and comparison (#9).
* Clarified the default value of the "path.type" option (#33).
* Removed the "append-relation" path type (#41).
* Renumbered the remaining path types.
* Renumbered the option numbers.
* Restructured the document.
* Minor editorial improvements. * Minor editorial improvements.
Changes from -01 to -02:
* Changed the syntax of schemes to exclude upper case characters
(#13).
* Minor editorial improvements (#34 #37).
Changes from -00 to -01: Changes from -00 to -01:
* None. * None.
Acknowledgements Acknowledgements
Thanks to Christian Amsuess, Ari Keranen, Jim Schaad, and Dave Thaler Thanks to Christian Amsuess, Carsten Bormann, Ari Keranen, Jim Schaad
for helpful comments and discussions that have shaped the document. and Dave Thaler for helpful comments and discussions that have shaped
the document.
Author's Address Author's Address
Klaus Hartke Klaus Hartke
Ericsson Ericsson
Torshamnsgatan 23 Torshamnsgatan 23
SE-16483 Stockholm SE-16483 Stockholm
Sweden Sweden
Email: klaus.hartke@ericsson.com Email: klaus.hartke@ericsson.com
 End of changes. 105 change blocks. 
499 lines changed or deleted 407 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/