draft-ietf-cbor-cddl-08.txt   rfc8610.txt 
CBOR H. Birkholz Internet Engineering Task Force (IETF) H. Birkholz
Internet-Draft Fraunhofer SIT Request for Comments: 8610 Fraunhofer SIT
Intended status: Standards Track C. Vigano Category: Standards Track C. Vigano
Expires: September 25, 2019 Universitaet Bremen ISSN: 2070-1721 Universitaet Bremen
C. Bormann C. Bormann
Universitaet Bremen TZI Universitaet Bremen TZI
March 24, 2019 June 2019
Concise data definition language (CDDL): a notational convention to Concise Data Definition Language (CDDL): A Notational Convention
express CBOR and JSON data structures to Express Concise Binary Object Representation (CBOR)
draft-ietf-cbor-cddl-08 and JSON Data Structures
Abstract Abstract
This document proposes a notational convention to express CBOR data This document proposes a notational convention to express Concise
structures (RFC 7049, Concise Binary Object Representation). Its Binary Object Representation (CBOR) data structures (RFC 7049). Its
main goal is to provide an easy and unambiguous way to express main goal is to provide an easy and unambiguous way to express
structures for protocol messages and data formats that use CBOR or structures for protocol messages and data formats that use CBOR or
JSON. JSON.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on September 25, 2019. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc8610.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction ....................................................4
1.1. Requirements notation . . . . . . . . . . . . . . . . . . 4 1.1. Requirements Notation ......................................5
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Terminology ................................................5
2. The Style of Data Structure Specification . . . . . . . . . . 4 2. The Style of Data Structure Specification .......................5
2.1. Groups and Composition in CDDL . . . . . . . . . . . . . 6 2.1. Groups and Composition in CDDL .............................7
2.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1. Usage ..............................................10
2.1.2. Syntax . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.2. Syntax .............................................10
2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2. Types .....................................................11
2.2.1. Values . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1. Values .............................................11
2.2.2. Choices . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2. Choices ............................................11
2.2.3. Representation Types . . . . . . . . . . . . . . . . 12 2.2.3. Representation Types ...............................13
2.2.4. Root type . . . . . . . . . . . . . . . . . . . . . . 13 2.2.4. Root Type ..........................................14
3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3. Syntax .........................................................15
3.1. General conventions . . . . . . . . . . . . . . . . . . . 13 3.1. General Conventions .......................................15
3.2. Occurrence . . . . . . . . . . . . . . . . . . . . . . . 15 3.2. Occurrence ................................................16
3.3. Predefined names for types . . . . . . . . . . . . . . . 16 3.3. Predefined Names for Types ................................17
3.4. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4. Arrays ....................................................18
3.5. Maps . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.5. Maps ......................................................19
3.5.1. Structs . . . . . . . . . . . . . . . . . . . . . . . 18 3.5.1. Structs ............................................19
3.5.2. Tables . . . . . . . . . . . . . . . . . . . . . . . 20 3.5.2. Tables .............................................22
3.5.3. Non-deterministic order . . . . . . . . . . . . . . . 21 3.5.3. Non-deterministic Order ............................23
3.5.4. Cuts in Maps . . . . . . . . . . . . . . . . . . . . 22 3.5.4. Cuts in Maps .......................................24
3.6. Tags . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.6. Tags ......................................................25
3.7. Unwrapping . . . . . . . . . . . . . . . . . . . . . . . 24 3.7. Unwrapping ................................................26
3.8. Controls . . . . . . . . . . . . . . . . . . . . . . . . 25 3.8. Controls ..................................................27
3.8.1. Control operator .size . . . . . . . . . . . . . . . 25 3.8.1. Control Operator .size .............................27
3.8.2. Control operator .bits . . . . . . . . . . . . . . . 26 3.8.2. Control Operator .bits .............................28
3.8.3. Control operator .regexp . . . . . . . . . . . . . . 26 3.8.3. Control Operator .regexp ...........................29
3.8.4. Control operators .cbor and .cborseq . . . . . . . . 28 3.8.4. Control Operators .cbor and .cborseq ...............30
3.8.5. Control operators .within and .and . . . . . . . . . 28 3.8.5. Control Operators .within and .and .................30
3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and 3.8.6. Control Operators .lt, .le, .gt, .ge, .eq,
.default . . . . . . . . . . . . . . . . . . . . . . 29 .ne, and .default ..................................31
3.9. Socket/Plug . . . . . . . . . . . . . . . . . . . . . . . 30 3.9. Socket/Plug ...............................................32
3.10. Generics . . . . . . . . . . . . . . . . . . . . . . . . 31 3.10. Generics .................................................33
3.11. Operator Precedence . . . . . . . . . . . . . . . . . . . 32 3.11. Operator Precedence ......................................34
4. Making Use of CDDL . . . . . . . . . . . . . . . . . . . . . 33 4. Making Use of CDDL .............................................36
4.1. As a guide to a human user . . . . . . . . . . . . . . . 33 4.1. As a Guide for a Human User ...............................36
4.2. For automated checking of CBOR data structure . . . . . . 34 4.2. For Automated Checking of CBOR Data Structures ............36
4.3. For data analysis tools . . . . . . . . . . . . . . . . . 34 4.3. For Data Analysis Tools ...................................37
5. Security considerations . . . . . . . . . . . . . . . . . . . 34 5. Security Considerations ........................................37
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 6. IANA Considerations ............................................38
6.1. CDDL control operator registry . . . . . . . . . . . . . 35 6.1. CDDL Control Operators Registry ...........................38
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 7. References .....................................................40
7.1. Normative References . . . . . . . . . . . . . . . . . . 36 7.1. Normative References ......................................40
7.2. Informative References . . . . . . . . . . . . . . . . . 37 7.2. Informative References ....................................41
Appendix A. Parsing Expression Grammars (PEG) . . . . . . . . . 39 Appendix A. Parsing Expression Grammars (PEGs) ....................43
Appendix B. ABNF grammar . . . . . . . . . . . . . . . . . . . . 41 Appendix B. ABNF Grammar ..........................................45
Appendix C. Matching rules . . . . . . . . . . . . . . . . . . . 43 Appendix C. Matching Rules ........................................47
Appendix D. Standard Prelude . . . . . . . . . . . . . . . . . . 47 Appendix D. Standard Prelude ......................................52
Appendix E. Use with JSON . . . . . . . . . . . . . . . . . . . 49 Appendix E. Use with JSON .........................................53
Appendix F. A CDDL tool . . . . . . . . . . . . . . . . . . . . 51 Appendix F. A CDDL Tool ...........................................56
Appendix G. Extended Diagnostic Notation . . . . . . . . . . . . 52 Appendix G. Extended Diagnostic Notation ..........................56
G.1. White space in byte string notation . . . . . . . . . . . 52 G.1. Whitespace in Byte String Notation .........................57
G.2. Text in byte string notation . . . . . . . . . . . . . . 52 G.2. Text in Byte String Notation ...............................57
G.3. Embedded CBOR and CBOR sequences in byte strings . . . . 53 G.3. Embedded CBOR and CBOR Sequences in Byte Strings ...........57
G.4. Concatenated Strings . . . . . . . . . . . . . . . . . . 53 G.4. Concatenated Strings .......................................58
G.5. Hexadecimal, octal, and binary numbers . . . . . . . . . 54 G.5. Hexadecimal, Octal, and Binary Numbers .....................59
G.6. Comments . . . . . . . . . . . . . . . . . . . . . . . . 54 G.6. Comments ...................................................59
Appendix H. Examples . . . . . . . . . . . . . . . . . . . . . . 55 Appendix H. Examples ..............................................60
H.1. RFC 7071 . . . . . . . . . . . . . . . . . . . . . . . . 55 Acknowledgements ..................................................63
H.2. Examples from JSON Content Rules . . . . . . . . . . . . 58 Contributors ......................................................63
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Authors' Addresses ................................................64
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 61
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 61
1. Introduction 1. Introduction
In this document, a notational convention to express CBOR [RFC7049] In this document, a notational convention to express Concise Binary
data structures is defined. Object Representation (CBOR) data structures [RFC7049] is defined.
The main goal for the convention is to provide a unified notation The main goal for the convention is to provide a unified notation
that can be used when defining protocols that use CBOR. We term the that can be used when defining protocols that use CBOR. We term the
convention "Concise data definition language", or CDDL. convention "Concise Data Definition Language", or CDDL.
The CBOR notational convention has the following goals: The CBOR notational convention has the following goals:
(G1) Provide an unambiguous description of the overall structure of (G1) Provide an unambiguous description of the overall structure of
a CBOR data item. a CBOR data item.
(G2) Be flexible in expressing the multiple ways in which data can (G2) Be flexible in expressing the multiple ways in which data can
be represented in the CBOR data format. be represented in the CBOR data format.
(G3) Be able to express common CBOR datatypes and structures. (G3) Be able to express common CBOR datatypes and structures.
(G4) Provide a single format that is both readable and editable for (G4) Provide a single format that is both readable and editable for
humans and processable by machine. humans and processable by a machine.
(G5) Enable automatic checking of CBOR data items for data format (G5) Enable automatic checking of CBOR data items for data format
compliance. compliance.
(G6) Enable extraction of specific elements from CBOR data for (G6) Enable extraction of specific elements from CBOR data for
further processing. further processing.
Not an original goal per se, but a convenient side effect of the JSON Not an original goal per se, but a convenient side effect of the JSON
generic data model being a subset of the CBOR generic data model, is generic data model being a subset of the CBOR generic data model, is
the fact that CDDL can also be used for describing JSON data the fact that CDDL can also be used for describing JSON data
structures (see Appendix E). structures (see Appendix E).
This document has the following structure: This document has the following structure:
The syntax of CDDL is defined in Section 3. Examples of CDDL and The syntax of CDDL is defined in Section 3. Examples of CDDL and a
related CBOR data items ("instances", which all happen to be in JSON related CBOR data item ("instance"), some of which use the JSON form,
form) are given in Appendix H. Section 4 discusses usage of CDDL. are described in Appendix H. Section 4 discusses usage of CDDL.
Examples are provided early in the text to better illustrate concept Examples are provided throughout the text to better illustrate
definitions. A formal definition of CDDL using ABNF grammar is concept definitions. A formal definition of CDDL using ABNF grammar
provided in Appendix B. Finally, a _prelude_ of standard CDDL [RFC5234] is provided in Appendix B. Finally, a _prelude_ of
definitions that is automatically prepended to and thus available in standard CDDL definitions that is automatically prepended to, and
every CBOR specification is listed in Appendix D. thus available in, every CDDL specification is listed in Appendix D.
1.1. Requirements notation 1.1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
1.2. Terminology 1.2. Terminology
New terms are introduced in _cursive_, which is rendered in plain New terms are introduced in _cursive_, which is rendered in plain
text as the new term surrounded by underscores. CDDL text in the text as the new term surrounded by underscores. CDDL text in the
running text is in "typewriter", which is rendered in plain text as running text is in "typewriter", which is rendered in plain text as
the CDDL text in double quotes (double quotes are also used in the the CDDL text in double quotes (double quotes are also used in the
usual English sense; the reader is expected to disambiguate this by usual English sense; the reader is expected to disambiguate this by
context). context).
In this specification, the term "byte" is used in its now customary In this specification, the term "byte" is used in its now-customary
sense as a synonym for "octet". sense as a synonym for "octet".
2. The Style of Data Structure Specification 2. The Style of Data Structure Specification
CDDL focuses on styles of specification that are in use in the CDDL focuses on styles of specification that are in use in the
community employing the data model as pioneered by JSON and now community employing the data model as pioneered by JSON and now
refined in CBOR. refined in CBOR.
There are a number of more or less atomic elements of a CBOR data There are a number of more or less atomic elements of a CBOR data
model, such as numbers, simple values (false, true, nil), text and model, such as numbers, simple values (false, true, nil), text
byte strings; CDDL does not focus on specifying their structure. strings, and byte strings; CDDL does not focus on specifying their
CDDL of course also allows adding a CBOR tag to a data item. structure. CDDL of course also allows adding a CBOR tag to a
data item.
Beyond those atomic elements, further components of a data structure Beyond those atomic elements, further components of a data structure
definition language are the data types used for composition: arrays definition language are the datatypes used for composition: arrays
and maps in CBOR (called arrays and objects in JSON). While these and maps in CBOR (called "arrays" and "objects" in JSON). While
are only two representation formats, they are used to specify four these are only two representation formats, they are used to specify
loosely distinguishable styles of composition: four loosely distinguishable styles of composition:
o A _vector_, an array of elements that are mostly of the same o A _vector_: an array of elements that are mostly of the same
semantics. The set of signatures associated with a signed data semantics. The set of signatures associated with a signed data
item is a typical application of a vector. item is a typical application of a vector.
o A _record_, an array the elements of which have different, o A _record_: an array the elements of which have different,
positionally defined semantics, as detailed in the data structure positionally defined semantics, as detailed in the data structure
definition. A 2D point, specified as an array of an x coordinate definition. A 2D point, specified as an array of an x coordinate
(which comes first) and a y coordinate (coming second) is an (which comes first) and a y coordinate (coming second), is an
example of a record, as is the pair of exponent (first) and example of a record, as is the pair of exponent (first) and
mantissa (second) in a CBOR decimal fraction. mantissa (second) in a CBOR decimal fraction.
o A _table_, a map from a domain of map keys to a domain of map o A _table_: a map from a domain of map keys to a domain of map
values, that are mostly of the same semantics. A set of language values, that are mostly of the same semantics. A set of language
tags, each mapped to a text string translated to that specific tags, each mapped to a text string translated to that specific
language, is an example of a table. The key domain is usually not language, is an example of a table. The key domain is usually not
limited to a specific set by the specification, but open for the limited to a specific set by the specification but is open for the
application, e.g., in a table mapping IP addresses to MAC application, e.g., in a table mapping IP addresses to Media Access
addresses, the specification does not attempt to foresee all Control (MAC) addresses, the specification does not attempt to
possible IP addresses. In a language such as JavaScript, a "Map" foresee all possible IP addresses. In a language such as
(as opposed to a plain "Object") would often be employed to JavaScript, a "Map" (as opposed to a plain "Object") would often
achieve the generality of the key domain. be employed to achieve the generality of the key domain.
o A _struct_, a map from a domain of map keys as defined by the o A _struct_: a map from a domain of map keys as defined by the
specification to a domain of map values the semantics of each of specification to a domain of map values the semantics of each of
which is bound to a specific map key. This is what many people which is bound to a specific map key. This is what many people
have in mind when they think about JSON objects; CBOR adds the have in mind when they think about JSON objects; CBOR adds the
ability to use map keys that are not just text strings. Structs ability to use map keys that are not just text strings. Structs
can be used to solve similar problems as records; the use of can be used to solve problems similar to those records are used
explicit map keys facilitates optionality and extensibility. for; the use of explicit map keys facilitates optionality and
extensibility.
Two important concepts provide the foundation for CDDL: Two important concepts provide the foundation for CDDL:
1. Instead of defining all four types of composition in CDDL 1. Instead of defining all four types of composition in CDDL
separately, or even defining one kind for arrays (vectors and separately, or even defining one kind for arrays (vectors and
records) and one kind for maps (tables and structs), there is records) and one kind for maps (tables and structs), there is
only one kind of composition in CDDL: the _group_ (Section 2.1). only one kind of composition in CDDL: the _group_ (Section 2.1).
2. The other important concept is that of a _type_. The entire CDDL 2. The other important concept is that of a _type_. The entire CDDL
specification defines a type (the one defined by its first specification defines a type (the one defined by its first
_rule_), which formally is the set of CBOR data items that are _rule_), which formally is the set of CBOR data items that are
acceptable as "instances" for this specification. CDDL acceptable as "instances" for this specification. CDDL
predefines a number of basic types such as "uint" (unsigned predefines a number of basic types such as "uint" (unsigned
integer) or "tstr" (text string), often making use of a simple integer) or "tstr" (text string), often making use of a simple
formal notation for CBOR data items. Each value that can be formal notation for CBOR data items. Each value that can be
expressed as a CBOR data item also is a type in its own right, expressed as a CBOR data item is also a type in its own right,
e.g. "1". A type can be built as a _choice_ of other types, e.g., "1". A type can be built as a _choice_ of other types,
e.g., an "int" is either a "uint" or a "nint" (negative integer). e.g., an "int" is either a "uint" or a "nint" (negative integer).
Finally, a type can be built as an array or a map from a group. Finally, a type can be built as an array or a map from a group.
The rest of this section introduces a number of basic concepts of The rest of this section introduces a number of basic concepts of
CDDL, and Section 3 defines additional syntax. Appendix C gives a CDDL, and Section 3 defines additional syntax. Appendix C gives a
concise summary of the semantics of CDDL. concise summary of the semantics of CDDL.
2.1. Groups and Composition in CDDL 2.1. Groups and Composition in CDDL
CDDL Groups are lists of group _entries_, each of which can be a CDDL groups are lists of group _entries_, each of which can be a
name/value pair or a more complex group expression (which then in name/value pair or a more complex group expression (which then in
turn stands for a sequence of name/value pairs). A CDDL group is a turn stands for a sequence of name/value pairs). A CDDL group is a
production in a grammar that matches certain sequences of name/value production in a grammar that matches certain sequences of name/value
pairs but not others. The grammar is based on the concepts of pairs but not others. The grammar is based on the concepts of
Parsing Expression Grammars (see Appendix A). Parsing Expression Grammars (PEGs) (see Appendix A).
In an array context, only the value of the name/value pair is In an array context, only the value of the name/value pair is
represented; the name is annotation only (and can be left off from represented; the name is annotation only (and can be left off from
the group specification if not needed). In a map context, the names the group specification if not needed). In a map context, the names
become the map keys ("member keys"). become the map keys ("member keys").
In an array context, the actual sequence of elements in the group is In an array context, the actual sequence of elements in the group is
important, as that sequence is the information that allows important, as that sequence is the information that allows
associating actual array elements with entries in the group. In a associating actual array elements with entries in the group. In a
map context, the sequence of entries in a group is not relevant (but map context, the sequence of entries in a group is not relevant (but
skipping to change at page 7, line 11 skipping to change at page 7, line 42
is not covered by the group. is not covered by the group.
A simple example of using a group directly in a map definition is: A simple example of using a group directly in a map definition is:
person = { person = {
age: int, age: int,
name: tstr, name: tstr,
employer: tstr, employer: tstr,
} }
Figure 1: Using a group directly in a map Figure 1: Using a Group Directly in a Map
The three entries of the group are written between the curly braces The three entries of the group are written between the curly braces
that create the map: Here, "age", "name", and "employer" are the that create the map: here, "age", "name", and "employer" are the
names that turn into the map key text strings, and "int" and "tstr" names that turn into the map key text strings, and "int" and "tstr"
(text string) are the types of the map values under these keys. (text string) are the types of the map values under these keys.
A group by itself (without creating a map around it) can be placed in A group by itself (without creating a map around it) can be placed in
(round) parentheses, and given a name by using it in a rule: (round) parentheses and given a name by using it in a rule:
pii = ( pii = (
age: int, age: int,
name: tstr, name: tstr,
employer: tstr, employer: tstr,
) )
Figure 2: A basic group Figure 2: A Basic Group
This separate, named group definition allows us to rephrase Figure 1 This separate, named group definition allows us to rephrase
as: Figure 1 as:
person = { person = {
pii pii
} }
Figure 3: Using a group by name Figure 3: Using a Group by Name
Note that the (curly) braces signify the creation of a map; the Note that the (curly) braces signify the creation of a map; the
groups themselves are neutral as to whether they will be used in a groups themselves are neutral as to whether they will be used in a
map or an array. map or an array.
As shown in Figure 1, the parentheses for groups are optional when As shown in Figure 1, the parentheses for groups are optional when
there is some other set of brackets present. Note that they can there is some other set of brackets present. Note that they can
still be used, leading to the not so realistic, but perfectly valid still be used, leading to this not-so-realistic, but perfectly valid,
example: example:
person = {( person = {(
age: int, age: int,
name: tstr, name: tstr,
employer: tstr, employer: tstr,
)} )}
Figure 4: Using a parenthesized group in a map Figure 4: Using a Parenthesized Group in a Map
Groups can be used to factor out common parts of structs, e.g., Groups can be used to factor out common parts of structs, e.g.,
instead of writing copy/paste style specifications such as in instead of writing specifications in copy/paste style, such as in
Figure 5, one can factor out the common subgroup, choose a name for Figure 5, one can factor out the common subgroup, choose a name for
it, and write only the specific parts into the individual maps it, and write only the specific parts into the individual maps
(Figure 6). (Figure 6).
person = { person = {
age: int, age: int,
name: tstr, name: tstr,
employer: tstr, employer: tstr,
} }
dog = { dog = {
age: int, age: int,
name: tstr, name: tstr,
leash-length: float, leash-length: float,
} }
Figure 5: Maps with copy/paste Figure 5: Maps with Copy/Paste
person = { person = {
identity, identity,
employer: tstr, employer: tstr,
} }
dog = { dog = {
identity, identity,
leash-length: float, leash-length: float,
} }
identity = ( identity = (
age: int, age: int,
name: tstr, name: tstr,
) )
Figure 6: Using a group for factorization Figure 6: Using a Group for Factorization
Note that the lists inside the braces in the above definitions Note that the lists inside the braces in the above definitions
constitute (anonymous) groups, while "identity" is a named group, constitute (anonymous) groups, while "identity" is a named group,
which can then be included as part of other groups (anonymous as in which can then be included as part of other groups (anonymous as in
the example, or themselves named). the example, or themselves named).
2.1.1. Usage 2.1.1. Usage
Groups are the instrument used in composing data structures with Groups are the instrument used in composing data structures with
CDDL. It is a matter of style in defining those structures whether CDDL. It is a matter of style in defining those structures whether
to define groups (anonymously) right in their contexts or whether to to define groups (anonymously) right in their contexts or whether to
define them in a separate rule and to reference them with their define them in a separate rule and to reference them with their
respective name (possibly more than once). respective name (possibly more than once).
With this, one is allowed to define all small parts of their data With this, one is allowed to define all small parts of their data
structures and compose bigger protocol units with those or to have structures and compose bigger protocol data units with those or to
only one big protocol data unit that has all definitions ad hoc where have only one big protocol data unit that has all definitions ad hoc
needed. where needed.
2.1.2. Syntax 2.1.2. Syntax
The composition syntax is intended to be concise and easy to read: The composition syntax is intended to be concise and easy to read:
o The start and end of a group can be marked by '(' and ')' o The start and end of a group can be marked by "(" and ")".
o Definitions of entries inside of a group are noted as follows: o Definitions of entries inside of a group are noted as follows:
_keytype => valuetype,_ (read "keytype maps to valuetype"). The _keytype => valuetype,_ (read "keytype maps to valuetype"). The
comma is actually optional (not just in the final entry), but it comma is actually optional (not just in the final entry), but it
is considered good style to set it. The double arrow can be is considered good style to set it. The double arrow can be
replaced by a colon in the common case of directly using a text replaced by a colon in the common case of directly using a text
string or integer literal as a key (see Section 3.5.1; this is string or integer literal as a key; see Section 3.5.1. This is
also the common way of naming elements of an array just for also the common way of naming elements of an array just for
documentation, see Section 3.4). documentation; see Section 3.4.
A basic entry consists of a _keytype_ and a _valuetype_, both of A basic entry consists of a _keytype_ and a _valuetype_, both of
which are types (Section 2.2); this entry matches any name-value pair which are types (Section 2.2); this entry matches any name/value pair
the name of which is in the keytype and the value of which is in the the name of which is in the keytype and the value of which is in the
valuetype. valuetype.
A group defined as a sequence of group entries matches any sequence A group defined as a sequence of group entries matches any sequence
of name-value pairs that is composed by concatenation in order of of name/value pairs that is composed by concatenation in order of
what the entries match. what the entries match.
A group definition can also contain choices between groups, see A group definition can also contain choices between groups; see
Section 2.2.2. Section 2.2.2.
2.2. Types 2.2. Types
2.2.1. Values 2.2.1. Values
Values such as numbers and strings can be used in place of a type. Values such as numbers and strings can be used in place of a type.
(For instance, this is a very common thing to do for a keytype, (For instance, this is a very common thing to do for a key type,
common enough that CDDL provides additional convenience syntax for common enough that CDDL provides additional convenience syntax
this.) for this.)
The value notation is based on the C language, but does not offer all The value notation is based on the C language, but does not offer all
the syntactic variations (see Appendix B for details). The value the syntactic variations (see Appendix B for details). The value
notation for numbers inherits from C the distinction between integer notation for numbers inherits from C the distinction between integer
values (no fractional part or exponent given -- NR1 [ISO6093]) and values (no fractional part or exponent given -- NR1 [ISO6093];
floating point values (where a fractional part and/or an exponent is "NR" stands for "numerical representation") and floating-point values
present -- NR2 or NR3), so the type "1" does not include any floating (where a fractional part, an exponent, or both are present -- NR2 or
point numbers while the types "1e3" and "1.5" are both floating point NR3), so the type "1" does not include any floating-point numbers
numbers and do not include any integer numbers. while the types "1e3" and "1.5" are both floating-point numbers and
do not include any integer numbers.
2.2.2. Choices 2.2.2. Choices
Many places that allow a type also allow a choice between types, Many places that allow a type also allow a choice between types,
delimited by a "/" (slash). The entire choice construct can be put delimited by a "/" (slash). The entire choice construct can be put
into parentheses if this is required to make the construction into parentheses if this is required to make the construction
unambiguous (please see Appendix B for the details). unambiguous (please see Appendix B for details of the CDDL grammar).
Choices of values can be used to express enumerations: Choices of values can be used to express enumerations:
attire = "bow tie" / "necktie" / "Internet attire" attire = "bow tie" / "necktie" / "Internet attire"
protocol = 6 / 17 protocol = 6 / 17
Similarly as for types, CDDL also allows choices between groups, Analogous to types, CDDL also allows choices between groups,
delimited by a "//" (double slash). Note that the "//" operator delimited by a "//" (double slash). Note that the "//" operator
binds much more weakly than the other CDDL operators, so each line binds much more weakly than the other CDDL operators, so each line
within "delivery" in the following example is its own alternative in within "delivery" in the following example is its own alternative in
the group choice: the group choice:
address = { delivery } address = { delivery }
delivery = ( delivery = (
street: tstr, ? number: uint, city // street: tstr, ? number: uint, city //
po-box: uint, city // po-box: uint, city //
per-pickup: true ) per-pickup: true )
city = ( city = (
name: tstr, zip-code: uint name: tstr, zip-code: uint
) )
A group choice matches the union of the sets of name-value pair A group choice matches the union of the sets of name/value pair
sequences that the alternatives in the choice can. sequences that the alternatives in the choice can.
Both for type choices and for group choices, additional alternatives For both type choices and group choices, additional alternatives can
can be added to a rule later in separate rules by using "/=" and be added to a rule later in separate rules by using "/=" and "//=",
"//=", respectively, instead of "=": respectively, instead of "=":
attire /= "swimwear" attire /= "swimwear"
delivery //= ( delivery //= (
lat: float, long: float, drone-type: tstr lat: float, long: float, drone-type: tstr
) )
It is not an error if a name is first used with a "/=" or "//=" It is not an error if a name is first used with a "/=" or "//="
(there is no need to "create it" with "="). (there is no need to "create it" with "=").
2.2.2.1. Ranges 2.2.2.1. Ranges
Instead of naming all the values that make up a choice, CDDL allows Instead of naming all the values that make up a choice, CDDL allows
building a _range_ out of two values that are in an ordering building a _range_ out of two values that are in an ordering
relationship: A lower bound (first value) and an upper bound (second relationship: a lower bound (first value) and an upper bound (second
value). A range can be inclusive of both bounds given (denoted by value). A range can be inclusive of both bounds given (denoted by
joining two values by ".."), or include the lower bound and exclude joining two values by ".."), or it can include the lower bound and
the upper bound (denoted by instead using "..."). If the lower bound exclude the upper bound (denoted by instead using "..."). If the
exceeds the upper bound, the resulting type is the empty set (this lower bound exceeds the upper bound, the resulting type is the empty
behavior can be desirable when generics, Section 3.10, are being set (this behavior can be desirable when generics (Section 3.10) are
used). being used).
device-address = byte device-address = byte
max-byte = 255 max-byte = 255
byte = 0..max-byte ; inclusive range byte = 0..max-byte ; inclusive range
first-non-byte = 256 first-non-byte = 256
byte1 = 0...first-non-byte ; byte1 is equivalent to byte byte1 = 0...first-non-byte ; byte1 is equivalent to byte
CDDL currently only allows ranges between integers (matching integer CDDL currently only allows ranges between integers (matching integer
values) or between floating point values (matching floating point values) or between floating-point values (matching floating-point
values). If both are needed in a type, a type choice between the two values). If both are needed in a type, a type choice between the two
kinds of ranges can be (clumsily) used: kinds of ranges can be (clumsily) used:
int-range = 0..10 ; only integers match int-range = 0..10 ; only integers match
float-range = 0.0..10.0 ; only floats match float-range = 0.0..10.0 ; only floats match
BAD-range1 = 0..10.0 ; NOT DEFINED BAD-range1 = 0..10.0 ; NOT DEFINED
BAD-range2 = 0.0..10 ; NOT DEFINED BAD-range2 = 0.0..10 ; NOT DEFINED
numeric-range = int-range / float-range numeric-range = int-range / float-range
(See also the control operators .lt/.ge and .le/.gt in (See also the control operators .lt/.ge and .le/.gt in
skipping to change at page 11, line 49 skipping to change at page 13, line 4
kinds of ranges can be (clumsily) used: kinds of ranges can be (clumsily) used:
int-range = 0..10 ; only integers match int-range = 0..10 ; only integers match
float-range = 0.0..10.0 ; only floats match float-range = 0.0..10.0 ; only floats match
BAD-range1 = 0..10.0 ; NOT DEFINED BAD-range1 = 0..10.0 ; NOT DEFINED
BAD-range2 = 0.0..10 ; NOT DEFINED BAD-range2 = 0.0..10 ; NOT DEFINED
numeric-range = int-range / float-range numeric-range = int-range / float-range
(See also the control operators .lt/.ge and .le/.gt in (See also the control operators .lt/.ge and .le/.gt in
Section 3.8.6.) Section 3.8.6.)
Note that the dot is a valid name continuation character in CDDL, so Note that the dot is a valid name continuation character in CDDL, so
min..max min..max
is not a range expression but a single name. When using a name as is not a range expression but a single name. When using a name as
the left hand side of a range operator, use spacing as in the left-hand side of a range operator, use spacing as in
min .. max min .. max
to separate off the range operator. to separate off the range operator.
2.2.2.2. Turning a group into a choice 2.2.2.2. Turning a Group into a Choice
Some choices are built out of large numbers of values, often Some choices are built out of large numbers of values, often
integers, each of which is best given a semantic name in the integers, each of which is best given a semantic name in the
specification. Instead of naming each of these integers and then specification. Instead of naming each of these integers and then
accumulating these into a choice, CDDL allows building a choice from accumulating them into a choice, CDDL allows building a choice from a
a group by prefixing it with a "&" character: group by prefixing it with an "&" character:
terminal-color = &basecolors terminal-color = &basecolors
basecolors = ( basecolors = (
black: 0, red: 1, green: 2, yellow: 3, black: 0, red: 1, green: 2, yellow: 3,
blue: 4, magenta: 5, cyan: 6, white: 7, blue: 4, magenta: 5, cyan: 6, white: 7,
) )
extended-color = &( extended-color = &(
basecolors, basecolors,
orange: 8, pink: 9, purple: 10, brown: 11, orange: 8, pink: 9, purple: 10, brown: 11,
) )
As with the use of groups in arrays (Section 3.4), the member names As with the use of groups in arrays (Section 3.4), the member names
have only documentary value (in particular, they might be used by a have only documentary value (in particular, they might be used by a
tool when displaying integers that are taken from that choice). tool when displaying integers that are taken from that choice).
2.2.3. Representation Types 2.2.3. Representation Types
CDDL allows the specification of a data item type by referring to the CDDL allows the specification of a data item type by referring to the
CBOR representation (major types and additional information, CBOR representation (specifically, to major types and additional
Section 2 of [RFC7049]). How this is used should be evident from the information; see Section 2 of [RFC7049]). How this is used should be
prelude (Appendix D): a hash mark ("#") optionally followed by a evident from the prelude (Appendix D): a hash mark ("#") optionally
number from 0 to 7 identifying the major type, which then can be followed by a number from 0 to 7 identifying the major type, which
followed by a dot and a number specifying the additional information. then can be followed by a dot and a number specifying the additional
This construction specifies the set of values that can be serialized information. This construction specifies the set of values that can
in CBOR (i.e., "any"), by the given major type if one is given, or by be serialized in CBOR (i.e., "any"), by the given major type if one
the given major type with the additional information if both are is given, or by the given major type with the additional information
given. Where a major type of 6 (Tag) is used, the type of the tagged if both are given. Where a major type of 6 (Tag) is used, the type
item can be specified by appending it in parentheses. of the tagged item can be specified by appending it in parentheses.
Note that although this notation is based on the CBOR serialization, Note that although this notation is based on the CBOR serialization,
it is about a set of values at the data model level, e.g. "#7.25" it is about a set of values at the data model level, e.g., "#7.25"
specifies the set of values that can be represented as half-precision specifies the set of values that can be represented as half-precision
floats; it does not mandate that these values also do have to be floats; it does not mandate that these values also do have to be
serialized as half-precision floats: CDDL does not provide any serialized as half-precision floats: CDDL does not provide any
language means to restrict the choice of serialization variants. language means to restrict the choice of serialization variants.
This also enables the use of CDDL with JSON, which uses a This also enables the use of CDDL with JSON, which uses a
fundamentally different way of serializing (some of) the same values. fundamentally different way of serializing (some of) the same values.
It may be necessary to make use of representation types outside the It may be necessary to make use of representation types outside the
prelude, e.g., a specification could start by making use of an prelude, e.g., a specification could start by making use of an
existing tag in a more specific way, or define a new tag not defined existing tag in a more specific way or could define a new tag not
in the prelude: defined in the prelude:
my_breakfast = #6.55799(breakfast) ; cbor-any is too general! my_breakfast = #6.55799(breakfast) ; cbor-any is too general!
breakfast = cereal / porridge breakfast = cereal / porridge
cereal = #6.998(tstr) cereal = #6.998(tstr)
porridge = #6.999([liquid, solid]) porridge = #6.999([liquid, solid])
liquid = milk / water liquid = milk / water
milk = 0 milk = 0
water = 1 water = 1
solid = tstr solid = tstr
2.2.4. Root type 2.2.4. Root Type
There is no special syntax to identify the root of a CDDL data There is no special syntax to identify the root of a CDDL data
structure definition: that role is simply taken by the first rule structure definition: that role is simply taken by the first rule
defined in the file. defined in the file.
This is motivated by the usual top-down approach for defining data This is motivated by the usual top-down approach for defining data
structures, decomposing a big data structure unit into smaller parts; structures, decomposing a big data structure unit into smaller parts;
however, except for the root type, there is no need to strictly however, except for the root type, there is no need to strictly
follow this sequence. follow this sequence.
(Note that there is no way to use a group as a root - it must be a (Note that there is no way to use a group as a root -- it must be
type.) a type.)
3. Syntax 3. Syntax
In this section, the overall syntax of CDDL is shown, alongside some In this section, the overall syntax of CDDL is shown, alongside some
examples just illustrating syntax. (The definition will not attempt examples just illustrating syntax. (The definition does not attempt
to be overly formal; refer to Appendix B for the details.) to be overly formal; refer to Appendix B for details.)
3.1. General conventions 3.1. General Conventions
The basic syntax is inspired by ABNF [RFC5234], with The basic syntax is inspired by ABNF [RFC5234], with the following:
o rules, whether they define groups or types, are defined with a o Rules, whether they define groups or types, are defined with a
name, followed by an equals sign "=" and the actual definition name, followed by an equals sign "=" and the actual definition
according to the respective syntactic rules of that definition. according to the respective syntactic rules of that definition.
o A name can consist of any of the characters from the set {'A' to o A name can consist of any of the characters from the set {"A" to
'Z', 'a' to 'z', '0' to '9', '_', '-', '@', '.', '$'}, starting "Z", "a" to "z", "0" to "9", "_", "-", "@", ".", "$"}, starting
with an alphabetic character (including '@', '_', '$') and ending with an alphabetic character (including "@", "_", "$") and ending
in such a character or or a digit. in such a character or a digit.
* Names are case sensitive. * Names are case sensitive.
* It is preferred style to start a name with a lower case letter. * It is preferred style to start a name with a lowercase letter.
* The hyphen is preferred over the underscore (except in a * The hyphen is preferred over the underscore (except in a
"bareword" (Section 3.5.1), where the semantics may actually "bareword" (Section 3.5.1), where the semantics may actually
require an underscore). require an underscore).
* The period may be useful for larger specifications, to express * The period may be useful for larger specifications, to express
some module structure (as in "tcp.throughput" vs. some module structure (as in "tcp.throughput" vs.
"udp.throughput"). "udp.throughput").
* A number of names are predefined in the CDDL prelude, as listed * A number of names are predefined in the CDDL prelude, as listed
in Appendix D. in Appendix D.
* Rule names (types or groups) do not appear in the actual CBOR * Rule names (types or groups) do not appear in the actual CBOR
encoding, but names used as "barewords" in member keys do. encoding, but names used as "barewords" in member keys do.
o Comments are started by a ';' (semicolon) character and finish at o Comments are started by a ";" (semicolon) character and finish at
the end of a line (LF or CRLF). the end of a line (LF or CRLF).
o outside strings, whitespace (spaces, newlines, and comments) is o Except within strings, whitespace (spaces, newlines, and comments)
used to separate syntactic elements for readability (and to is used to separate syntactic elements for readability (and to
separate identifiers, range operators, or numbers that follow each separate identifiers, range operators, or numbers that follow each
other); it is otherwise completely optional. other); it is otherwise completely optional.
o Hexadecimal numbers are preceded by '0x' (without quotes, lower o Hexadecimal numbers are preceded by "0x" (without quotes) and are
case x), and are case insensitive. Similarly, binary numbers are case insensitive. Similarly, binary numbers are preceded by "0b".
preceded by '0b'.
o Text strings are enclosed by double quotation '"' characters. o Text strings are enclosed by double quotation '"' characters.
They follow the conventions for strings as defined in section 7 of They follow the conventions for strings as defined in Section 7 of
[RFC8259]. (ABNF users may want to note that there is no support [RFC8259]. (ABNF users may want to note that there is no support
in CDDL for the concept of case insensitivity in text strings; if in CDDL for the concept of case insensitivity in text strings; if
necessary, regular expressions can be used (Section 3.8.3).) necessary, regular expressions can be used (Section 3.8.3).)
o Byte strings are enclosed by single quotation "'" characters and o Byte strings are enclosed by single quotation "'" characters and
may be prefixed by "h" or "b64". If unprefixed, the string is may be prefixed by "h" or "b64". If unprefixed, the string is
interpreted as with a text string, except that single quotes must interpreted as with a text string, except that single quotes must
be escaped and that the UTF-8 bytes resulting are marked as a byte be escaped and that the resulting UTF-8 bytes are marked as a byte
string (major type 2). If prefixed as "h" or "b64", the string is string (major type 2). If prefixed as "h" or "b64", the string is
interpreted as a sequence of pairs of hex digits (base16, interpreted as a sequence of pairs of hex digits (base16; see
Section 8 of [RFC4648]) or a base64(url) string (Sections 4 or 5 Section 8 of [RFC4648]) or a base64(url) string (Section 4 or
of [RFC4648]), respectively (as with the diagnostic notation in Section 5 of [RFC4648]), respectively (as with the diagnostic
section 6 of [RFC7049]; cf. Appendix G.2); any white space present notation in Section 6 of [RFC7049]; cf. Appendix G.2); any
within the string (including comments) is ignored in the prefixed whitespace present within the string (including comments) is
case. ignored in the prefixed case.
o CDDL uses UTF-8 [RFC3629] for its encoding. Processing of CDDL o CDDL uses UTF-8 [RFC3629] for its encoding. Processing of CDDL
does not involve Unicode normalization processes. does not involve Unicode normalization processes.
Example: Example:
; This is a comment ; This is a comment
person = { g } person = { g }
g = ( g = (
"name": tstr, "name": tstr,
age: int, ; "age" is a bareword age: int, ; "age" is a bareword
) )
3.2. Occurrence 3.2. Occurrence
An optional _occurrence_ indicator can be given in front of a group An optional _occurrence_ indicator can be given in front of a group
entry. It is either one of the characters '?' (optional), '*' (zero entry. It is either (1) one of the characters "?" (optional), "*"
or more), or '+' (one or more), or is of the form n*m, where n and m (zero or more), or "+" (one or more) or (2) of the form n*m, where n
are optional unsigned integers and n is the lower limit (default 0) and m are optional unsigned integers and n is the lower limit
and m is the upper limit (default no limit) of occurrences. (default 0) and m is the upper limit (default no limit) of
occurrences.
If no occurrence indicator is specified, the group entry is to occur If no occurrence indicator is specified, the group entry is to occur
exactly once (as if 1*1 were specified). A group entry with an exactly once (as if 1*1 were specified). A group entry with an
occurrence indicator matches sequences of name-value pairs that are occurrence indicator matches sequences of name/value pairs that are
composed by concatenating a number of sequences that the basic group composed by concatenating a number of sequences that the basic group
entry matches, where the number needs to be allowed by the occurrence entry matches, where the number needs to be allowed by the occurrence
indicator. indicator.
Note that CDDL, outside any directives/annotations that could Note that CDDL, outside any directives/annotations that could
possibly be defined, does not make any prescription as to whether possibly be defined, does not make any prescription as to whether
arrays or maps use the definite length or indefinite length encoding. arrays or maps use definite-length or indefinite-length encoding.
I.e., there is no correlation between leaving the size of an array That is, there is no correlation between leaving the size of an array
"open" in the spec and the fact that it is then interchanged with "open" in the spec and the fact that it is then interchanged with
definite or indefinite length. definite or indefinite length.
Please also note that CDDL can describe flexibility that the data Please also note that CDDL can describe flexibility that the data
model of the target representation does not have. This is rather model of the target representation does not have. This is rather
obvious for JSON, but also is relevant for CBOR: obvious for JSON but is also relevant for CBOR:
apartment = {
kitchen: size,
* bedroom: size,
}
size = float ; in m2
apartment = {
kitchen: size,
* bedroom: size,
}
size = float ; in m2
The previous specification does not mean that CBOR is changed to The previous specification does not mean that CBOR is changed to
allow to use the key "bedroom" more than once. In other words, due allow using the key "bedroom" more than once. In other words, due to
to the restrictions imposed by the data model, the third line pretty the restrictions imposed by the data model, the third line pretty
much turns into: much turns into:
? bedroom: size, ? bedroom: size,
(Occurrence indicators beyond one still are useful in maps for groups (Occurrence indicators beyond one are still useful in maps for groups
that allow a variety of keys.) that allow a variety of keys.)
3.3. Predefined names for types 3.3. Predefined Names for Types
CDDL predefines a number of names. This subsection summarizes these CDDL predefines a number of names. This subsection summarizes these
names, but please see Appendix D for the exact definitions. names, but please see Appendix D for the exact definitions.
The following keywords for primitive datatypes are defined: The following keywords for primitive datatypes are defined:
"bool" Boolean value (major type 7, additional information 20 or "bool" Boolean value (major type 7, additional information 20
21). or 21).
"uint" An unsigned integer (major type 0). "uint" An unsigned integer (major type 0).
"nint" A negative integer (major type 1). "nint" A negative integer (major type 1).
"int" An unsigned integer or a negative integer. "int" An unsigned integer or a negative integer.
"float16" A number representable as an IEEE 754 half-precision float "float16" A number representable as a half-precision float [IEEE754]
(major type 7, additional information 25). (major type 7, additional information 25).
"float32" A number representable as an IEEE 754 single-precision "float32" A number representable as a single-precision float
float (major type 7, additional information 26). [IEEE754] (major type 7, additional information 26).
"float64" A number representable as an IEEE 754 double-precision "float64" A number representable as a double-precision float
float (major type 7, additional information 27). [IEEE754] (major type 7, additional information 27).
"float" One of float16, float32, or float64. "float" One of float16, float32, or float64.
"bstr" or "bytes" A byte string (major type 2). "bstr" or "bytes" A byte string (major type 2).
"tstr" or "text" Text string (major type 3) "tstr" or "text" Text string (major type 3).
(Note that there are no predefined names for arrays or maps; these (Note that there are no predefined names for arrays or maps; these
are defined with the syntax given below.) are defined with the syntax given below.)
In addition, a number of types are defined in the prelude that are In addition, a number of types are defined in the prelude that are
associated with CBOR tags, such as "tdate", "bigint", "regexp" etc. associated with CBOR tags, such as "tdate", "bigint", "regexp", etc.
3.4. Arrays 3.4. Arrays
Array definitions surround a group with square brackets. Array definitions surround a group with square brackets.
For each entry, an occurrence indicator as specified in Section 3.2 For each entry, an occurrence indicator as specified in Section 3.2
is permitted. is permitted.
For example: For example:
skipping to change at page 17, line 41 skipping to change at page 19, line 14
3.5. Maps 3.5. Maps
The syntax for specifying maps merits special attention, as well as a The syntax for specifying maps merits special attention, as well as a
number of optimizations and conveniences, as it is likely to be the number of optimizations and conveniences, as it is likely to be the
focal point of many specifications employing CDDL. While the syntax focal point of many specifications employing CDDL. While the syntax
does not strictly distinguish struct and table usage of maps, it does not strictly distinguish struct and table usage of maps, it
caters specifically to each of them. caters specifically to each of them.
But first, let's reiterate a feature of CBOR that it has inherited But first, let's reiterate a feature of CBOR that it has inherited
from JSON: The key/value pairs in CBOR maps have no fixed ordering. from JSON: the key/value pairs in CBOR maps have no fixed ordering.
(One could imagine situations where fixing the ordering may be of (One could imagine situations where fixing the ordering may be of
use. For example, a decoder could look for values related with use. For example, a decoder could look for values related with
integer keys 1, 3 and 7. If the order were fixed and the decoder integer keys 1, 3, and 7. If the order were fixed and the decoder
encounters the key 4 without having encountered key 3, it could encounters the key 4 without having encountered key 3, it could
conclude that key 3 is not available without doing more complicated conclude that key 3 is not available without doing more complicated
bookkeeping. Unfortunately, neither JSON nor CBOR support this, so bookkeeping. Unfortunately, neither JSON nor CBOR supports this, so
no attempt was made to support this in CDDL either.) no attempt was made to support this in CDDL either.)
3.5.1. Structs 3.5.1. Structs
The "struct" usage of maps is similar to the way JSON objects are The "struct" usage of maps is similar to the way JSON objects are
used in many JSON applications. used in many JSON applications.
A map is defined in the same way as defining an array (see A map is defined in the same way as that for defining an array (see
Section 3.4), except for using curly braces "{}" instead of square Section 3.4), except for using curly braces "{}" instead of square
brackets "[]". brackets "[]".
An occurrence indicator as specified in Section 3.2 is permitted for An occurrence indicator as specified in Section 3.2 is permitted for
each group entry. each group entry.
The following is an example of a record with a structure enbedded: The following is an example of a record with a structure embedded:
Geography = [ Geography = [
city : tstr, city : tstr,
gpsCoordinates : GpsCoordinates, gpsCoordinates : GpsCoordinates,
] ]
GpsCoordinates = { GpsCoordinates = {
longitude : uint, ; degrees, scaled by 10^7 longitude : uint, ; degrees, scaled by 10^7
latitude : uint, ; degreed, scaled by 10^7 latitude : uint, ; degrees, scaled by 10^7
} }
When encoding, the Geography record is encoded using a CBOR array When encoding, the Geography record is encoded using a CBOR array
with two members (the keys for the group entries are ignored), with two members (the keys for the group entries are ignored),
whereas the GpsCoordinates structure is encoded as a CBOR map with whereas the GpsCoordinates structure is encoded as a CBOR map with
two key/value pairs. two key/value pairs.
Types used in a structure can be defined in separate rules or just in Types used in a structure can be defined in separate rules or just in
place (potentially placed inside parentheses, such as for choices). place (potentially placed inside parentheses, such as for choices).
E.g.: For example:
located-samples = { located-samples = {
sample-point: int, sample-point: int,
samples: [+ float], samples: [+ float],
} }
where "located-samples" is the datatype to be used when referring to where "located-samples" is the datatype to be used when referring to
the struct, and "sample-point" and "samples" are the keys to be used. the struct, and "sample-point" and "samples" are the keys to be used.
This is actually a complete example: an identifier that is followed This is actually a complete example: an identifier that is followed
by a colon can be directly used as the text string for a member key by a colon can be directly used as the text string for a member key
(we speak of a "bareword" member key), as can a double-quoted string (we speak of a "bareword" member key), as can a double-quoted string
or a number. (When other types, in particular ones that contain more or a number. (When other types -- in particular, types that contain
than one value, are used as the types of keys, they are followed by a more than one value -- are used as the types of keys, they are
double arrow, see below.) followed by a double arrow; see below.)
If a text string key does not match the syntax for an identifier (or If a text string key does not match the syntax for an identifier (or
if the specifier just happens to prefer using double quotes), the if the specifier just happens to prefer using double quotes), the
text string syntax can also be used in the member key position, text string syntax can also be used in the member key position,
followed by a colon. The above example could therefore have been followed by a colon. The above example could therefore have been
written with quoted strings in the member key positions. written with quoted strings in the member key positions.
More generally, types specified in other ways than the cases More generally, types specified in ways other than those listed for
described above can be used in a keytype position by following them the cases described above can be used in a key-type position by
with a double arrow -- in particular, the double arrow is necessary following them with a double arrow -- in particular, the double arrow
if a type is named by an identifier (which, when followed by a colon, is necessary if a type is named by an identifier (which, when
would be interpreted as a "bareword" and turned into a text string). followed by a colon, would be interpreted as a "bareword" and turned
A literal text string also gives rise to a type (which contains a into a text string). A literal text string also gives rise to a type
single value only -- the given string), so another form for this (which contains a single value only -- the given string), so another
example is: form for this example is:
located-samples = { located-samples = {
"sample-point" => int, "sample-point" => int,
"samples" => [+ float], "samples" => [+ float],
} }
See Section 3.5.4 below for how the colon shortcut described here See Section 3.5.4 below for how the colon (":") shortcut described
also adds some implied semantics. here also adds some implied semantics.
A better way to demonstrate the double-arrow use may be: A better way to demonstrate the use of the double arrow may be:
located-samples = { located-samples = {
sample-point: int, sample-point: int,
samples: [+ float], samples: [+ float],
* equipment-type => equipment-tolerances, * equipment-type => equipment-tolerances,
} }
equipment-type = [name: tstr, manufacturer: tstr] equipment-type = [name: tstr, manufacturer: tstr]
equipment-tolerances = [+ [float, float]] equipment-tolerances = [+ [float, float]]
The example below defines a struct with optional entries: display The example below defines a struct with optional entries: display
skipping to change at page 20, line 28 skipping to change at page 22, line 20
NameComponents, NameComponents,
? age: uint, ? age: uint,
* tstr => any * tstr => any
} }
NameComponents = ( NameComponents = (
? firstName: tstr, ? firstName: tstr,
? familyName: tstr, ? familyName: tstr,
) )
Figure 7: Personal Data: Example for extensibility Figure 7: Personal Data: Example for Extensibility
The CDDL tool reported on in Appendix F generated as one acceptable The CDDL tool described in Appendix F generated the following as one
instance for this specification: acceptable instance for this specification:
{"familyName": "agust", "antiforeignism": "pretzel", {"familyName": "agust", "antiforeignism": "pretzel",
"springbuck": "illuminatingly", "exuviae": "ephemeris", "springbuck": "illuminatingly", "exuviae": "ephemeris",
"kilometrage": "frogfish"} "kilometrage": "frogfish"}
(See Section 3.9 for one way to explicitly identify an extension (See Section 3.9 for one way to explicitly identify an extension
point.) point.)
3.5.2. Tables 3.5.2. Tables
A table can be specified by defining a map with entries where the A table can be specified by defining a map with entries where the
keytype allows more than just a single value, e.g.: key type allows more than just a single value; for example:
square-roots = {* x => y} square-roots = {* x => y}
x = int x = int
y = float y = float
Here, the key in each key/value pair has datatype x (defined as int), Here, the key in each key/value pair has datatype x (defined as int),
and the value has datatype y (defined as float). and the value has datatype y (defined as float).
If the specification does not need to restrict one of x or y (i.e., If the specification does not need to restrict one of x or y (i.e.,
the application is free to choose per entry), it can be replaced by the application is free to choose per entry), it can be replaced by
the predefined name "any". the predefined name "any".
As another example, the following could be used as a conversion table As another example, the following could be used as a conversion table
converting from an integer or float to a string: converting from an integer or float to a string:
tostring = {* mynumber => tstr} tostring = {* mynumber => tstr}
mynumber = int / float mynumber = int / float
3.5.3. Non-deterministic order 3.5.3. Non-deterministic Order
While the way arrays are matched is fully determined by the Parsing While the way arrays are matched is fully determined by the PEG
Expression Grammar (PEG) formalism (see Appendix A), matching is more formalism (see Appendix A), matching is more complicated for maps, as
complicated for maps, as maps do not have an inherent order. For maps do not have an inherent order. For each candidate name/value
each candidate name/value pair that the PEG algorithm would try, a pair that the PEG algorithm would try, a matching member is picked
matching member is picked out of the entire map. For certain group out of the entire map. For certain group expressions, more than one
expressions, more than one member in the map may match. Most often, member in the map may match. Most often, this is inconsequential, as
this is inconsequential, as the group expression tends to consume all the group expression tends to consume all matches:
matches:
labeled-values = { labeled-values = {
? fritz: number, ? fritz: number,
* label => value * label => value
} }
label = text label = text
value = number value = number
Here, if any member with the key "fritz" is present, this will be Here, if any member with the key "fritz" is present, this will be
picked by the first entry of the group; all remaining text/number picked by the first entry of the group; all remaining text/number
member will be picked by the second entry (and if anything remains members will be picked by the second entry (and if anything remains
unpicked, the map does not match). unpicked, the map does not match).
However, it is possible to construct group expressions where what is However, it is possible to construct group expressions where what is
actually picked is indeterminate, and does matter: actually picked is indeterminate, but does matter:
do-not-do-this = { do-not-do-this = {
int => int, int => int,
int => 6, int => 6,
} }
When this expression is matched against "{3: 5, 4: 6}", the first When this expression is matched against "{3: 5, 4: 6}", the first
group entry might pick off the "3: 5", leaving "4: 6" for matching group entry might pick off the "3: 5", leaving "4: 6" for matching
the second one. Or it might pick off "4: 6", leaving nothing for the the second one. Or it might pick off "4: 6", leaving nothing for the
second entry. This pathological non-determinism is caused by second entry. This pathological non-determinism is caused by
specifying more general before more specific, and by having a general specifying "more general" before "more specific" and by having a
rule that only consumes a subset of the map key/value pairs that it general rule that only consumes a subset of the map key/value pairs
is able to match -- both tend not to occur in real-world that it is able to match -- both tend not to occur in real-world
specifications of maps. At the time of writing, CDDL tools cannot specifications of maps. At the time of writing, CDDL tools cannot
detect such cases automatically, and for the present version of the detect such cases automatically, and for the present version of the
CDDL specification, the specification writer is simply urged to not CDDL specification, the specification writer is simply urged to not
write pathologically non-deterministic specifications. write pathologically non-deterministic specifications.
(The astute reader will be reminded of what was called "ambiguous (The astute reader will be reminded of what was called "ambiguous
content models" in SGML and "non-deterministic content models" in content models" in the Standard Generalized Markup Language (SGML)
XML. That problem is related to the one described here, but the and "non-deterministic content models" in XML. That problem is
problem here is specifically caused by the lack of order in maps, related to the one described here, but the problem here is
something that the XML schema languages do not have to contend with. specifically caused by the lack of order in maps, something that the
Note that Relax-NG's "interleave" pattern handles lack of order XML schema languages do not have to contend with. Note that
explicitly on the specification side, while the instances in XML RELAX NG's "interleave" pattern handles lack of order explicitly on
always have determinate order.) the specification side, while the instances in XML always have
determinate order.)
3.5.4. Cuts in Maps 3.5.4. Cuts in Maps
The extensibility idiom discussed above for structs has one problem: The extensibility idiom discussed above for structs has one problem:
extensible-map-example = { extensible-map-example = {
? "optional-key" => int, ? "optional-key" => int,
* tstr => any * tstr => any
} }
In this example, there is one optional key "optional-key", which, In this example, there is one optional key "optional-key", which,
when present, maps to an integer. There is also a wild card for any when present, maps to an integer. There is also a wildcard for any
future additions. future additions.
Unfortunately, the data item Unfortunately, the data item
{ "optional-key": "nonsense" } { "optional-key": "nonsense" }
does match this specification: While the first entry of the group does match this specification: while the first entry of the group
does not match, the second one (the wildcard) does. This may be very does not match, the second one (the wildcard) does. This may very
well desirable (e.g., if a future extension is to be allowed to well be desirable (e.g., if a future extension is to be allowed to
extend the type of "optional-key"), but in many cases isn't. extend the type of "optional-key"), but in many cases it isn't.
In anticipation of a more general potential feature called "cuts", In anticipation of a more general potential feature called "cuts",
CDDL allows inserting a cut "^" into the definition of the map entry: CDDL allows inserting a cut "^" into the definition of the map entry:
extensible-map-example = { extensible-map-example = {
? "optional-key" ^ => int, ? "optional-key" ^ => int,
* tstr => any * tstr => any
} }
A cut in this position means that once the member key matches the A cut in this position means that once the member key matches the
name part of an entry that carries a cut, other potential matches for name part of an entry that carries a cut, other potential matches for
the key of the member that occur in later entries in the group of the the key of the member that occur in later entries in the group of the
map are no longer allowed. In other words, when a group entry would map are no longer allowed. In other words, when a group entry would
pick a key/value pair based on just a matching key, it "locks in" the pick a key/value pair based on just a matching key, it "locks in" the
pick -- this rule applies independent of whether the value matches as pick -- this rule applies, independently of whether the value matches
well, so when it does not, the entire map fails to match. In as well, so when it does not, the entire map fails to match. In
summary, the example above no longer matches the specification as summary, the example above no longer matches the specification as
modified with the cut. modified with the cut.
Since the desire for this kind of exclusive matching is so frequent, Since the desire for this kind of exclusive matching is so frequent,
the ":" shortcut is actually defined to include the cut semantics. the ":" shortcut is actually defined to include the cut semantics.
So the preceding example (including the cut) can be written more So, the preceding example (including the cut) can be written more
simply as: simply as:
extensible-map-example = { extensible-map-example = {
? "optional-key": int, ? "optional-key": int,
* tstr => any * tstr => any
} }
or even shorter, using a bareword for the key: or even shorter, using a bareword for the key:
extensible-map-example = { extensible-map-example = {
skipping to change at page 23, line 35 skipping to change at page 25, line 33
} }
3.6. Tags 3.6. Tags
A type can make use of a CBOR tag (major type 6) by using the A type can make use of a CBOR tag (major type 6) by using the
representation type notation, giving #6.nnn(type) where nnn is an representation type notation, giving #6.nnn(type) where nnn is an
unsigned integer giving the tag number and "type" is the type of the unsigned integer giving the tag number and "type" is the type of the
data item being tagged. data item being tagged.
For example, the following line from the CDDL prelude (Appendix D) For example, the following line from the CDDL prelude (Appendix D)
defines "biguint" as a type name for a positive bignum N: defines "biguint" as a type name for an unsigned bignum N:
biguint = #6.2(bstr) biguint = #6.2(bstr)
The tags defined by [RFC7049] are included in the prelude. The tags defined by [RFC7049] are included in the prelude.
Additional tags since registered need to be added to a CDDL Additional tags registered since [RFC7049] was written need to be
specification as needed; e.g., a binary UUID tag could be referenced added to a CDDL specification as needed; e.g., a binary Universally
as "buuid" in a specification after defining Unique Identifier (UUID) tag could be referenced as "buuid" in a
specification after defining
buuid = #6.37(bstr) buuid = #6.37(bstr)
In the following example, usage of the tag 32 for URIs is optional: In the following example, usage of tag 32 for URIs is optional:
my_uri = #6.32(tstr) / tstr my_uri = #6.32(tstr) / tstr
3.7. Unwrapping 3.7. Unwrapping
The group that is used to define a map or an array can often be The group that is used to define a map or an array can often be
reused in the definition of another map or array. Similarly, a type reused in the definition of another map or array. Similarly, a type
defined as a tag carries an internal data item that one would like to defined as a tag carries an internal data item that one would like to
refer to. In these cases, it is expedient to simply use the name of refer to. In these cases, it is expedient to simply use the name of
the map, array, or tag type as a handle for the group or type defined the map, array, or tag type as a handle for the group or type defined
inside it. inside it.
The "unwrap" operator (written by preceding a name by a tilde The "unwrap" operator (written by preceding a name by a tilde
character "~") can be used to strip the type defined for a name by character "~") can be used to strip the type defined for a name by
one layer, exposing the underlying group (for maps and arrays) or one layer, exposing the underlying group (for maps and arrays) or
type (for tags). type (for tags).
For example, an application might want to define a basic and an For example, an application might want to define a basic header and
advanced header. Without unwrapping, this might be done as follows: an advanced header. Without unwrapping, this might be done as
follows:
basic-header-group = ( basic-header-group = (
field1: int, field1: int,
field2: text, field2: text,
) )
basic-header = [ basic-header-group ] basic-header = [ basic-header-group ]
advanced-header = [ advanced-header = [
basic-header-group, basic-header-group,
field3: bytes, field3: bytes,
field4: number, ; as in the tagged type "time" field4: number, ; as in the tagged type "time"
] ]
Unwrapping simplifies this to: Unwrapping simplifies this to:
basic-header = [ basic-header = [
field1: int, field1: int,
field2: text, field2: text,
] ]
advanced-header = [ advanced-header = [
~basic-header, ~basic-header,
field3: bytes, field3: bytes,
field4: ~time, field4: ~time,
] ]
(Note that leaving out the first unwrap operator in the latter (Note that leaving out the first unwrap operator in the latter
example would lead to nesting the basic-header in its own array example would lead to nesting the basic-header in its own array
inside the advanced-header, while, with the unwrapped basic-header, inside the advanced-header, while, with the unwrapped basic-header,
the definition of the group inside basic-header is essentially the definition of the group inside basic-header is essentially
repeated inside advanced-header, leading to a single array. This can repeated inside advanced-header, leading to a single array. This can
be used for various applications often solved by inheritance in be used for various applications often solved by inheritance in
programming languages. The effect of unwrapping can also be programming languages. The effect of unwrapping can also be
described as "threading in" the group or type inside the referenced described as "threading in" the group or type inside the referenced
type, which suggested the thread-like "~" character.) type, which suggested the thread-like "~" character.)
3.8. Controls 3.8. Controls
A _control_ allows to relate a _target_ type with a _controller_ type A _control_ allows relating a _target_ type with a _controller_ type
via a _control operator_. via a _control operator_.
The syntax for a control type is "target .control-operator The syntax for a control type is "target .control-operator
controller", where control operators are special identifiers prefixed controller", where control operators are special identifiers prefixed
by a dot. (Note that _target_ or _controller_ might need to be by a dot. (Note that _target_ or _controller_ might need to be
parenthesized.) parenthesized.)
A number of control operators are defined at this point. Further A number of control operators are defined at this point. Further
control operators may be defined by new versions of this control operators may be defined by new versions of this
specification or by registering them according to the procedures in specification or by registering them according to the procedures in
Section 6.1. Section 6.1.
3.8.1. Control operator .size 3.8.1. Control Operator .size
A ".size" control controls the size of the target in bytes by the A ".size" control controls the size of the target in bytes by the
control type. The control is defined for text and byte strings, control type. The control is defined for text and byte strings,
where it directly controls the number of bytes in the string. It is where it directly controls the number of bytes in the string. It is
also defined for unsigned integers (see below). Figure 8 shows also defined for unsigned integers (see below). Figure 8 shows
example usage for byte strings. example usage for byte strings.
full-address = [[+ label], ip4, ip6] full-address = [[+ label], ip4, ip6]
ip4 = bstr .size 4 ip4 = bstr .size 4
ip6 = bstr .size 16 ip6 = bstr .size 16
label = bstr .size (1..63) label = bstr .size (1..63)
Figure 8: Control for size in bytes Figure 8: Control for Size in Bytes
When applied to an unsigned integer, the ".size" control restricts When applied to an unsigned integer, the ".size" control restricts
the range of that integer by giving a maximum number of bytes that the range of that integer by giving a maximum number of bytes that
should be needed in a computer representation of that unsigned should be needed in a computer representation of that unsigned
integer. In other words, "uint .size N" is equivalent to integer. In other words, "uint .size N" is equivalent to
"0...BYTES_N", where BYTES_N == 256**N. "0...BYTES_N", where BYTES_N == 256**N.
audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216 audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216
Figure 9: Control for integer size in bytes Figure 9: Control for Integer Size in Bytes
Note that, as with value restrictions in CDDL, this control is not a Note that, as with value restrictions in CDDL, this control is not a
representation constraint; a number that fits into fewer bytes can representation constraint; a number that fits into fewer bytes can
still be represented in that form, and an inefficient implementation still be represented in that form, and an inefficient implementation
could use a longer form (unless that is restricted by some format could use a longer form (unless that is restricted by some format
constraints outside of CDDL, such as the rules in Section 3.9 of constraints outside of CDDL, such as the rules in Section 3.9 of
[RFC7049]). [RFC7049]).
3.8.2. Control operator .bits 3.8.2. Control Operator .bits
A ".bits" control on a byte string indicates that, in the target, A ".bits" control on a byte string indicates that, in the target,
only the bits numbered by a number in the control type are allowed to only the bits numbered by a number in the control type are allowed to
be set. (Bits are counted the usual way, bit number "n" being set in be set. (Bits are counted the usual way, bit number "n" being set in
"str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".) "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".)
Similarly, a ".bits" control on an unsigned integer "i" indicates Similarly, a ".bits" control on an unsigned integer "i" indicates
that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n" that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n"
must be in the control type. must be in the control type.
tcpflagbytes = bstr .bits flags tcpflagbytes = bstr .bits flags
skipping to change at page 26, line 34 skipping to change at page 28, line 38
ack: 12, ack: 12,
urg: 13, urg: 13,
ece: 14, ece: 14,
cwr: 15, cwr: 15,
ns: 0, ns: 0,
) / (4..7) ; data offset bits ) / (4..7) ; data offset bits
rwxbits = uint .bits rwx rwxbits = uint .bits rwx
rwx = &(r: 2, w: 1, x: 0) rwx = &(r: 2, w: 1, x: 0)
Figure 10: Control for what bits can be set Figure 10: Control for What Bits Can Be Set
The CDDL tool reported on in Appendix F generates the following ten The CDDL tool described in Appendix F generates the following ten
example instances for "tcpflagbytes": example instances for "tcpflagbytes":
h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f' h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f'
h'01fa' h'01fe' h'01fa' h'01fe'
These examples do not illustrate that the above CDDL specification These examples do not illustrate that the above CDDL specification
does not explicitly specify a size of two bytes: A valid all clear does not explicitly specify a size of two bytes: a valid all-clear
instance of flag bytes could be "h''" or "h'00'" or even "h'000000'" instance of flag bytes could be "h''" or "h'00'" or even "h'000000'"
as well. as well.
3.8.3. Control operator .regexp 3.8.3. Control Operator .regexp
A ".regexp" control indicates that the text string given as a target A ".regexp" control indicates that the text string given as a target
needs to match the XSD regular expression given as a value in the needs to match the XML Schema Definition (XSD) regular expression
control type. XSD regular expressions are defined in Appendix F of given as a value in the control type. XSD regular expressions are
[W3C.REC-xmlschema-2-20041028]. defined in Appendix F of [W3C.REC-xmlschema-2-20041028].
nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+" nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+"
Figure 11: Control with an XSD regexp Figure 11: Control with an XSD regexp
An example matching this regular expression: An example matching this regular expression:
"N1@CH57HF.4Znqe0.dYJRN.igjf" "N1@CH57HF.4Znqe0.dYJRN.igjf"
3.8.3.1. Usage considerations 3.8.3.1. Usage Considerations
Note that XSD regular expressions do not support the usual \x or \u Note that XSD regular expressions do not support the usual \x or \u
escapes for hexadecimal expression of bytes or unicode code points. escapes for hexadecimal expression of bytes or Unicode code points.
However, in CDDL the XSD regular expressions are contained in text However, in CDDL the XSD regular expressions are contained in text
strings, the literal notation for which provides \u escapes; this strings, the literal notation for which provides \u escapes; this
should suffice for most applications that use regular expressions for should suffice for most applications that use regular expressions for
text strings. (Note that this also means that there is one level of text strings. (Note that this also means that there is one level of
string escaping before the XSD escaping rules are applied.) string escaping before the XSD escaping rules are applied.)
XSD regular expressions support character class subtraction, a XSD regular expressions support character class subtraction, a
feature often not found in regular expression libraries; feature often not found in regular expression libraries;
specification writers may want to use this feature sparingly. specification writers may want to use this feature sparingly.
Similar considerations apply to Unicode character classes; where Similar considerations apply to Unicode character classes; where
these are used, the specification that employs CDDL SHOULD identify these are used, the specification that employs CDDL SHOULD identify
which Unicode versions are addressed. which Unicode versions are addressed.
Other surprises for infrequent users of XSD regular expressions may Other surprises for infrequent users of XSD regular expressions may
include: include the following:
o No direct support for case insensitivity. While case o No direct support for case insensitivity. While case
insensitivity has gone mostly out of fashion in protocol design, insensitivity has gone mostly out of fashion in protocol design,
it is sometimes needed and then needs to be expressed manually as it is sometimes needed and then needs to be expressed manually as
in "[Cc][Aa][Ss][Ee]". in "[Cc][Aa][Ss][Ee]".
o The support for popular character classes such as \w and \d is o The support for popular character classes such as \w and \d is
based on Unicode character properties, which is often not what is based on Unicode character properties; this is often not what is
desired in an ASCII-based protocol and thus might lead to desired in an ASCII-based protocol and thus might lead to
surprises. (\s and \S do have their more conventional meanings, surprises. (\s and \S do have their more conventional meanings,
and "." matches any character but the line ending characters \r or and "." matches any character but the line-ending characters \r
\n.) or \n.)
3.8.3.2. Discussion 3.8.3.2. Discussion
There are many flavors of regular expression in use in the There are many flavors of regular expression in use in the
programming community. For instance, perl-compatible regular programming community. For instance, Perl-Compatible Regular
expressions (PCRE) are widely used and probably are more useful than Expressions (PCREs) are widely used and probably are more useful than
XSD regular expressions. However, there is no normative reference XSD regular expressions. However, there is no normative reference
for PCRE that could be used in the present document. Instead, we opt for PCREs that could be used in the present document. Instead, we
for XSD regular expressions for now. There is precedent for that opt for XSD regular expressions for now. There is precedent for that
choice in the IETF, e.g., in YANG [RFC7950]. choice in the IETF, e.g., in YANG [RFC7950].
Note that CDDL uses controls as its main extension point. This Note that CDDL uses controls as its main extension point. This
creates the opportunity to add further regular expression formats in creates the opportunity to add further regular expression formats in
addition to the one referenced here if desired. As an example, a addition to the one referenced here, if desired. As an example, a
control ".pcre" is defined in [I-D.bormann-cbor-cddl-freezer]. proposal for a ".pcre" control is defined in [CDDL-Freezer].
3.8.4. Control operators .cbor and .cborseq 3.8.4. Control Operators .cbor and .cborseq
A ".cbor" control on a byte string indicates that the byte string A ".cbor" control on a byte string indicates that the byte string
carries a CBOR encoded data item. Decoded, the data item matches the carries a CBOR-encoded data item. Decoded, the data item matches the
type given as the right-hand side argument (type1 in the following type given as the right-hand-side argument (type1 in the following
example). example).
"bytes .cbor type1" "bytes .cbor type1"
Similarly, a ".cborseq" control on a byte string indicates that the Similarly, a ".cborseq" control on a byte string indicates that the
byte string carries a sequence of CBOR encoded data items. When the byte string carries a sequence of CBOR-encoded data items. When the
data items are taken as an array, the array matches the type given as data items are taken as an array, the array matches the type given as
the right-hand side argument (type2 in the following example). the right-hand-side argument (type2 in the following example).
"bytes .cborseq type2" "bytes .cborseq type2"
(The conversion of the encoded sequence to an array can be effected (The conversion of the encoded sequence to an array can be effected,
for instance by wrapping the byte string between the two bytes 0x9f for instance, by wrapping the byte string between the two bytes 0x9f
and 0xff and decoding the wrapped byte string as a CBOR encoded data and 0xff and decoding the wrapped byte string as a CBOR-encoded
item.) data item.)
3.8.5. Control operators .within and .and 3.8.5. Control Operators .within and .and
A ".and" control on a type indicates that the data item matches both A ".and" control on a type indicates that the data item matches both
that left hand side type and the type given as the right hand side. the left-hand-side type and the type given as the right-hand side.
(Formally, the resulting type is the intersection of the two types (Formally, the resulting type is the intersection of the two types
given.) given.)
"type1 .and type2" "type1 .and type2"
A variant of the ".and" control is the ".within" control, which A variant of the ".and" control is the ".within" control, which
expresses an additional intent: the left hand side type is meant to expresses an additional intent: the left-hand-side type is meant to
be a subset of the right-hand-side type. be a subset of the right-hand-side type.
"type1 .within type2" "type1 .within type2"
While both forms have the identical formal semantics (intersection), While both forms have the identical formal semantics (intersection),
the intention of the ".within" form is that the right hand side gives the intention of the ".within" form is that the right-hand side gives
guidance to the types allowed on the left hand side, which typically guidance to the types allowed on the left-hand side, which typically
is a socket (Section 3.9): is a socket (Section 3.9):
message = $message .within message-structure message = $message .within message-structure
message-structure = [message_type, *message_option] message-structure = [message_type, *message_option]
message_type = 0..255 message_type = 0..255
message_option = any message_option = any
$message /= [3, dough: text, topping: [* text]] $message /= [3, dough: text, topping: [* text]]
$message /= [4, noodles: text, sauce: text, parmesan: bool] $message /= [4, noodles: text, sauce: text, parmesan: bool]
For ".within", a tool might flag an error if type1 allows data items For ".within", a tool might flag an error if type1 allows data items
that are not allowed by type2. In contrast, for ".and", there is no that are not allowed by type2. In contrast, for ".and", there is no
expectation that type1 already is a subset of type2. expectation that type1 is already a subset of type2.
3.8.6. Control operators .lt, .le, .gt, .ge, .eq, .ne, and .default 3.8.6. Control Operators .lt, .le, .gt, .ge, .eq, .ne, and .default
The controls .lt, .le, .gt, .ge, .eq, .ne specify a constraint on the The controls .lt, .le, .gt, .ge, .eq, and .ne specify a constraint
left hand side type to be a value less than, less than or equal, on the left-hand-side type to be a value less than, less than or
greater than, greater than or equal, equal, or not equal, to a value equal to, greater than, greater than or equal to, equal to, or not
given as a right hand side type (containing just that single value). equal to a value given as a right-hand-side type (containing just
In the present specification, the first four controls (.lt, .le, .gt, that single value). In the present specification, the first four
.ge) are defined only for numeric types, as these have a natural controls (.lt, .le, .gt, and .ge) are defined only for numeric types,
ordering relationship. as these have a natural ordering relationship.
speed = number .ge 0 ; unit: m/s speed = number .ge 0 ; unit: m/s
.ne and .eq are defined both for numeric values and values of other .ne and .eq are defined for both numeric values and values of other
types. If one of the values is not of a numeric type, equality is types. If one of the values is not of a numeric type, equality is
determined as follows: Text strings are equal (satisfy .eq/do not determined as follows: text strings are equal (satisfy .eq / do not
satisfy .ne) if they are byte-wise identical; the same applies for satisfy .ne) if they are bytewise identical; the same applies for
byte strings. Arrays are equal if they have the same number of byte strings. Arrays are equal if they have the same number of
elements, all of which are equal pairwise in order between the elements, all of which are equal pairwise in order between the
arrays. Maps are equal if they have the same number of key/value arrays. Maps are equal if they have the same number of key/value
pairs, and there is pairwise equality between the key/value pairs pairs, and there is pairwise equality between the key/value pairs
between the two maps. Tagged values are equal if they both have the between the two maps. Tagged values are equal if they both have the
same tag and the values are equal. Values of simple types match if same tag and the values are equal. Values of simple types match if
they are the same values. Numeric types that occur within arrays, they are the same values. Numeric types that occur within arrays,
maps, or tagged values are equal if their numeric value is equal and maps, or tagged values are equal if their numeric value is equal and
they are both integers or both floating point values. All other they are both integers or both floating-point values. All other
cases are not equal (e.g., comparing a text string with a byte cases are not equal (e.g., comparing a text string with a byte
string). string).
A variant of the ".ne" control is the ".default" control, which A variant of the ".ne" control is the ".default" control, which
expresses an additional intent: the value specified by the right- expresses an additional intent: the value specified by the
hand-side type is intended as a default value for the left hand side right-hand-side type is intended as a default value for the
type given, and the implied .ne control is there to prevent this left-hand-side type given, and the implied .ne control is there to
value from being sent over the wire. This control is only meaningful prevent this value from being sent over the wire. This control is
when the control type is used in an optional context; otherwise there only meaningful when the control type is used in an optional context;
would be no way to make use of the default value. otherwise, there would be no way to make use of the default value.
timer = { timer = {
time: uint, time: uint,
? displayed-step: (number .gt 0) .default 1 ? displayed-step: (number .gt 0) .default 1
} }
3.9. Socket/Plug 3.9. Socket/Plug
Both for type choices and group choices, a mechanism is defined that For both type choices and group choices, a mechanism is defined that
facilitates starting out with empty choices and assembling them facilitates starting out with empty choices and assembling them
later, potentially in separate files that are concatenated to build later, potentially in separate files that are concatenated to build
the full specification. the full specification.
Per convention, CDDL extension points are marked with a leading Per convention, CDDL extension points are marked with a leading
dollar sign (types) or two leading dollar signs (groups). Tools dollar sign (types) or two leading dollar signs (groups). Tools
honor that convention by not raising an error if such a type or group honor that convention by not raising an error if such a type or group
is not defined at all; the symbol is then taken to be an empty type is not defined at all; the symbol is then taken to be an empty type
choice (group choice), i.e., no choice is available. choice (group choice), i.e., no choice is available.
skipping to change at page 31, line 30 skipping to change at page 33, line 40
$$personaldata-extensions //= ( $$personaldata-extensions //= (
favorite-salsa: tstr, favorite-salsa: tstr,
) )
; and again, somewhere else: ; and again, somewhere else:
$$personaldata-extensions //= ( $$personaldata-extensions //= (
shoesize: uint, shoesize: uint,
) )
Figure 12: Personal Data example: Using socket/plug extensibility Figure 12: Personal Data Example: Using Socket/Plug Extensibility
3.10. Generics 3.10. Generics
Using angle brackets, the left hand side of a rule can add formal Using angle brackets, the left-hand side of a rule can add formal
parameters after the name being defined, as in: parameters after the name being defined, as in:
messages = message<"reboot", "now"> / message<"sleep", 1..100> messages = message<"reboot", "now"> / message<"sleep", 1..100>
message<t, v> = {type: t, value: v} message<t, v> = {type: t, value: v}
When using a generic rule, the formal parameters are bound to the When using a generic rule, the formal parameters are bound to the
actual arguments supplied (also using angle brackets), within the actual arguments supplied (also using angle brackets), within the
scope of the generic rule (as if there were a rule of the form scope of the generic rule (as if there were a rule of the form
parameter = argument). parameter = argument).
Generic rules can be used for establishing names for both types and Generic rules can be used for establishing names for both types and
groups. groups.
(There are some limitations to nesting of generics in the tool (At this time, there are some limitations to the nesting of generics
described in Appendix F at this time.) in the CDDL tool described in Appendix F.)
3.11. Operator Precedence 3.11. Operator Precedence
As with any language that has multiple syntactic features such as As with any language that has multiple syntactic features such as
prefix and infix operators, CDDL has operators that bind more tightly prefix and infix operators, CDDL has operators that bind more tightly
than others. This is becoming more complicated than, say, in ABNF, than others. This is becoming more complicated than, say, in ABNF,
as CDDL has both types and groups, with operators that are specific as CDDL has both types and groups, with operators that are specific
to these concepts. Type operators (such as "/" for type choice) to these concepts. Type operators (such as "/" for type choice)
operate on types, while group operators (such as "//" for group operate on types, while group operators (such as "//" for group
choice) operate on groups. Types can simply be used in groups, but choice) operate on groups. Types can simply be used in groups, but
groups need to be bracketed (as arrays or maps) to become types. So, groups need to be bracketed (as arrays or maps) to become types. So,
type operators naturally bind closer than group operators. type operators naturally bind closer than group operators.
For instance, in For instance, in
t = [group1] t = [group1]
group1 = (a / b // c / d) group1 = (a / b // c / d)
a = 1 b = 2 c = 3 d = 4 a = 1 b = 2 c = 3 d = 4
group1 is a group choice between the type choice of a and b and the group1 is a group choice between the type choice of a and b and the
type choice of c and d. This becomes more relevant once member keys type choice of c and d. This becomes more relevant once member keys
and/or occurrences are added in: and/or occurrences are added in:
t = {group2} t = {group2}
group2 = (? ab: a / b // cd: c / d) group2 = (? ab: a / b // cd: c / d)
a = 1 b = 2 c = 3 d = 4 a = 1 b = 2 c = 3 d = 4
is a group choice between the optional member "ab" of type a or b and is a group choice between the optional member "ab" of type a or b and
the member "cd" of type c or d. Note that the optionality is the member "cd" of type c or d. Note that the optionality is
attached to the first choice ("ab"), not to the second choice. attached to the first choice ("ab"), not to the second choice.
Similarly, in Similarly, in
t = [group3] t = [group3]
group3 = (+ a / b / c) group3 = (+ a / b / c)
a = 1 b = 2 c = 3 a = 1 b = 2 c = 3
group3 is a repetition of a type choice between a, b, and c; if just group3 is a repetition of a type choice between a, b, and c; if just
a is to be repeatable, a group choice is needed to focus the a is to be repeatable, a group choice is needed to focus the
occurrence: occurrence:
(A comment has been that this could be counter-intuitive. The t = [group4]
specification writer is encouraged to use parentheses liberally to group4 = (+ a // b / c)
guide readers that are not familiar with CDDL precedence rules.) a = 1 b = 2 c = 3
t = [group4]
group4 = (+ a // b / c)
a = 1 b = 2 c = 3
group4 is a group choice between a repeatable a and a single b or c. group4 is a group choice between a repeatable a and a single b or c.
In general, as with many other languages with operator precedence A comment has been that the semantics of group3 could be
rules, it is best not to rely on them, but to insert parentheses for counterintuitive. In general, as with many other languages with
readability: operator precedence rules, the specification writer is encouraged not
to rely on them, but to insert parentheses liberally to guide readers
that are not familiar with CDDL precedence rules:
t = [group4a] t = [group4a]
group4a = ((+ a) // (b / c)) group4a = ((+ a) // (b / c))
a = 1 b = 2 c = 3 a = 1 b = 2 c = 3
The operator precedences, in sequence of loose to tight binding, are The operator precedences, in sequence of loose to tight binding, are
defined in Appendix B and summarized in Table 1. (Arities given are defined in Appendix B and summarized in Table 1. (Arities given are
1 for unary prefix operators and 2 for binary infix operators.) 1 for unary prefix operators and 2 for binary infix operators.)
+----------+-------+---------------------------+------------+
| Operator | Arity | Operates on | Precedence |
+----------+-------+---------------------------+------------+
| = | 2 | name = type, name = group | 1 |
| /= | 2 | name /= type | 1 |
| //= | 2 | name //= group | 1 |
| // | 2 | group // group | 2 |
| , | 2 | group, group | 3 |
| * | 1 | * group | 4 |
| n*m | 1 | n*m group | 4 |
| + | 1 | + group | 4 |
| ? | 1 | ? group | 4 |
| => | 2 | type => type | 5 |
| : | 2 | name: type | 5 |
| / | 2 | type / type | 6 |
| .. | 2 | type..type | 7 |
| ... | 2 | type...type | 7 |
| .ctrl | 2 | type .ctrl type | 7 |
| & | 1 | &group | 8 |
| ~ | 1 | ~type | 8 |
+----------+-------+---------------------------+------------+
+----------+----+---------------------------+------+ Table 1: Summary of Operator Precedences
| Operator | Ar | Operates on | Prec |
+----------+----+---------------------------+------+
| = | 2 | name = type, name = group | 1 |
| /= | 2 | name /= type | 1 |
| //= | 2 | name //= group | 1 |
| // | 2 | group // group | 2 |
| , | 2 | group, group | 3 |
| * | 1 | * group | 4 |
| N*M | 1 | N*M group | 4 |
| + | 1 | + group | 4 |
| ? | 1 | ? group | 4 |
| => | 2 | type => type | 5 |
| : | 2 | name: type | 5 |
| / | 2 | type / type | 6 |
| .. | 2 | type..type | 7 |
| ... | 2 | type...type | 7 |
| .ctrl | 2 | type .ctrl type | 7 |
| & | 1 | &group | 8 |
| ~ | 1 | ~type | 8 |
+----------+----+---------------------------+------+
Table 1: Summary of operator precedences
4. Making Use of CDDL 4. Making Use of CDDL
In this section, we discuss several potential ways to employ CDDL. In this section, we discuss several potential ways to employ CDDL.
4.1. As a guide to a human user 4.1. As a Guide for a Human User
CDDL can be used to efficiently define the layout of CBOR data, such CDDL can be used to efficiently define the layout of CBOR data, such
that a human implementer can easily see how data is supposed to be that a human implementer can easily see how data is supposed to be
encoded. encoded.
Since CDDL maps parts of the CBOR data to human readable names, tools Since CDDL maps parts of the CBOR data to human-readable names, tools
could be built that use CDDL to provide a human friendly could be built that use CDDL to provide a human-friendly
representation of the CBOR data, and allow them to edit such data representation of the CBOR data and allow them to edit such data
while remaining compliant to its CDDL definition. while remaining compliant with its CDDL definition.
4.2. For automated checking of CBOR data structure 4.2. For Automated Checking of CBOR Data Structures
CDDL has been specified such that a machine can handle the CDDL CDDL has been specified such that a machine can handle the CDDL
definition and related CBOR data (and, thus, also JSON data). For definition and related CBOR data (and, thus, also JSON data). For
example, a machine could use CDDL to check whether or not CBOR data example, a machine could use CDDL to check whether or not CBOR data
is compliant to its definition. is compliant with its definition.
The need for thoroughness of such compliance checking depends on the The need for thoroughness of such compliance checking depends on the
application. For example, an application may decide not to check the application. For example, an application may decide not to check the
data structure at all, and use the CDDL definition solely as a means data structure at all and use the CDDL definition solely as a means
to indicate the structure of the data to the programmer. to indicate the structure of the data to the programmer.
On the other end, the application may also implement a checking On the other hand, the application may also implement a checking
mechanism that goes as far as checking that all mandatory map members mechanism that goes as far as checking that all mandatory map members
are available. are available.
The matter in how far the data description must be enforced by an The matter of how far the data description must be enforced by an
application is left to the designers and implementers of that application is left to the designers and implementers of that
application, keeping in mind related security considerations. application, keeping in mind related security considerations.
In no case the intention is that a CDDL tool would be "writing code" In no case is it intended that a CDDL tool would be "writing code"
for an implementation. for an implementation.
4.3. For data analysis tools 4.3. For Data Analysis Tools
In the long run, it can be expected that more and more data will be In the long run, it can be expected that more and more data will be
stored using the CBOR data format. stored using the CBOR data format.
Where there is data, there is data analysis and the need to process Where there is data, there is data analysis and the need to process
such data automatically. CDDL can be used for such automated data such data automatically. CDDL can be used for such automated data
processing, allowing tools to verify data, clean it, and extract processing, allowing tools to verify data, clean it, and extract
particular parts of interest from it. particular parts of interest from it.
Since CBOR is designed with constrained devices in mind, a likely use Since CBOR is designed with constrained devices in mind, a likely use
of it would be small sensors. An interesting use would thus be of it would be small sensors. An interesting use would thus be
automated analysis of sensor data. automated analysis of sensor data.
5. Security considerations 5. Security Considerations
This document presents a content rules language for expressing CBOR This document presents a content rules language for expressing CBOR
data structures. As such, it does not bring any security issues on data structures. As such, it does not bring any security issues on
itself, although specifications of protocols that use CBOR naturally itself, although specifications of protocols that use CBOR naturally
need security analyses when defined. General guidelines for writing need security analyses when defined. General guidelines for writing
security considerations are defined in security considerations are defined in [RFC3552] (BCP 72).
Security Considerations Guidelines [RFC3552] (BCP 72).
Specifications using CDDL to define CBOR structures in protocols need Specifications using CDDL to define CBOR structures in protocols need
to follow those guidelines. Additional topics that could be to follow those guidelines. Additional topics that could be
considered in a security considerations section for a specification considered in a security considerations section for a specification
that uses CDDL to define CBOR structures include the following: that uses CDDL to define CBOR structures include the following:
o Where could the language maybe cause confusion in a way that will o Where could the language maybe cause confusion in a way that will
enable security issues? enable security issues?
o Where a CDDL matcher is part of the implementation of a system, o Where a CDDL matcher is part of the implementation of a system,
the security of the system ought not depend on the correctness of the security of the system ought not depend on the correctness of
the CDDL specification or CDDL implementation without any further the CDDL specification or CDDL implementation without any further
defenses in place. defenses in place.
o Where the CDDL includes extension points, the impact of extensions o Where the CDDL specification includes extension points, the impact
on the security of the system needs to be carefully considered. of extensions on the security of the system needs to be carefully
considered.
Writers of CDDL specifications are strongly encouraged to value Writers of CDDL specifications are strongly encouraged to value
clarity and transparency of the specification over its elegance. clarity and transparency of the specification over its elegance.
Keep it as simple as possible while still expressing the needed data Keep it as simple as possible while still expressing the needed data
model. model.
A related observation about formal description techniques in general A related observation about formal description techniques in general
that is strongly recommended to be kept in mind by writers of CDDL that is strongly recommended to be kept in mind by writers of CDDL
specifications: Just because CDDL makes it easier to handle specifications: just because CDDL makes it easier to handle
complexity in a specification, that does not make that complexity complexity in a specification, that does not make that complexity
somehow less bad (except maybe on the level of the humans having to somehow less bad (except maybe on the level of the humans having to
grasp the complex structure while reading the spec). grasp the complex structure while reading the spec).
6. IANA Considerations 6. IANA Considerations
6.1. CDDL control operator registry 6.1. CDDL Control Operators Registry
IANA is requested to create a registry for control operators IANA has created a registry for control operators (Section 3.8). The
Section 3.8. The name of this registry is "CDDL Control Operators". "CDDL Control Operators" registry has been created within the
"Concise Data Definition Language (CDDL)" registry.
Each entry in the subregistry must include the name of the control Each entry in the subregistry must include the name of the control
operator (by convention given with the leading dot) and a reference operator (by convention given with the leading dot) and a reference
to its documentation. Names must be composed of the leading dot to its documentation. Names must be composed of the leading dot
followed by a text string conforming to the production "id" in followed by a text string conforming to the production "id" in
Appendix B. Appendix B.
Initial entries in this registry are as follows: Initial entries in this registry are as follows:
+----------+---------------+ +----------+---------------+
| name | documentation | | Name | Documentation |
+----------+---------------+ +----------+---------------+
| .size | [RFCthis] | | .size | RFC 8610 |
| .bits | [RFCthis] | | .bits | RFC 8610 |
| .regexp | [RFCthis] | | .regexp | RFC 8610 |
| .cbor | [RFCthis] | | .cbor | RFC 8610 |
| .cborseq | [RFCthis] | | .cborseq | RFC 8610 |
| .within | [RFCthis] | | .within | RFC 8610 |
| .and | [RFCthis] | | .and | RFC 8610 |
| .lt | [RFCthis] | | .lt | RFC 8610 |
| .le | [RFCthis] | | .le | RFC 8610 |
| .gt | [RFCthis] | | .gt | RFC 8610 |
| .ge | [RFCthis] | | .ge | RFC 8610 |
| .eq | [RFCthis] | | .eq | RFC 8610 |
| .ne | [RFCthis] | | .ne | RFC 8610 |
| .default | [RFCthis] | | .default | RFC 8610 |
+----------+---------------+ +----------+---------------+
All other control operator names are Unassigned. All other control operator names are Unassigned.
The IANA policy for additions to this registry is "Specification The IANA policy for additions to this registry is "Specification
Required" as defined in [RFC8126] (which involves an Expert Review) Required" as defined in [RFC8126] (which involves an Expert Review)
for names that do not include an internal dot, and "IETF Review" for for names that do not include an internal dot and "IETF Review" for
names that do include an internal dot. The Expert is specifically names that do include an internal dot. The expert reviewer is
instructed that other Standards Development Organizations (SDOs) may specifically instructed that other Standards Development
want to define control operators that are specific to their fields Organizations (SDOs) may want to define control operators that are
(e.g., based on a binary syntax already in use at the SDO); the specific to their fields (e.g., based on a binary syntax already in
review process should strive to facilitate such an undertaking. use at the SDO); the review process should strive to facilitate such
an undertaking.
7. References 7. References
7.1. Normative References 7.1. Normative References
[ISO6093] ISO, "Information processing -- Representation of [ISO6093] ISO, "Information processing -- Representation of
numerical values in character strings for information numerical values in character strings for information
interchange", ISO 6093, 1985. interchange", ISO 6093, 1985.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552, Text on Security Considerations", BCP 72, RFC 3552,
DOI 10.17487/RFC3552, July 2003, DOI 10.17487/RFC3552, July 2003,
<https://www.rfc-editor.org/info/rfc3552>. <https://www.rfc-editor.org/info/rfc3552>.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO [RFC3629] Yergeau, F., "UTF-8, a transformation format of
10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629,
2003, <https://www.rfc-editor.org/info/rfc3629>. November 2003, <https://www.rfc-editor.org/info/rfc3629>.
[RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data
Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006,
<https://www.rfc-editor.org/info/rfc4648>. <https://www.rfc-editor.org/info/rfc4648>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234, Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008, DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>. <https://www.rfc-editor.org/info/rfc5234>.
skipping to change at page 37, line 31 skipping to change at page 40, line 49
[RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493,
DOI 10.17487/RFC7493, March 2015, DOI 10.17487/RFC7493, March 2015,
<https://www.rfc-editor.org/info/rfc7493>. <https://www.rfc-editor.org/info/rfc7493>.
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26, Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017, RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>. <https://www.rfc-editor.org/info/rfc8126>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, RFC 2119 Key Words", BCP 14, RFC 8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. DOI 10.17487/RFC8174, May 2017,
<https://www.rfc-editor.org/info/rfc8174>.
[RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data [RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
Interchange Format", STD 90, RFC 8259, Interchange Format", STD 90, RFC 8259,
DOI 10.17487/RFC8259, December 2017, DOI 10.17487/RFC8259, December 2017,
<https://www.rfc-editor.org/info/rfc8259>. <https://www.rfc-editor.org/info/rfc8259>.
[W3C.REC-xmlschema-2-20041028] [W3C.REC-xmlschema-2-20041028]
Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes
Second Edition", World Wide Web Consortium Recommendation Second Edition", World Wide Web Consortium Recommendation
REC-xmlschema-2-20041028, October 2004, REC-xmlschema-2-20041028, October 2004,
<http://www.w3.org/TR/2004/REC-xmlschema-2-20041028>. <https://www.w3.org/TR/2004/REC-xmlschema-2-20041028>.
7.2. Informative References 7.2. Informative References
[I-D.bormann-cbor-cddl-freezer] [CDDL-Freezer]
Bormann, C., "A feature freezer for the Concise Data Bormann, C., "A feature freezer for the Concise Data
Definition Language (CDDL)", draft-bormann-cbor-cddl- Definition Language (CDDL)", Work in Progress,
freezer-01 (work in progress), August 2018. draft-bormann-cbor-cddl-freezer-01, August 2018.
[I-D.ietf-anima-grasp] [GRASP] Bormann, C., Carpenter, B., Ed., and B. Liu, Ed., "A
Bormann, C., Carpenter, B., and B. Liu, "A Generic Generic Autonomic Signaling Protocol (GRASP)", Work in
Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- Progress, draft-ietf-anima-grasp-15, July 2017.
grasp-15 (work in progress), July 2017.
[I-D.newton-json-content-rules] [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
Newton, A. and P. Cordell, "A Language for Rules Std 754-2008.
Describing JSON Content", draft-newton-json-content-
rules-09 (work in progress), September 2017.
[PEG] Ford, B., "Parsing expression grammars", Proceedings of [JCR] Newton, A. and P. Cordell, "A Language for Rules
the 31st ACM SIGPLAN-SIGACT symposium on Principles of Describing JSON Content", Work in Progress,
programming languages - POPL '04, draft-newton-json-content-rules-09, September 2017.
DOI 10.1145/964001.964011, 2004.
[PEG] Ford, B., "Parsing expression grammars: a recognition-
based syntactic foundation", Proceedings of the 31st ACM
SIGPLAN-SIGACT symposium on Principles of programming
languages - POPL '04, DOI 10.1145/964001.964011,
January 2004.
[RELAXNG] ISO/IEC, "Information technology -- Document Schema [RELAXNG] ISO/IEC, "Information technology -- Document Schema
Definition Language (DSDL) -- Part 2: Regular-grammar- Definition Language (DSDL) -- Part 2: Regular-grammar-
based validation -- RELAX NG", ISO/IEC 19757-2, December based validation -- RELAX NG", ISO/IEC 19757-2,
2008. December 2008.
[RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for
Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071,
November 2013, <https://www.rfc-editor.org/info/rfc7071>. November 2013, <https://www.rfc-editor.org/info/rfc7071>.
[RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language",
RFC 7950, DOI 10.17487/RFC7950, August 2016, RFC 7950, DOI 10.17487/RFC7950, August 2016,
<https://www.rfc-editor.org/info/rfc7950>. <https://www.rfc-editor.org/info/rfc7950>.
[RFC8007] Murray, R. and B. Niven-Jenkins, "Content Delivery Network [RFC8007] Murray, R. and B. Niven-Jenkins, "Content Delivery Network
skipping to change at page 38, line 47 skipping to change at page 42, line 19
[RFC8152] Schaad, J., "CBOR Object Signing and Encryption (COSE)", [RFC8152] Schaad, J., "CBOR Object Signing and Encryption (COSE)",
RFC 8152, DOI 10.17487/RFC8152, July 2017, RFC 8152, DOI 10.17487/RFC8152, July 2017,
<https://www.rfc-editor.org/info/rfc8152>. <https://www.rfc-editor.org/info/rfc8152>.
[RFC8428] Jennings, C., Shelby, Z., Arkko, J., Keranen, A., and C. [RFC8428] Jennings, C., Shelby, Z., Arkko, J., Keranen, A., and C.
Bormann, "Sensor Measurement Lists (SenML)", RFC 8428, Bormann, "Sensor Measurement Lists (SenML)", RFC 8428,
DOI 10.17487/RFC8428, August 2018, DOI 10.17487/RFC8428, August 2018,
<https://www.rfc-editor.org/info/rfc8428>. <https://www.rfc-editor.org/info/rfc8428>.
7.3. URIs [YAML] Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup
Language (YAML[TM]) Version 1.2", 3rd Edition,
[1] https://github.com/cabo/cbor-diag October 2009, <https://yaml.org/spec/1.2/spec.html>.
Appendix A. Parsing Expression Grammars (PEG) Appendix A. Parsing Expression Grammars (PEGs)
This appendix is normative. This appendix is normative.
Since the 1950s, many grammar notations are based on Backus-Naur Form Since the 1950s, many grammar notations are based on Backus-Naur Form
(BNF), a notation for context-free grammars (CFGs) within Chomsky's (BNF), a notation for context-free grammars (CFGs) within Chomsky's
generative system of grammars. ABNF [RFC5234], the Augmented Backus- generative system of grammars. The Augmented Backus-Naur Form (ABNF)
Naur Form widely used in IETF specifications and also inspiring the [RFC5234], widely used in IETF specifications and also inspiring the
syntax of CDDL, is an example of this. syntax of CDDL, is an example of this.
Generative grammars can express ambiguity well, but this very Generative grammars can express ambiguity well, but this very
property may make them hard to use in recognition systems, spawning a property may make them hard to use in recognition systems, spawning a
number of subdialects that pose constraints on generative grammars to number of subdialects that pose constraints on generative grammars to
be used with parser generators, which may be hard to manage for the be used with parser generators; this scenario may be hard for the
specification writer. specification writer to manage.
Parsing Expression Grammars [PEG] provide an alternative formal PEGs [PEG] provide an alternative formal foundation for describing
foundation for describing grammars that emphasizes recognition over grammars that emphasizes recognition over generation and resolves
generation, and resolves what would have been ambiguity in generative what would have been ambiguity in generative systems by introducing
systems by introducing the concept of "prioritized choice". the concept of "prioritized choice".
The notation for Parsing Expression Grammars is quite close to BNF, The notation for PEGs is quite close to BNF, with the usual "Extended
with the usual "Extended BNF" features such as repetition added. BNF" features, such as repetition, added. However, where BNF uses
However, where BNF uses the unordered (symmetrical) choice operator the unordered (symmetrical) choice operator "|" (incidentally notated
"|" (incidentally notated as "/" in ABNF), PEG provides a prioritized as "/" in ABNF), PEG provides a prioritized choice operator "/". The
choice operator "/". The two alternatives listed are to be tested in two alternatives listed are to be tested in left-to-right order,
left-to-right order, locking in the first successful match and locking in the first successful match and disregarding any further
disregarding any further potential matches within the choice (but not potential matches within the choice (but not disabling alternatives
disabling alternatives in choices containing this choice, as a "cut" in choices containing this choice, as a cut (Section 3.5.4) would).
would - Section 3.5.4}.
For example, the ABNF expressions For example, the ABNF expressions
A = "a" "b" / "a" (1) A = "a" "b" / "a" (1)
and and
A = "a" / "a" "b" (2) A = "a" / "a" "b" (2)
are equivalent in ABNF's original generative framework, but very are equivalent in ABNF's original generative framework but are very
different in PEG: In (2), the second alternative will never match, as different in PEG: in (2), the second alternative will never match, as
any input string starting with an "a" will already succeed in the any input string starting with an "a" will already succeed in the
first alternative, locking in the match. first alternative, locking in the match.
Similarly, the occurrence indicators ("?", "*", "+") are "greedy" in Similarly, the occurrence indicators ("?", "*", "+") are "greedy" in
PEG, i.e., they consume as much input as they match (and, as a PEG, i.e., they consume as much input as they match (and, as a
consequence, "a* a" in PEG notation or "*a a" in CDDL syntax never consequence, "a* a" in PEG notation or "*a a" in CDDL syntax never
can match anything as all input matching "a" is already consumed by can match anything, as all input matching "a" is already consumed by
the initial "a*", leaving nothing to match the second "a"). the initial "a*", leaving nothing to match the second "a").
Incidentally, the grammar of the CDDL language itself, as written in Incidentally, the grammar of CDDL itself, as written in ABNF in
ABNF in Appendix B, can be interpreted both in the generative Appendix B, can be interpreted both (1) in the generative framework
framework on which RFC 5234 is based, and as a PEG. This was made on which RFC 5234 is based and (2) as a PEG. This was made possible
possible by ordering the choices in the grammar such that a by ordering the choices in the grammar such that a successful match
successful match made on the left hand side of a "/" operator is made on the left-hand side of a "/" operator is always the intended
always the intended match, instead of relying on the power of match, instead of relying on the power of symmetrical choices (for
symmetrical choices (for example, note the sequence of alternatives example, note the sequence of alternatives in the rule for "uint",
in the rule for "uint", where the lone zero is behind the longer where the lone zero is behind the longer match alternatives that
match alternatives that start with a zero). start with a zero).
The syntax used for expressing the PEG component of CDDL is based on The syntax used for expressing the PEG component of CDDL is based on
ABNF, interpreted in the obvious way with PEG semantics. The ABNF ABNF, interpreted in the obvious way with PEG semantics. The ABNF
convention of notating occurrence indicators before the controlled convention of notating occurrence indicators before the controlled
primary, and of allowing numeric values for minimum and maximum primary, and of allowing numeric values for minimum and maximum
occurrence around a "*" sign, is copied. While PEG is only about occurrence around a "*" sign, is copied. While PEG is only about
characters, CDDL has a richer set of elements, such as types and characters, CDDL has a richer set of elements, such as types and
groups. Specifically, the following constructs map: groups. Specifically, the following constructs map:
+-------+-------+-------------------------------------------+ +-------+-------+-------------------------------------------+
skipping to change at page 40, line 39 skipping to change at page 44, line 37
| "//" | "/" | prioritized choice | | "//" | "/" | prioritized choice |
| "/" | "/" | prioritized choice, limited to types only | | "/" | "/" | prioritized choice, limited to types only |
| "?" P | P "?" | zero or one | | "?" P | P "?" | zero or one |
| "*" P | P "*" | zero or more | | "*" P | P "*" | zero or more |
| "+" P | P "+" | one or more | | "+" P | P "+" | one or more |
| A B | A B | sequence | | A B | A B | sequence |
| A, B | A B | sequence, comma is decoration only | | A, B | A B | sequence, comma is decoration only |
+-------+-------+-------------------------------------------+ +-------+-------+-------------------------------------------+
The literal notation and the use of square brackets, curly braces, The literal notation and the use of square brackets, curly braces,
tildes, ampersands, and hash marks is specific to CDDL and unrelated tildes, ampersands, and hash marks are specific to CDDL and unrelated
to the conventional PEG notation. The DOT (".") is replaced by the to the conventional PEG notation. The DOT (".") from PEG is replaced
unadorned "#" or its alias "any". Also, CDDL does not provide the by the unadorned "#" or its alias "any". Also, CDDL does not provide
syntactic predicate operators NOT ("!") or AND ("&") from PEG, the syntactic predicate operators NOT ("!") or AND ("&") from PEG,
reducing expressiveness as well as complexity. reducing expressiveness as well as complexity.
For more details about PEG's theoretical foundation and interesting For more details about PEG's theoretical foundation and interesting
properties of the operators such as associativity and distributivity, properties of the operators such as associativity and distributivity,
the reader is referred to [PEG]. the reader is referred to [PEG].
Appendix B. ABNF grammar Appendix B. ABNF Grammar
This appendix is normative. This appendix is normative.
The following is a formal definition of the CDDL syntax in Augmented The following is a formal definition of the CDDL syntax in ABNF
Backus-Naur Form (ABNF, [RFC5234]). Note that, as is defined in [RFC5234]. Note that, as is defined in ABNF, the quote-delimited
ABNF, the quote-delimited strings below are case-insensitive (while strings below are case insensitive (while string values and names are
string values and names are case-sensitive in CDDL). case sensitive in CDDL).
cddl = S 1*(rule S) cddl = S 1*(rule S)
rule = typename [genericparm] S assignt S type rule = typename [genericparm] S assignt S type
/ groupname [genericparm] S assigng S grpent / groupname [genericparm] S assigng S grpent
typename = id typename = id
groupname = id groupname = id
assignt = "=" / "/=" assignt = "=" / "/="
assigng = "=" / "//=" assigng = "=" / "//="
skipping to change at page 42, line 32 skipping to change at page 46, line 33
/ "0" / "0"
value = number value = number
/ text / text
/ bytes / bytes
int = ["-"] uint int = ["-"] uint
; This is a float if it has fraction or exponent; int otherwise ; This is a float if it has fraction or exponent; int otherwise
number = hexfloat / (int ["." fraction] ["e" exponent ]) number = hexfloat / (int ["." fraction] ["e" exponent ])
hexfloat = "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent hexfloat = ["-"] "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent
fraction = 1*DIGIT fraction = 1*DIGIT
exponent = ["+"/"-"] 1*DIGIT exponent = ["+"/"-"] 1*DIGIT
text = %x22 *SCHAR %x22 text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD) SESC = "\" (%x20-7E / %x80-10FFFD)
bytes = [bsqual] %x27 *BCHAR %x27 bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
bsqual = "h" / "b64" bsqual = "h" / "b64"
skipping to change at page 43, line 17 skipping to change at page 47, line 25
NL = COMMENT / CRLF NL = COMMENT / CRLF
COMMENT = ";" *PCHAR CRLF COMMENT = ";" *PCHAR CRLF
PCHAR = %x20-7E / %x80-10FFFD PCHAR = %x20-7E / %x80-10FFFD
CRLF = %x0A / %x0D.0A CRLF = %x0A / %x0D.0A
Figure 13: CDDL ABNF Figure 13: CDDL ABNF
Note that this ABNF does not attempt to reflect the detailed rules of Note that this ABNF does not attempt to reflect the detailed rules of
what can be in a prefixed byte string. what can be in a prefixed byte string.
Appendix C. Matching rules Appendix C. Matching Rules
This appendix is normative. This appendix is normative.
In this appendix, we go through the ABNF syntax rules defined in In this appendix, we go through the ABNF syntax rules defined in
Appendix B and briefly describe the matching semantics of each Appendix B and briefly describe the matching semantics of each
syntactic feature. In this context, an instance (data item) syntactic feature. In this context, an instance (data item)
"matches" a CDDL specification if it is allowed by the CDDL "matches" a CDDL specification if it is allowed by the CDDL
specification; this is then broken down to parts of specifications specification; this is then broken down into parts of specifications
(type and group expressions) and parts of instances (data items). (type and group expressions) and parts of instances (data items).
cddl = S 1*(rule S) cddl = S 1*(rule S)
A CDDL specification is a sequence of one or more rules. Each rule A CDDL specification is a sequence of one or more rules. Each rule
gives a name to a right hand side expression, either a CDDL type or a gives a name to a right-hand-side expression, either a CDDL type or a
CDDL group. Rule names can be used in the rule itself and/or other CDDL group. Rule names can be used in the rule itself and/or other
rules (and tools can output warnings if that is not the case). The rules (and tools can output warnings if that is not the case). The
order of the rules is significant only in two cases: order of the rules is significant only in two cases:
1. The first rule defines the semantics of the entire specification; 1. The first rule defines the semantics of the entire specification;
hence, there is no need to give that root rule a special name or hence, there is no need to give that root rule a special name or
special syntax in the language (as, e.g., with "start" in Relax- special syntax in the language (as, for example, with "start" in
NG); its name can be therefore chosen to be descriptive. (As RELAX NG); its name can therefore be chosen to be descriptive.
with all other rule names, the name of the initial rule may be (As with all other rule names, the name of the initial rule may
used in itself or in other rules). be used in itself or in other rules.)
2. Where a rule contributes to a type or group choice (using "/=" or 2. Where a rule contributes to a type or group choice (using "/=" or
"//="), that choice is populated in the order the rules are "//="), that choice is populated in the order the rules are
given; see below. given; see below.
rule = typename [genericparm] S assignt S type rule = typename [genericparm] S assignt S type
/ groupname [genericparm] S assigng S grpent / groupname [genericparm] S assigng S grpent
typename = id typename = id
groupname = id groupname = id
skipping to change at page 44, line 15 skipping to change at page 48, line 26
for a group expression (production "grpent"), with the intention that for a group expression (production "grpent"), with the intention that
the semantics does not change when the name is replaced by its the semantics does not change when the name is replaced by its
(parenthesized if needed) definition. Note that whether the name (parenthesized if needed) definition. Note that whether the name
defined by a rule stands for a type or a group isn't always defined by a rule stands for a type or a group isn't always
determined by syntax alone: e.g., "a = b" can make "a" a type if "b" determined by syntax alone: e.g., "a = b" can make "a" a type if "b"
is a type, or a group if "b" is a group. More subtly, in "a = (b)", is a type, or a group if "b" is a group. More subtly, in "a = (b)",
"a" may be used as a type if "b" is a type, or as a group both when "a" may be used as a type if "b" is a type, or as a group both when
"b" is a group and when "b" is a type (a good convention to make the "b" is a group and when "b" is a type (a good convention to make the
latter case stand out to the human reader is to write "a = (b,)"). latter case stand out to the human reader is to write "a = (b,)").
(Note that the same dual meaning of parentheses applies within an (Note that the same dual meaning of parentheses applies within an
expression, but often can be resolved by the context of the expression but often can be resolved by the context of the
parenthesized expression. On the more general point, it may not be parenthesized expression. On the more general point, it may not be
clear immediately either whether "b" stands for a group or a type -- clear immediately either whether "b" stands for a group or a type --
this semantic processing may need to span several levels of rule this semantic processing may need to span several levels of rule
definitions before a determination can be made.) definitions before a determination can be made.)
assignt = "=" / "/=" assignt = "=" / "/="
assigng = "=" / "//=" assigng = "=" / "//="
A plain equals sign defines the rule name as the equivalent of the A plain equals sign defines the rule name as the equivalent of the
expression to the right; it is an error if the name already was expression to the right; it is an error if the name was already
defined with a different expression. A "/=" or "//=" extends a named defined with a different expression. A "/=" or "//=" extends a named
type or a group by additional choices; a number of these could be type or a group by additional choices; a number of these could be
replaced by collecting all the right hand sides and creating a single replaced by collecting all the right-hand sides and creating a single
rule with a type choice or a group choice built from the right hand rule with a type choice or a group choice built from the right-hand
sides in the order of the rules given. (It is not an error to extend sides in the order of the rules given. (It is not an error to extend
a rule name that has not yet been defined; this makes the right hand a rule name that has not yet been defined; this makes the right-hand
side the first entry in the choice being created.) side the first entry in the choice being created.)
genericparm = "<" S id S *("," S id S ) ">" genericparm = "<" S id S *("," S id S ) ">"
genericarg = "<" S type1 S *("," S type1 S ) ">" genericarg = "<" S type1 S *("," S type1 S ) ">"
Rule names can have generic parameters, which cause temporary Rule names can have generic parameters, which cause temporary
assignments within the right hand sides to the parameter names from assignments within the right-hand sides to the parameter names from
the arguments given when citing the rule name. the arguments given when citing the rule name.
type = type1 *(S "/" S type1) type = type1 *(S "/" S type1)
A type can be given as a choice between one or more types. The A type can be given as a choice between one or more types. The
choice matches a data item if the data item matches any one of the choice matches a data item if the data item matches any one of the
types given in the choice. The choice uses Parsing Expression types given in the choice. The choice uses PEG semantics as
Grammar semantics as discussed in Appendix A: The first choice that discussed in Appendix A: the first choice that matches wins. (As a
matches wins. (As a result, the order of rules that contribute to a result, the order of rules that contribute to a single rule name can
single rule name can very well matter.) very well matter.)
type1 = type2 [S (rangeop / ctlop) S type2] type1 = type2 [S (rangeop / ctlop) S type2]
Two types can be combined with a range operator (which see below) or
a control operator (see Section 3.8). Two types can be combined with a range operator (see below) or a
control operator (see Section 3.8).
type2 = value type2 = value
A type can be just a single value (such as 1 or "icecream" or A type can be just a single value (such as 1 or "icecream" or
h'0815'), which matches only a data item with that specific value (no h'0815'), which matches only a data item with that specific value (no
conversions defined), conversions defined),
/ typename [genericarg] / typename [genericarg]
or be defined by a rule giving a meaning to a name (possibly after or be defined by a rule giving a meaning to a name (possibly after
supplying generic arguments as required by the generic parameters), supplying generic arguments as required by the generic parameters),
/ "(" S type S ")" / "(" S type S ")"
or be defined in a parenthesized type expression (parentheses may be or be defined in a parenthesized type expression (parentheses may be
necessary to override some operator precedence), or necessary to override some operator precedence), or
/ "{" S group S "}" / "{" S group S "}"
a map expression, which matches a valid CBOR map the key/value pairs a map expression, which matches a valid CBOR map the key/value pairs
of which can be ordered in such a way that the resulting sequence of which can be ordered in such a way that the resulting sequence
matches the group expression, or matches the group expression, or
/ "[" S group S "]" / "[" S group S "]"
an array expression, which matches a CBOR array the elements of an array expression, which matches a CBOR array the elements of which
which, when taken as values and complemented by a wildcard (matches -- when taken as values and complemented by a wildcard (matches
anything) key each, match the group, or anything) key each -- match the group, or
/ "~" S typename [genericarg] / "~" S typename [genericarg]
an "unwrapped" group (see Section 3.7), which matches the group an "unwrapped" group (see Section 3.7), which matches the group
inside a type defined as a map or an array by wrapping the group, or inside a type defined as a map or an array by wrapping the group, or
/ "&" S "(" S group S ")" / "&" S "(" S group S ")"
/ "&" S groupname [genericarg] / "&" S groupname [genericarg]
an enumeration expression, which matches any a value that is within an enumeration expression, which matches any value that is within the
the set of values that the values of the group given can take, or set of values that the values of the group given can take, or
/ "#" "6" ["." uint] "(" S type S ")" / "#" "6" ["." uint] "(" S type S ")"
a tagged data item, tagged with the "uint" given and containing the a tagged data item, tagged with the "uint" given and containing the
type given as the tagged value, or type given as the tagged value, or
/ "#" DIGIT ["." uint] ; major/ai / "#" DIGIT ["." uint] ; major/ai
a data item of a major type (given by the DIGIT), optionally a data item of a major type (given by the DIGIT), optionally
constrained to the additional information given by the uint, or constrained to the additional information given by the uint, or
/ "#" ; any / "#" ; any
any data item. any data item.
rangeop = "..." / ".." rangeop = "..." / ".."
A range operator can be used to join two type expressions that stand A range operator can be used to join two type expressions that stand
for either two integer values or two floating point values; it for either two integer values or two floating-point values; it
matches any value that is between the two values, where the first matches any value that is between the two values, where the first
value is always included in the matching set and the second value is value is always included in the matching set and the second value is
included for ".." and excluded for "...". included for ".." and excluded for "...".
ctlop = "." id ctlop = "." id
A control operator ties a _target_ type to a _controller_ type as A control operator ties a _target_ type to a _controller_ type as
defined in Section 3.8. Note that control operators are an extension defined in Section 3.8. Note that control operators are an extension
point for CDDL; additional documents may want to define additional point for CDDL; additional documents may want to define additional
control operators. control operators.
group = grpchoice *(S "//" S grpchoice) group = grpchoice *(S "//" S grpchoice)
A group matches any sequence of key/value pairs that matches any of A group matches any sequence of key/value pairs that matches any of
the choices given (again using Parsing Expression Grammar semantics). the choices given (again using PEG semantics).
grpchoice = *(grpent optcom) grpchoice = *(grpent optcom)
Each of the component groups is given as a sequence of group entries. Each of the component groups is given as a sequence of group entries.
For a match, the sequence of key/value pairs given needs to match the For a match, the sequence of key/value pairs given needs to match the
sequence of group entries in the sequence given. sequence of group entries in the sequence given.
grpent = [occur S] [memberkey S] type grpent = [occur S] [memberkey S] type
A group entry can be given by a value type, which needs to be matched A group entry can be given by a value type, which needs to be matched
by the value part of a single element, and optionally a memberkey by the value part of a single element; and, optionally, a memberkey
type, which needs to be matched by the key part of the element, if type, which needs to be matched by the key part of the element, if
the memberkey is given. If the memberkey is not given, the entry can the memberkey is given. If the memberkey is not given, the entry can
only be used for matching arrays, not for maps. (See below how that only be used for matching arrays, not for maps. (See below for how
is modified by the occurrence indicator.) that is modified by the occurrence indicator.)
/ [occur S] groupname [genericarg] ; preempted by above / [occur S] groupname [genericarg] ; preempted by above
A group entry can be built from a named group, or A group entry can be built from a named group, or
/ [occur S] "(" S group S ")" / [occur S] "(" S group S ")"
from a parenthesized group, again with a possible occurrence from a parenthesized group, again with a possible occurrence
indicator. indicator.
memberkey = type1 S ["^" S] "=>" memberkey = type1 S ["^" S] "=>"
/ bareword S ":" / bareword S ":"
/ value S ":" / value S ":"
Key types can be given by a type expression, a bareword (which stands Key types can be given by a type expression, a bareword (which stands
for a type that just contains a string value created from this for a type that just contains a string value created from this
bareword), or a value (which stands for a type that just contains bareword), or a value (which stands for a type that just contains
this value). A key value matches its key type if the key value is a this value). A key value matches its key type if the key value is a
member of the key type, unless a cut preceding it in the group member of the key type, unless a cut preceding it in the group
applies (see Section 3.5.4 how map matching is influenced by the applies (see Section 3.5.4 for how map matching is influenced by the
presence of the cuts denoted by "^" or ":" in previous entries). presence of the cuts denoted by "^" or ":" in previous entries).
bareword = id bareword = id
A bareword is an alternative way to write a type with a single text A bareword is an alternative way to write a type with a single text
string value; it can only be used in the syntactic context given string value; it can only be used in the syntactic context given
above. above.
optcom = S ["," S] optcom = S ["," S]
(Optional commas do not influence the matching.) (Optional commas do not influence the matching.)
occur = [uint] "*" [uint] occur = [uint] "*" [uint]
/ "+" / "+"
/ "?" / "?"
An occurrence indicator modifies the group given to its right by An occurrence indicator modifies the group given to its right by
requiring the group to match the sequence to be matched exactly for a requiring the group to match the sequence to be matched exactly for a
certain number of times (see Section 3.2) in sequence, i.e. it acts certain number of times (see Section 3.2) in sequence, i.e., it acts
as a (possibly infinite) group choice that contains choices with the as a (possibly infinite) group choice that contains choices with the
group repeated each of the occurrences times. group repeated each of the occurrences times.
The rest of the ABNF describes syntax for value notation that should The rest of the ABNF describes syntax for value notation that should
be familiar from programming languages, with the possible exception be familiar to readers from programming languages, with the possible
of h'..' and b64'..' for byte strings, as well as syntactic elements exception of h'..' and b64'..' for byte strings, as well as syntactic
such as comments and line ends. elements such as comments and line ends.
Appendix D. Standard Prelude Appendix D. Standard Prelude
This appendix is normative. This appendix is normative.
The following prelude is automatically added to each CDDL file. The following prelude is automatically added to each CDDL file.
(Note that technically, it is a postlude, as it does not disturb the (Note that technically, it is a postlude, as it does not disturb the
selection of the first rule as the root of the definition.) selection of the first rule as the root of the definition.)
any = # any = #
uint = #0 uint = #0
nint = #1 nint = #1
int = uint / nint int = uint / nint
bstr = #2 bstr = #2
bytes = bstr bytes = bstr
tstr = #3 tstr = #3
text = tstr text = tstr
skipping to change at page 49, line 6 skipping to change at page 53, line 21
false = #7.20 false = #7.20
true = #7.21 true = #7.21
bool = false / true bool = false / true
nil = #7.22 nil = #7.22
null = nil null = nil
undefined = #7.23 undefined = #7.23
Figure 14: CDDL Prelude Figure 14: CDDL Prelude
Note that the prelude is deemed to be fixed. This means, for Note that the prelude is deemed to be fixed. This means, for
instance, that additional tags beyond [RFC7049], as registered, need instance, that additional tags beyond those defined in [RFC7049], as
to be defined in each CDDL file that is using them. registered, need to be defined in each CDDL file that is using them.
A common stumbling point is that the prelude does not define a type A common stumbling point is that the prelude does not define a type
"string". CBOR has byte strings ("bytes" in the prelude) and text "string". CBOR has byte strings ("bytes" in the prelude) and text
strings ("text"), so a type that is simply called "string" would be strings ("text"), so a type that is simply called "string" would be
ambiguous. ambiguous.
Appendix E. Use with JSON Appendix E. Use with JSON
This appendix is normative. This appendix is normative.
The JSON generic data model (implicit in [RFC8259]) is a subset of The JSON generic data model (implicit in [RFC8259]) is a subset of
the generic data model of CBOR. So one can use CDDL with JSON by the generic data model of CBOR. So, one can use CDDL with JSON by
limiting oneself to what can be represented in JSON. Roughly limiting oneself to what can be represented in JSON. Roughly
speaking, this means leaving out byte strings, tags, and simple speaking, this means leaving out byte strings, tags, and simple
values other than "false", "true", and "null", leading to the values other than "false", "true", and "null", leading to the
following limited prelude: following limited prelude:
any = # any = #
uint = #0 uint = #0
nint = #1 nint = #1
int = uint / nint int = uint / nint
skipping to change at page 49, line 49 skipping to change at page 54, line 29
float16-32 = float16 / float32 float16-32 = float16 / float32
float32-64 = float32 / float64 float32-64 = float32 / float64
float = float16-32 / float64 float = float16-32 / float64
false = #7.20 false = #7.20
true = #7.21 true = #7.21
bool = false / true bool = false / true
nil = #7.22 nil = #7.22
null = nil null = nil
Figure 15: JSON compatible subset of CDDL Prelude Figure 15: JSON-Compatible Subset of CDDL Prelude
(The major types given here do not have a direct meaning in JSON, but (The major types given here do not have a direct meaning in JSON, but
they can be interpreted as CBOR major types translated through they can be interpreted as CBOR major types translated through
Section 4 of [RFC7049].) Section 4 of [RFC7049].)
There are a few fine points in using CDDL with JSON. First, JSON There are a few fine points in using CDDL with JSON. First, JSON
does not distinguish between integers and floating point numbers; does not distinguish between integers and floating-point numbers;
there is only one kind of number (which may happen to be integral). there is only one kind of number (which may happen to be integral).
In this context, specifying a type as "uint", "nint" or "int" then In this context, specifying a type as "uint", "nint", or "int" then
becomes a predicate that the number be integral. As an example, this becomes a predicate that the number be integral. As an example, this
means that the following JSON numbers are all matching "uint": means that the following JSON numbers are all matching "uint":
10 10.0 1e1 1.0e1 100e-1 10 10.0 1e1 1.0e1 100e-1
(The fact that these are all integers may be surprising to users (The fact that these are all integers may be surprising to users
accustomed to the long tradition in programming languages of using accustomed to the long tradition in programming languages of using
decimal points or exponents in a number to indicate a floating point decimal points or exponents in a number to indicate a floating-point
literal.) literal.)
CDDL distinguishes the various CBOR number types, but there is only CDDL distinguishes the various CBOR number types, but there is only
one number type in JSON. The effect of specifying a floating point one number type in JSON. The effect of specifying a floating-point
precision (float16/float32/float64) is only to restrict the set of precision (float16/float32/float64) is only to restrict the set of
permissible values to those expressible with binary16/binary32/ permissible values to those expressible with binary16/binary32/
binary64; this is unlikely to be very useful when using CDDL for binary64; this is unlikely to be very useful when using CDDL for
specifying JSON data structures. specifying JSON data structures.
Fundamentally, the number system of JSON itself is based on decimal Fundamentally, the number system of JSON itself is based on decimal
numbers and decimal fractions and does not have limits to its numbers and decimal fractions and does not have limits to its
precision or range. In practice, JSON numbers are often parsed into precision or range. In practice, JSON numbers are often parsed into
a number type that is called float64 here, creating a number of a number type that is called "float64" here, creating a number of
limitations to the generic data model [RFC7493]. In particular, this limitations to the generic data model [RFC7493]. In particular, this
means that integers can only be expressed with interoperable means that integers can only be expressed with interoperable
exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a
smaller range than that covered by CDDL "int". smaller range than that covered by CDDL "int".
JSON applications that want to stay compatible with I-JSON JSON applications that want to stay compatible with I-JSON ("Internet
([RFC7493], "Internet JSON") therefore may want to define integer JSON"; see [RFC7493]) may therefore want to define integer types with
types with more limited ranges, such as in Figure 16. Note that the more limited ranges, such as in Figure 16. Note that the types given
types given here are not part of the prelude; they need to be copied here are not part of the prelude; they need to be copied into the
into the CDDL specification if needed. CDDL specification if needed.
ij-uint = 0..9007199254740991 ij-uint = 0..9007199254740991
ij-nint = -9007199254740991..-1 ij-nint = -9007199254740991..-1
ij-int = -9007199254740991..9007199254740991 ij-int = -9007199254740991..9007199254740991
Figure 16: I-JSON types for CDDL (not part of prelude) Figure 16: I-JSON Types for CDDL (Not Part of Prelude)
JSON applications that do not need to stay compatible with I-JSON and JSON applications that do not need to stay compatible with I-JSON and
that actually may need to go beyond the 64-bit unsigned and negative that actually may need to go beyond the 64-bit unsigned and negative
integers supported by "int" (= "uint"/"nint") may want to use the integers supported by "int" (= "uint"/"nint") may want to use the
following additional types from the standard prelude, which are following additional types from the standard prelude, which are
expressed in terms of tags but can straightforwardly be mapped into expressed in terms of tags but can straightforwardly be mapped into
JSON (but not I-JSON) numbers: JSON (but not I-JSON) numbers:
biguint = #6.2(bstr) biguint = #6.2(bstr)
bignint = #6.3(bstr) bignint = #6.3(bstr)
bigint = biguint / bignint bigint = biguint / bignint
integer = int / bigint integer = int / bigint
unsigned = uint / biguint unsigned = uint / biguint
CDDL at this point does not have a way to express the unlimited CDDL at this point does not have a way to express the unlimited
floating point precision that is theoretically possible with JSON; at floating-point precision that is theoretically possible with JSON; at
the time of writing, this is rarely used in protocols in practice. the time of writing, this is rarely used in protocols in practice.
Note that a data model described in CDDL is always restricted by what Note that a data model described in CDDL is always restricted by what
can be expressed in the serialization; e.g., floating point values can be expressed in the serialization; e.g., floating-point values
such as NaN (not a number) and the infinities cannot be represented such as NaN (not a number) and the infinities cannot be represented
in JSON even if they are allowed in the CDDL generic data model. in JSON even if they are allowed in the CDDL generic data model.
Appendix F. A CDDL tool Appendix F. A CDDL Tool
This appendix is for information only. This appendix is for information only.
A rough CDDL tool is available. For CDDL specifications, it can A rough CDDL tool is available. For CDDL specifications, it can
check the syntax, generate one or more instances (expressed in CBOR check the syntax, generate one or more instances (expressed in CBOR
diagnostic notation or in pretty-printed JSON), and validate an diagnostic notation or in pretty-printed JSON), and validate an
existing instance against the specification: existing instance against the specification:
Usage: Usage:
cddl spec.cddl generate [n] cddl spec.cddl generate [n]
cddl spec.cddl json-generate [n] cddl spec.cddl json-generate [n]
cddl spec.cddl validate instance.cbor cddl spec.cddl validate instance.cbor
cddl spec.cddl validate instance.json cddl spec.cddl validate instance.json
Figure 17: CDDL tool usage Figure 17: CDDL Tool Usage
Install on a system with a modern Ruby via: Install on a system with a modern Ruby via:
gem install cddl gem install cddl
Figure 18: CDDL tool installation Figure 18: CDDL Tool Installation
The accompanying CBOR diagnostic tools (which are automatically The accompanying CBOR diagnostic tools (which are automatically
installed by the above) are described in https://github.com/cabo/ installed by the above) are described in <https://github.com/cabo/
cbor-diag [1]; they can be used to convert between binary CBOR, a cbor-diag>; they can be used to convert between binary CBOR, a
pretty-printed form of that, CBOR diagnostic notation, JSON, and pretty-printed hexadecimal form of binary CBOR, CBOR diagnostic
YAML. notation, JSON, and YAML [YAML].
Appendix G. Extended Diagnostic Notation Appendix G. Extended Diagnostic Notation
This appendix is normative. This appendix is normative.
Section 6 of [RFC7049] defines a "diagnostic notation" in order to be Section 6 of [RFC7049] defines a "diagnostic notation" in order to be
able to converse about CBOR data items without having to resort to able to converse about CBOR data items without having to resort to
binary data. Diagnostic notation is based on JSON, with extensions binary data. Diagnostic notation is based on JSON, with extensions
for representing CBOR constructs such as binary data and tags. for representing CBOR constructs such as binary data and tags.
(Standardizing this together with the actual interchange format does (Standardizing this together with the actual interchange format does
not serve to create another interchange format, but enables the use not serve to create another interchange format but enables the use of
of a shared diagnostic notation in tools for and documents about a shared diagnostic notation in tools for and documents about CBOR.)
CBOR.)
This section discusses a few extensions to the diagnostic notation This appendix discusses a few extensions to the diagnostic notation
that have turned out to be useful since RFC 7049 was written. We that have turned out to be useful since RFC 7049 was written. We
refer to the result as extended diagnostic notation (EDN). refer to the result as Extended Diagnostic Notation (EDN).
G.1. White space in byte string notation G.1. Whitespace in Byte String Notation
Examples often benefit from some white space (spaces, line breaks) in Examples often benefit from some whitespace (spaces, line breaks) in
byte strings. In extended diagnostic notation, white space is byte strings. In EDN, whitespace is ignored in prefixed byte
ignored in prefixed byte strings; for instance, the following are strings; for instance, the following are equivalent:
equivalent:
h'48656c6c6f20776f726c64' h'48656c6c6f20776f726c64'
h'48 65 6c 6c 6f 20 77 6f 72 6c 64' h'48 65 6c 6c 6f 20 77 6f 72 6c 64'
h'4 86 56c 6c6f h'4 86 56c 6c6f
20776 f726c64' 20776 f726c64'
G.2. Text in byte string notation G.2. Text in Byte String Notation
Diagnostic notation notates Byte strings in one of the [RFC4648] base Diagnostic notation notates byte strings in one of the base encodings
encodings,, enclosed in single quotes, prefixed by >h< for base16, per [RFC4648], enclosed in single quotes, prefixed by >h< for base16,
>b32< for base32, >h32< for base32hex, >b64< for base64 or base64url. >b32< for base32, >h32< for base32hex, or >b64< for base64 or
Quite often, byte strings carry bytes that are meaningfully base64url. Quite often, byte strings carry bytes that are
interpreted as UTF-8 text. Extended Diagnostic Notation allows the meaningfully interpreted as UTF-8 text. EDN allows the use of single
use of single quotes without a prefix to express byte strings with quotes without a prefix to express byte strings with UTF-8 text; for
UTF-8 text; for instance, the following are equivalent: instance, the following are equivalent:
'hello world' 'hello world'
h'68656c6c6f20776f726c64' h'68656c6c6f20776f726c64'
The escaping rules of JSON strings are applied equivalently for text- The escaping rules of JSON strings are applied equivalently for
based byte strings, e.g., \ stands for a single backslash and ' text-based byte strings, e.g., "\" stands for a single backslash and
stands for a single quote. White space is included literally, i.e., "'" stands for a single quote. Whitespace is included literally,
the previous section does not apply to text-based byte strings. i.e., the previous section does not apply to text-based byte strings.
G.3. Embedded CBOR and CBOR sequences in byte strings G.3. Embedded CBOR and CBOR Sequences in Byte Strings
Where a byte string is to carry an embedded CBOR-encoded item, or Where a byte string is to carry an embedded CBOR-encoded item, or
more generally a sequence of zero or more such items, the diagnostic more generally a sequence of zero or more such items, the diagnostic
notation for these zero or more CBOR data items, separated by notation for these zero or more CBOR data items, separated by commas,
commata, can be enclosed in << and >> to notate the byte string can be enclosed in << and >> to notate the byte string resulting from
resulting from encoding the data items and concatenating the result. encoding the data items and concatenating the result. For instance,
For instance, each pair of columns in the following are equivalent: each pair of columns in the following are equivalent:
<<1>> h'01' <<1>> h'01'
<<1, 2>> h'0102' <<1, 2>> h'0102'
<<"foo", null>> h'63666F6FF6' <<"foo", null>> h'63666F6FF6'
<<>> h'' <<>> h''
G.4. Concatenated Strings G.4. Concatenated Strings
While the ability to include white space enables line-breaking of While the ability to include whitespace enables line-breaking of
encoded byte strings, a mechanism is needed to be able to include encoded byte strings, a mechanism is needed to be able to include
text strings as well as byte strings in direct UTF-8 representation text strings as well as byte strings in direct UTF-8 representation
into line-based documents (such as RFCs and source code). into line-based documents (such as RFCs and source code).
We extend the diagnostic notation by allowing multiple text strings We extend the diagnostic notation by allowing multiple text strings
or multiple byte strings to be notated separated by white space, or multiple byte strings to be notated separated by whitespace; these
these are then concatenated into a single text or byte string, are then concatenated into a single text or byte string,
respectively. Text strings and byte strings do not mix within such a respectively. Text strings and byte strings do not mix within such a
concatenation, except that byte string notation can be used inside a concatenation, except that byte string notation can be used inside a
sequence of concatenated text string notation to encode characters sequence of concatenated text string notation to encode characters
that may be better represented in an encoded way. The following four that may be better represented in an encoded way. The following four
values are equivalent: values are equivalent:
"Hello world" "Hello world"
"Hello " "world" "Hello " "world"
"Hello" h'20' "world" "Hello" h'20' "world"
"" h'48656c6c6f20776f726c64' "" "" h'48656c6c6f20776f726c64' ""
Similarly, the following byte string values are equivalent Similarly, the following byte string values are equivalent:
'Hello world' 'Hello world'
'Hello ' 'world' 'Hello ' 'world'
'Hello ' h'776f726c64' 'Hello ' h'776f726c64'
'Hello' h'20' 'world' 'Hello' h'20' 'world'
'' h'48656c6c6f20776f726c64' '' b64'' '' h'48656c6c6f20776f726c64' '' b64''
h'4 86 56c 6c6f' h' 20776 f726c64' h'4 86 56c 6c6f' h' 20776 f726c64'
(Note that the approach of separating by whitespace, while familiar (Note that the approach of separating by whitespace, while familiar
from the C language, requires some attention - a single comma makes a from the C language, requires some attention -- a single comma makes
big difference here.) a big difference here.)
G.5. Hexadecimal, octal, and binary numbers G.5. Hexadecimal, Octal, and Binary Numbers
In addition to JSON's decimal numbers, EDN provides hexadecimal, In addition to JSON's decimal numbers, EDN provides hexadecimal,
octal and binary numbers in the usual C-language notation (octal with octal, and binary numbers in the usual C-language notation (octal
0o prefix present only). with 0o prefix present only).
The following are equivalent: The following are equivalent:
4711 4711
0x1267 0x1267
0o11147 0o11147
0b1001001100111 0b1001001100111
As are: As are:
1.5 1.5
0x1.8p0 0x1.8p0
0x18p-4 0x18p-4
G.6. Comments G.6. Comments
Longer pieces of diagnostic notation may benefit from comments. JSON Longer pieces of diagnostic notation may benefit from comments. JSON
famously does not provide for comments, and basic RFC 7049 diagnostic famously does not provide for comments, and basic diagnostic notation
notation inherits this property. per RFC 7049 inherits this property.
In extended diagnostic notation, comments can be included, delimited In EDN, comments can be included, delimited by slashes ("/"). Any
by slashes ("/"). Any text within and including a pair of slashes is text within and including a pair of slashes is considered a comment.
considered a comment.
Comments are considered white space. Hence, they are allowed in Comments are considered whitespace. Hence, they are allowed in
prefixed byte strings; for instance, the following are equivalent: prefixed byte strings; for instance, the following are equivalent:
h'68656c6c6f20776f726c64' h'68656c6c6f20776f726c64'
h'68 65 6c /doubled l!/ 6c 6f /hello/ h'68 65 6c /doubled l!/ 6c 6f /hello/
20 /space/ 20 /space/
77 6f 72 6c 64' /world/ 77 6f 72 6c 64' /world/
This can be used to annotate a CBOR structure as in: This can be used to annotate a CBOR structure as in:
/grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416,
/objective/ [/objective-name/ "opsonize", /objective/ [/objective-name/ "opsonize",
/D, N, S/ 7, /loop-count/ 105]] /D, N, S/ 7, /loop-count/ 105]]
(There are currently no end-of-line comments. If we want to add (There are currently no end-of-line comments. If we want to add
them, "//" sounds like a reasonable delimiter given that we already them, "//" sounds like a reasonable delimiter given that we already
use slashes for comments, but we also could go e.g. for "#".) use slashes for comments, but we could also go, for example,
for "#".)
Appendix H. Examples Appendix H. Examples
This appendix is for information only. This appendix is for information only.
This section contains a few examples of structures defined using This appendix contains a few examples of structures defined
CDDL. using CDDL. The theme for the examples is taken from [RFC7071],
which defines certain JSON structures in English. For a similar
The theme for the first example is taken from [RFC7071], which example, it may also be of interest to examine Appendix A of
defines certain JSON structures in English. For a similar example, [RFC8007], which contains a CDDL definition for a JSON structure
it may also be of interest to examine Appendix A of [RFC8007], which defined in the main body of that RFC.
contains a CDDL definition for a JSON structure defined in the main
body of the RFC.
The second subsection in this appendix translates examples from
[I-D.newton-json-content-rules] into CDDL.
These examples all happen to describe data that is interchanged in These examples all happen to describe data that is interchanged in
JSON. Examples for CDDL definitions of data that is interchanged in JSON. Examples for CDDL definitions of data that is interchanged in
CBOR can be found in [RFC8152], [I-D.ietf-anima-grasp], or [RFC8428]. CBOR can be found in [RFC8152], [GRASP], and [RFC8428].
H.1. RFC 7071
[RFC7071] defines the Reputon structure for JSON using somewhat [RFC7071] defines the "reputon" structure for JSON using somewhat
formalized English text. Here is a (somewhat verbose) equivalent formalized English text. Here is a (somewhat verbose) equivalent
definition using the same terms, but notated in CDDL: definition using the same terms, but notated in CDDL:
reputation-object = { reputation-object = {
reputation-context, reputation-context,
reputon-list reputon-list
} }
reputation-context = ( reputation-context = (
application: text application: text
skipping to change at page 57, line 24 skipping to change at page 61, line 36
rating: float16 rating: float16
? confidence: float16 ? confidence: float16
? normal-rating: float16 ? normal-rating: float16
? sample-size: uint ? sample-size: uint
? generated: uint ? generated: uint
? expires: uint ? expires: uint
* text => any * text => any
} }
Note how this rather clearly delineates the structure somewhat Note how this rather clearly delineates the structure somewhat
shrouded by so many words in section 6.2.2. of [RFC7071]. Also, this shrouded by so many words in Section 6.2.2 of [RFC7071]. Also, this
definition makes it clear that several ext-values are allowed (by definition makes it clear that several ext-values are allowed (by
definition with different member names); RFC 7071 could be read to definition with different member names); RFC 7071 could be read to
forbid the repetition of ext-value ("A specific reputon-element MUST forbid the repetition of ext-value ("A specific reputon-element
NOT appear more than once" is ambiguous.) MUST NOT appear more than once" is ambiguous).
The CDDL tool reported on in Appendix F generates as one example:
{
"application": "conchometry",
"reputons": [
{
"rater": "Ephthianura",
"assertion": "codding",
"rated": "sphaerolitic",
"rating": 0.34133473256800795,
"confidence": 0.9481983064298332,
"expires": 1568,
"unplaster": "grassy"
},
{
"rater": "nonchargeable",
"assertion": "raglan",
"rated": "alienage",
"rating": 0.5724646875815566,
"sample-size": 3514,
"Aldebaran": "unchurched",
"puruloid": "impersonable",
"uninfracted": "pericarpoidal",
"schorl": "Caro"
},
{
"rater": "precollectable",
"assertion": "Merat",
"rated": "thermonatrite",
"rating": 0.19164006323936977,
"confidence": 0.6065252103391268,
"normal-rating": 0.5187773690879303,
"generated": 899,
"speedy": "solidungular",
"noviceship": "medicine",
"checkrow": "epidictic"
}
]
}
H.2. Examples from JSON Content Rules
Although JSON Content Rules [I-D.newton-json-content-rules] seems to
address a more general problem than CDDL, it is still a worthwhile
resource to explore for examples (beyond all the inspiration the
format itself has had for CDDL).
Figure 2 of the JCR I-D looks very similar, if slightly less noisy,
in CDDL:
root = [2*2 {
precision: text,
Latitude: float,
Longitude: float,
Address: text,
City: text,
State: text,
Zip: text,
Country: text
}]
Figure 19: JCR, Figure 2, in CDDL
Apart from the lack of a need to quote the member names, text strings
are called "text" or "tstr" in CDDL ("string" would be ambiguous as
CBOR also provides byte strings).
The CDDL tool reported on in Appendix F creates the below example
instance for this:
[{"precision": "pyrosphere", "Latitude": 0.5399712314350172,
"Longitude": 0.5157523963028087, "Address": "resow",
"City": "problemwise", "State": "martyrlike", "Zip": "preprove",
"Country": "Pace"},
{"precision": "unrigging", "Latitude": 0.10422704368372193,
"Longitude": 0.6279808663725834, "Address": "picturedom",
"City": "decipherability", "State": "autometry", "Zip": "pout",
"Country": "wimple"}]
Figure 4 of the JCR I-D in CDDL:
root = { image }
image = (
Image: {
size,
Title: text,
thumbnail,
IDs: [* int]
}
)
size = (
Width: 0..1280
Height: 0..1024
)
thumbnail = (
Thumbnail: {
size,
Url: ~uri
}
)
This shows how the group concept can be used to keep related elements The CDDL tool described in Appendix F generates as one example:
(here: width, height) together, and to emulate the JCR style of
specification. (It also shows referencing a type by unwrapping a tag
from the prelude, "uri" - this could be done differently.) The more
compact form of Figure 5 of the JCR I-D could be emulated like this:
root = { {
Image: { "application": "conchometry",
size, Title: text, "reputons": [
Thumbnail: { size, Url: ~uri }, {
IDs: [* int] "rater": "Ephthianura",
"assertion": "codding",
"rated": "sphaerolitic",
"rating": 0.34133473256800795,
"confidence": 0.9481983064298332,
"expires": 1568,
"unplaster": "grassy"
},
{
"rater": "nonchargeable",
"assertion": "raglan",
"rated": "alienage",
"rating": 0.5724646875815566,
"sample-size": 3514,
"Aldebaran": "unchurched",
"puruloid": "impersonable",
"uninfracted": "pericarpoidal",
"schorl": "Caro"
},
{
"rater": "precollectable",
"assertion": "Merat",
"rated": "thermonatrite",
"rating": 0.19164006323936977,
"confidence": 0.6065252103391268,
"normal-rating": 0.5187773690879303,
"generated": 899,
"speedy": "solidungular",
"noviceship": "medicine",
"checkrow": "epidictic"
} }
} ]
}
size = (
Width: 0..1280,
Height: 0..1024,
)
The CDDL tool reported on in Appendix F creates the below example
instance for this:
{"Image": {"Width": 566, "Height": 516, "Title": "leisterer",
"Thumbnail": {"Width": 1111, "Height": 176, "Url": 32("scrog")},
"IDs": []}}
Contributors
CDDL was originally conceived by Bert Greevenbosch, who also wrote
the original five versions of this document.
Acknowledgements Acknowledgements
Inspiration was taken from the C and Pascal languages, MPEG's Inspiration was taken from the C and Pascal languages, MPEG's
conventions for describing structures in the ISO base media file conventions for describing structures in the ISO base media file
format, Relax-NG and its compact syntax [RELAXNG], and in particular format, RELAX NG and its compact syntax [RELAXNG], and, in
from Andrew Lee Newton's "JSON Content Rules" particular, Andrew Lee Newton's early proposals on JSON Content Rules
[I-D.newton-json-content-rules]. (JCR) as found in draft version four (-04) of [JCR].
Lots of highly useful feedback came from members of the IETF CBOR WG, Lots of highly useful feedback came from members of the IETF CBOR WG
in particular Ari Keraenen, Brian Carpenter, Burt Harris, Jeffrey -- in particular, Ari Keranen, Brian Carpenter, Burt Harris, Jeffrey
Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael
Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer. Also, Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer. Also,
Francesca Palombini and Joe volunteered to chair the WG when it was Francesca Palombini and Joe volunteered to chair the WG when it was
created, providing the framework for generating and processing this created, providing the framework for generating and processing this
feedback; with Barry Leiba having taken over from Joe since. Chris feedback, with Barry Leiba having taken over from Joe since then.
Lonvick and Ines Robles provided additional reviews during IESG Chris Lonvick and Ines Robles provided additional reviews during IESG
processing, and Alexey Melnikov steered the process as the processing, and Alexey Melnikov steered the process as the
responsible area director. responsible Area Director.
The CDDL tool reported on in Appendix F was written by Carsten The CDDL tool described in Appendix F was written by Carsten Bormann,
Bormann, building on previous work by Troy Heninger and Tom Lord. building on previous work by Troy Heninger and Tom Lord.
Contributors
CDDL was originally conceived by Bert Greevenbosch, who also wrote
the original five draft versions of this document.
Authors' Addresses Authors' Addresses
Henk Birkholz Henk Birkholz
Fraunhofer SIT Fraunhofer SIT
Rheinstrasse 75 Rheinstrasse 75
Darmstadt 64295 Darmstadt 64295
Germany Germany
Email: henk.birkholz@sit.fraunhofer.de Email: henk.birkholz@sit.fraunhofer.de
 End of changes. 312 change blocks. 
872 lines changed or deleted 777 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/