< draft-ietf-cbor-array-tags-04.txt   draft-ietf-cbor-array-tags-05.txt >
Network Working Group C. Bormann, Ed. Network Working Group C. Bormann, Ed.
Internet-Draft Universitaet Bremen TZI Internet-Draft Universitaet Bremen TZI
Intended status: Informational May 22, 2019 Intended status: Informational June 20, 2019
Expires: November 23, 2019 Expires: December 22, 2019
Concise Binary Object Representation (CBOR) Tags for Typed Arrays Concise Binary Object Representation (CBOR) Tags for Typed Arrays
draft-ietf-cbor-array-tags-04 draft-ietf-cbor-array-tags-05
Abstract Abstract
The Concise Binary Object Representation (CBOR, RFC 7049) is a data The Concise Binary Object Representation (CBOR, RFC 7049) is a data
format whose design goals include the possibility of extremely small format whose design goals include the possibility of extremely small
code size, fairly small message size, and extensibility without the code size, fairly small message size, and extensibility without the
need for version negotiation. need for version negotiation.
The present document makes use of this extensibility to define a The present document makes use of this extensibility to define a
number of CBOR tags for typed arrays of numeric data, as well as two number of CBOR tags for typed arrays of numeric data, as well as two
skipping to change at page 1, line 39 skipping to change at page 1, line 39
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 23, 2019. This Internet-Draft will expire on December 22, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 32 skipping to change at page 3, line 32
"OPTIONAL" in this document are to be interpreted as described in "OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
The term "byte" is used in its now customary sense as a synonym for The term "byte" is used in its now customary sense as a synonym for
"octet". Where bit arithmetic is explained, this document uses the "octet". Where bit arithmetic is explained, this document uses the
notation familiar from the programming language C (including C++14's notation familiar from the programming language C (including C++14's
0bnnn binary literals), except that the operator "**" stands for 0bnnn binary literals), except that the operator "**" stands for
exponentiation. exponentiation.
The term "array" is used in a general sense in this document, unless
further specified. The term "classical CBOR array" describes an
array represented with CBOR major type 4. A "homogeneous array" is
an array of elements that are all of the same type (the term is
neutral whether that is a representation type or an application data
model type).
2. Typed Arrays 2. Typed Arrays
Typed arrays are homogeneous arrays of numbers, all of which are Typed arrays are homogeneous arrays of numbers, all of which are
encoded in a single form of binary representation. The concatenation encoded in a single form of binary representation. The concatenation
of these representations is encoded as a single CBOR byte string of these representations is encoded as a single CBOR byte string
(major type 2), enclosed by a single tag indicating the type and (major type 2), enclosed by a single tag indicating the type and
encoding of all the numbers represented in the byte string. encoding of all the numbers represented in the byte string.
2.1. Types of numbers 2.1. Types of numbers
skipping to change at page 4, line 19 skipping to change at page 4, line 21
| 1 | uint16 | sint16 | binary32 | | 1 | uint16 | sint16 | binary32 |
| 2 | uint32 | sint32 | binary64 | | 2 | uint32 | sint32 | binary64 |
| 3 | uint64 | sint64 | binary128 | | 3 | uint64 | sint64 | binary128 |
+-----------+--------+--------+-----------+ +-----------+--------+--------+-----------+
Table 1: Length values Table 1: Length values
Here, sintN stands for a signed integer of exactly N bits (for Here, sintN stands for a signed integer of exactly N bits (for
instance, sint16), and uintN stands for an unsigned integer of instance, sint16), and uintN stands for an unsigned integer of
exactly N bits (for instance, uint32). The name binaryN stands for exactly N bits (for instance, uint32). The name binaryN stands for
the number form of the same name defined in IEEE 754. the number form of the same name defined in IEEE 754 [IEEE754].
Since one objective of these tags is to be able to directly ship the Since one objective of these tags is to be able to directly ship the
ArrayBuffers underlying the Typed Arrays without re-encoding them, ArrayBuffers underlying the Typed Arrays without re-encoding them,
and these may be either in big endian (network byte order) or in and these may be either in big endian (network byte order) or in
little endian form, we need to define tags for both variants. little endian form, we need to define tags for both variants.
In total, this leads to 24 variants. In the tag, we need to express In total, this leads to 24 variants. In the tag, we need to express
the choice between integer and floating point, the signedness (for the choice between integer and floating point, the signedness (for
integers), the endianness, and one of the four length values. integers), the endianness, and one of the four length values.
skipping to change at page 5, line 4 skipping to change at page 5, line 9
| e | 0 for big endian, 1 for little endian | | e | 0 for big endian, 1 for little endian |
| ll | A number for the length (Table 1). | | ll | A number for the length (Table 1). |
+-------+-------------------------------------------------------+ +-------+-------------------------------------------------------+
Table 2: Bit fields in the low 8 bits of the tag Table 2: Bit fields in the low 8 bits of the tag
The number of bytes in each array element can then be calculated by The number of bytes in each array element can then be calculated by
"2**(f + ll)" (or "1 << (f + ll)" in a typical programming language). "2**(f + ll)" (or "1 << (f + ll)" in a typical programming language).
(Notice that 0f and ll are the two least significant bits, (Notice that 0f and ll are the two least significant bits,
respectively, of each nibble (4bit) in the byte.) respectively, of each nibble (4bit) in the byte.)
In the CBOR representation, the total number of elements in the array In the CBOR representation, the total number of elements in the array
is not expressed explicitly, but implied from the length of the byte is not expressed explicitly, but implied from the length of the byte
string and the length of each representation. It can be computed string and the length of each representation. It can be computed
inversely to the previous formula from the length of the byte string inversely to the previous formula from the length of the byte string
in bytes: "bytelength >> (f + ll)". in bytes: "bytelength >> (f + ll)".
For the uint8/sint8 values, the endianness is redundant. Only the For the uint8/sint8 values, the endianness is redundant. Only the
big endian variant is used. The Tag that would signify the little tag for the big endian variant is used and assigned as such. The Tag
endian variant of sint8 MUST NOT be used, its tag number is marked as that would signify the little endian variant of sint8 MUST NOT be
reserved. As a special case, the Tag that would signify the little used, its tag number is marked as reserved. As a special case, the
endian variant of uint8 is instead assigned to signify that the Tag that would signify the little endian variant of uint8 is instead
numbers in the array are using clamped conversion from integers, as assigned to signify that the numbers in the array are using clamped
described in more detail in Section 7.1.11 ("ToUint8Clamp") of the conversion from integers, as described in more detail in
ES6 JavaScript specification [TypedArrayES6]; the assumption here is Section 7.1.11 ("ToUint8Clamp") of the ES6 JavaScript specification
that a program-internal representation of this array after decoding [TypedArrayES6]; the assumption here is that a program-internal
would be marked this way for further processing, providing representation of this array after decoding would be marked this way
"roundtripping" of JavaScript typed arrays through CBOR. for further processing, providing "roundtripping" of JavaScript typed
arrays through CBOR.
IEEE 754 binary floating numbers are always signed. Therefore, for IEEE 754 binary floating numbers are always signed. Therefore, for
the float variants ("f" == 1), there is no need to distinguish the float variants ("f" == 1), there is no need to distinguish
between signed and unsigned variants; the "s" bit is always zero. between signed and unsigned variants; the "s" bit is always zero.
The Tag numbers where "s" would be one (which would have Tag values
88 to 95) remain free to use by other specifications.
3. Additional Array Tags 3. Additional Array Tags
This specification defines three additional array tags. The Multi- This specification defines three additional array tags. The Multi-
dimensional Array tags can be combined with classical CBOR arrays as dimensional Array tags can be combined with classical CBOR arrays as
well as with Typed Arrays in order to build multi-dimensional arrays well as with Typed Arrays in order to build multi-dimensional arrays
with constant numbers of elements in the sub-arrays. The Homogeneous with constant numbers of elements in the sub-arrays. The Homogeneous
Array tag can be used to facilitate the ingestion of homogeneous Array tag can be used as a signal by an application to identify a
classical CBOR arrays, providing performance advantages even when a classical CBOR array as a homogeneous array, even when a Typed Array
Typed Array does not apply. does not apply.
3.1. Multi-dimensional Array 3.1. Multi-dimensional Array
A multi-dimensional array is represented as a tagged array that A multi-dimensional array is represented as a tagged array that
contains two (one-dimensional) arrays. The first array defines the contains two (one-dimensional) arrays. The first array defines the
dimensions of the multi-dimensional array (in the sequence of outer dimensions of the multi-dimensional array (in the sequence of outer
dimensions towards inner dimensions) while the second array dimensions towards inner dimensions) while the second array
represents the contents of the multi-dimensional array. If the represents the contents of the multi-dimensional array. If the
second array is itself tagged as a Typed Array then the element type second array is itself tagged as a Typed Array then the element type
of the multi-dimensional array is known to be the same type as that of the multi-dimensional array is known to be the same type as that
skipping to change at page 8, line 11 skipping to change at page 8, line 11
Figure 3: Multi-dimensional array using basic CBOR array, column Figure 3: Multi-dimensional array using basic CBOR array, column
major order major order
3.2. Homogeneous Array 3.2. Homogeneous Array
Tag: 41 Tag: 41
Data Item: array (major type 4) Data Item: array (major type 4)
This tag provides a hint to decoders that the CBOR array (major type This tag identifies the classical CBOR array (a one-dimensional
4, a one-dimensional array) tagged by it has elements that are all of array) tagged by it as a homogeneous array, that is, it has elements
the same application type. The element type of the array is thus that are all of the same application model data type. The element
determined by the application type of the first array element. This type of the array is thus determined by the application model data
can be used by implementations in strongly typed languages while type of the first array element.
decoding to create native homogeneous arrays of specific types
instead of ordered lists.
Which CBOR data items constitute elements of the same application This can be used in application data models that apply specific
type is specific to the application. However, type systems of semantics to homogeneous arrays. Also, in certain cases,
programming languages have enough commonality that an application implementations in strongly typed languages may be able to create
should be able to create portable homogeneous arrays. native homogeneous arrays of specific types instead of ordered lists
while decoding. Which CBOR data items constitute elements of the
same application type is specific to the application.
Figure 4 shows an example for a homogeneous array of booleans in C++ Figure 4 shows an example for a homogeneous array of booleans in C++
and CBOR. and CBOR.
bool boolArray[2] = { true, false }; bool boolArray[2] = { true, false };
<Tag 41> # Homogeneous Array Tag <Tag 41> # Homogeneous Array Tag
82 #array(2) 82 #array(2)
F5 # true F5 # true
F4 # false F4 # false
skipping to change at page 9, line 39 skipping to change at page 9, line 39
This specification allocates a sizable chunk out of the single-byte This specification allocates a sizable chunk out of the single-byte
tag space. This use of code point space is justified by the wide use tag space. This use of code point space is justified by the wide use
of typed arrays in data interchange. of typed arrays in data interchange.
Providing a column-major order variant of the multi-dimensional array Providing a column-major order variant of the multi-dimensional array
may seem superfluous to some, and useful to others. It is cheap to may seem superfluous to some, and useful to others. It is cheap to
define the additional tag so it is available when actually needed. define the additional tag so it is available when actually needed.
Allocating it out of a different number space makes the preference Allocating it out of a different number space makes the preference
for row-major evident. for row-major evident.
Applying a Homogeneous Array tag to a Typed Array would be redundant Applying a Homogeneous Array tag to a Typed Array would usually be
and is therefore not provided by the present specification. redundant and is therefore not provided by the present specification.
5. CDDL typenames 5. CDDL typenames
For the use with CDDL [I-D.ietf-cbor-cddl], the typenames defined in For the use with CDDL [RFC8610], the typenames defined in Figure 6
Figure 6 are recommended: are recommended:
ta-uint8 = #6.64(bstr) ta-uint8 = #6.64(bstr)
ta-uint16be = #6.65(bstr) ta-uint16be = #6.65(bstr)
ta-uint32be = #6.66(bstr) ta-uint32be = #6.66(bstr)
ta-uint64be = #6.67(bstr) ta-uint64be = #6.67(bstr)
ta-uint8-clamped = #6.68(bstr) ta-uint8-clamped = #6.68(bstr)
ta-uint16le = #6.69(bstr) ta-uint16le = #6.69(bstr)
ta-uint32le = #6.70(bstr) ta-uint32le = #6.70(bstr)
ta-uint64le = #6.71(bstr) ta-uint64le = #6.71(bstr)
ta-sint8 = #6.72(bstr) ta-sint8 = #6.72(bstr)
skipping to change at page 14, line 9 skipping to change at page 14, line 9
item that a maliciously constructed CBOR input can then choose to item that a maliciously constructed CBOR input can then choose to
ignore. As always, the decoder therefore has to ensure that it is ignore. As always, the decoder therefore has to ensure that it is
not driven into an undefined state by array elements that do not not driven into an undefined state by array elements that do not
fulfill the promise and that it does continue to fulfill its API fulfill the promise and that it does continue to fulfill its API
contract in this case as well. contract in this case as well.
8. References 8. References
8.1. Normative References 8.1. Normative References
[I-D.ietf-cbor-cddl] [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
Birkholz, H., Vigano, C., and C. Bormann, "Concise data Std 754-2008.
definition language (CDDL): a notational convention to
express CBOR and JSON data structures", draft-ietf-cbor-
cddl-08 (work in progress), March 2019.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object
Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049,
October 2013, <https://www.rfc-editor.org/info/rfc7049>. October 2013, <https://www.rfc-editor.org/info/rfc7049>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data
Definition Language (CDDL): A Notational Convention to
Express Concise Binary Object Representation (CBOR) and
JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610,
June 2019, <https://www.rfc-editor.org/info/rfc8610>.
8.2. Informative References 8.2. Informative References
[ArrayBuffer] [ArrayBuffer]
Mozilla Developer Network, "JavaScript typed arrays", Mozilla Developer Network, "JavaScript typed arrays",
2013, <https://developer.mozilla.org/en- 2013, <https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Typed_arrays>. US/docs/Web/JavaScript/Typed_arrays>.
[TypedArray] [TypedArray]
Vukicevic, V. and K. Russell, "Typed Array Specification", Vukicevic, V. and K. Russell, "Typed Array Specification",
February 2011. February 2011.
skipping to change at page 15, line 7 skipping to change at page 15, line 10
The initial draft for this specification was written by Johnathan The initial draft for this specification was written by Johnathan
Roatch (roatch@gmail.com). Many thanks for getting this ball Roatch (roatch@gmail.com). Many thanks for getting this ball
rolling. rolling.
Glenn Engel suggested the tags for multi-dimensional arrays and Glenn Engel suggested the tags for multi-dimensional arrays and
homogeneous arrays. homogeneous arrays.
Acknowledgements Acknowledgements
Jim Schaad reminded us that column-major order still is in use. IANA Jim Schaad provided helpful comments and reminded us that column-
helped correct an error in a previous version. major order still is in use. Jeffrey Yaskin helped improve the
definition of homogeneous arrays. IANA helped correct an error in a
previous version.
Author's Address Author's Address
Carsten Bormann (editor) Carsten Bormann (editor)
Universitaet Bremen TZI Universitaet Bremen TZI
Postfach 330440 Postfach 330440
Bremen D-28359 Bremen D-28359
Germany Germany
Phone: +49-421-218-63921 Phone: +49-421-218-63921
 End of changes. 16 change blocks. 
40 lines changed or deleted 56 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/