draft-ietf-avtcore-rtp-vvc-01.txt   draft-ietf-avtcore-rtp-vvc-02.txt 
avtcore S. Zhao avtcore S. Zhao
Internet-Draft S. Wenger Internet-Draft S. Wenger
Intended status: Standards Track Tencent Intended status: Standards Track Tencent
Expires: October 1, 2020 Y. Sanchez Expires: January 12, 2021 Y. Sanchez
Fraunhofer HHI Fraunhofer HHI
March 30, 2020 July 11, 2020
RTP Payload Format for Versatile Video Coding (VVC) RTP Payload Format for Versatile Video Coding (VVC)
draft-ietf-avtcore-rtp-vvc-01 draft-ietf-avtcore-rtp-vvc-02
Abstract Abstract
This memo describes an RTP payload format for the video coding This memo describes an RTP payload format for the video coding
standard ITU-T Recommendation [H.266] and ISO/IEC International standard ITU-T Recommendation [H.266] and ISO/IEC International
Standard [ISO23090-3], both also known as Versatile Video Coding Standard [ISO23090-3], both also known as Versatile Video Coding
(VVC) and developed by the Joint Video Experts Team (JVET). The RTP (VVC) and developed by the Joint Video Experts Team (JVET). The RTP
payload format allows for packetization of one or more Network payload format allows for packetization of one or more Network
Abstraction Layer (NAL) units in each RTP packet payload as well as Abstraction Layer (NAL) units in each RTP packet payload as well as
fragmentation of a NAL unit into multiple RTP packets. The payload fragmentation of a NAL unit into multiple RTP packets. The payload
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on October 1, 2020. This Internet-Draft will expire on January 12, 2021.
Copyright Notice Copyright Notice
Copyright (c) 2020 IETF Trust and the persons identified as the Copyright (c) 2020 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Overview of the VVC Codec . . . . . . . . . . . . . . . . 3 1.1. Overview of the VVC Codec . . . . . . . . . . . . . . . . 3
1.1.1. Coding-Tool Features (informative) . . . . . . . . . 3 1.1.1. Coding-Tool Features (informative) . . . . . . . . . 4
1.1.2. Systems and Transport Interfaces . . . . . . . . . . 6 1.1.2. Systems and Transport Interfaces . . . . . . . . . . 6
1.1.3. Parallel Processing Support (informative) . . . . . . 10 1.1.3. Parallel Processing Support (informative) . . . . . . 10
1.1.4. NAL Unit Header . . . . . . . . . . . . . . . . . . . 10 1.1.4. NAL Unit Header . . . . . . . . . . . . . . . . . . . 11
1.2. Overview of the Payload Format . . . . . . . . . . . . . 12 1.2. Overview of the Payload Format . . . . . . . . . . . . . 12
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 12 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 12
3. Definitions and Abbreviations . . . . . . . . . . . . . . . . 12 3. Definitions and Abbreviations . . . . . . . . . . . . . . . . 12
3.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 12 3.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 12
3.1.1. Definitions from the VVC Specification . . . . . . . 13 3.1.1. Definitions from the VVC Specification . . . . . . . 13
3.1.2. Definitions Specific to This Memo . . . . . . . . . . 16 3.1.2. Definitions Specific to This Memo . . . . . . . . . . 16
3.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 16 3.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 16
4. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . 17 4. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . 17
4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 18 4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 18
4.2. Payload Header Usage . . . . . . . . . . . . . . . . . . 19 4.2. Payload Header Usage . . . . . . . . . . . . . . . . . . 19
4.3. Payload Structures . . . . . . . . . . . . . . . . . . . 20 4.3. Payload Structures . . . . . . . . . . . . . . . . . . . 20
4.3.1. Single NAL Unit Packets . . . . . . . . . . . . . . . 20 4.3.1. Single NAL Unit Packets . . . . . . . . . . . . . . . 20
4.3.2. Aggregation Packets (APs) . . . . . . . . . . . . . . 21 4.3.2. Aggregation Packets (APs) . . . . . . . . . . . . . . 21
4.3.3. Fragmentation Units . . . . . . . . . . . . . . . . . 25 4.3.3. Fragmentation Units . . . . . . . . . . . . . . . . . 25
4.4. Decoding Order Number . . . . . . . . . . . . . . . . . . 28 4.4. Decoding Order Number . . . . . . . . . . . . . . . . . . 28
5. Packetization Rules . . . . . . . . . . . . . . . . . . . . . 29 5. Packetization Rules . . . . . . . . . . . . . . . . . . . . . 29
6. De-packetization Process . . . . . . . . . . . . . . . . . . 30 6. De-packetization Process . . . . . . . . . . . . . . . . . . 30
7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 32 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 32
8. Use with Feedback Messages . . . . . . . . . . . . . . . . . 32 7.1. Media Type Registration . . . . . . . . . . . . . . . . . 32
8.1. Picture Loss Indication (PLI) . . . . . . . . . . . . . . 32 7.2. SDP Parameters . . . . . . . . . . . . . . . . . . . . . 32
8.2. Slice Loss Indication (SLI) . . . . . . . . . . . . . . . 32 7.2.1. Mapping of Payload Type Parameters to SDP . . . . . . 32
7.2.2. Usage with SDP Offer/Answer Model . . . . . . . . . . 33
8. Use with Feedback Messages . . . . . . . . . . . . . . . . . 33
8.1. Picture Loss Indication (PLI) . . . . . . . . . . . . . . 33
8.2. Slice Loss Indication (SLI) . . . . . . . . . . . . . . . 33
8.3. Reference Picture Selection Indication (RPSI) . . . . . . 33 8.3. Reference Picture Selection Indication (RPSI) . . . . . . 33
8.4. Full Intra Request (FIR) . . . . . . . . . . . . . . . . 33 8.4. Full Intra Request (FIR) . . . . . . . . . . . . . . . . 34
9. Frame marking . . . . . . . . . . . . . . . . . . . . . . . . 33 9. Frame Marking . . . . . . . . . . . . . . . . . . . . . . . . 34
10. Security Considerations . . . . . . . . . . . . . . . . . . . 33 9.1. Frame Marking Short Extension . . . . . . . . . . . . . . 35
11. Congestion Control . . . . . . . . . . . . . . . . . . . . . 35 9.2. Frame Marking Long Extension . . . . . . . . . . . . . . 36
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 10. Security Considerations . . . . . . . . . . . . . . . . . . . 37
13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 11. Congestion Control . . . . . . . . . . . . . . . . . . . . . 38
14. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39
14.1. Normative References . . . . . . . . . . . . . . . . . . 36 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 39
14.2. Informative References . . . . . . . . . . . . . . . . . 38 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 39
Appendix A. Change History . . . . . . . . . . . . . . . . . . . 39 14.1. Normative References . . . . . . . . . . . . . . . . . . 39
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 39 14.2. Informative References . . . . . . . . . . . . . . . . . 41
Appendix A. Change History . . . . . . . . . . . . . . . . . . . 42
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43
1. Introduction 1. Introduction
The Versatile Video Coding [VVC] specification, formally published as The Versatile Video Coding [VVC] specification, formally published as
both ITU-T Recommendation H.266 and ISO/IEC International Standard both ITU-T Recommendation H.266 and ISO/IEC International Standard
23090-3 [ISO23090-3], is currently in the ISO/IEC approval process 23090-3 [ISO23090-3], is currently in the ISO/IEC approval process
and is planned for ratification in mid 2020. H.266 is reported to and is planned for ratification in mid 2020. H.266 is reported to
provide significant coding efficiency gains over H.265 and earlier provide significant coding efficiency gains over H.265 and earlier
video codec formats. video codec formats.
skipping to change at page 20, line 44 skipping to change at page 20, line 44
o Aggregation Packet (AP): Contains more than one NAL unit within o Aggregation Packet (AP): Contains more than one NAL unit within
one access unit. This payload structure is specified in one access unit. This payload structure is specified in
Section 4.3.2. Section 4.3.2.
o Fragmentation Unit (FU): Contains a subset of a single NAL unit. o Fragmentation Unit (FU): Contains a subset of a single NAL unit.
This payload structure is specified in Section 4.3.3. This payload structure is specified in Section 4.3.3.
4.3.1. Single NAL Unit Packets 4.3.1. Single NAL Unit Packets
Editor notes: its better to add a section to describe DONL and Editor notes: its better to add a section to describe DONL and
sprop-max_don_diff sprop-max_don_diff. sprop-max_don_diff is used but not specified
as parameters in section 7 are not yet specified. A value of
sprop-max_don_diff greater than 0 indicates that the transmission
order may not correspond to the decoding order and that the DON is
is included in the payload header.
A single NAL unit packet contains exactly one NAL unit, and consists A single NAL unit packet contains exactly one NAL unit, and consists
of a payload header (denoted as PayloadHdr), a conditional 16-bit of a payload header (denoted as PayloadHdr), a conditional 16-bit
DONL field (in network byte order), and the NAL unit payload data DONL field (in network byte order), and the NAL unit payload data
(the NAL unit excluding its NAL unit header) of the contained NAL (the NAL unit excluding its NAL unit header) of the contained NAL
unit, as shown in Figure 3. unit, as shown in Figure 3.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 21, line 23 skipping to change at page 21, line 26
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :...OPTIONAL RTP padding | | :...OPTIONAL RTP padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Structure of a Single NAL Unit Packet The Structure of a Single NAL Unit Packet
Figure 3 Figure 3
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the contained NAL significant bits of the decoding order number of the contained NAL
unit. If sprop-max-don-diff is greater than 0 for any of the RTP unit. If sprop-max-don-diff is greater than 0, the DONL field MUST
streams, the DONL field MUST be present, and the variable DON for the be present, and the variable DON for the contained NAL unit is
contained NAL unit is derived as equal to the value of the DONL derived as equal to the value of the DONL field. Otherwise (sprop-
field. Otherwise (sprop-max-don-diff is equal to 0 for all the RTP max-don-diff is equal to 0), the DONL field MUST NOT be present.
streams), the DONL field MUST NOT be present.
4.3.2. Aggregation Packets (APs) 4.3.2. Aggregation Packets (APs)
Aggregation Packets (APs) can reduce of packetization overhead for Aggregation Packets (APs) can reduce of packetization overhead for
small NAL units, such as most of the non- VCL NAL units, which are small NAL units, such as most of the non- VCL NAL units, which are
often only a few octets in size. often only a few octets in size.
An AP aggregates NAL units of one access unit. Each NAL unit to be An AP aggregates NAL units of one access unit. Each NAL unit to be
carried in an AP is encapsulated in an aggregation unit. NAL units carried in an AP is encapsulated in an aggregation unit. NAL units
aggregated in one AP are included in NAL unit decoding order. aggregated in one AP are included in NAL unit decoding order.
skipping to change at page 23, line 25 skipping to change at page 23, line 25
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Structure of the First Aggregation Unit in an AP The Structure of the First Aggregation Unit in an AP
Figure 5 Figure 5
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the aggregated NAL significant bits of the decoding order number of the aggregated NAL
unit. unit.
If sprop-max-don-diff is greater than 0 for any of the RTP streams, If sprop-max-don-diff is greater than 0, the DONL field MUST be
the DONL field MUST be present in an aggregation unit that is the present in an aggregation unit that is the first aggregation unit in
first aggregation unit in an AP, and the variable DON for the an AP, and the variable DON for the aggregated NAL unit is derived as
aggregated NAL unit is derived as equal to the value of the DONL equal to the value of the DONL field. Otherwise (sprop-max-don-diff
field. Otherwise (sprop-max-don-diff is equal to 0 for all the RTP is equal to 0), the DONL field MUST NOT be present in an aggregation
streams), the DONL field MUST NOT be present in an aggregation unit unit that is the first aggregation unit in an AP.
that is the first aggregation unit in an AP.
An aggregation unit that is not the first aggregation unit in an AP An aggregation unit that is not the first aggregation unit in an AP
will be followed immediately by a 16-bit unsigned size information will be followed immediately by a 16-bit unsigned size information
(in network byte order) that indicates the size of the NAL unit in (in network byte order) that indicates the size of the NAL unit in
bytes (excluding these two octets, but including the NAL unit bytes (excluding these two octets, but including the NAL unit
header), followed by the NAL unit itself, including its NAL unit header), followed by the NAL unit itself, including its NAL unit
header, as shown in Figure 6. header, as shown in Figure 6.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
skipping to change at page 27, line 22 skipping to change at page 27, line 22
FuType: 5 bits FuType: 5 bits
The field FuType MUST be equal to the field Type of the fragmented The field FuType MUST be equal to the field Type of the fragmented
NAL unit. NAL unit.
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the fragmented NAL significant bits of the decoding order number of the fragmented NAL
unit. unit.
If sprop-max-don-diff is greater than 0 for any of the RTP streams, If sprop-max-don-diff is greater than 0, and the S bit is equal to 1,
and the S bit is equal to 1, the DONL field MUST be present in the the DONL field MUST be present in the FU, and the variable DON for
FU, and the variable DON for the fragmented NAL unit is derived as the fragmented NAL unit is derived as equal to the value of the DONL
equal to the value of the DONL field. Otherwise (sprop-max-don-diff field. Otherwise (sprop-max-don-diff is equal to 0, or the S bit is
is equal to 0 for all the RTP streams, or the S bit is equal to 0), equal to 0), the DONL field MUST NOT be present in the FU.
the DONL field MUST NOT be present in the FU.
A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e., A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e.,
the Start bit and End bit must not both be set to 1 in the same FU the Start bit and End bit must not both be set to 1 in the same FU
header. header.
The FU payload consists of fragments of the payload of the fragmented The FU payload consists of fragments of the payload of the fragmented
NAL unit so that if the FU payloads of consecutive FUs, starting with NAL unit so that if the FU payloads of consecutive FUs, starting with
an FU with the S bit equal to 1 and ending with an FU with the E bit an FU with the S bit equal to 1 and ending with an FU with the E bit
equal to 1, are sequentially concatenated, the payload of the equal to 1, are sequentially concatenated, the payload of the
fragmented NAL unit can be reconstructed. The NAL unit header of the fragmented NAL unit can be reconstructed. The NAL unit header of the
skipping to change at page 28, line 16 skipping to change at page 28, line 16
4.4. Decoding Order Number 4.4. Decoding Order Number
For each NAL unit, the variable AbsDon is derived, representing the For each NAL unit, the variable AbsDon is derived, representing the
decoding order number that is indicative of the NAL unit decoding decoding order number that is indicative of the NAL unit decoding
order. order.
Let NAL unit n be the n-th NAL unit in transmission order within an Let NAL unit n be the n-th NAL unit in transmission order within an
RTP stream. RTP stream.
If sprop-max-don-diff is equal to 0 for all the RTP streams carrying If sprop-max-don-diff is equal to 0, AbsDon[n], the value of AbsDon
the [VVC] bitstream, AbsDon[n], the value of AbsDon for NAL unit n, for NAL unit n, is derived as equal to n.
is derived as equal to n.
Otherwise (sprop-max-don-diff is greater than 0 for any of the RTP Otherwise (sprop-max-don-diff is greater than 0), AbsDon[n] is
streams), AbsDon[n] is derived as follows, where DON[n] is the value derived as follows, where DON[n] is the value of the variable DON for
of the variable DON for NAL unit n: NAL unit n:
o If n is equal to 0 (i.e., NAL unit n is the very first NAL unit in o If n is equal to 0 (i.e., NAL unit n is the very first NAL unit in
transmission order), AbsDon[0] is set equal to DON[0]. transmission order), AbsDon[0] is set equal to DON[0].
o Otherwise (n is greater than 0), the following applies for o Otherwise (n is greater than 0), the following applies for
derivation of AbsDon[n]: derivation of AbsDon[n]:
If DON[n] == DON[n-1], If DON[n] == DON[n-1],
AbsDon[n] = AbsDon[n-1] AbsDon[n] = AbsDon[n-1]
skipping to change at page 29, line 11 skipping to change at page 29, line 8
o AbsDon[n] greater than AbsDon[m] indicates that NAL unit n follows o AbsDon[n] greater than AbsDon[m] indicates that NAL unit n follows
NAL unit m in NAL unit decoding order. NAL unit m in NAL unit decoding order.
o When AbsDon[n] is equal to AbsDon[m], the NAL unit decoding order o When AbsDon[n] is equal to AbsDon[m], the NAL unit decoding order
of the two NAL units can be in either order. of the two NAL units can be in either order.
o AbsDon[n] less than AbsDon[m] indicates that NAL unit n precedes o AbsDon[n] less than AbsDon[m] indicates that NAL unit n precedes
NAL unit m in decoding order. NAL unit m in decoding order.
Informative note: When two consecutive NAL units in the NAL Informative note: When two consecutive NAL units in the NAL unit
unit decoding order have different values of AbsDon, the decoding order have different values of AbsDon, the absolute
absolute difference between the two AbsDon values may be difference between the two AbsDon values may be greater than or
greater than or equal to 1. equal to 1.
Informative note: There are multiple reasons to allow for the Informative note: There are multiple reasons to allow for the
absolute difference of the values of AbsDon for two consecutive absolute difference of the values of AbsDon for two consecutive
NAL units in the NAL unit decoding order to be greater than NAL units in the NAL unit decoding order to be greater than one.
one. An increment by one is not required, as at the time of An increment by one is not required, as at the time of associating
associating values of AbsDon to NAL units, it may not be known values of AbsDon to NAL units, it may not be known whether all NAL
whether all NAL units are to be delivered to the receiver. For units are to be delivered to the receiver. For example, a gateway
example, a gateway might not forward VCL NAL units of higher might not forward VCL NAL units of higher sub-layers or some SEI
sub- layers or some SEI NAL units when there is congestion in NAL units when there is congestion in the network.
the network. In another example, the first intra-coded picture In another example, the first intra-coded picture of a pre-encoded
of a pre-encoded clip is transmitted in advance to ensure that clip is transmitted in advance to ensure that it is readily
it is readily available in the receiver, and when transmitting available in the receiver, and when transmitting the first intra-
the first intra-coded picture, the originator does not exactly coded picture, the originator does not exactly know how many NAL
know how many NAL units will be encoded before the first intra- units will be encoded before the first intra-coded picture of the
coded picture of the pre-encoded clip follows in decoding pre-encoded clip follows in decoding order. Thus, the values of
order. Thus, the values of AbsDon for the NAL units of the AbsDon for the NAL units of the first intra-coded picture of the
first intra-coded picture of the pre-encoded clip have to be pre-encoded clip have to be estimated when they are transmitted,
estimated when they are transmitted, and gaps in values of and gaps in values of AbsDon may occur.
AbsDon may occur.
5. Packetization Rules 5. Packetization Rules
The following packetization rules apply: The following packetization rules apply:
o If sprop-max-don-diff is greater than 0 for any of the RTP o If sprop-max-don-diff is greater than 0, the transmission order of
streams, the transmission order of NAL units carried in the RTP NAL units carried in the RTP stream MAY be different than the NAL
stream MAY be different than the NAL unit decoding order and the unit decoding order and the NAL unit output order.
NAL unit output order.
o A NAL unit of a small size SHOULD be encapsulated in an o A NAL unit of a small size SHOULD be encapsulated in an
aggregation packet together one or more other NAL units in order aggregation packet together one or more other NAL units in order
to avoid the unnecessary packetization overhead for small NAL to avoid the unnecessary packetization overhead for small NAL
units. For example, non-VCL NAL units such as access unit units. For example, non-VCL NAL units such as access unit
delimiters, parameter sets, or SEI NAL units are typically small delimiters, parameter sets, or SEI NAL units are typically small
and can often be aggregated with VCL NAL units without violating and can often be aggregated with VCL NAL units without violating
MTU size constraints. MTU size constraints.
o Each non-VCL NAL unit SHOULD, when possible from an MTU size match o Each non-VCL NAL unit SHOULD, when possible from an MTU size match
skipping to change at page 31, line 11 skipping to change at page 31, line 5
compensation of transmission delay jitter, the receiver buffer is compensation of transmission delay jitter, the receiver buffer is
hereafter called the de-packetization buffer in this section. hereafter called the de-packetization buffer in this section.
Receivers should also prepare for transmission delay jitter; that is, Receivers should also prepare for transmission delay jitter; that is,
either reserve separate buffers for transmission delay jitter either reserve separate buffers for transmission delay jitter
buffering and de-packetization buffering or use a receiver buffer for buffering and de-packetization buffering or use a receiver buffer for
both transmission delay jitter and de- packetization. Moreover, both transmission delay jitter and de- packetization. Moreover,
receivers should take transmission delay jitter into account in the receivers should take transmission delay jitter into account in the
buffering operation, e.g., by additional initial buffering before buffering operation, e.g., by additional initial buffering before
starting of decoding and playback. starting of decoding and playback.
When sprop-max-don-diff is equal to 0 for all the received RTP When sprop-max-don-diff is equal to 0, the de-packetization buffer
streams, the de-packetization buffer size is zero bytes, and the size is zero bytes, and the process described in the remainder of
process described in the remainder of this paragraph applies. this paragraph applies.
The NAL units carried in the single RTP stream are directly passed to The NAL units carried in the single RTP stream are directly passed to
the decoder in their transmission order, which is identical to their the decoder in their transmission order, which is identical to their
decoding order. When there are several NAL units of the same RTP decoding order. When there are several NAL units of the same RTP
stream with the same NTP timestamp, the order to pass them to the stream with the same NTP timestamp, the order to pass them to the
decoder is their transmission order. decoder is their transmission order.
Informative note: The mapping between RTP and NTP timestamps is Informative note: The mapping between RTP and NTP timestamps is
conveyed in RTCP SR packets. In addition, the mechanisms for conveyed in RTCP SR packets. In addition, the mechanisms for
faster media timestamp synchronization discussed in [RFC6051] may faster media timestamp synchronization discussed in [RFC6051] may
be used to speed up the acquisition of the RTP-to-wall-clock be used to speed up the acquisition of the RTP-to-wall-clock
mapping. mapping.
When sprop-max-don-diff is greater than 0 for any the received RTP When sprop-max-don-diff is greater than 0, the process described in
streams, the process described in the remainder of this section the remainder of this section applies.
applies.
There are two buffering states in the receiver: initial buffering and There are two buffering states in the receiver: initial buffering and
buffering while playing. Initial buffering starts when the reception buffering while playing. Initial buffering starts when the reception
is initialized. After initial buffering, decoding and playback are is initialized. After initial buffering, decoding and playback are
started, and the buffering-while-playing mode is used. started, and the buffering-while-playing mode is used.
Regardless of the buffering state, the receiver stores incoming NAL Regardless of the buffering state, the receiver stores incoming NAL
units, in reception order, into the de-packetization buffer. NAL units, in reception order, into the de-packetization buffer. NAL
units carried in RTP packets are stored in the de-packetization units carried in RTP packets are stored in the de-packetization
buffer individually, and the value of AbsDon is calculated and stored buffer individually, and the value of AbsDon is calculated and stored
skipping to change at page 32, line 16 skipping to change at page 32, line 7
value of AbsDon is removed from the de-packetization buffer and value of AbsDon is removed from the de-packetization buffer and
passed to the decoder. passed to the decoder.
When no more NAL units are flowing into the de-packetization buffer, When no more NAL units are flowing into the de-packetization buffer,
all NAL units remaining in the de-packetization buffer are removed all NAL units remaining in the de-packetization buffer are removed
from the buffer and passed to the decoder in the order of increasing from the buffer and passed to the decoder in the order of increasing
AbsDon values. AbsDon values.
7. Payload Format Parameters 7. Payload Format Parameters
Placeholder This section specifies the optional parameters. A mapping of the
parameters with Session Description Protocol (SDP) [RFC4556] is also
provided for applications that use SDP.
7.1. Media Type Registration
The receiver MUST ignore any parameter unspecified in this memo.
Type name: Video
Subtype name: H266
Required parameters: none
Optional parameters:
Editor's notes: To be added
7.2. SDP Parameters
The receiver MUST ignore any parameter unspecified in this memo.
7.2.1. Mapping of Payload Type Parameters to SDP
The media type video/H266 string is mapped to fields in the Session
Description Protocol (SDP) [RFC4566] as follows:
o The media name in the "m=" line of SDP MUST be video.
o The encoding name in the "a=rtpmap" line of SDP MUST be H266 (the
media subtype).
o The clock rate in the "a=rtpmap" line MUST be 90000.
o OPTIONAL PARAMETERS:
Editor's notes: To be dicussed here
7.2.1.1. SDP Example
An example of media representation in SDP is as follows:
m=video 49170 RTP/AVP 98
a=rtpmap:98 H266/90000
a=fmtp:98 profile-id=1; sprop-vps=<video parameter sets data>
7.2.2. Usage with SDP Offer/Answer Model
When [VVC] is offered over RTP using SDP in an offer/answer model
[RFC3264] for negotiation for unicast usage, the following
limitations and rules apply:
Placeholder: To add limitations and considerations.
8. Use with Feedback Messages 8. Use with Feedback Messages
The following subsections define the use of the Picture Loss The following subsections define the use of the Picture Loss
Indication (PLI), Slice Lost Indication (SLI), Reference Picture Indication (PLI), Slice Lost Indication (SLI), Reference Picture
Selection Indication (RPSI), and Full Intra Request (FIR) feedback Selection Indication (RPSI), and Full Intra Request (FIR) feedback
messages with HEVC. The PLI, SLI, and RPSI messages are defined in messages with HEVC. The PLI, SLI, and RPSI messages are defined in
[RFC4585], and the FIR message is defined in [RFC5104]. [RFC4585], and the FIR message is defined in [RFC5104].
8.1. Picture Loss Indication (PLI) 8.1. Picture Loss Indication (PLI)
skipping to change at page 33, line 44 skipping to change at page 34, line 36
observing applicable congestion-control-related constraints, such as observing applicable congestion-control-related constraints, such as
those set out in [RFC8082]). those set out in [RFC8082]).
Upon reception of a FIR, a sender MUST send an IDR picture. Upon reception of a FIR, a sender MUST send an IDR picture.
Parameter sets MUST also be sent, except when there is a priori Parameter sets MUST also be sent, except when there is a priori
knowledge that the parameter sets have been correctly established. A knowledge that the parameter sets have been correctly established. A
typical example for that is an understanding between sender and typical example for that is an understanding between sender and
receiver, established by means outside this document, that parameter receiver, established by means outside this document, that parameter
sets are exclusively sent out-of-band. sets are exclusively sent out-of-band.
9. Frame marking 9. Frame Marking
placeholder [FrameMarking] provides an extension mechanism for RTP. The codec-
agnostic meta-data in the [FrameMarking] header provides valuable
video frame information. Its usage with [VVC] is defined in this
section. Refer [FrameMarking] for any unspecified fields. Two
header extensions are RECOMMENDED:
o The short extension for non-scalable streams.
o The long extension for scalable streams.
9.1. Frame Marking Short Extension
The fields for the short extension, as shown in Figure 11, are used
as described in the following.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | L=0 |S|E|I|D|0 0 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Short Frame Marking RTP Extension for [VVC]
Figure 11
The I bit MUST be 1 when the NAL unit type is 7-9 (inclusive),
otherwise it MUST be 0.
The D bit MUST be 1 when the syntax element ph_non_ref_pic_flag for a
picture is equal to 1, otherwise it MUST be 0.
The S bit MUST be set to 1 if any of the following conditions is true
and MUST be set to 0 otherwise:
o The RTP packet is a single NAL unit packet and it is the first VCL
NAL unit, in decoding order, of a picture.
o The RTP packet is an AP, and the NAL unit in the first contained
aggregation unit is the first VCL NAL unit, in decoding order, of
a picture.
o The RTP packet is a FU with its S bit equal to 1 and the FU
payload contains a fragment of the first VCL NAL unit, in decoding
order, of a picture.
The E bit MUST be set to 1 if any of the following conditions is true
and MUST be set to 0 otherwise:
o The RTP packet is a single NAL unit packet and it is the last VCL
NAL unit, in decoding order, of a picture.
o The RTP packet is an AP and the NAL unit in the last contained
aggregation unit is the last VCL NAL unit, in decoding order, of a
picture.
o The RTP packet is a FU with its E bit equal to 1 and the FU
payload contains a fragment of the last VCL NAL unit, in decoding
order, of a picture.
9.2. Frame Marking Long Extension
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | L=2 |S|E|I|D|B| TID |0|0| LayerID | TL0PICIDX |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Long Frame Marking RTP Extension for [VVC]
Figure 12
The fields for the long extension for scalable streams, as shown in
Figure 12, are used as described in the following.
The LayerID (6 bits) and TID (3 bits) from the NAL unit header
Section 1.1.4 are mapped to the generic LID and TID fields in
[FrameMarking] as shown in Figure 12.
The I bit MUST be 1 when the NAL unit type is 7-9 (inclusive),
otherwise it MUST be 0.
The D bit MUST be 1 when the syntax element ph_non_ref_pic_flag for a
picture is equal to 1, otherwise it MUST be 0.
The S bit MUST be set to 1 if any of the following conditions is true
and MUST be set to 0 otherwise:
o The RTP packet is a single NAL unit packet and it is the first VCL
NAL unit, in decoding order, of a picture.
o The RTP packet is an AP, and the NAL unit in the first contained
aggregation unit is the first VCL NAL unit, in decoding order, of
a picture.
o The RTP packet is a FU with its S bit equal to 1 and the FU
payload contains a fragment of the first VCL NAL unit, in decoding
order, of a picture.
The E bit MUST be set to 1 if any of the following conditions is true
and MUST be set to 0 otherwise:
o The RTP packet is a single NAL unit packet and it is the last VCL
NAL unit, in decoding order, of a picture.
o The RTP packet is an AP and the NAL unit in the last contained
aggregation unit is the last VCL NAL unit, in decoding order, of a
picture.
o The RTP packet is a FU with its E bit equal to 1 and the FU
payload contains a fragment of the last VCL NAL unit, in decoding
order, of a picture.
10. Security Considerations 10. Security Considerations
The scope of this Security Considerations section is limited to the The scope of this Security Considerations section is limited to the
payload format itself and to one feature of [VVC] that may pose a payload format itself and to one feature of [VVC] that may pose a
particularly serious security risk if implemented naively. The particularly serious security risk if implemented naively. The
payload format, in isolation, does not form a complete system. payload format, in isolation, does not form a complete system.
Implementers are advised to read and understand relevant security- Implementers are advised to read and understand relevant security-
related documents, especially those pertaining to RTP (see the related documents, especially those pertaining to RTP (see the
Security Considerations section in [RFC3550] ), and the security of Security Considerations section in [RFC3550] ), and the security of
skipping to change at page 37, line 5 skipping to change at page 40, line 16
"ISO/IEC DIS Information technology --- Coded "ISO/IEC DIS Information technology --- Coded
representation of immersive media --- Part 3 Versatile representation of immersive media --- Part 3 Versatile
video codings", n.d., video codings", n.d.,
<https://www.iso.org/standard/73022.html>. <https://www.iso.org/standard/73022.html>.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
DOI 10.17487/RFC3264, June 2002,
<https://www.rfc-editor.org/info/rfc3264>.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
July 2003, <https://www.rfc-editor.org/info/rfc3550>. July 2003, <https://www.rfc-editor.org/info/rfc3550>.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551, Video Conferences with Minimal Control", STD 65, RFC 3551,
DOI 10.17487/RFC3551, July 2003, DOI 10.17487/RFC3551, July 2003,
<https://www.rfc-editor.org/info/rfc3551>. <https://www.rfc-editor.org/info/rfc3551>.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)", Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, DOI 10.17487/RFC3711, March 2004, RFC 3711, DOI 10.17487/RFC3711, March 2004,
<https://www.rfc-editor.org/info/rfc3711>. <https://www.rfc-editor.org/info/rfc3711>.
[RFC4556] Zhu, L. and B. Tung, "Public Key Cryptography for Initial
Authentication in Kerberos (PKINIT)", RFC 4556,
DOI 10.17487/RFC4556, June 2006,
<https://www.rfc-editor.org/info/rfc4556>.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, DOI 10.17487/RFC4566, Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
July 2006, <https://www.rfc-editor.org/info/rfc4566>. July 2006, <https://www.rfc-editor.org/info/rfc4566>.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control "Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
DOI 10.17487/RFC4585, July 2006, DOI 10.17487/RFC4585, July 2006,
<https://www.rfc-editor.org/info/rfc4585>. <https://www.rfc-editor.org/info/rfc4585>.
skipping to change at page 38, line 9 skipping to change at page 41, line 31
[RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund, [RFC8082] Wenger, S., Lennox, J., Burman, B., and M. Westerlund,
"Using Codec Control Messages in the RTP Audio-Visual "Using Codec Control Messages in the RTP Audio-Visual
Profile with Feedback with Layered Codecs", RFC 8082, Profile with Feedback with Layered Codecs", RFC 8082,
DOI 10.17487/RFC8082, March 2017, DOI 10.17487/RFC8082, March 2017,
<https://www.rfc-editor.org/info/rfc8082>. <https://www.rfc-editor.org/info/rfc8082>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[VVC] "Versatile Video Coding (Draft 8), Joint Video Experts [VVC] "Versatile Video Coding (Draft 10), Joint Video Experts
Team (JVET)", January 2020. Team (JVET)", July 2020.
14.2. Informative References 14.2. Informative References
[CABAC] Sole, J, . and . et al, "Transform coefficient coding in [CABAC] Sole, J, . and . et al, "Transform coefficient coding in
HEVC, IEEE Transactions on Circuts and Systems for Video HEVC, IEEE Transactions on Circuts and Systems for Video
Technology", DOI 10.1109/TCSVT.2012.2223055, December Technology", DOI 10.1109/TCSVT.2012.2223055, December
2012. 2012.
[FrameMarking] [FrameMarking]
Berger, E, ., Nandakumar, S, ., and . Zanaty M, "Frame Berger, E, ., Nandakumar, S, ., and . Zanaty M, "Frame
skipping to change at page 39, line 27 skipping to change at page 42, line 49
Communications and Image Processing 2005 (VCIP 2005) , Communications and Image Processing 2005 (VCIP 2005) ,
July 2005. July 2005.
Appendix A. Change History Appendix A. Change History
draft-zhao-payload-rtp-vvc-00 ........ initial version draft-zhao-payload-rtp-vvc-00 ........ initial version
draft-zhao-payload-rtp-vvc-01 ........ editorial clarifications and draft-zhao-payload-rtp-vvc-01 ........ editorial clarifications and
corrections corrections
draft-ietf-payload-rtp-vvc-00 ........ initial WG draft
draft-ietf-payload-rtp-vvc-01 ........ VVC specification update
Authors' Addresses Authors' Addresses
Shuai Zhao Shuai Zhao
Tencent Tencent
2747 Park Blvd 2747 Park Blvd
Palo Alto 94588 Palo Alto 94588
USA USA
Email: shuai.zhao@ieee.org Email: shuai.zhao@ieee.org
 End of changes. 26 change blocks. 
82 lines changed or deleted 261 lines changed or added

This html diff was produced by rfcdiff 1.47. The latest version is available from http://tools.ietf.org/tools/rfcdiff/