--- 1/draft-ietf-cellar-ffv1-v4-09.txt 2020-04-28 12:15:54.540055620 -0700 +++ 2/draft-ietf-cellar-ffv1-v4-10.txt 2020-04-28 12:15:54.604056502 -0700 @@ -1,20 +1,20 @@ cellar M. Niedermayer Internet-Draft Intended status: Standards Track D. Rice -Expires: 30 July 2020 +Expires: 30 October 2020 J. Martinez - 27 January 2020 + 28 April 2020 FFV1 Video Coding Format Version 4 - draft-ietf-cellar-ffv1-v4-09 + draft-ietf-cellar-ffv1-v4-10 Abstract This document defines FFV1, a lossless intra-frame video encoding format. FFV1 is designed to efficiently compress video data in a variety of pixel formats. Compared to uncompressed video, FFV1 offers storage compression, frame fixity, and self-description, which makes FFV1 useful as a preservation or intermediate video format. Status of This Memo @@ -25,21 +25,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 30 July 2020. + This Internet-Draft will expire on 30 October 2020. Copyright Notice Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights @@ -56,98 +56,97 @@ 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 5 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 5 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 6 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 6 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 7 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 8 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 8 - 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 8 + 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 9 + 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 10 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 11 - 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 11 + 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 12 + 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 3.7. Color spaces . . . . . . . . . . . . . . . . . . . . . . 12 - 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 12 + 3.7.1. YCbCr . . . . . . . . . . . . . . . . . . . . . . . . 13 3.7.2. RGB . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 14 - 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 14 - 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 19 - 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 24 - 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 25 - 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 27 - 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 27 - 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 28 - 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 29 - 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 29 - 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 30 - 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 30 - 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 30 - 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 31 - 4.1.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 31 - 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 31 - 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 31 - 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 31 - 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 31 - 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 32 - 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 32 - 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 32 - 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 33 - 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 33 - 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 33 - 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 34 - 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 35 - 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 36 - 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 37 - 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 38 - 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 38 - 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 38 - 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 38 - 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 39 - 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 39 - 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 39 - 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 39 - 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 40 - 4.5.10. reset_contexts . . . . . . . . . . . . . . . . . . . 40 - 4.5.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 40 - 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 40 - 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 41 - 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 41 - 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 41 - 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 41 - 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 42 - 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 42 - 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 42 - 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 42 - 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 43 - 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 43 - 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 43 - 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 43 - 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 44 - 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 44 - 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 45 - 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 45 - 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 46 - 6. Security Considerations . . . . . . . . . . . . . . . . . . . 46 - 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 47 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 48 - 9. Appendixes . . . . . . . . . . . . . . . . . . . . . . . . . 49 - 9.1. Decoder implementation suggestions . . . . . . . . . . . 49 - 9.1.1. Multi-threading Support and Independence of Slices . 49 - 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 49 - 11. Normative References . . . . . . . . . . . . . . . . . . . . 49 - 12. Informative References . . . . . . . . . . . . . . . . . . . 50 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 + 3.8. Coding of the Sample Difference . . . . . . . . . . . . . 15 + 3.8.1. Range Coding Mode . . . . . . . . . . . . . . . . . . 15 + 3.8.2. Golomb Rice Mode . . . . . . . . . . . . . . . . . . 20 + 4. Bitstream . . . . . . . . . . . . . . . . . . . . . . . . . . 25 + 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 26 + 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 28 + 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 28 + 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 29 + 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 30 + 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 30 + 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 31 + 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 31 + 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 32 + 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 32 + 4.1.10. "extra\_plane" . . . . . . . . . . . . . . . . . . . 32 + 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 32 + 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 33 + 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 33 + 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 33 + 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 33 + 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 34 + 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 34 + 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 34 + 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 35 + 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 35 + 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 35 + 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 36 + 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 38 + 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 39 + 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 39 + 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 39 + 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 40 + 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 40 + 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 40 + 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 40 + 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 40 + 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 41 + 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 41 + 4.5.10. reset_contexts . . . . . . . . . . . . . . . . . . . 41 + 4.5.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 42 + 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 42 + 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 42 + 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 43 + 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 43 + 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 43 + 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 43 + 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 44 + 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 44 + 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 44 + 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 44 + 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 44 + 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 45 + 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 45 + 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 45 + 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 46 + 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 47 + 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 47 + 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 47 + 6. Security Considerations . . . . . . . . . . . . . . . . . . . 48 + 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 49 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 50 + 9. Appendix A: Multi-theaded decoder implementation + suggestions . . . . . . . . . . . . . . . . . . . . . . . 50 + 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 51 + 11. Normative References . . . . . . . . . . . . . . . . . . . . 51 + 12. Informative References . . . . . . . . . . . . . . . . . . . 52 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 53 1. Introduction This document describes FFV1, a lossless video encoding format. The design of FFV1 considers the storage of image characteristics, data fixity, and the optimized use of encoding time and storage requirements. FFV1 is designed to support a wide range of lossless video applications such as long-term audiovisual preservation, scientific imaging, screen recording, and other video encoding scenarios that seek to avoid the generational loss of lossy video @@ -220,40 +219,43 @@ The FFV1 bitstream is described in this document using pseudo-code. Note that the pseudo-code is used for clarity in order to illustrate the structure of FFV1 and not intended to specify any particular implementation. The pseudo-code used is based upon the C programming language [ISO.9899.1990] and uses its "if/else", "while" and "for" keywords as well as functions defined within this document. 2.2.2. Arithmetic Operators Note: the operators and the order of precedence are the same as used - in the C programming language [ISO.9899.2018]. + in the C programming language [ISO.9899.2018]. With the exception of + ">>" (removal of implementation defined behavior) and "^" (power + instead of XOR) operators which are re-defined within this section. "a + b" means a plus b. "a - b" means a minus b. "-a" means negation of a. "a * b" means a multiplied by b. "a / b" means a divided by b. "a ^ b" means a raised to the b-th power. "a & b" means bit-wise "and" of a and b. "a | b" means bit-wise "or" of a and b. "a >> b" means arithmetic right shift of two's complement integer - representation of a by b binary digits. + representation of a by b binary digits. This is equivalent to, b + times dividing a by 2 with rounding toward negative infinity. "a << b" means arithmetic left shift of two's complement integer representation of a by b binary digits. 2.2.3. Assignment Operators "a = b" means a is assigned b. "a++" is equivalent to a is assigned a + 1. @@ -300,23 +302,23 @@ log2(a) the base-two logarithm of a min(a,b) the smallest of two values a and b max(a,b) the largest of two values a and b median(a,b,c) the numerical middle value in a data set of a, b, and c, i.e. a+b+c-min(a,b,c)-max(a,b,c) - a_(b) the b-th value of a sequence of a + A <== B B implies A - a~b,c. the 'b,c'-th value of a sequence of a + A <==> B A <== B , B <== A 2.2.6. Order of Operation Precedence When order of precedence is not indicated explicitly by use of parentheses, operations are evaluated in the following order (from top to bottom, operations of same precedence being evaluated from left to right). This order of operations is based on the order of operations used in Standard C. a++, a-- @@ -584,43 +587,49 @@ Cb=b-g Cr=r-g Y=g+(Cb+Cr)>>2 g=Y-(Cb+Cr)>>2 r=Cr+g b=Cb+g Figure 5 - Exception for the JPEG2000-RCT conversion: if bits_per_raw_sample is - between 9 and 15 inclusive and extra_plane is 0, the following + Exception for the JPEG2000-RCT conversion: if "bits_per_raw_sample" + is between 9 and 15 inclusive and "extra_plane" is 0, the following formulae for reversible conversions between YCbCr and RGB MUST be used instead of the ones above: Cb=g-b Cr=r-b Y=b+(Cb+Cr)>>2 b=Y-(Cb+Cr)>>2 r=Cr+b g=Cb+b Figure 6 Background: At the time of this writing, in all known implementations - of FFV1 bitstream, when bits_per_raw_sample was between 9 and 15 - inclusive and extra_plane is 0, GBR "Planes" were used as BGR + of FFV1 bitstream, when "bits_per_raw_sample" was between 9 and 15 + inclusive and "extra_plane" is 0, GBR "Planes" were used as BGR "Planes" during both encoding and decoding. In the meanwhile, 16-bit JPEG2000-RCT was implemented without this issue in one implementation and validated by one conformance checker. Methods to address this exception for the transform are under consideration for the next version of the FFV1 bitstream. + Cb and Cr are positively offseted by "1 << bits_per_raw_sample" after + the conversion from RGB to the modified YCbCr and are negatively + offseted by the same value before the conversion from the modified + YCbCr to RGB, in order to have only non-negative values after the + conversion. + When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are interleaved to improve caching efficiency since it is most likely that the JPEG2000-RCT will immediately be converted to RGB during decoding. The interleaved coding order is also Y, then Cb, then Cr, and then if used transparency. As an example, a "Frame" that is two "Pixels" wide and two "Pixels" high, could comprise the following structure: +------------------------+------------------------+ @@ -637,43 +645,43 @@ Y(1,1) Y(2,1) Cb(1,1) Cb(2,1) Cr(1,1) Cr(2,1) Y(1,2) Y(2,2) Cb(1,2) Cb(2,2) Cr(1,2) Cr(2,2) 3.8. Coding of the Sample Difference Instead of coding the n+1 bits of the Sample Difference with Huffman or Range coding (or n+2 bits, in the case of JPEG2000-RCT), only the n (or n+1, in the case of JPEG2000-RCT) least significant bits are used, since this is sufficient to recover the original "Sample". In - the equation below, the term "bits" represents bits_per_raw_sample+1 - for JPEG2000-RCT or bits_per_raw_sample otherwise: + the equation below, the term "bits" represents "bits_per_raw_sample + + 1" for JPEG2000-RCT or "bits_per_raw_sample" otherwise: coder_input = [(sample_difference + 2^(bits-1)) & (2^bits - 1)] - 2^(bits-1) Figure 7 3.8.1. Range Coding Mode Early experimental versions of FFV1 used the CABAC Arithmetic coder from H.264 as defined in [ISO.14496-10.2014] but due to the uncertain patent/royalty situation, as well as its slightly worse performance, CABAC was replaced by a Range coder based on an algorithm defined by G. Nigel and N. Martin in 1979 [range-coding]. 3.8.1.1. Range Binary Values - To encode binary digits efficiently a Range coder is used. "C~i~" is - the i-th Context. "B~i~" is the i-th byte of the bytestream. "b~i~" - is the i-th Range coded binary value, "S~0,i~" is the i-th initial + To encode binary digits efficiently a Range coder is used. "C(i)" is + the i-th Context. "B(i)" is the i-th byte of the bytestream. "b(i)" + is the i-th Range coded binary value, "S(0,i)" is the i-th initial state. The length of the bytestream encoding n binary symbols is - "j~n~" bytes. + "j(n)" bytes. r_{i} = floor( ( R_{i} * S_{i,C_{i}} ) / 2^8 ) Figure 8 S_{i+1,C_{i}} = zero_state_{S_{i,C_{i}}} AND l_i = L_i AND t_i = R_i - r_i <== b_i = 0 <==> L_i < R_i - r_i @@ -840,21 +848,21 @@ 210,211,212,213,215,215,216,217,218,219,220,220,222,223,224,225, 226,227,227,229,229,230,231,232,234,234,235,236,237,238,239,240, 241,242,243,244,245,246,247,248,248, 0, 0, 0, 0, 0, 0, 0, 3.8.1.6. Alternative State Transition Table The alternative state transition table has been built using iterative minimization of frame sizes and generally performs better than the - default. To use it, the coder_type (see Section 4.1.3) MUST be set + default. To use it, the "coder_type" (see Section 4.1.3) MUST be set to 2 and the difference to the default MUST be stored in the "Parameters", see Section 4.1. The reference implementation of FFV1 in FFmpeg uses Figure 18 by default at the time of this writing when Range coding is used. 0, 10, 10, 10, 10, 16, 16, 16, 28, 16, 16, 29, 42, 49, 20, 49, 59, 25, 26, 26, 27, 31, 33, 33, 33, 34, 34, 37, 67, 38, 39, 39, 40, 40, 41, 79, 43, 44, 45, 45, 48, 48, 64, 50, 51, 52, 88, 52, @@ -920,20 +928,24 @@ +----------------+-------+ | bits | value | +================+=======+ | 1 | 0 | +----------------+-------+ | 01 | 1 | +----------------+-------+ | ... | ... | +----------------+-------+ + | 0000 0000 01 | 9 | + +----------------+-------+ + | 0000 0000 001 | 10 | + +----------------+-------+ | 0000 0000 0001 | 11 | +----------------+-------+ | 0000 0000 0000 | ESC | +----------------+-------+ Table 1 3.8.2.1.2. Suffix +---------+--------------------------------------------------+ @@ -968,21 +979,21 @@ 3.8.2.2. Run Mode Run mode is entered when the context is 0 and left as soon as a non-0 difference is found. The level is identical to the predicted one. The run and the first different level are coded. 3.8.2.2.1. Run Length Coding The run value is encoded in 2 parts, the prefix part stores the more - significant part of the run as well as adjusting the run_index that + significant part of the run as well as adjusting the "run_index" that determines the number of bits in the less significant part of the run. The 2nd part of the value stores the less significant part of the run as it is. The run_index is reset for each "Plane" and slice to 0. pseudo-code | type --------------------------------------------------------------|----- log2_run[41]={ | 0, 0, 0, 0, 1, 1, 1, 1, | 2, 2, 2, 2, 3, 3, 3, 3, | @@ -1004,21 +1015,21 @@ } else { | run_count = 0; | } | if (run_index) { | run_index--; | } | run_mode = 2; | } | } | - The log2_run function is also used within [ISO.14495-1.1999]. + The "log2_run" function is also used within [ISO.14495-1.1999]. 3.8.2.2.2. Level Coding Level coding is identical to the normal difference coding with the exception that the 0 value is removed as it cannot occur: diff = get_vlc_symbol(context_state); if (diff >= 0) { diff++; } @@ -1224,53 +1235,53 @@ "micro_version" specifies the micro-version of the FFV1 bitstream. After a version is considered stable (a micro-version value is assigned to be the first stable variant of a specific version), each new micro-version after this first stable variant is compatible with the previous micro-version: decoders SHOULD NOT reject a file due to an unknown micro-version equal or above the micro-version considered as stable. - Meaning of micro_version for version 3: + Meaning of "micro_version" for version 3: +-------+-------------------------+ | value | micro_version | +=======+=========================+ | 0...3 | reserved* | +-------+-------------------------+ | 4 | first stable variant | +-------+-------------------------+ | Other | reserved for future use | +-------+-------------------------+ Table 6: The definitions for - micro_version values. + "micro_version" values. * development versions may be incompatible with the stable variants. - Meaning of micro_version for version 4 (note: at the time of writing - of this specification, version 4 is not considered stable so the - first stable version value is to be announced in the future): + Meaning of "micro_version" for version 4 (note: at the time of + writing of this specification, version 4 is not considered stable so + the first stable version value is to be announced in the future): +---------+-------------------------+ | value | micro_version | +=========+=========================+ | 0...TBA | reserved* | +---------+-------------------------+ | TBA | first stable variant | +---------+-------------------------+ | Other | reserved for future use | +---------+-------------------------+ Table 7: The definitions for - micro_version values for FFV1 + "micro_version" values for FFV1 version 4. * development versions which may be incompatible with the stable variants. 4.1.3. coder_type "coder_type" specifies the coder used. +-------+-------------------------------------------------+ @@ -1280,26 +1291,35 @@ +-------+-------------------------------------------------+ | 1 | Range Coder with default state transition table | +-------+-------------------------------------------------+ | 2 | Range Coder with custom state transition table | +-------+-------------------------------------------------+ | Other | reserved for future use | +-------+-------------------------------------------------+ Table 8 + Restrictions: + + If "coder_type" is 0, then "bits_per_raw_sample" SHOULD NOT be > 8. + + Background: At the time of this writing, there is no known + implementations of FFV1 bitstream supporting Golomb Rice algorithm + with "bits_per_raw_sample" greater than 8, and Range Coder is + prefered. + 4.1.4. state_transition_delta "state_transition_delta" specifies the Range coder custom state transition table. - If state_transition_delta is not present in the FFV1 bitstream, all + If "state_transition_delta" is not present in the FFV1 bitstream, all Range coder custom state transition table elements are assumed to be 0. 4.1.5. colorspace_type "colorspace_type" specifies the color space encoded, the pixel transformation used by the encoder, the extra plane content, as well as interleave method. +-------+-------------+----------------+--------------+-------------+ @@ -1349,36 +1369,36 @@ +-------+-----------------------------------+ | value | bits for each sample | +=======+===================================+ | 0 | reserved* | +-------+-----------------------------------+ | Other | the actual bits for each "Sample" | +-------+-----------------------------------+ Table 11 - * Encoders MUST NOT store bits_per_raw_sample = 0 Decoders SHOULD - accept and interpret bits_per_raw_sample = 0 as 8. + * Encoders MUST NOT store "bits_per_raw_sample" = 0 Decoders SHOULD + accept and interpret "bits_per_raw_sample" = 0 as 8. 4.1.8. log2_h_chroma_subsample "log2_h_chroma_subsample" indicates the subsample factor, stored in powers to which the number 2 must be raised, between luma and chroma width ("chroma_width = 2^-log2_h_chroma_subsample^ * luma_width"). 4.1.9. log2_v_chroma_subsample "log2_v_chroma_subsample" indicates the subsample factor, stored in powers to which the number 2 must be raised, between luma and chroma height ("chroma_height=2^-log2_v_chroma_subsample^ * luma_height"). -4.1.10. extra_plane +4.1.10. "extra\_plane" "extra_plane" indicates if an extra "Plane" is present. +-------+------------------------------+ | value | presence | +=======+==============================+ | 0 | extra "Plane" is not present | +-------+------------------------------+ | 1 | extra "Plane" is present | +-------+------------------------------+ @@ -1451,35 +1471,34 @@ +-------+--------------------------------------------+ | 1 | 32-bit CRC per slice and the global header | +-------+--------------------------------------------+ | Other | reserved for future use | +-------+--------------------------------------------+ Table 14 4.1.17. intra - "intra" indicates the relationship between the instances of "Frame". + "intra" indicates the constraint on "keyframe" in each instance of + "Frame". Inferred to be 0 if not present. - +-------+-------------------------------------+ + +-------+-------------------------------------------------------+ | value | relationship | - +=======+=====================================+ - | 0 | Frames are independent or dependent | - | | (keyframes and non keyframes) | - +-------+-------------------------------------+ - | 1 | Frames are independent (keyframes | - | | only) | - +-------+-------------------------------------+ + +=======+=======================================================+ + | 0 | "keyframe" can be 0 or 1 (non keyframes or keyframes) | + +-------+-------------------------------------------------------+ + | 1 | "keyframe" MUST be 1 (keyframes only) | + +-------+-------------------------------------------------------+ | Other | reserved for future use | - +-------+-------------------------------------+ + +-------+-------------------------------------------------------+ Table 15 4.2. Configuration Record In the case of a FFV1 bitstream with "version >= 3", a "Configuration Record" is stored in the underlying "Container", at the track header level. It contains the "Parameters" used for all instances of "Frame". The size of the "Configuration Record", "NumBytes", is supplied by the underlying "Container". @@ -1541,47 +1560,47 @@ The "Configuration Record" extends the sample description box ("moov", "trak", "mdia", "minf", "stbl", "stsd") with a "glbl" box that contains the ConfigurationRecord bitstream. See [ISO.14496-12.2015] for more information about boxes. "NumBytes" is defined as the size, in bytes, of the "glbl" box indicated in the box header minus the size of the box header. 4.2.3.3. NUT File Format - The codec_specific_data element (in "stream_header" packet) contains - the ConfigurationRecord bitstream. See [NUT] for more information - about elements. + The "codec_specific_data" element (in "stream_header" packet) + contains the ConfigurationRecord bitstream. See [NUT] for more + information about elements. "NumBytes" is defined as the size, in bytes, of the - codec_specific_data element as indicated in the "length" field of - codec_specific_data + "codec_specific_data" element as indicated in the "length" field of + "codec_specific_data". 4.2.3.4. Matroska File Format FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID". For FFV1 versions 2 or less, the Matroska "CodecPrivate" Element SHOULD NOT be used. For FFV1 versions 3 or greater, the Matroska "CodecPrivate" Element MUST contain the FFV1 "Configuration Record" structure and no other data. See [Matroska] for more information about elements. "NumBytes" is defined as the "Element Data Size" of the "CodecPrivate" Element. 4.3. Frame A "Frame" is an encoded representation of a complete static image. The whole "Frame" is provided by the underlaying container. - A "Frame" consists of the keyframe field, "Parameters" (if version - <=1), and a sequence of independent slices. The pseudo-code below - describes the contents of a "Frame". + A "Frame" consists of the "keyframe" field, "Parameters" (if + "version" <=1), and a sequence of independent slices. The pseudo- + code below describes the contents of a "Frame". pseudo-code | type --------------------------------------------------------------|----- Frame( NumBytes ) { | keyframe | br if (keyframe && !ConfigurationRecordIsPresent { | Parameters( ) | } | while (remaining_bits_in_bitstream( NumBytes )) { | Slice( ) | @@ -1661,22 +1680,22 @@ "reserved" specifies a bit without any significance in this revision of the specification and may have a significance in a later revision of this specification. Encoders SHOULD NOT fill these bits. Decoders SHOULD ignore these bits. Note in case these bits are used in a later revision of this specification: any revision of this specification SHOULD care about - avoiding to add 40 bits of content after "SliceContent" for version 0 - and 1 of the bitstream. Background: Due to some non-conforming + avoiding to add 40 bits of content after "SliceContent" for "version" + 0 and 1 of the bitstream. Background: Due to some non-conforming encoders, some bitstreams were found with 40 extra bits corresponding to "error_status" and "slice_crc_parity". As a result, a decoder conforming to the revised specification could not distinguish between a revised bitstream and a buggy bitstream. 4.5. Slice Header A "Slice Header" provides information about the decoding configuration of the "Slice", such as its spatial position, size, and aspect ratio. The pseudo-code below describes the contents of the @@ -1725,21 +1744,21 @@ 4.5.4. slice_height "slice_height" indicates the height on the slice raster formed by num_v_slices. Inferred to be 1 if not present. 4.5.5. quant_table_set_index_count "quant_table_set_index_count" is defined as "1 + ( ( chroma_planes || - version <= 3 ) ? 1 : 0 ) + ( extra_plane ? 1 : 0 )". + version <= 3 ) ? 1 : 0 ) + ("extra_plane"? 1 : 0 )". 4.5.6. quant_table_set_index "quant_table_set_index" indicates the Quantization Table Set index to select the Quantization Table Set and the initial states for the slice. Inferred to be 0 if not present. 4.5.7. picture_structure @@ -1836,21 +1855,21 @@ for (p = 0; p < primary_color_count; p++) { | Line( p, y ) | } | } | } | } | 4.6.1. primary_color_count "primary_color_count" is defined as "1 + ( chroma_planes ? 2 : 0 ) + - ( extra_plane ? 1 : 0 )". + ("extra_plane"? 1 : 0 )". 4.6.2. plane_pixel_height "plane_pixel_height[ p ]" is the height in pixels of plane p of the slice. "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". If "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and @@ -2035,33 +2055,33 @@ 4.9.2. context_count "context_count[ i ]" indicates the count of contexts for Quantization Table Set "i". "context_count[ i ]" MUST be less than or equal to 32768. 5. Restrictions To ensure that fast multithreaded decoding is possible, starting with - version 3 and if "frame_pixel_width * frame_pixel_height" is more + "version" 3 and if "frame_pixel_width * frame_pixel_height" is more than 101376, "slice_width * slice_height" MUST be less or equal to "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate Format") frame size format. For each "Frame", each position in the slice raster MUST be filled by one and only one slice of the "Frame" (no missing slice position, no slice overlapping). - For each "Frame" with keyframe value of 0, each slice MUST have the - same value of "slice_x, slice_y, slice_width, slice_height" as a - slice in the previous "Frame", except if "reset_contexts" is 1. + For each "Frame" with "keyframe" value of 0, each slice MUST have the + same value of "slice_x", "slice_y", "slice_width", "slice_height" as + a slice in the previous "Frame", except if "reset_contexts" is 1. 6. Security Considerations Like any other codec, (such as [RFC6716]), FFV1 should not be used with insecure ciphers or cipher-modes that are vulnerable to known plaintext attacks. Some of the header bits as well as the padding are easily predictable. Implementations of the FFV1 codec need to take appropriate security considerations into account, as outlined in [RFC4732]. It is @@ -2097,68 +2117,68 @@ * Sending the decoder random packets that are not FFV1. In all of the conditions above, the decoder and encoder was run inside the [VALGRIND] memory debugger as well as clangs address sanitizer [Address-Sanitizer], which track reads and writes to invalid memory regions as well as the use of uninitialized memory. There were no errors reported on any of the tested conditions. 7. Media Type Definition - This registration is done using the template defined in [RFC6838] and - following [RFC4855]. + This section completes the media type registration template defined + in [RFC6838] and following [RFC4855]. Type name: video Subtype name: FFV1 Required parameters: None. Optional parameters: This parameter is used to signal the capabilities of a receiver implementation. This parameter MUST NOT be used for any other purpose. - version: The version of the FFV1 encoding as defined by + "version": The version of the FFV1 encoding as defined by Section 4.1.1. - micro_version: The micro_version of the FFV1 encoding as defined by - Section 4.1.2. + "micro_version": The "micro_version" of the FFV1 encoding as defined + by Section 4.1.2. - coder_type: The coder_type of the FFV1 encoding as defined by + "coder_type": The "coder_type" of the FFV1 encoding as defined by Section 4.1.3. - colorspace_type: The colorspace_type of the FFV1 encoding as defined - by Section 4.1.5. + "colorspace_type": The "colorspace_type" of the FFV1 encoding as + defined by Section 4.1.5. - bits_per_raw_sample: The bits_per_raw_sample of the FFV1 encoding as - defined by Section 4.1.7. + "bits_per_raw_sample": The "bits_per_raw_sample" of the FFV1 encoding + as defined by Section 4.1.7. - max-slices: The value of max-slices is an integer indicating the + "max_slices": The value of "max_slices" is an integer indicating the maximum count of slices with a frames of the FFV1 encoding. Encoding considerations: This media type is defined for encapsulation in several audiovisual container formats and contains binary data; see Section 4.2.3. This - media type is framed binary data Section 4.8 of [RFC6838]. + media type is framed binary data; see Section 4.8 of [RFC6838]. Security considerations: See Section 6 of this document. Interoperability considerations: None. Published specification: - [I-D.ietf-cellar-ffv1] and RFC XXXX. + RFC XXXX. [RFC Editor: Upon publication as an RFC, please replace "XXXX" with the number assigned to this document and remove this note.] Applications which use this media type: Any application that requires the transport of lossless video can use this media type. Some examples are, but not limited to screen recording, scientific imaging, and digital video preservation. @@ -2176,146 +2196,145 @@ Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) Change controller: IETF cellar working group delegated from the IESG. 8. IANA Considerations The IANA is requested to register the following values: * Media type registration as described in Section 7. -9. Appendixes - -9.1. Decoder implementation suggestions - -9.1.1. Multi-threading Support and Independence of Slices +9. Appendix A: Multi-theaded decoder implementation suggestions The FFV1 bitstream is parsable in two ways: in sequential order as described in this document or with the pre-analysis of the footer of - each slice. Each slice footer contains a slice_size field so the + each slice. Each slice footer contains a "slice_size" field so the boundary of each slice is computable without having to parse the slice content. That allows multi-threading as well as independence of slice content (a bitstream error in a slice header or slice content has no impact on the decoding of the other slices). - After having checked keyframe field, a decoder SHOULD parse - slice_size fields, from slice_size of the last slice at the end of - the "Frame" up to slice_size of the first slice at the beginning of - the "Frame", before parsing slices, in order to have slices + After having checked "keyframe" field, a decoder SHOULD parse + "slice_size" fields, from "slice_size" of the last slice at the end + of the "Frame" up to "slice_size" of the first slice at the beginning + of the "Frame", before parsing slices, in order to have slices boundaries. A decoder MAY fallback on sequential order e.g. in case - of a corrupted "Frame" (frame size unknown, slice_size of slices not - coherent...) or if there is no possibility of seeking into the + of a corrupted "Frame" (frame size unknown, "slice_size" of slices + not coherent...) or if there is no possibility of seeking into the stream. 10. Changelog See https://github.com/FFmpeg/FFV1/commits/master (https://github.com/FFmpeg/FFV1/commits/master) + [RFC Editor: Please remove this Changelog section prior to + publication.] + 11. Normative References - [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC - 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, - May 2017, . + [ISO.9899.2018] + International Organization for Standardization, + "Programming languages - C", 2018. - [Matroska] IETF, "Matroska", 2019, . + [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet + Denial-of-Service Considerations", RFC 4732, + DOI 10.17487/RFC4732, December 2006, + . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . - [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet - Denial-of-Service Considerations", RFC 4732, - DOI 10.17487/RFC4732, December 2006, - . - [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, . - [ISO.9899.2018] - International Organization for Standardization, - "Programming languages - C", 2018. - - [ISO.15444-1.2016] - International Organization for Standardization, - "Information technology -- JPEG 2000 image coding system: - Core coding system", October 2016. - [ISO.9899.1990] International Organization for Standardization, "Programming languages - C", 1990. + [Matroska] IETF, "Matroska", 2019, . + [RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, . - [I-D.ietf-cellar-ffv1] - Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video - Coding Format Version 0, 1, and 3", Work in Progress, - Internet-Draft, draft-ietf-cellar-ffv1-11, 23 October - 2019, - . + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, . -12. Informative References - - [ISO.14495-1.1999] + [ISO.15444-1.2016] International Organization for Standardization, - "Information technology -- Lossless and near-lossless - compression of continuous-tone still images: Baseline", - December 1999. + "Information technology -- JPEG 2000 image coding system: + Core coding system", October 2016. - [AVI] Microsoft, "AVI RIFF File Reference", undated, - . +12. Informative References + + [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, + . [ISO.14496-12.2015] International Organization for Standardization, "Information technology -- Coding of audio-visual objects -- Part 12: ISO base media file format", December 2015. - [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the - FFV1 codec in FFmpeg", undated, . - - [YCbCr] Wikipedia, "YCbCr", undated, - . - - [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, - . - [VALGRIND] Valgrind Developers, "Valgrind website", undated, . + [AVI] Microsoft, "AVI RIFF File Reference", undated, + . + + [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the + FFV1 codec in FFmpeg", undated, . + [Address-Sanitizer] The Clang Team, "ASAN AddressSanitizer website", undated, . - [NUT] Niedermayer, M., "NUT Open Container Format", December - 2013, . + [YCbCr] Wikipedia, "YCbCr", undated, + . + + [ISO.14495-1.1999] + International Organization for Standardization, + "Information technology -- Lossless and near-lossless + compression of continuous-tone still images: Baseline", + December 1999. [range-coding] Nigel, G. and N. Martin, "Range encoding: an algorithm for removing redundancy from a digitised message.", July 1979. [ISO.14496-10.2014] International Organization for Standardization, "Information technology -- Coding of audio-visual objects -- Part 10: Advanced Video Coding", September 2014. + [I-D.ietf-cellar-ffv1] + Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video + Coding Format Version 0, 1, and 3", Work in Progress, + Internet-Draft, draft-ietf-cellar-ffv1-12, 28 January + 2020, + . + + [NUT] Niedermayer, M., "NUT Open Container Format", December + 2013, . + Authors' Addresses Michael Niedermayer Email: michael@niedermayer.cc Dave Rice Email: dave@dericed.com