--- 1/draft-ietf-cellar-ffv1-v4-10.txt 2020-05-26 12:13:26.031087186 -0700 +++ 2/draft-ietf-cellar-ffv1-v4-11.txt 2020-05-26 12:13:26.271093297 -0700 @@ -1,20 +1,20 @@ cellar M. Niedermayer Internet-Draft Intended status: Standards Track D. Rice -Expires: 30 October 2020 +Expires: 27 November 2020 J. Martinez - 28 April 2020 + 26 May 2020 FFV1 Video Coding Format Version 4 - draft-ietf-cellar-ffv1-v4-10 + draft-ietf-cellar-ffv1-v4-11 Abstract This document defines FFV1, a lossless intra-frame video encoding format. FFV1 is designed to efficiently compress video data in a variety of pixel formats. Compared to uncompressed video, FFV1 offers storage compression, frame fixity, and self-description, which makes FFV1 useful as a preservation or intermediate video format. Status of This Memo @@ -25,21 +25,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on 30 October 2020. + This Internet-Draft will expire on 27 November 2020. Copyright Notice Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights @@ -48,25 +48,25 @@ as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Notation and Conventions . . . . . . . . . . . . . . . . . . 4 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Conventions . . . . . . . . . . . . . . . . . . . . . . . 5 2.2.1. Pseudo-code . . . . . . . . . . . . . . . . . . . . . 5 - 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 5 + 2.2.2. Arithmetic Operators . . . . . . . . . . . . . . . . 6 2.2.3. Assignment Operators . . . . . . . . . . . . . . . . 6 2.2.4. Comparison Operators . . . . . . . . . . . . . . . . 6 2.2.5. Mathematical Functions . . . . . . . . . . . . . . . 7 - 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 7 + 2.2.6. Order of Operation Precedence . . . . . . . . . . . . 8 2.2.7. Range . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.8. NumBytes . . . . . . . . . . . . . . . . . . . . . . 8 2.2.9. Bitstream Functions . . . . . . . . . . . . . . . . . 8 3. Sample Coding . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1. Border . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2. Samples . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3. Median Predictor . . . . . . . . . . . . . . . . . . . . 10 3.4. Context . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.5. Quantization Table Sets . . . . . . . . . . . . . . . . . 12 3.6. Quantization Table Set Indexes . . . . . . . . . . . . . 12 @@ -80,92 +80,100 @@ 4.1. Parameters . . . . . . . . . . . . . . . . . . . . . . . 26 4.1.1. version . . . . . . . . . . . . . . . . . . . . . . . 28 4.1.2. micro_version . . . . . . . . . . . . . . . . . . . . 28 4.1.3. coder_type . . . . . . . . . . . . . . . . . . . . . 29 4.1.4. state_transition_delta . . . . . . . . . . . . . . . 30 4.1.5. colorspace_type . . . . . . . . . . . . . . . . . . . 30 4.1.6. chroma_planes . . . . . . . . . . . . . . . . . . . . 31 4.1.7. bits_per_raw_sample . . . . . . . . . . . . . . . . . 31 4.1.8. log2_h_chroma_subsample . . . . . . . . . . . . . . . 32 4.1.9. log2_v_chroma_subsample . . . . . . . . . . . . . . . 32 - 4.1.10. "extra\_plane" . . . . . . . . . . . . . . . . . . . 32 + 4.1.10. extra_plane . . . . . . . . . . . . . . . . . . . . . 32 4.1.11. num_h_slices . . . . . . . . . . . . . . . . . . . . 32 4.1.12. num_v_slices . . . . . . . . . . . . . . . . . . . . 33 4.1.13. quant_table_set_count . . . . . . . . . . . . . . . . 33 4.1.14. states_coded . . . . . . . . . . . . . . . . . . . . 33 4.1.15. initial_state_delta . . . . . . . . . . . . . . . . . 33 4.1.16. ec . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.17. intra . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2. Configuration Record . . . . . . . . . . . . . . . . . . 34 4.2.1. reserved_for_future_use . . . . . . . . . . . . . . . 35 4.2.2. configuration_record_crc_parity . . . . . . . . . . . 35 4.2.3. Mapping FFV1 into Containers . . . . . . . . . . . . 35 4.3. Frame . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.4. Slice . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.5. Slice Header . . . . . . . . . . . . . . . . . . . . . . 39 4.5.1. slice_x . . . . . . . . . . . . . . . . . . . . . . . 39 4.5.2. slice_y . . . . . . . . . . . . . . . . . . . . . . . 39 - 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 40 + 4.5.3. slice_width . . . . . . . . . . . . . . . . . . . . . 39 4.5.4. slice_height . . . . . . . . . . . . . . . . . . . . 40 4.5.5. quant_table_set_index_count . . . . . . . . . . . . . 40 4.5.6. quant_table_set_index . . . . . . . . . . . . . . . . 40 4.5.7. picture_structure . . . . . . . . . . . . . . . . . . 40 4.5.8. sar_num . . . . . . . . . . . . . . . . . . . . . . . 41 4.5.9. sar_den . . . . . . . . . . . . . . . . . . . . . . . 41 4.5.10. reset_contexts . . . . . . . . . . . . . . . . . . . 41 - 4.5.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 42 + 4.5.11. slice_coding_mode . . . . . . . . . . . . . . . . . . 41 4.6. Slice Content . . . . . . . . . . . . . . . . . . . . . . 42 4.6.1. primary_color_count . . . . . . . . . . . . . . . . . 42 - 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 43 + 4.6.2. plane_pixel_height . . . . . . . . . . . . . . . . . 42 4.6.3. slice_pixel_height . . . . . . . . . . . . . . . . . 43 4.6.4. slice_pixel_y . . . . . . . . . . . . . . . . . . . . 43 4.7. Line . . . . . . . . . . . . . . . . . . . . . . . . . . 43 - 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 44 + 4.7.1. plane_pixel_width . . . . . . . . . . . . . . . . . . 43 4.7.2. slice_pixel_width . . . . . . . . . . . . . . . . . . 44 4.7.3. slice_pixel_x . . . . . . . . . . . . . . . . . . . . 44 4.7.4. sample_difference . . . . . . . . . . . . . . . . . . 44 4.8. Slice Footer . . . . . . . . . . . . . . . . . . . . . . 44 - 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 45 + 4.8.1. slice_size . . . . . . . . . . . . . . . . . . . . . 44 4.8.2. error_status . . . . . . . . . . . . . . . . . . . . 45 4.8.3. slice_crc_parity . . . . . . . . . . . . . . . . . . 45 - 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 46 - 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 47 - 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 47 + 4.9. Quantization Table Set . . . . . . . . . . . . . . . . . 45 + 4.9.1. quant_tables . . . . . . . . . . . . . . . . . . . . 46 + 4.9.2. context_count . . . . . . . . . . . . . . . . . . . . 46 5. Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 47 - 6. Security Considerations . . . . . . . . . . . . . . . . . . . 48 - 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 49 - 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 50 - 9. Appendix A: Multi-theaded decoder implementation - suggestions . . . . . . . . . . . . . . . . . . . . . . . 50 - 10. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 51 - 11. Normative References . . . . . . . . . . . . . . . . . . . . 51 - 12. Informative References . . . . . . . . . . . . . . . . . . . 52 - Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 53 + 6. Security Considerations . . . . . . . . . . . . . . . . . . . 47 + 7. Media Type Definition . . . . . . . . . . . . . . . . . . . . 48 + 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 49 + 9. Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . 50 + 10. Normative References . . . . . . . . . . . . . . . . . . . . 50 + 11. Informative References . . . . . . . . . . . . . . . . . . . 51 + Appendix A. Multi-theaded decoder implementation suggestions . . 52 + Appendix B. Future handling of some streams created by non + conforming encoders . . . . . . . . . . . . . . . . . . . 52 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 52 1. Introduction This document describes FFV1, a lossless video encoding format. The design of FFV1 considers the storage of image characteristics, data fixity, and the optimized use of encoding time and storage requirements. FFV1 is designed to support a wide range of lossless video applications such as long-term audiovisual preservation, scientific imaging, screen recording, and other video encoding scenarios that seek to avoid the generational loss of lossy video encodings. This document defines a version 4 of FFV1. Prior versions of FFV1 are defined within [I-D.ietf-cellar-ffv1]. This document assumes familiarity with mathematical and coding concepts such as Range coding [range-coding] and YCbCr color spaces [YCbCr]. + This specification describes the valid bitstream and how to decode + such valid bitstream. Bitstreams not conforming to this + specification or how they are handled is outside this specification. + A decoder could reject every invalid bitstream or attempt to perform + error concealment or re-download or use a redundant copy of the + invalid part or any other action it deems appropriate. + 2. Notation and Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2.1. Definitions @@ -175,33 +183,29 @@ "Sample": The smallest addressable representation of a color component or a luma component in a "Frame". Examples of "Sample" are Luma, Blue Chrominance, Red Chrominance, Transparency, Red, Green, and Blue. "Plane": A discrete component of a static image comprised of "Samples" that represent a specific quantification of "Samples" of that image. "Pixel": The smallest addressable representation of a color in a - "Frame". It is composed of 1 or more "Samples". + "Frame". It is composed of one or more "Samples". "ESC": An ESCape symbol to indicate that the symbol to be stored is too large for normal storage and that an alternate storage method is used. "MSB": Most Significant Bit, the bit that can cause the largest change in magnitude of the symbol. - "RCT": Reversible Color Transform, a near linear, exactly reversible - integer transform that converts between RGB and YCbCr representations - of a "Pixel". - "VLC": Variable Length Code, a code that maps source symbols to a variable number of bits. "RGB": A reference to the method of storing the value of a "Pixel" by using three numeric values that represent Red, Green, and Blue. "YCbCr": A reference to the method of storing the value of a "Pixel" by using three numeric values that represent the luma of the "Pixel" (Y) and the chrominance of the "Pixel" (Cb and Cr). YCbCr word is used for historical reasons and currently references any color space @@ -219,43 +223,43 @@ The FFV1 bitstream is described in this document using pseudo-code. Note that the pseudo-code is used for clarity in order to illustrate the structure of FFV1 and not intended to specify any particular implementation. The pseudo-code used is based upon the C programming language [ISO.9899.1990] and uses its "if/else", "while" and "for" keywords as well as functions defined within this document. 2.2.2. Arithmetic Operators Note: the operators and the order of precedence are the same as used - in the C programming language [ISO.9899.2018]. With the exception of + in the C programming language [ISO.9899.2018], with the exception of ">>" (removal of implementation defined behavior) and "^" (power instead of XOR) operators which are re-defined within this section. "a + b" means a plus b. "a - b" means a minus b. "-a" means negation of a. "a * b" means a multiplied by b. "a / b" means a divided by b. "a ^ b" means a raised to the b-th power. "a & b" means bit-wise "and" of a and b. "a | b" means bit-wise "or" of a and b. "a >> b" means arithmetic right shift of two's complement integer - representation of a by b binary digits. This is equivalent to, b - times dividing a by 2 with rounding toward negative infinity. + representation of a by b binary digits. This is equivalent to + dividing a by 2, b times, with rounding toward negative infinity. "a << b" means arithmetic left shift of two's complement integer representation of a by b binary digits. 2.2.3. Assignment Operators "a = b" means a is assigned b. "a++" is equivalent to a is assigned a + 1. @@ -284,41 +288,42 @@ "a && b" means Boolean logical "and" of a and b. "a || b" means Boolean logical "or" of a and b. "!a" means Boolean logical "not" of a. "a ? b : c" if a is true, then b, otherwise c. 2.2.5. Mathematical Functions - floor(a) the largest integer less than or equal to a + "floor(a)" means the largest integer less than or equal to a. - ceil(a) the smallest integer greater than or equal to a + "ceil(a)" means the smallest integer greater than or equal to a. - sign(a) extracts the sign of a number, i.e. if a < 0 then -1, else if - a > 0 then 1, else 0 + "sign(a)" extracts the sign of a number, i.e. if a < 0 then -1, else + if a > 0 then 1, else 0. - abs(a) the absolute value of a, i.e. abs(a) = sign(a)*a + "abs(a)" means the absolute value of a, i.e. "abs(a)" = "sign(a) * + a". - log2(a) the base-two logarithm of a + "log2(a)" means the base-two logarithm of a. - min(a,b) the smallest of two values a and b + "min(a,b)" means the smallest of two values a and b. - max(a,b) the largest of two values a and b + "max(a,b)" means the largest of two values a and b. - median(a,b,c) the numerical middle value in a data set of a, b, and - c, i.e. a+b+c-min(a,b,c)-max(a,b,c) + "median(a,b,c)" means the numerical middle value in a data set of a, + b, and c, i.e. a+b+c-min(a,b,c)-max(a,b,c). - A <== B B implies A + "A <== B" means B implies A. - A <==> B A <== B , B <== A + "A <==> B" means A <== B , B <== A. 2.2.6. Order of Operation Precedence When order of precedence is not indicated explicitly by use of parentheses, operations are evaluated in the following order (from top to bottom, operations of same precedence being evaluated from left to right). This order of operations is based on the order of operations used in Standard C. a++, a-- @@ -442,42 +447,40 @@ The labels for these relative "Samples" are made of the first letters of the words Top, Left and Right. 3.3. Median Predictor The prediction for any "Sample" value at position "X" may be computed based upon the relative neighboring values of "l", "t", and "tl" via this equation: - "median(l, t, l + t - tl)". - + median(l, t, l + t - tl) Note, this prediction template is also used in [ISO.14495-1.1999] and [HuffYUV]. Exception for the median predictor: if "colorspace_type == 0 && bits_per_raw_sample == 16 && ( coder_type == 1 || coder_type == 2 )", the following median predictor MUST be used: - "median(left16s, top16s, left16s + top16s - diag16s)" + median(left16s, top16s, left16s + top16s - diag16s) where: - left16s = l >= 32768 ? ( l - 65536 ) : l - top16s = t >= 32768 ? ( t - 65536 ) : t - diag16s = tl >= 32768 ? ( tl - 65536 ) : tl + left16s = l >= 32768 ? ( l - 65536 ) : l top16s = t >= 32768 ? ( t - + 65536 ) : t diag16s = tl >= 32768 ? ( tl - 65536 ) : tl Background: a two's complement signed 16-bit signed integer was used for storing "Sample" values in all known implementations of FFV1 bitstream. So in some circumstances, the most significant bit was wrongly interpreted (used as a sign bit instead of the 16th bit of an - unsigned integer). Note that when the issue is discovered, the only + unsigned integer). Note that when the issue was discovered, the only configuration of all known implementations being impacted is 16-bit YCbCr with no Pixel transformation with Range Coder coder, as other potentially impacted configurations (e.g. 15/16-bit JPEG2000-RCT with Range Coder coder, or 16-bit content with Golomb Rice coder) were implemented nowhere [ISO.15444-1.2016]. In the meanwhile, 16-bit JPEG2000-RCT with Range Coder coder was implemented without this issue in one implementation and validated by one conformance checker. It is expected (to be confirmed) to remove this exception for the median predictor in the next version of the FFV1 bitstream. @@ -494,28 +497,28 @@ Figure 3 If "context >= 0" then "context" is used and the difference between the "Sample" and its predicted value is encoded as is, else "-context" is used and the difference between the "Sample" and its predicted value is encoded with a flipped sign. 3.5. Quantization Table Sets - The FFV1 bitstream contains 1 or more Quantization Table Sets. Each - Quantization Table Set contains exactly 5 Quantization Tables with - each Quantization Table corresponding to 1 of the 5 Quantized Sample - Differences. For each Quantization Table, both the number of - quantization steps and their distribution are stored in the FFV1 - bitstream; each Quantization Table has exactly 256 entries, and the 8 - least significant bits of the Quantized Sample Difference are used as - index: + The FFV1 bitstream contains one or more Quantization Table Sets. + Each Quantization Table Set contains exactly 5 Quantization Tables + with each Quantization Table corresponding to one of the five + Quantized Sample Differences. For each Quantization Table, both the + number of quantization steps and their distribution are stored in the + FFV1 bitstream; each Quantization Table has exactly 256 entries, and + the 8 least significant bits of the Quantized Sample Difference are + used as index: Q_{j}[k] = quant_tables[i][j][k&255] Figure 4 In this formula, "i" is the Quantization Table Set index, "j" is the Quantized Table index, "k" the Quantized Sample Difference. 3.6. Quantization Table Set Indexes @@ -524,21 +527,21 @@ * For Y "Plane", "quant_table_set_index[ 0 ]" index is used * For Cb and Cr "Planes", "quant_table_set_index[ 1 ]" index is used * For extra "Plane", "quant_table_set_index[ (version <= 3 || chroma_planes) ? 2 : 1 ]" index is used Background: in first implementations of FFV1 bitstream, the index for Cb and Cr "Planes" was stored even if it is not used (chroma_planes - set to 0), this index is kept for version <= 3 in order to keep + set to 0), this index is kept for "version" <= 3 in order to keep compatibility with FFV1 bitstreams in the wild. 3.7. Color spaces FFV1 supports several color spaces. The count of allowed coded planes and the meaning of the extra "Plane" are determined by the selected color space. The FFV1 bitstream interleaves data in an order determined by the color space. In YCbCr for each "Plane", each "Line" is coded from @@ -610,21 +613,21 @@ Background: At the time of this writing, in all known implementations of FFV1 bitstream, when "bits_per_raw_sample" was between 9 and 15 inclusive and "extra_plane" is 0, GBR "Planes" were used as BGR "Planes" during both encoding and decoding. In the meanwhile, 16-bit JPEG2000-RCT was implemented without this issue in one implementation and validated by one conformance checker. Methods to address this exception for the transform are under consideration for the next version of the FFV1 bitstream. - Cb and Cr are positively offseted by "1 << bits_per_raw_sample" after + Cb and Cr are positively offset by "1 << bits_per_raw_sample" after the conversion from RGB to the modified YCbCr and are negatively offseted by the same value before the conversion from the modified YCbCr to RGB, in order to have only non-negative values after the conversion. When FFV1 uses the JPEG2000-RCT, the horizontal "Lines" are interleaved to improve caching efficiency since it is most likely that the JPEG2000-RCT will immediately be converted to RGB during decoding. The interleaved coding order is also Y, then Cb, then Cr, and then if used transparency. @@ -717,55 +720,55 @@ L_{0} = 2^8 * B_{0} + B_{1} Figure 13 j_{0} = 2 Figure 14 3.8.1.1.1. Termination - The range coder can be used in 3 modes. + The range coder can be used in three modes. * In "Open mode" when decoding, every symbol the reader attempts to read is available. In this mode arbitrary data can have been appended without affecting the range coder output. This mode is not used in FFV1. * In "Closed mode" the length in bytes of the bytestream is provided to the range decoder. Bytes beyond the length are read as 0 by - the range decoder. This is generally 1 byte shorter than the open - mode. + the range decoder. This is generally one byte shorter than the + open mode. * In "Sentinel mode" the exact length in bytes is not known and thus the range decoder MAY read into the data that follows the range coded bytestream by one byte. In "Sentinel mode", the end of the range coded bytestream is a binary symbol with state 129, which value SHALL be discarded. After reading this symbol, the range decoder will have read one byte beyond the end of the range coded bytestream. This way the byte position of the end can be determined. Bytestreams written in "Sentinel mode" can be read in "Closed mode" if the length can be determined, in this case the last (sentinel) symbol will be read non-corrupted and be of value 0. - Above describes the range decoding, encoding is defined as any + Above describes the range decoding. Encoding is defined as any process which produces a decodable bytestream. - There are 3 places where range coder termination is needed in FFV1. - First is in the "Configuration Record", in this case the size of the - range coded bytestream is known and handled as "Closed mode". Second - is the switch from the "Slice Header" which is range coded to Golomb - coded slices as "Sentinel mode". Third is the end of range coded - Slices which need to terminate before the CRC at their end. This can - be handled as "Sentinel mode" or as "Closed mode" if the CRC position - has been determined. + There are three places where range coder termination is needed in + FFV1. First is in the "Configuration Record", in this case the size + of the range coded bytestream is known and handled as "Closed mode". + Second is the switch from the "Slice Header" which is range coded to + Golomb coded slices as "Sentinel mode". Third is the end of range + coded Slices which need to terminate before the CRC at their end. + This can be handled as "Sentinel mode" or as "Closed mode" if the CRC + position has been determined. 3.8.1.2. Range Non Binary Values To encode scalar integers, it would be possible to encode each bit separately and use the past bits as context. However that would mean 255 contexts per 8-bit symbol that is not only a waste of memory but also requires more past data to reach a reasonably good estimate of the probabilities. Alternatively assuming a Laplacian distribution and only dealing with its variance and mean (as in Huffman coding) would also be possible, however, for maximum flexibility and @@ -895,22 +898,22 @@ Figure 18: Alternative state transition table for Range coding. 3.8.2. Golomb Rice Mode The end of the bitstream of the "Frame" is filled with 0-bits until that the bitstream contains a multiple of 8 bits. 3.8.2.1. Signed Golomb Rice Codes - This coding mode uses Golomb Rice codes. The VLC is split into 2 - parts, the prefix stores the most significant bits and the suffix + This coding mode uses Golomb Rice codes. The VLC is split into two + parts. The prefix stores the most significant bits and the suffix stores the k least significant bits or stores the whole number in the ESC case. pseudo-code | type --------------------------------------------------------------|----- int get_ur_golomb(k) { | for (prefix = 0; prefix < 12; prefix++) { | if (get_bits(1)) { | return get_bits(k) + (prefix << k) | } | @@ -941,29 +944,31 @@ +----------------+-------+ | 0000 0000 0001 | 11 | +----------------+-------+ | 0000 0000 0000 | ESC | +----------------+-------+ Table 1 3.8.2.1.2. Suffix - +---------+--------------------------------------------------+ - +=========+==================================================+ + +---------+----------------------------------------+ + +=========+========================================+ | non ESC | the k least significant bits MSB first | - +---------+--------------------------------------------------+ - | ESC | the value - 11, in MSB first order, ESC may only | - | | be used if the value cannot be coded as non ESC | - +---------+--------------------------------------------------+ + +---------+----------------------------------------+ + | ESC | the value - 11, in MSB first order | + +---------+----------------------------------------+ + Table 2 + "ESC" MUST NOT be used if the value can be coded as "non ESC". + 3.8.2.1.3. Examples +-----+-------------------------+-------+ | k | bits | value | +=====+=========================+=======+ | 0 | "1" | 0 | +-----+-------------------------+-------+ | 0 | "001" | 2 | +-----+-------------------------+-------+ | 2 | "1 00" | 0 | @@ -978,26 +983,26 @@ Table 3 3.8.2.2. Run Mode Run mode is entered when the context is 0 and left as soon as a non-0 difference is found. The level is identical to the predicted one. The run and the first different level are coded. 3.8.2.2.1. Run Length Coding - The run value is encoded in 2 parts, the prefix part stores the more - significant part of the run as well as adjusting the "run_index" that - determines the number of bits in the less significant part of the - run. The 2nd part of the value stores the less significant part of - the run as it is. The run_index is reset for each "Plane" and slice - to 0. + The run value is encoded in two parts. The prefix part stores the + more significant part of the run as well as adjusting the "run_index" + that determines the number of bits in the less significant part of + the run. The second part of the value stores the less significant + part of the run as it is. The run_index is reset for each "Plane" + and slice to 0. pseudo-code | type --------------------------------------------------------------|----- log2_run[41]={ | 0, 0, 0, 0, 1, 1, 1, 1, | 2, 2, 2, 2, 3, 3, 3, 3, | 4, 4, 5, 5, 6, 6, 7, 7, | 8, 9,10,11,12,13,14,15, | 16,17,18,19,20,21,22,23, | 24, | @@ -1015,36 +1020,21 @@ } else { | run_count = 0; | } | if (run_index) { | run_index--; | } | run_mode = 2; | } | } | - The "log2_run" function is also used within [ISO.14495-1.1999]. - -3.8.2.2.2. Level Coding - - Level coding is identical to the normal difference coding with the - exception that the 0 value is removed as it cannot occur: - - diff = get_vlc_symbol(context_state); - if (diff >= 0) { - diff++; - } - - Note, this is different from JPEG-LS, which doesn't use prediction in - run mode and uses a different encoding and context model for the last - difference On a small set of test "Samples" the use of prediction - slightly improved the compression rate. + The "log2_run" array is also used within [ISO.14495-1.1999]. 3.8.2.3. Scalar Mode Each difference is coded with the per context mean prediction removed and a per context value for k. get_vlc_symbol(state) { i = state->count; k = 0; while (i < state->error_sum) { @@ -1076,33 +1066,48 @@ -state->count + 1); } else if (state->drift > 0) { state->bias = min(state->bias + 1, 127); state->drift = min(state->drift - state->count, 0); } return ret; } +3.8.2.3.1. Level Coding + + Level coding is identical to the normal difference coding with the + exception that the 0 value is removed as it cannot occur: + + diff = get_vlc_symbol(context_state); + if (diff >= 0) { + diff++; + } + + Note, this is different from JPEG-LS, which doesn't use prediction in + run mode and uses a different encoding and context model for the last + difference. On a small set of test "Samples" the use of prediction + slightly improved the compression rate. + 3.8.2.4. Initial Values for the VLC context state At keyframes all coder state variables are set to their initial state. drift = 0; error_sum = 4; bias = 0; count = 1; 4. Bitstream - An FFV1 bitstream is composed of a series of 1 or more "Frames" and + An FFV1 bitstream is composed of a series of one or more "Frames" and (when required) a "Configuration Record". Within the following sub-sections, pseudo-code is used to explain the structure of each FFV1 bitstream component, as described in Section 2.2.1. Table 4 lists symbols used to annotate that pseudo- code in order to define the storage of the data referenced in that line of pseudo-code. +--------+----------------------------------------------+ | Symbol | Definition | @@ -1114,20 +1119,23 @@ +--------+----------------------------------------------+ | br | Range coded Boolean (1-bit) symbol with the | | | method described in Section 3.8.1.1 | +--------+----------------------------------------------+ | ur | Range coded unsigned scalar symbol coded | | | with the method described in Section 3.8.1.2 | +--------+----------------------------------------------+ | sr | Range coded signed scalar symbol coded with | | | the method described in Section 3.8.1.2 | +--------+----------------------------------------------+ + | sd | Sample difference coded with the method | + | | described in Section 3.8 | + +--------+----------------------------------------------+ Table 4: Definition of pseudo-code symbols for this document. The same context that is initialized to 128 is used for all fields in the header. The following MUST be provided by external means during initialization of the decoder: @@ -1194,81 +1202,81 @@ Figure 19: A pseudo-code description of the bitstream contents. CONTEXT_SIZE is 32. 4.1.1. version "version" specifies the version of the FFV1 bitstream. Each version is incompatible with other versions: decoders SHOULD - reject a file due to an unknown version. + reject FFV1 bitstreams due to an unknown version. - Decoders SHOULD reject a file with version <= 1 && + Decoders SHOULD reject FFV1 bitstreams with version <= 1 && ConfigurationRecordIsPresent == 1. - Decoders SHOULD reject a file with version >= 3 && + Decoders SHOULD reject FFV1 bitstreams with version >= 3 && ConfigurationRecordIsPresent == 0. +-------+-------------------------+ | value | version | +=======+=========================+ | 0 | FFV1 version 0 | +-------+-------------------------+ | 1 | FFV1 version 1 | +-------+-------------------------+ | 2 | reserved* | +-------+-------------------------+ | 3 | FFV1 version 3 | +-------+-------------------------+ | 4 | FFV1 version 4 | +-------+-------------------------+ | Other | reserved for future use | +-------+-------------------------+ Table 5 - * Version 2 was never enabled in the encoder thus version 2 files - SHOULD NOT exist, and this document does not describe them to keep - the text simpler. + * Version 2 was experimental and this document does not describe it. 4.1.2. micro_version "micro_version" specifies the micro-version of the FFV1 bitstream. After a version is considered stable (a micro-version value is assigned to be the first stable variant of a specific version), each new micro-version after this first stable variant is compatible with - the previous micro-version: decoders SHOULD NOT reject a file due to - an unknown micro-version equal or above the micro-version considered - as stable. + the previous micro-version: decoders SHOULD NOT reject FFV1 + bitstreams due to an unknown micro-version equal or above the micro- + version considered as stable. - Meaning of "micro_version" for version 3: + Meaning of "micro_version" for "version" 3: +-------+-------------------------+ | value | micro_version | +=======+=========================+ | 0...3 | reserved* | +-------+-------------------------+ | 4 | first stable variant | +-------+-------------------------+ | Other | reserved for future use | +-------+-------------------------+ Table 6: The definitions for - "micro_version" values. + "micro_version" values for FFV1 + version 3. * development versions may be incompatible with the stable variants. - Meaning of "micro_version" for version 4 (note: at the time of + Meaning of "micro_version" for "version" 4 (note: at the time of writing of this specification, version 4 is not considered stable so - the first stable version value is to be announced in the future): + the first stable "micro_version" value is to be announced in the + future): +---------+-------------------------+ | value | micro_version | +=========+=========================+ | 0...TBA | reserved* | +---------+-------------------------+ | TBA | first stable variant | +---------+-------------------------+ | Other | reserved for future use | +---------+-------------------------+ @@ -1296,21 +1304,21 @@ | Other | reserved for future use | +-------+-------------------------------------------------+ Table 8 Restrictions: If "coder_type" is 0, then "bits_per_raw_sample" SHOULD NOT be > 8. Background: At the time of this writing, there is no known - implementations of FFV1 bitstream supporting Golomb Rice algorithm + implementation of FFV1 bitstream supporting Golomb Rice algorithm with "bits_per_raw_sample" greater than 8, and Range Coder is prefered. 4.1.4. state_transition_delta "state_transition_delta" specifies the Range coder custom state transition table. If "state_transition_delta" is not present in the FFV1 bitstream, all Range coder custom state transition table elements are assumed to be @@ -1334,25 +1342,23 @@ | | | | | then | | | | | | "Plane" | +-------+-------------+----------------+--------------+-------------+ | Other | reserved | reserved for | reserved for | reserved | | | for future | future use | future use | for future | | | use | | | use | +-------+-------------+----------------+--------------+-------------+ Table 9 - Restrictions: - - If "colorspace_type" is 1, then "chroma_planes" MUST be 1, - "log2_h_chroma_subsample" MUST be 0, and "log2_v_chroma_subsample" - MUST be 0. + FFV1 bitstreams with "colorspace_type" == 1 && ("chroma_planes" != + 1 || "log2_h_chroma_subsample" != 0 || "log2_v_chroma_subsample" != + 0) are not part of this specification. 4.1.6. chroma_planes "chroma_planes" indicates if chroma (color) "Planes" are present. +-------+---------------------------------+ | value | presence | +=======+=================================+ | 0 | chroma "Planes" are not present | +-------+---------------------------------+ @@ -1369,36 +1375,37 @@ +-------+-----------------------------------+ | value | bits for each sample | +=======+===================================+ | 0 | reserved* | +-------+-----------------------------------+ | Other | the actual bits for each "Sample" | +-------+-----------------------------------+ Table 11 - * Encoders MUST NOT store "bits_per_raw_sample" = 0 Decoders SHOULD + * Encoders MUST NOT store "bits_per_raw_sample" = 0. Decoders SHOULD accept and interpret "bits_per_raw_sample" = 0 as 8. 4.1.8. log2_h_chroma_subsample "log2_h_chroma_subsample" indicates the subsample factor, stored in powers to which the number 2 must be raised, between luma and chroma - width ("chroma_width = 2^-log2_h_chroma_subsample^ * luma_width"). + width ("chroma_width = 2 ^ -log2_h_chroma_subsample * luma_width"). 4.1.9. log2_v_chroma_subsample "log2_v_chroma_subsample" indicates the subsample factor, stored in powers to which the number 2 must be raised, between luma and chroma - height ("chroma_height=2^-log2_v_chroma_subsample^ * luma_height"). + height ("chroma_height = 2 ^ -log2_v_chroma_subsample * + luma_height"). -4.1.10. "extra\_plane" +4.1.10. extra_plane "extra_plane" indicates if an extra "Plane" is present. +-------+------------------------------+ | value | presence | +=======+==============================+ | 0 | extra "Plane" is not present | +-------+------------------------------+ | 1 | extra "Plane" is present | +-------+------------------------------+ @@ -1444,21 +1451,21 @@ | 1 | initial states are present | +-------+--------------------------------+ Table 13 4.1.15. initial_state_delta "initial_state_delta[ i ][ j ][ k ]" indicates the initial Range coder state, it is encoded using "k" as context index and - pred = j ? initial_states[ i ][j - 1][ k ] + pred = j ? initial_states[ i ][j - 1][ k ] : 128 Figure 20 initial_state[ i ][ j ][ k ] = ( pred + initial_state_delta[ i ][ j ][ k ] ) & 255 Figure 21 4.1.16. ec @@ -1521,26 +1528,25 @@ Encoders conforming to this version of this specification SHALL NOT write this value. Decoders conforming to this version of this specification SHALL ignore its value. 4.2.2. configuration_record_crc_parity "configuration_record_crc_parity" 32 bits that are chosen so that the - "Configuration Record" as a whole has a crc remainder of 0. + "Configuration Record" as a whole has a CRC remainder of 0. - This is equivalent to storing the crc remainder in the 32-bit parity. + This is equivalent to storing the CRC remainder in the 32-bit parity. - The CRC generator polynomial used is the standard IEEE CRC polynomial - (0x104C11DB7) with initial value 0. + The CRC generator polynomial used is described in Section 4.8.3. 4.2.3. Mapping FFV1 into Containers This "Configuration Record" can be placed in any file format supporting "Configuration Records", fitting as much as possible with how the file format uses to store "Configuration Records". The "Configuration Record" storage place and "NumBytes" are currently defined and supported by this version of this specification for the following formats: @@ -1678,29 +1684,20 @@ byte alignment. MUST be 0. "reserved" specifies a bit without any significance in this revision of the specification and may have a significance in a later revision of this specification. Encoders SHOULD NOT fill these bits. Decoders SHOULD ignore these bits. - Note in case these bits are used in a later revision of this - specification: any revision of this specification SHOULD care about - avoiding to add 40 bits of content after "SliceContent" for "version" - 0 and 1 of the bitstream. Background: Due to some non-conforming - encoders, some bitstreams were found with 40 extra bits corresponding - to "error_status" and "slice_crc_parity". As a result, a decoder - conforming to the revised specification could not distinguish between - a revised bitstream and a buggy bitstream. - 4.5. Slice Header A "Slice Header" provides information about the decoding configuration of the "Slice", such as its spatial position, size, and aspect ratio. The pseudo-code below describes the contents of the "Slice Header". pseudo-code | type --------------------------------------------------------------|----- SliceHeader( ) { | @@ -1743,22 +1740,24 @@ 4.5.4. slice_height "slice_height" indicates the height on the slice raster formed by num_v_slices. Inferred to be 1 if not present. 4.5.5. quant_table_set_index_count - "quant_table_set_index_count" is defined as "1 + ( ( chroma_planes || - version <= 3 ) ? 1 : 0 ) + ("extra_plane"? 1 : 0 )". + "quant_table_set_index_count" is defined as: + + 1 + ( ( chroma_planes || version <= 3 ) ? 1 : 0 ) + ( extra_plane ? 1 + : 0 ) 4.5.6. quant_table_set_index "quant_table_set_index" indicates the Quantization Table Set index to select the Quantization Table Set and the initial states for the slice. Inferred to be 0 if not present. 4.5.7. picture_structure @@ -1854,92 +1853,89 @@ for (y = 0; y < slice_pixel_height; y++) { | for (p = 0; p < primary_color_count; p++) { | Line( p, y ) | } | } | } | } | 4.6.1. primary_color_count - "primary_color_count" is defined as "1 + ( chroma_planes ? 2 : 0 ) + - ("extra_plane"? 1 : 0 )". + "primary_color_count" is defined as: -4.6.2. plane_pixel_height + 1 + ( chroma_planes ? 2 : 0 ) + ( extra_plane ? 1 : 0 ) - "plane_pixel_height[ p ]" is the height in pixels of plane p of the - slice. +4.6.2. plane_pixel_height - "plane_pixel_height[ 0 ]" and "plane_pixel_height[ 1 + ( - chroma_planes ? 2 : 0 ) ]" value is "slice_pixel_height". + "plane_pixel_height[ p ]" is the height in "Pixels" of "Plane" p of + the "Slice". It is defined as: - If "chroma_planes" is set to 1, "plane_pixel_height[ 1 ]" and - "plane_pixel_height[ 2 ]" value is "ceil( slice_pixel_height / (1 << - log2_v_chroma_subsample) )". + (chroma_planes == 1 && (p == 1 || p == 2)) ? ceil(slice_pixel_height + / (1 << log2_v_chroma_subsample)) : slice_pixel_height 4.6.3. slice_pixel_height - "slice_pixel_height" is the height in pixels of the slice. + "slice_pixel_height" is the height in pixels of the slice. It is + defined as: - Its value is "floor( ( slice_y + slice_height ) * slice_pixel_height - / num_v_slices ) - slice_pixel_y". + floor( ( slice_y + slice_height ) * slice_pixel_height / num_v_slices + ) - slice_pixel_y. 4.6.4. slice_pixel_y - "slice_pixel_y" is the slice vertical position in pixels. + "slice_pixel_y" is the slice vertical position in pixels. It is + defined as: - Its value is "floor( slice_y * frame_pixel_height / num_v_slices )". + floor( slice_y * frame_pixel_height / num_v_slices ) 4.7. Line A "Line" is a list of the sample differences (relative to the predictor) of primary color components. The pseudo-code below describes the contents of the "Line". pseudo-code | type --------------------------------------------------------------|----- Line( p, y ) { | if (colorspace_type == 0) { | for (x = 0; x < plane_pixel_width[ p ]; x++) { | - sample_difference[ p ][ y ][ x ] | + sample_difference[ p ][ y ][ x ] | sd } | } else if (colorspace_type == 1) { | for (x = 0; x < slice_pixel_width; x++) { | - sample_difference[ p ][ y ][ x ] | + sample_difference[ p ][ y ][ x ] | sd } | } | } | 4.7.1. plane_pixel_width "plane_pixel_width[ p ]" is the width in "Pixels" of "Plane" p of the - slice. - - "plane_pixel_width[ 0 ]" and "plane_pixel_width[ 1 + ( chroma_planes - ? 2 : 0 ) ]" value is "slice_pixel_width". + "Slice". It is defined as: - If "chroma_planes" is set to 1, "plane_pixel_width[ 1 ]" and - "plane_pixel_width[ 2 ]" value is "ceil( slice_pixel_width / (1 << - log2_h_chroma_subsample) )". + (chroma_planes == 1 && (p == 1 || p == 2)) ? ceil( slice_pixel_width + / (1 << log2_h_chroma_subsample) ) : slice_pixel_width. 4.7.2. slice_pixel_width - "slice_pixel_width" is the width in "Pixels" of the slice. + "slice_pixel_width" is the width in "Pixels" of the slice. It is + defined as: - Its value is "floor( ( slice_x + slice_width ) * slice_pixel_width / - num_h_slices ) - slice_pixel_x". + floor( ( slice_x + slice_width ) * slice_pixel_width / num_h_slices ) + - slice_pixel_x 4.7.3. slice_pixel_x - "slice_pixel_x" is the slice horizontal position in "Pixels". + "slice_pixel_x" is the slice horizontal position in "Pixels". It is + defined as: - Its value is "floor( slice_x * frame_pixel_width / num_h_slices )". + floor( slice_x * frame_pixel_width / num_h_slices ) 4.7.4. sample_difference "sample_difference[ p ][ y ][ x ]" is the sample difference for "Sample" at "Plane" "p", y position "y", and x position "x". The "Sample" value is computed based on median predictor and context described in Section 3.2. 4.8. Slice Footer @@ -2017,21 +2012,21 @@ --------------------------------------------------------------|----- QuantizationTableSet( i ) { | scale = 1 | for (j = 0; j < MAX_CONTEXT_INPUTS; j++) { | QuantizationTable( i, j, scale ) | scale *= 2 * len_count[ i ][ j ] - 1 | } | context_count[ i ] = ceil( scale / 2 ) | } | - MAX_CONTEXT_INPUTS is 5. + "MAX_CONTEXT_INPUTS" is 5. pseudo-code | type --------------------------------------------------------------|----- QuantizationTable(i, j, scale) { | v = 0 | for (k = 0; k < 128;) { | len - 1 | ur for (a = 0; a < len; a++) { | quant_tables[ i ][ j ][ k ] = scale * v | k++ | @@ -2055,21 +2050,21 @@ 4.9.2. context_count "context_count[ i ]" indicates the count of contexts for Quantization Table Set "i". "context_count[ i ]" MUST be less than or equal to 32768. 5. Restrictions To ensure that fast multithreaded decoding is possible, starting with - "version" 3 and if "frame_pixel_width * frame_pixel_height" is more + version 3 and if "frame_pixel_width * frame_pixel_height" is more than 101376, "slice_width * slice_height" MUST be less or equal to "num_h_slices * num_v_slices / 4". Note: 101376 is the frame size in "Pixels" of a 352x288 frame also known as CIF ("Common Intermediate Format") frame size format. For each "Frame", each position in the slice raster MUST be filled by one and only one slice of the "Frame" (no missing slice position, no slice overlapping). For each "Frame" with "keyframe" value of 0, each slice MUST have the @@ -2132,21 +2127,21 @@ Subtype name: FFV1 Required parameters: None. Optional parameters: This parameter is used to signal the capabilities of a receiver implementation. This parameter MUST NOT be used for any other purpose. - "version": The version of the FFV1 encoding as defined by + "version": The "version" of the FFV1 encoding as defined by Section 4.1.1. "micro_version": The "micro_version" of the FFV1 encoding as defined by Section 4.1.2. "coder_type": The "coder_type" of the FFV1 encoding as defined by Section 4.1.3. "colorspace_type": The "colorspace_type" of the FFV1 encoding as defined by Section 4.1.5. @@ -2196,148 +2191,161 @@ Author: Dave Rice dave@dericed.com (mailto:dave@dericed.com) Change controller: IETF cellar working group delegated from the IESG. 8. IANA Considerations The IANA is requested to register the following values: * Media type registration as described in Section 7. -9. Appendix A: Multi-theaded decoder implementation suggestions - - The FFV1 bitstream is parsable in two ways: in sequential order as - described in this document or with the pre-analysis of the footer of - each slice. Each slice footer contains a "slice_size" field so the - boundary of each slice is computable without having to parse the - slice content. That allows multi-threading as well as independence - of slice content (a bitstream error in a slice header or slice - content has no impact on the decoding of the other slices). - - After having checked "keyframe" field, a decoder SHOULD parse - "slice_size" fields, from "slice_size" of the last slice at the end - of the "Frame" up to "slice_size" of the first slice at the beginning - of the "Frame", before parsing slices, in order to have slices - boundaries. A decoder MAY fallback on sequential order e.g. in case - of a corrupted "Frame" (frame size unknown, "slice_size" of slices - not coherent...) or if there is no possibility of seeking into the - stream. - -10. Changelog +9. Changelog See https://github.com/FFmpeg/FFV1/commits/master (https://github.com/FFmpeg/FFV1/commits/master) [RFC Editor: Please remove this Changelog section prior to publication.] -11. Normative References +10. Normative References - [ISO.9899.2018] - International Organization for Standardization, - "Programming languages - C", 2018. + [Matroska] IETF, "Matroska", 2019, . [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet Denial-of-Service Considerations", RFC 4732, DOI 10.17487/RFC4732, December 2006, . - [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate - Requirement Levels", BCP 14, RFC 2119, - DOI 10.17487/RFC2119, March 1997, - . - - [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type - Specifications and Registration Procedures", BCP 13, - RFC 6838, DOI 10.17487/RFC6838, January 2013, - . + [ISO.9899.2018] + International Organization for Standardization, + "Programming languages - C", 2018. [ISO.9899.1990] International Organization for Standardization, "Programming languages - C", 1990. - [Matroska] IETF, "Matroska", 2019, . - [RFC4855] Casner, S., "Media Type Registration of RTP Payload Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, . - [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC - 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, - May 2017, . + [ISO.15444-1.2016] + International Organization for Standardization, + "Information technology -- JPEG 2000 image coding system: + Core coding system", October 2016. [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716, September 2012, . - [ISO.15444-1.2016] - International Organization for Standardization, - "Information technology -- JPEG 2000 image coding system: - Core coding system", October 2016. + [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type + Specifications and Registration Procedures", BCP 13, + RFC 6838, DOI 10.17487/RFC6838, January 2013, + . -12. Informative References + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . - [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, - . + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + +11. Informative References [ISO.14496-12.2015] International Organization for Standardization, "Information technology -- Coding of audio-visual objects -- Part 12: ISO base media file format", December 2015. [VALGRIND] Valgrind Developers, "Valgrind website", undated, . - [AVI] Microsoft, "AVI RIFF File Reference", undated, - . - - [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the - FFV1 codec in FFmpeg", undated, . - - [Address-Sanitizer] - The Clang Team, "ASAN AddressSanitizer website", undated, - . + [I-D.ietf-cellar-ffv1] + Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video + Coding Format Version 0, 1, and 3", Work in Progress, + Internet-Draft, draft-ietf-cellar-ffv1-13, 28 April 2020, + . - [YCbCr] Wikipedia, "YCbCr", undated, - . + [range-coding] + Nigel, G. and N. Martin, "Range encoding: an algorithm for + removing redundancy from a digitised message.", July 1979. [ISO.14495-1.1999] International Organization for Standardization, "Information technology -- Lossless and near-lossless compression of continuous-tone still images: Baseline", December 1999. - [range-coding] - Nigel, G. and N. Martin, "Range encoding: an algorithm for - removing redundancy from a digitised message.", July 1979. - [ISO.14496-10.2014] International Organization for Standardization, "Information technology -- Coding of audio-visual objects -- Part 10: Advanced Video Coding", September 2014. - [I-D.ietf-cellar-ffv1] - Niedermayer, M., Rice, D., and J. Martinez, "FFV1 Video - Coding Format Version 0, 1, and 3", Work in Progress, - Internet-Draft, draft-ietf-cellar-ffv1-12, 28 January - 2020, - . + [YCbCr] Wikipedia, "YCbCr", undated, + . + + [AVI] Microsoft, "AVI RIFF File Reference", undated, + . + + [Address-Sanitizer] + The Clang Team, "ASAN AddressSanitizer website", undated, + . [NUT] Niedermayer, M., "NUT Open Container Format", December 2013, . + [REFIMPL] Niedermayer, M., "The reference FFV1 implementation / the + FFV1 codec in FFmpeg", undated, . + + [HuffYUV] Rudiak-Gould, B., "HuffYUV", December 2003, + . + +Appendix A. Multi-theaded decoder implementation suggestions + + This appendix is informative. + + The FFV1 bitstream is parsable in two ways: in sequential order as + described in this document or with the pre-analysis of the footer of + each slice. Each slice footer contains a "slice_size" field so the + boundary of each slice is computable without having to parse the + slice content. That allows multi-threading as well as independence + of slice content (a bitstream error in a slice header or slice + content has no impact on the decoding of the other slices). + + After having checked "keyframe" field, a decoder SHOULD parse + "slice_size" fields, from "slice_size" of the last slice at the end + of the "Frame" up to "slice_size" of the first slice at the beginning + of the "Frame", before parsing slices, in order to have slices + boundaries. A decoder MAY fallback on sequential order e.g. in case + of a corrupted "Frame" (frame size unknown, "slice_size" of slices + not coherent...) or if there is no possibility of seeking into the + stream. + +Appendix B. Future handling of some streams created by non conforming + encoders + + This appendix is informative. + + Some bitstreams were found with 40 extra bits corresponding to + "error_status" and "slice_crc_parity" in the "reserved" bits of + "Slice()". Any revision of this specification SHOULD care about + avoiding to add 40 bits of content after "SliceContent" if "version" + == 0 or "version" == 1. Else a decoder conforming to the revised + specification could not distinguish between a revised bitstream and + such buggy bitstream in the wild. + Authors' Addresses Michael Niedermayer Email: michael@niedermayer.cc - Dave Rice Email: dave@dericed.com Jerome Martinez Email: jerome@mediaarea.net