[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: (draft-kristensen-avt-rtp-h264-rcdo) 00 01 02 03 04 05 06 07 08 RFC 6185

Audio/Video Transport WG                                   T. Kristensen
Internet-Draft                                                  P. Luthi
Intended status: Standards Track                                TANDBERG
Expires: April 5, 2010                                   October 2, 2009


                RTP Payload Format for H.264 RCDO Video
                    draft-ietf-avt-rtp-h264-rcdo-03

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 5, 2010.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Abstract

   This document describes an RTP Payload format for the Reduced-
   Complexity Decoding Operation (RCDO) for H.264 Baseline profile
   bitstreams, as specified in ITU-T Recommendation H.241.  RCDO reduces



Kristensen & Luthi        Expires April 5, 2010                 [Page 1]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


   the decoding cost and resource consumption of the video processing.
   The RTP Payload format is based on the description in RFC 3984.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Conventions, Definitions and Acronyms  . . . . . . . . . . . .  3
   3.  Media Format Background  . . . . . . . . . . . . . . . . . . .  3
   4.  Payload Format . . . . . . . . . . . . . . . . . . . . . . . .  4
   5.  Congestion Control Considerations  . . . . . . . . . . . . . .  4
   6.  Payload Format Parameters  . . . . . . . . . . . . . . . . . .  4
     6.1.  Media Type Definition  . . . . . . . . . . . . . . . . . .  4
   7.  Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . . . 15
     7.1.  Offer/Answer Considerations  . . . . . . . . . . . . . . . 15
     7.2.  Declarative SDP Considerations . . . . . . . . . . . . . . 15
   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 16
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 16
   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 16
     11.2. Informative references . . . . . . . . . . . . . . . . . . 17
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18




























Kristensen & Luthi        Expires April 5, 2010                 [Page 2]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


1.  Introduction

   ITU-T Recommendation H.241 [5] specifies a reduced-complexity
   decoding operation (RCDO) for use with H.264 [4] Baseline profile
   bitstreams.  It also specifies a bitstream constraint associated with
   RCDO and a mechanism for signalling RCDO within the bitstream that
   the bitstream conforms to the bitstream constraint and that the
   decoder applies the RCDO decoding process to the bitstream.

   RCDO for H.264 offers a solution to support higher resolutions at the
   same high framerates used in current implementations.  This is
   achieved by reducing the processing requirements and thus the
   decoding cost/resource consumption of the video processing.


2.  Conventions, Definitions and Acronyms

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [1].

   RFC-editor note: RFC XXXX is to be replaced by the RFC number this
   specification recieves when published.


3.  Media Format Background

   The Reduced-Complexity Decoding Operation (RCDO) for H.264 Baseline
   profile bitstreams is specified in Annex B of H.241 [5].  RCDO is
   specified as a separate H.264 mode, and is distinct from any profile
   defined in H.264.  An RCDO bitstream obey to all the constraints of
   the Baseline profile.

   The media format is based on the H.264 RTP Payload format as
   specified in RFC 3984 [3].  Therefore, RFC 3984 constitutes the basis
   for this document and is referred to several times.

   In order to signal H.264 additional modes, Table 9f of H.241 [5]
   specifies an AdditionalModesSupported parameter.  Currently, the only
   additional mode defined is RCDO.

      Informative note: Other additional modes may be defined in the
      future.  H.264 additional modes may or may not be distinct from
      the Profiles in H.264.

   A separate media subtype, named H264-RCDO, is defined to ensure
   backward compatibility with deployed implementations of H.264.




Kristensen & Luthi        Expires April 5, 2010                 [Page 3]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


4.  Payload Format

   The payload format defined in Section 5 of RFC 3984 [3] SHALL be
   used.  This includes the RTP header usage and the payload format in
   RFC 3984.  Examples of typical RTP packets can be found in RFC 3984.


5.  Congestion Control Considerations

   Congestion control for RTP SHALL be used in accordance with RFC 3550
   [6], and with any applicable RTP profile; e.g., RFC 3551 [7].  If
   best-effort service is being used, users of this payload format SHALL
   monitor packet loss to ensure that the packet loss rate is within
   acceptable parameters.


6.  Payload Format Parameters

   This RTP payload format is identified using the H264-RCDO media type
   which is registered in accordance with RFC 4855 [8] and using the
   template of RFC 4288 [10].

6.1.  Media Type Definition

      Informative note: The media type definition for H264-RCDO is based
      on the definition for the H264 media subtype as specified in
      Section 8.1 of RFC 3984 [3].  Except for the profile-level-id
      parameter where new semantics are specified below, the optional
      media type parameters are copied verbatim from RFC 3984 [3] for
      completeness in the IANA registration.

   The media subtype for RCDO for H.264 is allocated from the IETF tree.

   The receiver MUST ignore any unspecified parameter.

   Type name: video

   Subtype name: H264-RCDO

   Required parameters:

   rate:  Indicates the RTP timestamp clock rate.  The rate value MUST
      be 90000.

   Optional parameters:






Kristensen & Luthi        Expires April 5, 2010                 [Page 4]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


   profile-level-id:  A base16 RFC 3548 [9] (hexadecimal) representation
      of the following three bytes in the sequence parameter set NAL
      unit specified in H.264 [4]: 1) profile_idc, 2) a byte herein
      referred to as profile-iop, composed of the values of
      constraint_set0_flag, constraint_set1_flag, constraint_set2_flag,
      and reserved_zero_5bits in bit-significance order, starting from
      the most significant bit, and 3) level_idc.

      RCDO is distinct from any profile, this implies that the profile
      value 0 (no profile) and the profile_idc byte of the profile-
      level-id parameter are equal to 0.  An RCDO bitstream MUST obey to
      all the constraints of the Baseline profile.  Therefore, only
      constraint_set0_flag is equal to 1 in the profile-iop part of the
      profile-level-id parameter, the remaining bits are set to 0.

      If the profile-level-id parameter is used to indicate properties
      of a NAL unit stream, it indicates the level that a decoder has to
      support in order to comply with H.264 [4] when it decodes the
      stream.  If the profile-level-id parameter is used for capability
      exchange or session setup procedure, it indicates the highest
      level supported for the signaled profile.

      For example, if a codec supports level 2.1, the profile-level-id
      becomes 00800d, in which 00 indicates the "no profile" value, 80
      indicates the constraints of the Baseline profile and 0d indicates
      level 1.3.  When level 2.1 is supported, the profile-level-id
      becomes 008015.

      If no profile-level-id is present, level 1 MUST be implied, i.e.
      equivalent to profile-level-id 00800a.

   max-mbps, max-fs, max-cpb, max-dpb, and max-br:  These parameters MAY
      be used to signal the capabilities of a receiver implementation.
      These parameters MUST NOT be used for any other purpose.  The
      profile-level-id parameter MUST be present in the same receiver
      capability description that contains any of these parameters.  The
      level conveyed in the value of the profile-level-id parameter MUST
      be such that the receiver is fully capable of supporting. max-
      mbps, max-fs, max-cpb, max- dpb, and max-br MAY be used to
      indicate capabilities of the receiver that extend the required
      capabilities of the signaled level, as specified below.

      When more than one parameter from the set (max- mbps, max-fs, max-
      cpb, max-dpb, max-br) is present, the receiver MUST support all
      signaled capabilities simultaneously.  For example, if both max-
      mbps and max-br are present, the signaled level with the extension
      of both the frame rate and bit rate is supported.  That is, the
      receiver is able to decode NAL unit streams in which the



Kristensen & Luthi        Expires April 5, 2010                 [Page 5]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


      macroblock processing rate is up to max-mbps (inclusive), the bit
      rate is up to max-br (inclusive), the coded picture buffer size is
      derived as specified in the semantics of the max-br parameter
      below, and other properties comply with the level specified in the
      value of the profile-level-id parameter.

      A receiver MUST NOT signal values of max- mbps, max-fs, max-cpb,
      max-dpb, and max-br that meet the requirements of a higher level,
      referred to as level A herein, compared to the level specified in
      the value of the profile- level-id parameter, if the receiver can
      support all the properties of level A.

         Informative note: When the OPTIONAL MIME type parameters are
         used to signal the properties of a NAL unit stream, max-mbps,
         max-fs, max-cpb, max-dpb, and max-br are not present, and the
         value of profile- level-id must always be such that the NAL
         unit stream complies fully with the specified profile and
         level.

   max-mbps:  The value of max-mbps is an integer indicating the maximum
      macroblock processing rate in units of macroblocks per second.
      The max-mbps parameter signals that the receiver is capable of
      decoding video at a higher rate than is required by the signaled
      level conveyed in the value of the profile-level-id parameter.
      When max-mbps is signaled, the receiver MUST be able to decode NAL
      unit streams that conform to the signaled level, with the
      exception that the MaxMBPS value in Table A-1 of H.264 [4] for the
      signaled level is replaced with the value of max-mbps.  The value
      of max-mbps MUST be greater than or equal to the value of MaxMBPS
      for the level given in Table A-1 of H.264 [4].  Senders MAY use
      this knowledge to send pictures of a given size at a higher
      picture rate than is indicated in the signaled level.

   max-fs:  The value of max-fs is an integer indicating the maximum
      frame size in units of macroblocks.  The max-fs parameter signals
      that the receiver is capable of decoding larger picture sizes than
      are required by the signaled level conveyed in the value of the
      profile-level-id parameter.  When max-fs is signaled, the receiver
      MUST be able to decode NAL unit streams that conform to the
      signaled level, with the exception that the MaxFS value in Table
      A-1 of H.264 [4] for the signaled level is replaced with the value
      of max-fs.  The value of max-fs MUST be greater than or equal to
      the value of MaxFS for the level given in Table A-1 of H.264 [4].
      Senders MAY use this knowledge to send larger pictures at a
      proportionally lower frame rate than is indicated in the signaled
      level.





Kristensen & Luthi        Expires April 5, 2010                 [Page 6]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



   max-cpb:  The value of max-cpb is an integer indicating the maximum
      coded picture buffer size in units of 1000 bits for the VCL HRD
      parameters (see A.3.1 item i of H.264 [4]) and in units of 1200
      bits for the NAL HRD parameters (see A.3.1 item j of H.264 [4]).
      The max-cpb parameter signals that the receiver has more memory
      than the minimum amount of coded picture buffer memory required by
      the signaled level conveyed in the value of the profile-level-id
      parameter.  When max-cpb is signaled, the receiver MUST be able to
      decode NAL unit streams that conform to the signaled level, with
      the exception that the MaxCPB value in Table A-1 of H.264 [4] for
      the signaled level is replaced with the value of max-cpb.  The
      value of max-cpb MUST be greater than or equal to the value of
      MaxCPB for the level given in Table A-1 of H.264 [4].  Senders MAY
      use this knowledge to construct coded video streams with greater
      variation of bit rate than can be achieved with the MaxCPB value
      in Table A-1 of H.264 [4].

         Informative note: The coded picture buffer is used in the
         hypothetical reference decoder (Annex C) of H.264.  The use of
         the hypothetical reference decoder is recommended in H.264
         encoders to verify that the produced bitstream conforms to the
         standard and to control the output bitrate.  Thus, the coded
         picture buffer is conceptually independent of any other
         potential buffers in the receiver, including de-interleaving
         and de-jitter buffers.  The coded picture buffer need not be
         implemented in decoders as specified in Annex C of H.264, but
         rather standard- compliant decoders can have any buffering
         arrangements provided that they can decode standard-compliant
         bitstreams.  Thus, in practice, the input buffer for video
         decoder can be integrated with de- interleaving and de-jitter
         buffers of the receiver.

   max-dpb:  The value of max-dpb is an integer indicating the maximum
      decoded picture buffer size in units of 1024 bytes.  The max-dpb
      parameter signals that the receiver has more memory than the
      minimum amount of decoded picture buffer memory required by the
      signaled level conveyed in the value of the profile-level-id
      parameter.  When max-dpb is signaled, the receiver MUST be able to
      decode NAL unit streams that conform to the signaled level, with
      the exception that the MaxDPB value in Table A-1 of H.264 [4] for
      the signaled level is replaced with the value of max-dpb.
      Consequently, a receiver that signals max-dpb MUST be capable of
      storing the following number of decoded frames, complementary
      field pairs, and non-paired fields in its decoded picture buffer:






Kristensen & Luthi        Expires April 5, 2010                 [Page 7]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



      Min(1024 * max-dpb / ( PicWidthInMbs * FrameHeightInMbs * 256 *
      ChromaFormatFactor ), 16)

      PicWidthInMbs, FrameHeightInMbs, and ChromaFormatFactor are
      defined in H.264 [4].

      The value of max-dpb MUST be greater than or equal to the value of
      MaxDPB for the level given in Table A-1 of H.264 [4].  Senders MAY
      use this knowledge to construct coded video streams with improved
      compression.

         Informative note: This parameter was added primarily to
         complement a similar codepoint in the ITU-T Recommendation
         H.245, so as to facilitate signaling gateway designs.  The
         decoded picture buffer stores reconstructed samples and is a
         property of the video decoder only.  There is no relationship
         between the size of the decoded picture buffer and the buffers
         used in RTP, especially de-interleaving and de-jitter buffers.

   max-br:  The value of max-br is an integer indicating the maximum
      video bit rate in units of 1000 bits per second for the VCL HRD
      parameters (see A.3.1 item i of H.264 [4]) and in units of 1200
      bits per second for the NAL HRD parameters (see A.3.1 item j of
      H.264 [4]).

      The max-br parameter signals that the video decoder of the
      receiver is capable of decoding video at a higher bit rate than is
      required by the signaled level conveyed in the value of the
      profile-level-id parameter.  The value of max- br MUST be greater
      than or equal to the value of MaxBR for the level given in Table
      A-1 of H.264 [4].

      When max-br is signaled, the video codec of the receiver MUST be
      able to decode NAL unit streams that conform to the signaled
      level, conveyed in the profile-level-id parameter, with the
      following exceptions in the limits specified by the level: o The
      value of max-br replaces the MaxBR value of the signaled level (in
      Table A-1 of H.264 [4]). o When the max-cpb parameter is not
      present, the result of the following formula replaces the value of
      MaxCPB in Table A-1 of H.264 [4]: (MaxCPB of the signaled level) *
      max-br / (MaxBR of the signaled level).

      For example, if a receiver signals capability for Level 1.2 with
      max-br equal to 1550, this indicates a maximum video bitrate of
      1550 kbits/sec for VCL HRD parameters, a maximum video bitrate of
      1860 kbits/sec for NAL HRD parameters, and a CPB size of 4036458
      bits (1550000 / 384000 * 1000 * 1000).



Kristensen & Luthi        Expires April 5, 2010                 [Page 8]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



      The value of max-br MUST be greater than or equal to the value
      MaxBR for the signaled level given in Table A-1 of H.264 [4].

      Senders MAY use this knowledge to send higher bitrate video as
      allowed in the level definition of Annex A of H.264, to achieve
      improved video quality.

         Informative note: This parameter was added primarily to
         complement a similar codepoint in the ITU-T Recommendation
         H.245, so as to facilitate signaling gateway designs.  No
         assumption can be made from the value of this parameter that
         the network is capable of handling such bit rates at any given
         time.  In particular, no conclusion can be drawn that the
         signaled bit rate is possible under congestion control
         constraints.

   redundant-pic-cap:  This parameter signals the capabilities of a
      receiver implementation.  When equal to 0, the parameter indicates
      that the receiver makes no attempt to use redundant coded pictures
      to correct incorrectly decoded primary coded pictures.  When equal
      to 0, the receiver is not capable of using redundant slices;
      therefore, a sender SHOULD avoid sending redundant slices to save
      bandwidth.  When equal to 1, the receiver is capable of decoding
      any such redundant slice that covers a corrupted area in a primary
      decoded picture (at least partly), and therefore a sender MAY send
      redundant slices.  When the parameter is not present, then a value
      of 0 MUST be used for redundant-pic-cap.  When present, the value
      of redundant-pic-cap MUST be either 0 or 1.

      When the profile-level-id parameter is present in the same
      capability signaling as the redundant-pic-cap parameter, and the
      profile indicated in profile-level-id is such that it disallows
      the use of redundant coded pictures (e.g., Main Profile), the
      value of redundant- pic-cap MUST be equal to 0.  When a receiver
      indicates redundant-pic-cap equal to 0, the received stream SHOULD
      NOT contain redundant coded pictures.

         Informative note: Even if redundant-pic-cap is equal to 0, the
         decoder is able to ignore redundant codec pictures provided
         that the decoder supports such a profile (Baseline, Extended)
         in which redundant coded pictures are allowed.

         Informative note: Even if redundant-pic-cap is equal to 1, the
         receiver may also choose other error concealment strategies to
         replace or complement decoding of redundant slices.





Kristensen & Luthi        Expires April 5, 2010                 [Page 9]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



   sprop-parameter-sets:  This parameter MAY be used to convey any
      sequence and picture parameter set NAL units (herein referred to
      as the initial parameter set NAL units) that MUST precede any
      other NAL units in decoding order.  The parameter MUST NOT be used
      to indicate codec capability in any capability exchange procedure.
      The value of the parameter is the base64 RFC 3548 [9]
      representation of the initial parameter set NAL units as specified
      in sections 7.3.2.1 and 7.3.2.2 of H.264 [4].  The parameter sets
      are conveyed in decoding order, and no framing of the parameter
      set NAL units takes place.  A comma is used to separate any pair
      of parameter sets in the list.  Note that the number of bytes in a
      parameter set NAL unit is typically less than 10, but a picture
      parameter set NAL unit can contain several hundreds of bytes.

         Informative note: When several payload types are offered in the
         SDP Offer/Answer model, each with its own sprop-parameter- sets
         parameter, then the receiver cannot assume that those parameter
         sets do not use conflicting storage locations (i.e., identical
         values of parameter set identifiers).  Therefore, a receiver
         should double-buffer all sprop-parameter-sets and make them
         available to the decoder instance that decodes a certain
         payload type.

   parameter-add:  This parameter MAY be used to signal whether the
      receiver of this parameter is allowed to add parameter sets in its
      signaling response using the sprop-parameter-sets MIME parameter.
      The value of this parameter is either 0 or 1. 0 is equal to false;
      i.e., it is not allowed to add parameter sets. 1 is equal to true;
      i.e., it is allowed to add parameter sets.  If the parameter is
      not present, its value MUST be 1.

   packetization-mode:  This parameter signals the properties of an RTP
      payload type or the capabilities of a receiver implementation.
      Only a single configuration point can be indicated; thus, when
      capabilities to support more than one packetization-mode are
      declared, multiple configuration points (RTP payload types) must
      be used.

      When the value of packetization-mode is equal to 0 or
      packetization-mode is not present, the single NAL mode, as defined
      in section 6.2 of RFC 3984, MUST be used.  This mode is in use in
      standards using ITU-T Recommendation H.241 [5] (see section 12.1).
      When the value of packetization-mode is equal to 1, the non-
      interleaved mode, as defined in section 6.3 of RFC 3984, MUST be
      used.  When the value of packetization-mode is equal to 2, the
      interleaved mode, as defined in section 6.4 of RFC 3984, MUST be
      used.  The value of packetization mode MUST be an integer in the



Kristensen & Luthi        Expires April 5, 2010                [Page 10]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


      range of 0 to 2, inclusive.

   sprop-interleaving-depth:  This parameter MUST NOT be present when
      packetization-mode is not present or the value of packetization-
      mode is equal to 0 or 1.  This parameter MUST be present when the
      value of packetization-mode is equal to 2.

      This parameter signals the properties of a NAL unit stream.  It
      specifies the maximum number of VCL NAL units that precede any VCL
      NAL unit in the NAL unit stream in transmission order and follow
      the VCL NAL unit in decoding order.  Consequently, it is
      guaranteed that receivers can reconstruct NAL unit decoding order
      when the buffer size for NAL unit decoding order recovery is at
      least the value of sprop- interleaving-depth + 1 in terms of VCL
      NAL units.

      The value of sprop-interleaving-depth MUST be an integer in the
      range of 0 to 32767, inclusive.

   sprop-deint-buf-req:  This parameter MUST NOT be present when
      packetization-mode is not present or the value of packetization-
      mode is equal to 0 or 1.  It MUST be present when the value of
      packetization-mode is equal to 2.

      sprop-deint-buf-req signals the required size of the
      deinterleaving buffer for the NAL unit stream.  The value of the
      parameter MUST be greater than or equal to the maximum buffer
      occupancy (in units of bytes) required in such a deinterleaving
      buffer that is specified in section 7.2 of RFC 3984.  It is
      guaranteed that receivers can perform the deinterleaving of
      interleaved NAL units into NAL unit decoding order, when the
      deinterleaving buffer size is at least the value of sprop-deint-
      buf-req in terms of bytes.

      The value of sprop-deint-buf-req MUST be an integer in the range
      of 0 to 4294967295, inclusive.

         Informative note: sprop-deint-buf-req indicates the required
         size of the deinterleaving buffer only.  When network jitter
         can occur, an appropriately sized jitter buffer has to be
         provisioned for as well.

   deint-buf-cap:  This parameter signals the capabilities of a receiver
      implementation and indicates the amount of deinterleaving buffer
      space in units of bytes that the receiver has available for
      reconstructing the NAL unit decoding order.  A receiver is able to
      handle any stream for which the value of the sprop-deint-buf-req
      parameter is smaller than or equal to this parameter.



Kristensen & Luthi        Expires April 5, 2010                [Page 11]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



      If the parameter is not present, then a value of 0 MUST be used
      for deint-buf-cap.  The value of deint-buf-cap MUST be an integer
      in the range of 0 to 4294967295, inclusive.

         Informative note: deint-buf-cap indicates the maximum possible
         size of the deinterleaving buffer of the receiver only.  When
         network jitter can occur, an appropriately sized jitter buffer
         has to be provisioned for as well.

   sprop-init-buf-time:  This parameter MAY be used to signal the
      properties of a NAL unit stream.  The parameter MUST NOT be
      present, if the value of packetization-mode is equal to 0 or 1.

      The parameter signals the initial buffering time that a receiver
      MUST buffer before starting decoding to recover the NAL unit
      decoding order from the transmission order.  The parameter is the
      maximum value of (transmission time of a NAL unit - decoding time
      of the NAL unit), assuming reliable and instantaneous
      transmission, the same timeline for transmission and decoding, and
      that decoding starts when the first packet arrives.

      An example of specifying the value of sprop- init-buf-time
      follows.  A NAL unit stream is sent in the following interleaved
      order, in which the value corresponds to the decoding time and the
      transmission order is from left to right:

      0 2 1 3 5 4 6 8 7 ...

      Assuming a steady transmission rate of NAL units, the transmission
      times are:

      0 1 2 3 4 5 6 7 8 ...

      Subtracting the decoding time from the transmission time column-
      wise results in the following series:

      0 -1 1 0 -1 1 0 -1 1 ...

      Thus, in terms of intervals of NAL unit transmission times, the
      value of sprop-init-buf-time in this example is 1.

      The parameter is coded as a non-negative base10 integer
      representation in clock ticks of a 90- kHz clock.  If the
      parameter is not present, then no initial buffering time value is
      defined.  Otherwise the value of sprop-init- buf-time MUST be an
      integer in the range of 0 to 4294967295, inclusive.




Kristensen & Luthi        Expires April 5, 2010                [Page 12]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



      In addition to the signaled sprop-init-buf- time, receivers SHOULD
      take into account the transmission delay jitter buffering,
      including buffering for the delay jitter caused by mixers,
      translators, gateways, proxies, traffic-shapers, and other network
      elements.

   sprop-max-don-diff:  This parameter MAY be used to signal the
      properties of a NAL unit stream.  It MUST NOT be used to signal
      transmitter or receiver or codec capabilities.  The parameter MUST
      NOT be present if the value of packetization-mode is equal to 0 or
      1. sprop-max-don-diff is an integer in the range of 0 to 32767,
      inclusive.  If sprop-max-don-diff is not present, the value of the
      parameter is unspecified. sprop-max- don-diff is calculated as
      follows:

      sprop-max-don-diff = max{AbsDON(i) - AbsDON(j)},
      for any i and any j>i,

      where i and j indicate the index of the NAL unit in the
      transmission order and AbsDON denotes a decoding order number of
      the NAL unit that does not wrap around to 0 after 65535.  In other
      words, AbsDON is calculated as follows: Let m and n be consecutive
      NAL units in transmission order.  For the very first NAL unit in
      transmission order (whose index is 0), AbsDON(0) = DON(0).  For
      other NAL units, AbsDON is calculated as follows:

      If DON(m) == DON(n), AbsDON(n) = AbsDON(m)

      If (DON(m) < DON(n) and DON(n) - DON(m) < 32768),
      AbsDON(n) = AbsDON(m) + DON(n) - DON(m)

      If (DON(m) > DON(n) and DON(m) - DON(n) >= 32768),
      AbsDON(n) = AbsDON(m) + 65536 - DON(m) + DON(n)

      If (DON(m) < DON(n) and DON(n) - DON(m) >= 32768),

      AbsDON(n) = AbsDON(m) - (DON(m) + 65536 - DON(n))

      If (DON(m) > DON(n) and DON(m) - DON(n) < 32768),
      AbsDON(n) = AbsDON(m) - (DON(m) - DON(n))

      where DON(i) is the decoding order number of the NAL unit having
      index i in the transmission order.  The decoding order number is
      specified in section 5.5 of RFC 3984.






Kristensen & Luthi        Expires April 5, 2010                [Page 13]

Internet-Draft           H.264 RCDO RTP Payload             October 2009



         Informative note: Receivers may use sprop- max-don-diff to
         trigger which NAL units in the receiver buffer can be passed to
         the decoder.

   max-rcmd-nalu-size:  This parameter MAY be used to signal the
      capabilities of a receiver.  The parameter MUST NOT be used for
      any other purposes.  The value of the parameter indicates the
      largest NALU size in bytes that the receiver can handle
      efficiently.  The parameter value is a recommendation, not a
      strict upper boundary.  The sender MAY create larger NALUs but
      must be aware that the handling of these may come at a higher cost
      than NALUs conforming to the limitation.

      The value of max-rcmd-nalu-size MUST be an integer in the range of
      0 to 4294967295, inclusive.  If this parameter is not specified,
      no known limitation to the NALU size exists.  Senders still have
      to consider the MTU size available between the sender and the
      receiver and SHOULD run MTU discovery for this purpose.

      This parameter is motivated by, for example, an IP to H.223 video
      telephony gateway, where NALUs smaller than the H.223 transport
      data unit will be more efficient.  A gateway may terminate IP;
      thus, MTU discovery will normally not work beyond the gateway.

         Informative note: Setting this parameter to a lower than
         necessary value may have a negative impact.

   Encoding considerations:  This type is only defined for transfer via
      RTP (RFC 3550) and is framed and binary, see section 4.8 in
      RFC4288.

   Security considerations:  See section X of RFC XXXX.

   Interoperability considerations:  None

   Published specification:  RFC XXXX and its reference section.

   Applications that use this media type:  None

   Additional information:  None

      Magic number(s):








Kristensen & Luthi        Expires April 5, 2010                [Page 14]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


      File extension(s):

      Macintosh file type code(s):

   Person & email address to contact for further information:
      Tom Kristensen <tom.kristensen@tandberg.com>, <tomkri@ifi.uio.no>

   Intended usage:  COMMON

   Restrictions on usage:  This media type depends on RTP framing, and
      hence is only defined for transfer via RTP, ref RFC3550.
      Transport within other framing protocols is not defined at this
      time.

   Author:  Tom Kristensen

   Change controller:  IETF Audio/Video Transport working group
      delegated from the IESG.


7.  Mapping to SDP

   The mapping of the above defined payload format media type and its
   parameters SHALL be done according to Section 3 of RFC 4855 [8].

   An example of media representation of a level 2 bitstream is as
   follows:

      m=video 54321 RTP/AVP 101
      a=rtpmap:101 H264-RCDO/90000
      a=fmtp:101 profile-level-id=008014;max-mbps=60000

7.1.  Offer/Answer Considerations

   When H264-RCDO is offered over RTP using SDP in an Offer/Answer model
   [2] for unicast and multicast usage, the limitations and rules
   described in Section 8.2.2 of RFC 3984 [3] apply.  Note that the
   profile_idc byte of the H264-RCDO profile-level-id parameter can only
   take the value of 0 (no profile).

7.2.  Declarative SDP Considerations

   When H264-RCDO over RTP is offered with SDP in a declarative style,
   as in RTSP [14] or SAP [15], the considerations in Section 8.2.3 of
   RFC 3984 [3] apply.  Note that the profile_idc byte of the H264-RCDO
   profile-level-id parameter can only take the value of 0 (no profile).





Kristensen & Luthi        Expires April 5, 2010                [Page 15]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


8.  IANA Considerations

   This document requests that IANA registers H264-RCDO as specified in
   Section Section 6.1.  The media type is also requested to be added to
   the IANA registry for "RTP Payload Format MIME types"
   (http://www.iana.org/assignments/rtp-parameters).


9.  Security Considerations

   RTP packets using the payload format defined in this specification
   are subject to the security considerations discussed in the RTP
   specification [6], and in any applicable RTP profile.  The main
   security considerations for the RTP packet carrying the RTP payload
   format defined within this document are confidentiality, integrity
   and source authenticity.  Confidentiality is achieved by encryption
   of the RTP payload.  Integrity of the RTP packets through suitable
   cryptographic integrity protection mechanism.  Cryptographic system
   may also allow the authentication of the source of the payload.  A
   suitable security mechanism for this RTP payload format should
   provide confidentiality, integrity protection and at least source
   authentication capable of determining if an RTP packet is from a
   member of the RTP session or not.

   Note that the appropriate mechanism to provide security to RTP and
   payloads following this document may vary.  It is dependent on the
   application, the transport, and the signalling protocol employed.
   Therefore a single mechanism is not sufficient, although if suitable
   the usage of SRTP [11] is recommended.  Other mechanism that may be
   used are IPsec [12] and TLS [13] (RTP over TCP), but also other
   alternatives may exist.

   Refer also to section 9 of RFC 3984 [3], as no reasons for separate
   considerations are introduced in this document.


10.  Acknowledgements

   The authors would like to acknowledge Gisle Bjoentegaard for his
   technical contribution and review of the specification.


11.  References

11.1.  Normative References

   [1]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
         Levels", BCP 14, RFC 2119, March 1997.



Kristensen & Luthi        Expires April 5, 2010                [Page 16]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


   [2]   Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
         Session Description Protocol (SDP)", RFC 3264, June 2002.

   [3]   Wenger, S., Hannuksela, M., Stockhammer, T., Westerlund, M.,
         and D. Singer, "RTP Payload Format for H.264 Video", RFC 3984,
         February 2005.

   [4]   International Telecommunications Union, "Advanced video coding
         for generic audiovisual services", ITU-T Recommendation H.264,
         March 2005.

   [5]   International Telecommunications Union, "Extended video
         procedures and control signals for H.300-series terminals",
         ITU-T Recommendation H.241, May 2006.

   [6]   Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
         "RTP: A Transport Protocol for Real-Time Applications", STD 64,
         RFC 3550, July 2003.

   [7]   Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video
         Conferences with Minimal Control", STD 65, RFC 3551, July 2003.

   [8]   Casner, S., "Media Type Registration of RTP Payload Formats",
         RFC 4855, February 2007.

   [9]   Josefsson, S., "The Base16, Base32, and Base64 Data Encodings",
         RFC 3548, July 2003.

11.2.  Informative references

   [10]  Freed, N. and J. Klensin, "Media Type Specifications and
         Registration Procedures", BCP 13, RFC 4288, December 2005.

   [11]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
         Norrman, "The Secure Real-time Transport Protocol (SRTP)",
         RFC 3711, March 2004.

   [12]  Kent, S. and K. Seo, "Security Architecture for the Internet
         Protocol", RFC 4301, December 2005.

   [13]  Dierks, T. and E. Rescorla, "The Transport Layer Security (TLS)
         Protocol Version 1.2", RFC 5246, August 2008.

   [14]  Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time Streaming
         Protocol (RTSP)", RFC 2326, April 1998.

   [15]  Handley, M., Perkins, C., and E. Whelan, "Session Announcement
         Protocol", RFC 2974, October 2000.



Kristensen & Luthi        Expires April 5, 2010                [Page 17]

Internet-Draft           H.264 RCDO RTP Payload             October 2009


Authors' Addresses

   Tom Kristensen
   TANDBERG
   Philip Pedersens vei 22
   N-1366 Lysaker
   Norway

   Phone: +47 67125125
   Email: tom.kristensen@tandberg.com, tomkri@ifi.uio.no
   URI:   http://www.tandberg.com


   Patrick Luthi
   TANDBERG
   Philip Pedersens vei 22
   N-1366 Lysaker
   Norway

   Email: patrick.luthi@tandberg.com
   URI:   http://www.tandberg.com






























Kristensen & Luthi        Expires April 5, 2010                [Page 18]


Html markup produced by rfcmarkup 1.108, available from http://tools.ietf.org/tools/rfcmarkup/