[Docs] [txt|pdf|html] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]
Versions: (draft-westerlund-avt-rtp-g719) 00
01 02 03 04 RFC 5404
Network Working Group M. Westerlund
Internet-Draft I. Johansson
Intended status: Standards Track Ericsson AB
Expires: May 21, 2009 Nov 17, 2008
RTP Payload format for G.719
draft-ietf-avt-rtp-g719-04
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 21, 2009.
Abstract
This document specifies the payload format for packetization of the
G.719 full-band codec encoded audio signals into the Real-time
Transport Protocol (RTP). The payload format supports transmission
of multiple channels, multiple frames per payload, and interleaving.
Westerlund & Johansson Expires May 21, 2009 [Page 1]
Internet-Draft RTP Payload format for G.719 Nov 2008
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Definitions and Conventions . . . . . . . . . . . . . . . . . 3
3. G.719 Description . . . . . . . . . . . . . . . . . . . . . . 3
4. Payload format Capabilities . . . . . . . . . . . . . . . . . 4
4.1. Multi-rate Encoding and Rate Adaptation . . . . . . . . . 4
4.2. Support for Multi-Channel Sessions . . . . . . . . . . . . 5
4.3. Robustness against Packet Loss . . . . . . . . . . . . . . 5
4.3.1. Use of Forward Error Correction (FEC) . . . . . . . . 5
4.3.2. Use of Frame Interleaving . . . . . . . . . . . . . . 6
5. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 7
5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 8
5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 8
5.2.1. Basic ToC element . . . . . . . . . . . . . . . . . . 9
5.3. Basic mode . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4. Interleaved mode . . . . . . . . . . . . . . . . . . . . . 10
5.5. Audio Data . . . . . . . . . . . . . . . . . . . . . . . . 11
5.6. Implementation Considerations . . . . . . . . . . . . . . 12
5.6.1. Receiving Redundant Frames . . . . . . . . . . . . . . 12
5.6.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 12
5.6.3. Decoding Validation . . . . . . . . . . . . . . . . . 13
6. Payload Examples . . . . . . . . . . . . . . . . . . . . . . . 13
6.1. 3 mono frames with 2 different bitrates . . . . . . . . . 14
6.2. 2 stereo frame-blocks of the same bitrate . . . . . . . . 14
6.3. 4 mono frames interleaved . . . . . . . . . . . . . . . . 15
7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 16
7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 16
7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 19
7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 20
7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 23
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23
9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 23
10. Security Considerations . . . . . . . . . . . . . . . . . . . 24
10.1. Confidentiality . . . . . . . . . . . . . . . . . . . . . 24
10.2. Authentication and Integrity . . . . . . . . . . . . . . . 25
11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25
12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
12.1. Informative References . . . . . . . . . . . . . . . . . . 25
12.2. Normative References . . . . . . . . . . . . . . . . . . . 26
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
Intellectual Property and Copyright Statements . . . . . . . . . . 28
Westerlund & Johansson Expires May 21, 2009 [Page 2]
Internet-Draft RTP Payload format for G.719 Nov 2008
1. Introduction
This document specifies the payload format for packetization of the
G.719 full-band (FB) codec encoded audio signals into the Real-time
Transport Protocol (RTP) [RFC3550]. The payload format supports
transmission of multiple channels, multiple frames per payload,
packet loss robustness methods using redundancy or interleaving.
This document starts with conventions, a brief description of the
codec, and the payload formats capabilities. The payload format is
specified in Section 5. Examples can be found in Section 6. The
media type and its mappings to SDP, usage in SDP offer/answer is then
specified. The document ends with considerations around congestion
control and security.
2. Definitions and Conventions
The term "frame-block" is used in this document to describe the time-
synchronized set of audio frames in a multi-channel audio session.
In particular, in an N-channel session, a frame-block will contain N
audio frames, one from each of the channels, and all N speech frames
represents exactly the same time period.
This document contains depictions of bit fields. The most
significant bit is always leftmost in the figure on each row and have
the lowest enumeration. For fields that are depicted over multiple
rows the upper row is more significant than the next.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
3. G.719 Description
The ITU-T G.719 full-band codec is a transform coder based on
Modulated Lapped Transform (MLT). G.719 is a low complexity full
bandwidth codec for conversational speech and audio coding. The
encoder input and decoder output are sampled at 48 kHz. The codec
enables full bandwidth, from 20 Hz to 20 kHz, encoding of speech,
music and general audio content at rates from 32 kbit/s up to 128
kbit/s. The codec operates on 20ms frames and has an algorithmic
delay of 40 ms.
The codec provides excellent quality for speech, music and other
types of audio. Some of the applications for which this coder is
suitable are:
Westerlund & Johansson Expires May 21, 2009 [Page 3]
Internet-Draft RTP Payload format for G.719 Nov 2008
o Real-time communications such as video conferencing and telephony.
o Streaming audio
o Archival and messaging
The encoding and decoding algorithm can change the bit rate at any
20ms frame boundary. The encoder receives the audio sampled at
48kHz. The support of other sampling rates is possible by re-
sampling the input signal to the codec's sampling rate, i.e. 48kHz,
however, this functionality is not part of the standard.
The encoding is performed on equally sized frames. For each frame,
the encoder decides between two encoding modes, a transient mode and
a stationary mode. The decision is based on statistics derived from
the input signal. The stationary mode uses a long MLT that leads to
a spectrum of 960 coefficients while the transient encoding mode uses
a short MLT (higher time resolution transform) which results in 4
spectra (4 x 240 = 960 coefficients). The encoding of the spectrum
is done in two steps. First, the spectral envelope is computed,
quantized and Huffman encoded. The envelope is computed on a non-
uniform frequency subdivision. From the coded spectral envelope, a
weighted spectral envelope is derived and is used for bit-allocation,
this process is also repeated at the decoder, thus only the spectral
envelope is transmitted. The output of the bit-allocation is used in
order to quantize the spectra. In addition, for stationary frames
the encoder estimates the amount of noise level. The decoder applies
the reverse operation upon reception of the bit stream. The non-
coded coefficients (i.e. no bits allocated) are replaced by entries
of a noise codebook which is built based on the decoded coefficients.
4. Payload format Capabilities
This payload format have a number of capabilities and this section
discuss them in some detail.
4.1. Multi-rate Encoding and Rate Adaptation
G.719 supports multi-rate encoding capability that enables on a per
frame basis variation of the encoding rate. This enables support for
bit-rate adaptation and congestion control. The possibility to
aggregate multiple audio frames into a single RTP payload is another
dimension of adaptation. The RTP and payload format overhead can
thus be reduced by the aggregation at the cost of increased delay and
reduced packet-loss robustness.
Westerlund & Johansson Expires May 21, 2009 [Page 4]
Internet-Draft RTP Payload format for G.719 Nov 2008
4.2. Support for Multi-Channel Sessions
The RTP payload format defined in this document supports multi-
channel audio content (e.g. stereophonic or surround audio sessions).
Although the G.719 codec itself does not support encoding of multi-
channel audio content into a single bit stream, it can be used to
separately encode and decode each of the individual channels. To
transport (or store) the separately encoded multi-channel content,
the audio frames for all channels that are framed and encoded for the
same 20 ms period are logically collected in a "frame-block".
At the session setup, out-of-band signaling must be used to indicate
the number of channels in the payload type. The order of the audio
frames within the frame-block depends on the number of the channels
and follows the definition in Section 4.1 of the RTP/AVP Profile
[RFC3551]. When using SDP for signaling, the number of channels is
specified in the rtpmap attribute.
4.3. Robustness against Packet Loss
The payload format supports several means, including forward error
correction (FEC) and frame interleaving, to increase robustness
against packet loss.
4.3.1. Use of Forward Error Correction (FEC)
Generic forward error correction within RTP is defined, for example,
in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC
2198 [RFC2198]. Either scheme can be used to add redundant
information to the RTP packet stream and make it more resilient to
packet losses, at the expense of a higher bit rate. Please see
either RFCs for a discussion of the implications of the higher bit
rate to network congestion.
In addition to these media-unaware mechanisms, this memo specifies an
optional G.719 specific form of audio redundancy coding, which may be
beneficial in terms of packetization overhead. Conceptually,
previously transmitted transport frames are aggregated together with
new ones. A sliding window can be used to group the frames to be
sent in each payload. However, irregular or non-consecutive patterns
are also possible by inserting NO_DATA frames between primary and
redundant transmissions. Figure 1 below shows an example.
Westerlund & Johansson Expires May 21, 2009 [Page 5]
Internet-Draft RTP Payload format for G.719 Nov 2008
--+--------+--------+--------+--------+--------+--------+--------+--
| f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
--+--------+--------+--------+--------+--------+--------+--------+--
<---- p(n-1) ---->
<----- p(n) ----->
<---- p(n+1) ---->
<---- p(n+2) ---->
<---- p(n+3) ---->
<---- p(n+4) ---->
Figure 1: An example of redundant transmission
Here, each frame is retransmitted once in the following RTP payload
packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n-
1)...p(n+4) a sequence of payload packets.
The mechanism described does not really require signaling at the
session setup. However, signalling has been defined to allow for the
sender to voluntarily bounding the buffering and delay requirements.
If nothing is signalled the use of this mechanism is allowed and
unbounded. For a certain timestamp, the receiver may receive
multiple copies of a frame containing encoded audio data, even at
different encoding rates. The cost of this scheme is bandwidth and
the receiver delay necessary to allow the redundant copy to arrive.
This redundancy scheme provides a functionality similar to the one
described in RFC 2198, but it works only if both original frames and
redundant representations are G.719 frames. When the use of other
media coding schemes is desirable, one has to resort to RFC 2198.
The sender is responsible for selecting an appropriate amount of
redundancy based on feedback about the channel conditions, e.g., in
the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The
sender is also responsible for avoiding congestion, which may be
exacerbated by redundancy (see Section 9 for more details).
4.3.2. Use of Frame Interleaving
To decrease protocol overhead, the payload design allows several
audio transport frames to be encapsulated into a single RTP packet.
One of the drawbacks of such an approach is that in case of packet
loss several consecutive frames are lost. Consecutive frame loss
normally renders error concealment less efficient and usually causes
clearly audible and annoying distortions in the reconstructed audio.
Interleaving of transport frames can improve the audio quality in
such cases by distributing the consecutive losses into a number of
isolated frame losses, which are easier to conceal. However,
Westerlund & Johansson Expires May 21, 2009 [Page 6]
Internet-Draft RTP Payload format for G.719 Nov 2008
interleaving and bundling several frames per payload also increases
end-to-end delay and sets higher buffering requirements. Therefore,
interleaving is not appropriate for all use cases or devices.
Streaming applications should most likely be able to exploit
interleaving to improve audio quality in lossy transmission
conditions.
Note that this payload design supports the use of frame interleaving
as an option. The usage of this feature needs to be negotiated in
the session setup.
The interleaving supported by this format is rather flexible. For
example, a continuous pattern can be defined, as depicted in
Figure 2.
--+--------+--------+--------+--------+--------+--------+--------+--
| f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
--+--------+--------+--------+--------+--------+--------+--------+--
[ p(n) ]
[ p(n+1) ] [ p(n+1) ]
[ p(n+2) ] [ p(n+2) ]
[ p(n+3) ]
[ p(n+4) ]
Figure 2: An example of interleaving pattern that has constant delay
In Figure 2 the consecutive frames, denoted f(n-2) to f(n+4), are
aggregated into packets p(n) to p(n+4), each packet carrying two
frames. This approach provides an interleaving pattern that allows
for constant delay in both the interleaving and de-interleaving
processes. The de-interleaving buffer needs to have room for at
least three frames, including the one that is ready to be consumed.
The storage space for three frames is needed, for example, when f(n)
is the next frame to be decoded: since frame f(n) was received in
packet p(n+2), which also carried frame f(n+3), both these frames are
stored in the buffer. Furthermore, frame f(n+1) received in the
previous packet, p(n+1), is also in the de-interleaving buffer. Note
also that in this example the buffer occupancy varies: when frame
f(n+1) is the next one to be decoded, there are only two frames,
f(n+1) and f(n+3), in the buffer.
5. Payload format
The main purpose of the payload design for G.719 is to maximize the
potential of the codec to its fullest degree with an as minimal
overhead as possible. In the design both basic and interleaved modes
Westerlund & Johansson Expires May 21, 2009 [Page 7]
Internet-Draft RTP Payload format for G.719 Nov 2008
have been included as the codec is suitable both for conversational
and other low delay applications as well as streaming, where more
delay is acceptable.
The main structural difference between the basic and interleaved
modes is the extension of the table of content entries with frame
displacement fields in the interleaved mode. The basic mode supports
aggregation of multiple consecutive frames in a payload. The
interleaved mode supports aggregation of multiple frames that are
non-consecutive in time. In both modes it is possible to have frames
encoded with different frame types in the same payload.
The payload format also supports the usage of G.719 for carrying
multi-channel content using one discrete encoder per channel all
using the same bit-rate. In this case a complete frame-block with
data from all channels are included in the RTP payload. The data is
the concatenation of all the encoded audio frames in the order
specified for that number of included channels. Also interleaving is
done on complete frame-blocks rather than individual audio frames.
5.1. RTP Header Usage
The RTP timestamp corresponds to the sampling instant of the first
sample encoded for the first frame-block in the packet. The
timestamp clock frequency SHALL be 48000 Hz. The timestamp is also
used to recover the correct decoding order of the frame-blocks.
The RTP header marker bit (M) SHALL be set to 1 whenever the first
frame-block carried in the packet is the first frame-block in a
talkspurt (see definition of the talkspurt in section 4.1 of
[RFC3551]). For all other packets the marker bit SHALL be set to
zero (M=0).
The assignment of an RTP payload type for the format defined in this
memo is outside the scope of this document. The RTP profiles in use
currently mandates binding the payload type dynamically for this
payload format. This is basically necessary due to that the payload
type expresses the configuration of the payload itself, i.e. basic or
interleaved mode and the number of channels carried.
The remaining RTP header fields are used as specified in RFC 3550
[RFC3550].
5.2. Payload Structure
The payload consists of one or more table of contents (ToC) entries
followed by the audio data corresponding to the ToC entries. The
following sections describe both the basic mode and the interleaved
Westerlund & Johansson Expires May 21, 2009 [Page 8]
Internet-Draft RTP Payload format for G.719 Nov 2008
mode. Each ToC entry MUST be padded to a byte boundary to ensure
octet alignment. The rules regarding maximum payload size given in
Section 3.2 of [I-D.ietf-tsvwg-udp-guidelines] SHOULD be followed.
5.2.1. Basic ToC element
All the different formats and modes in this draft use a common basic
ToC which may be extended in the different options described below.
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|F| L |R|R|
+-+-+-+-+-+-+-+-+
Figure 3: Basic TOC element
F (1 bit): If set to 1, indicates that this ToC entry is followed by
another ToC entry; if set to 0, indicates that this ToC entry is
the last one in the ToC.
L (5 bits): A field that gives the frame length of each individual
frame within the frame-block.
L length(bytes)
============================
0 0 NO_DATA
1-7 N/A (reserved)
8-22 80+10*(L-8)
23-27 240+20*(L-23)
28-31 N/A (reserved)
Figure 4: How to map L values to frame lengths
L=0 (NO_DATA) is used to indicate an empty frame, this is useful
if frames are missing e.g at re-packetization or to insert gaps
when sending redundant frames together with primary frames in the
same payload.
The value range [1..7] and [28..31] inclusive is reserved for
future use in this draft version, if these values occur in a ToC
the entire packet SHOULD be treated as invalid and discarded.
A few examples are given below where the frame size and the
corresponding codec bitrate is computed based on the value L.
Westerlund & Johansson Expires May 21, 2009 [Page 9]
Internet-Draft RTP Payload format for G.719 Nov 2008
L Bytes Codec Bitrate(kbps)
===================================
8 80 32
9 90 36
10 100 40
12 120 48
16 160 64
22 220 88
23 240 96
25 280 112
27 320 128
Figure 5: Examples of L values and corresponding frame lengths
This encoding yields a granularity of 4kbps between 32 and 88kbps
and a granularity of 8kbps between 88 and 128kbps with a defined
range of 32-128kbps for the codec data.
R (2bits): Reserved bits. SHALL be set to 0 on sending and SHALL be
ignored on reception.
5.3. Basic mode
The basic ToC element Figure 3 is followed by a one octet field for
the number of frame-blocks (#frames) to form the ToC entry. The
frame-blocks field tells how many frame-blocks of the same length the
ToC entry relates to.
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
| #frames |
+-+-+-+-+-+-+-+-+
Figure 6: Number of frame-blocks field
5.4. Interleaved mode
The basic ToC is followed by a one octet field for the number of
frame-blocks (#frames) and then the DIS fields to form a ToC entry in
interleaved mode. The frame-blocks field tells how many frame-blocks
of the same length the ToC relates to. The DIS fields, one for each
frame-block indicated by the #frames field, express the interleaving
distance between audio frames carried in the payload. If necessary
to achieve octet alignment, a 4-bit padding is added.
Westerlund & Johansson Expires May 21, 2009 [Page 10]
Internet-Draft RTP Payload format for G.719 Nov 2008
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| #frames | DIS1 | ... | DISi | ... | DISn | Padd |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7: Number of frame-block + interleave fields
DIS1...DISn (4 bits): A list of n (n=#frames) displacement fields
indicating the displacement of the i:th (i=1..n) audio frame-block
relative to the preceding frame-block in the payload, in units of
20 ms long audio frame-blocks). The four-bit unsigned integer
displacement values may be between 0 and 15 indicating the number
of audio frame-blocks in decoding order between the (i-1):th and
the i:th frame in the payload. Note that for the first ToC entry
of the payload the value of DIS1 is meaningless. It SHALL be set
to zero by a sender, and SHALL be ignored by a receiver. This
frame-block's location in the decoding order is uniquely defined
by the RTP timestamp. Note that for subsequent ToC entries DIS1
indicates the number of frames between the last frame of the
previous group and the first frame of this group.
Padd (4 bits): To ensure octet alignment, four padding bits SHALL be
included at the end of the ToC entry in case there is an odd
number of frame-blocks in the group referenced by this ToC entry.
These bits SHALL be set to zero and SHALL be ignored by the
receiver. If a group containing an even number of frames is
referenced by this ToC entry, these padding bits SHALL NOT be
included in the payload.
5.5. Audio Data
The audio data part follows the table of contents. All the octets
comprising an audio frame SHALL be appended to the payload as a unit.
For each frame-block the audio frames are concatenated in order
indicated by table in Section 4.1 of [RFC3551] for the number of
channels configured for the payload type in use. So the first
channel (left most) indicated comes first followed by the next
channel. The audio frame-blocks are packetized in increasing
timestamp order within each group of frame-blocks (per ToC entry),
i.e. oldest frame-block first. The groups of frame-blocks are
packetized in the same order as their corresponding ToC entries.
The audio frames are specified in ITU recommendation [ITU-T-G719].
The G.719 bit stream is split into a sequence of octets and
transmitted in order from the left most (most significant-MSB) bit to
the right most (least significant -LSB) bit.
Westerlund & Johansson Expires May 21, 2009 [Page 11]
Internet-Draft RTP Payload format for G.719 Nov 2008
5.6. Implementation Considerations
An application implementing this payload format MUST understand all
the payload parameters specified in this specification. Any mapping
of the parameters to a signaling protocol MUST support all
parameters. So an implementation of this payload format in an
application using SDP is required to understand all the payload
parameters in their SDP-mapped form. This requirement ensures that
an implementation always can decide whether it is capable of
communicating when the communicating entities support this version of
the specification.
Basic mode SHALL be implemented and the interleaved mode SHOULD be
implemented. The implementation burden of both is rather small, and
supporting both ensures interoperability. However, interleaving is
not mandated as it has limited applicability for conversational
application that requires tight delay boundaries.
5.6.1. Receiving Redundant Frames
The reception of redundant audio frames, i.e. more than one audio
frame from the same source for the same time slot, MUST be supported
by the implementation. In the case that the receiver gets multiple
audio frames in different bit-rates for the same time slot it is
RECOMMENDED that the receiver keeps the one with the highest bit-
rate.
5.6.2. Interleaving
The use of interleaving requires further considerations. As
presented in the example in Section 4.3.2, a given interleaving
pattern requires a certain amount of the de-interleaving buffer.
This buffer space, expressed in a number of transport frame slots, is
indicated by the "interleaving" media type parameter. The number of
frame slots needed can be converted into actual memory requirements
by considering the 320 bytes per frame used by the highest bit-rate
rate of G.719.
The information about the frame buffer size is not always sufficient
to determine when it is appropriate to start consuming frames from
the interleaving buffer. Additional information is needed when the
interleaving pattern changes. The "int-delay" media type parameter
is defined to convey this information. It allows a sender to
indicate the minimal media time that needs to be present in the
buffer before the decoder can start consuming frames from the buffer.
Because the sender has full control over the interleaving pattern, it
can calculate this value. In certain cases (for example, if joining
a multicast session with interleaving mid-session), a receiver may
Westerlund & Johansson Expires May 21, 2009 [Page 12]
Internet-Draft RTP Payload format for G.719 Nov 2008
initially receive only part of the packets in the interleaving
pattern. This initial partial reception (in frame sequence order) of
frames can yield too few frames for acceptable quality from the audio
decoding. This problem also arises when using encryption for access
control, and the receiver does not have the previous key. Although
the G.719 is robust and thus tolerant to a high random frame erasure
rate, it would have difficulties handling consecutive frame losses at
startup. Thus, some special implementation considerations are
described.
In order to handle this type of startup efficiently, decoding can
start provided that:
1. There are at least two consecutive frames available.
2. More than or equal to half the frames are available in the time
period from where decoding was planned to start and the most
forward received decoding.
After receiving a number of packets, in the worst case as many
packets as the interleaving pattern covers, the previously described
effects disappear and normal decoding is resumed. Similar issues
arise when a receiver leaves a session or has lost access to the
stream. If the receiver leaves the session, this would be a minor
issue since playout is normally stopped. The sender can avoid this
type of problem in many sessions by starting and ending interleaving
patterns correctly when risks of losses occur. One such example is a
key-change done for access control to encrypted streams. If only
some keys are provided to clients and there is a risk of they
receiving content for which they do not have the key, it is
recommended that interleaving patterns do not overlap key changes.
5.6.3. Decoding Validation
If the receiver finds a mismatch between the size of a received
payload and the size indicated by the ToC of the payload, the
receiver SHOULD discard the packet. This is recommended because
decoding a frame parsed from a payload based on erroneous ToC data
could severely degrade the audio quality.
6. Payload Examples
A few examples to highlight the payload format
Westerlund & Johansson Expires May 21, 2009 [Page 13]
Internet-Draft RTP Payload format for G.719 Nov 2008
6.1. 3 mono frames with 2 different bitrates
The first example is a payload consisting of 3 mono frames where the
2 first frames correspond to a bitrate of 32kbps (80byte/frame) and
the last is 48kbps (120byte/frame).
The first 32 bits are ToC fields.
Bit 0 is '1' as another ToC field follow.
Bits 1..5 is 01000 = 80bytes/frame
Bits 8..15 is 00000010 = 2 frame-blocks with 80bytes/frame
Bit 16 is '0', no more ToC follows
Bits 17..21 is 01100 = 120 bytes/frame
Bits 24..31 = 00000001 = 1 frame-block with 120bytes/frame
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0|0|0 1 1 0 0|0 0|0 0 0 0 0 0 0 1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|d(0) frame 1 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|d(0) frame 2 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|d(0) frame 3 |
. .
| d(959)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6.2. 2 stereo frame-blocks of the same bitrate
A payload consisting of 2 stereo frames corresponding to a bitrate of
32kbps (80byte/frame) per channel. The receiver calculates the
number of frames in the audio block by multiplying the value of the
channels parameter (2) with the #frames field value (2) to derive
that there are 4 audio frames in the payload.
The first 16 bits is the ToC field.
Bit 0 is '0' as no ToC field follow.
Bits 1..5 is 01000 = 80bytes/frame
Bits 8..15 is 00000010 = 2 frame-blocks with 80bytes/frame
Westerlund & Johansson Expires May 21, 2009 [Page 14]
Internet-Draft RTP Payload format for G.719 Nov 2008
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0| d(0) frame 1 left ch. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
| d(639)| d(0) frame 1 right ch. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
| d(639)| d(0) frame 2 left ch. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
| d(639)| d(0) frame 2 right ch. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6.3. 4 mono frames interleaved
A payload consisting of 4 mono frames corresponding to a bitrate of
32kbps (80byte/frame) interleaved. A pattern of interleaving for
constant delay when aggregating 4 frames is used in the below
example. The actual packet illustrated is packet n, while the
previous and following packets frame-block content is shown to
illustrate the pattern.
Packet n-3: 1, 6, 11, 16
Packet n-2: 5, 10, 15, 20
Packet n-1: 9, 14, 19, 24
Packet n: 13, 18, 23, 28
Packet n+1: 17, 22, 27, 32
Packet n+2: 21, 26, 31, 36
The first 16 bits is the ToC field.
Bit 0 is '0' as there are no ToC field following.
Bits 1..5 is 01000 = 80bytes/frame
Bits 8..15 is 00000100 = 4 frame-blocks with 80bytes/frame
Bits 16..19 is 0000 = DIS1 (0)
Bits 20..23 is 0100 = DIS2 (4)
Bits 24..27 is 0100 = DIS3 (4)
Bits 28..31 is 0100 = DIS4 (4)
Westerlund & Johansson Expires May 21, 2009 [Page 15]
Internet-Draft RTP Payload format for G.719 Nov 2008
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|0 1 0 0 0|0 0|0 0 0 0 0 1 0 0|0 0 0 0|0 1 0 0|0 1 0 0|0 1 0 0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| d(0) frame 13 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| d(0) frame 18 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| d(0) frame 23 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| d(0) frame 28 |
. .
| d(639)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
7. Payload Format Parameters
This RTP payload format is identified using the media type audio/g719
which is registered in accordance with [RFC4855] and using the
template of [RFC4288].
7.1. Media Type Definition
The media type for the G.719 codec is allocated from the IETF tree
since G.719 is a has the potential to become a widely used audio
codec in general VoIP, teleconferencing and streaming applications.
This media type registration covers real-time transfer via RTP.
Note, any unspecified parameter MUST be ignored by the receiver to
ensure that additional parameters can be added in any future revision
of this specification.
Type name: audio
Subtype name: g719
Required parameters: none
Optional parameters:
Westerlund & Johansson Expires May 21, 2009 [Page 16]
Internet-Draft RTP Payload format for G.719 Nov 2008
interleaving: Indicates that interleaved mode SHALL be used for the
payload. The parameter specifies the number of frame-block slots
available in a de-interleaving buffer (including the frame that is
ready to be consumed). Its value is equal to one plus the maximum
number of frames that can precede any frame in transmission order
and follow the frame in RTP timestamp order. The value MUST be
greater than zero. If this parameter is not present, interleaved
mode SHALL NOT be used.
int-delay: The minimal media time delay in milliseconds that is
needed to avoid underrun in the de-interleaving buffer before
starting decoding, i.e., the difference in RTP timestamp ticks
between the earliest and latest audio frame present in the de-
interleaving buffer expressed in milliseconds. The value is a
stream property and provided per source. The allowed values are 0
to the largest value expressible by a unsigned 16 bit integer
(65535). Please note that the in practice largest value that can
be used is equal to the declared size of the interleaving buffer
of the receiver. If the value for some reason is larger than the
receiver buffer declared by or for the receiver this value
defaults to the size of the receiver buffer. For sources for
which this value hasn't been provided the value defaults to the
size of the receiver buffer. The format is comma separated list
of SSRC ":" delay in ms pairs which in ABNF [RFC5234] is expressed
as:
int-delay = "int-delay:" source-delay *("," source-delay)
source-delay = SSRC ":" delay-value
SSRC = 1*8HEXDIG ; The 32-bit SSRC encoded in hex format
delay-value = 1*5DIGIT ; The delay value in milliseconds
Example: int-delay=ABCD1234:1000,4321DCB:640
NOTE: No white space allowed in the parameter before the end of
all the value pairs
max-red: The maximum duration in milliseconds that elapses between
the primary (first) transmission of a frame and any redundant
transmission that the sender will use. This parameter allows a
receiver to have a bounded delay when redundancy is used. Allowed
values are between 0 (no redundancy will be used) and 65535. If
the parameter is omitted, no limitation on the use of redundancy
is present.
Westerlund & Johansson Expires May 21, 2009 [Page 17]
Internet-Draft RTP Payload format for G.719 Nov 2008
channels: The number of audio channels. The possible values (1-6)
and their respective channel order is specified in Section 4.1 in
[RFC3551]. If omitted, it has the default value of 1.
CBR: Constant Bit Rate (CBR), indicates the exact codec-bitrate in
bits per second (not including the overhead from packetization,
RTP header or lower layers) that the codec MUST use. CBR is to be
used when dynamic rate cannot be supported (one case is e.g
gateway to H.320). CBR is mostly used for gateways to circuit
switch networks. Therefore the CBR rate is the rate not including
any FEC as specified in Section 4.3.1. If FEC is to be used the
b= parameter MUST be used to allow the extra bit rate needed to
send the redundant information. It is RECOMMENDED that this
parameter is only used when necessary to establish a working
communication. The usage of this parameter have implications on
congestion control that needs to be considered, see Section 9.
ptime: see [RFC4566].
maxptime: see [RFC4566].
Encoding considerations:
This media type is framed and binary, see section 4.8 in RFC4288
[RFC4288].
Security considerations:
See Section 10 of RFC XXXX.
Interoperability considerations:
The support of the Interleaving mode is not mandatory and needs to
be negotiated. See Section 7.2 for how to that for SDP based
protocols.
Published specification:
RFC XXXX
Applications that use this media type:
Real-time audio applications like voice over IP and
teleconference, and multi-media streaming.
Additional information: none
Person & email address to contact for further information:
Westerlund & Johansson Expires May 21, 2009 [Page 18]
Internet-Draft RTP Payload format for G.719 Nov 2008
Payload format: IngemarJohansson
<ingemar.s.johansson@ericsson.com>
Intended usage: COMMON
Restrictions on usage:
This media type depends on RTP framing, and hence is only defined
for transfer via RTP [RFC3550]. Transport within other framing
protocols is not defined at this time.
Author:
Ingemar Johansson <ingemar.s.johansson@ericsson.com>
Magnus Westerlund <magnus.westerlund@ericsson.com>
Change controller:
IETF Audio/Video Transport working group delegated from the IESG.
Additional Information:
File storage of G.719 encoded audio in ISO base media file format
is specified in Annex A of [ITU-T-G719]. Thus media file formats
such as MP4 (audio/mp4 or video/mp4) [RFC4337] and 3GP (audio/3GPP
and video/3GPP) [RFC3839] can contain G.719 encoded audio.
7.2. Mapping to SDP
The information carried in the media type specification has a
specific mapping to fields in the Session Description Protocol (SDP)
[RFC4566], which is commonly used to describe RTP sessions. When SDP
is used to specify sessions employing the G.719 codec, the mapping is
as follows:
o The media type ("audio") goes in SDP "m=" as the media name.
o The media subtype (payload format name) goes in SDP "a=rtpmap" as
the encoding name. The RTP clock rate in "a=rtpmap" MUST be
48000, and the encoding parameter "channels" (Section 7.1) MUST
either be explicitly set to N or omitted, implying a default value
of 1. The values of N that are allowed are specified in Section
4.1 in [RFC3551].
o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
"a=maxptime" attributes, respectively.
Westerlund & Johansson Expires May 21, 2009 [Page 19]
Internet-Draft RTP Payload format for G.719 Nov 2008
o Any remaining parameters go in the SDP "a=fmtp" attribute by
copying them directly from the media type parameter string as a
semicolon-separated list of parameter=value pairs.
7.2.1. Offer/Answer Considerations
The following considerations apply when using SDP Offer-Answer
procedures to negotiate the use of G.719 payload in RTP:
o Each combination of the RTP payload transport format configuration
parameters (interleaving, and channels) is unique in its bit-
pattern and not compatible with any other combination. When
creating an offer in an application desiring to use the more
advanced features (interleaving, or more than one channel), the
offerer is RECOMMENDED to also offer a payload type containing
only the configuration with a single channel. If multiple
configurations are of interest to the application, they may all be
offered; however, care should be taken not to offer too many
payload types. An SDP answerer MUST include, in the SDP answer
for a payload type, the following parameters unmodified from the
SDP offer (unless it removes the payload type): "interleaving";
and "channels". However, the value of the Interleaving parameter
MAY be changed. The SDP offerer and answerer MUST generate G.719
packets as described by these parameters.
o The "interleaving" and "int-delay" parameter's values have a
specific relationship that needs to be considered. It also
depends on the directionality of the streams and their delivery
method. The high level explanation that can be understood from
the definition is that the value of "interleaving" declares the
size of the receiver buffer, while int-delay is a stream property
provided by the sender to inform how much buffer space it in
practice is using for the stream it sends.
* For media streams which is sent over multicast the value of
"interleaving" SHALL NOT be changed by the answerer. It shall
either be accepted or the payload type deleted. The value of
the "int-delay" parameter is a stream property and provided by
the offer/answer agent that intends to send media with this
payload type, and for each stream coming from that agent (one
or more). The value MUST be between 0 and what corresponds to
the buffer size declared by the value of the "interleaving"
parameter.
* For unicast streams which the offerer declares as send-only the
value of the "interleaving" parameter is the size that the
answerer is RECOMMENDED to use by the offerer. The answerer
MAY change it to any allowed value. The int-delay parameter
Westerlund & Johansson Expires May 21, 2009 [Page 20]
Internet-Draft RTP Payload format for G.719 Nov 2008
value will be the one the offerer intends to use unless the
answerer reduce the value of the interleaving parameter below
what is needed for that int-delay value. If the interleaving
value in the answer is smaller than the offer's int-delay, the
int-delay value is per default reduced to be corresponding to
the interleaving value. If the offerer is not satisfied with
this he will need to perform another round of offer/answer. As
the answerer will not send any media it doesn't include any
int-delay in the answer.
* For unicast streams which the offerer declares as recvonly the
value of interleaving in the offer will be the offerer's size
of the interleaving buffer. The answerer indicate its
preferred size of the interleaving buffer for any future round
of offer/answer. The offerer will not provide any int-delay
parameter as it is not sending any media. The answerer is
recommended in its answer include a int-delay parameter to
declare what the property is for the stream it is going to
send. As it already know the receivers interleaving buffer
size, there should be no issue with providing a value that is
between 0 and corresponding to a full de-interleaving buffer.
* For unicast streams which the offer declares as sendrecv
streams the value of the interleaving parameter in the offer
will be offerer's size of the interleaving buffer. The
answerer will in the answer indicate the size of its actual
interleaving buffer. It is recommended that this value is as
least as big as the offer's. The offerer is recommended to
include a int-delay parameter that is selected based on that
the answerer has at least as much interleaving space as the
offerer unless nothing else is known. As the offerer's
interleaving buffer size is not yet known this may fail, in
which cases the default rule is to downgrade the value of the
int-delay to correspond to the full size of the answerer's
interleaving buffer. If the offerer isn't satisfied with this
it will need to initiate another round of offer/answer. The
answerer is recommended in its answer include a int-delay
parameter to declare what the property is for the stream(s) it
is going to send. As it already know the receivers
interleaving buffer size, there should be no issue with
providing a value that is between 0 and corresponding to a full
de-interleaving buffer.
o In most cases, the parameters "maxptime" and "ptime" will not
affect interoperability; however, the setting of the parameters
can affect the performance of the application. The SDP offer-
answer handling of the "ptime" parameter is described in
[RFC3264]. The "maxptime" parameter MUST be handled in the same
Westerlund & Johansson Expires May 21, 2009 [Page 21]
Internet-Draft RTP Payload format for G.719 Nov 2008
way.
o The parameter "max-red" is a stream property parameter. For
sendonly or sendrecv unicast media streams, the parameter declares
the limitation on redundancy that the stream sender will use. For
recvonly streams, it indicates the desired value for the stream
sent to the receiver. The answerer MAY change the value, but is
RECOMMENDED to use the same limitation as the offer declares. In
the case of multicast, the offerer MAY declare a limitation; this
SHALL be answered using the same value. A media sender using this
payload format is RECOMMENDED to always include the "max-red"
parameter. This information is likely to simplify the media
stream handling in the receiver. This is especially true if no
redundancy will be used, in which case "max-red" is set to 0.
o Any unknown parameter in an offer SHALL be removed in the answer.
o The b= SDP parameter SHOULD be used to negotiate the maximum
bandwidth to be used for the audio stream. The offerer may offer
a maximum rate and the answer may contain a lower rate. If no b=
parameter is present in the offer or answer it implies a rate up
to 128kbps
o The parameter "CBR" is a receiver capability, i.e. only receivers
that really requires constant bit-rate should use it. Usage of
this parameter have negative impact on the possibility to perform
congestion control, see Section 9. For recvonly and sendrecv
streams, it indicates the desired constant bit rate that the
receiver wants to accept. A sender MUST be able to send constant
bit rate stream since it is a subset of the variable bit rate
capability. If the offer includes this parameter the answerer
MUST send G.719 audio at the constant bit rate if it is within the
allowed session bit rate (b= parameter). If the answerer can not
support the stated CBR this payload type must be refused in the
answer. The answerer SHOULD only include this parameter if it
self requires to receive at a constant bit rate, even if the offer
did not include the CBR parameter. In this case, the offerer
SHALL send at the constant bit rate but SHALL be able to accept
media at variable bit rate. An answerer is RECOMMEND to use the
same CBR rate as in the offer, as symmetric usage is more likely
to work. If both sides requires a particular CBR rate there is
the possibility of communication failure when one or both sides
can't transmit the requested rate. In this case the agent
detecting this issue will have to perform a second round of offer/
answer to try to find another working configuration or end the
established session. In case the offer contained a CBR parameter
but the answer does not, then the offerer is free to transmit at
any rate to the answerer, but the answerer is restricted to the
Westerlund & Johansson Expires May 21, 2009 [Page 22]
Internet-Draft RTP Payload format for G.719 Nov 2008
declared rate.
7.2.2. Declarative SDP Considerations
In declarative usage, like SDP in RTSP [RFC2326] or SAP [RFC2974],
the parameters SHALL be interpreted as follows:
o The payload format configuration parameters (interleaving, and
channels) are all declarative, and a participant MUST use the
configuration(s) that is provided for the session. More than one
configuration may be provided if necessary by declaring multiple
RTP payload types; however, the number of types should be kept
small.
o It might not be possible to know the SSRC values that are going to
be used by the sources at the time of sending the SDP. This is
not a major issues as the size of the interleaving buffer can be
tailored towards the values actually going to be used. Thus
ensuring that the default values for int-delay is not resulting in
to much extra buffering.
o Any "maxptime" and "ptime" values should be selected with care to
ensure that the session's participants can achieve reasonable
performance.
o The parameter "CBR" if included applies to all RTP streams using
that payload type for which a particular CBR rate is declared.
Usage of this parameter have negative impact on the possibility to
perform congestion control, see Section 9.
8. IANA Considerations
One media type (audio/g719) has been defined and needs registration
in the media types registry; see Section 7.1.
9. Congestion Control
The general congestion control considerations for transporting RTP
data apply; see RTP [RFC3550] and any applicable RTP profile like AVP
[RFC3551]. However, the multi-rate capability of G.719 audio coding
provides a mechanism that may help to control congestion, since the
bandwidth demand can be adjusted (within the limits of the codec) by
selecting a different encoding bit-rate.
The number of frames encapsulated in each RTP payload highly
influences the overall bandwidth of the RTP stream due to header
Westerlund & Johansson Expires May 21, 2009 [Page 23]
Internet-Draft RTP Payload format for G.719 Nov 2008
overhead constraints. Packetizing more frames in each RTP payload
can reduce the number of packets sent and hence the header overhead,
at the expense of increased delay and reduced error robustness. If
forward error correction (FEC) is used, the amount of FEC-induced
redundancy needs to be regulated such that the use of FEC itself does
not cause a congestion problem. In other words a sender SHALL NOT
increase the total bit-rate when adding redundancy in response to
packet loss, and needs instead to adjust it down in accordance to the
congestion control algorithm being run. Thus when adding redundancy
the media bit-rate will generally be needed to reduced to free up the
bit-rate that is used for redundancy.
The CBR signalling parameter allows a receiver to lock down a RTP
payload type to use a single encoding rate. As this prevents the
codec rate from being lowered when congestion is experienced, the
sender is constrained to either change the packetization or abort the
transmission. Since these responses to congestion are severely
limited, implementations SHOULD NOT use the CBR parameter unless they
are interacting with a device that cannot support variable bit rate
(e.g. a gateway to H.320 systems). When using CBR mode, a receiver
MUST monitor the packet loss rate to ensure congestion is not caused,
following the guidelines in Section 2 of RFC 3551.
10. Security Considerations
RTP packets using the payload format defined in this specification
are subject to the general security considerations discussed in RTP
[RFC3550] and any applicable profile such as AVP [RFC3551] or SAVP
[RFC3711]. As this format transports encoded audio, the main
security issues include confidentiality, integrity protection, and
data origin authentication of the audio itself. The payload format
itself does not have any built-in security mechanisms. Any suitable
external mechanisms, such as SRTP [RFC3711], MAY be used.
This payload format and the G.719 decoder do not exhibit any
significant non-uniformity in the receiver-side computational
complexity for packet processing, and thus are unlikely to pose a
denial-of-service threat due to the receipt of pathological data.
The payload format or the codec data does not contain any type of
active content such as scripts.
10.1. Confidentiality
In order to ensure confidentiality of the encoded audio, all audio
data bits MUST be encrypted. There is less need to encrypt the
payload header or the table of contents since they only carry
information about the frame type. This information could also be
Westerlund & Johansson Expires May 21, 2009 [Page 24]
Internet-Draft RTP Payload format for G.719 Nov 2008
useful to a third party, for example, for quality monitoring.
However, as there currently don't exist any mechanism supporting
differential protection, this behavior isn't expected to be supported
and requirement of the audio data will be what governs the protection
of the RTP payload.
The use of interleaving in conjunction with encryption can have a
negative impact on confidentiality, for a short period of time.
Consider the following packets (in brackets) containing frame numbers
as indicated: {10, 14, 18}, {13, 17, 21}, {16, 20, 24} (a popular
continuous diagonal interleaving pattern). The originator wishes to
deny some participants the ability to hear material starting at time
16. Simply changing the key on the packet with the timestamp at or
after 16, and denying that new key to those participants, does not
achieve this; frames 17, 18, and 21 have been supplied in prior
packets under the prior key, and error concealment may make the audio
intelligible at least as far as frame 18 or 19, and possibly further.
10.2. Authentication and Integrity
To authenticate the sender of the audio-stream, an external mechanism
MUST be used. It is RECOMMENDED that such a mechanism protects both
the complete RTP header and the payload (audio and data bits). Data
tampering by a man-in-the-middle attacker could replace audio content
and also result in erroneous depacketization/decoding that could
lower the audio quality.
11. Acknowledgements
The authors would like to thank Roni Even and Anisse Taleb for their
help with this draft. We would also like to thank the people that
has provided feedback; Colin Perkins, Mark Baker and Stephen Botzko.
12. References
12.1. Informative References
[RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
September 1997.
[RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
Streaming Protocol (RTSP)", RFC 2326, April 1998.
[RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
Westerlund & Johansson Expires May 21, 2009 [Page 25]
Internet-Draft RTP Payload format for G.719 Nov 2008
Announcement Protocol", RFC 2974, October 2000.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004.
[RFC3839] Castagno, R. and D. Singer, "MIME Type Registrations for
3rd Generation Partnership Project (3GPP) Multimedia
files", RFC 3839, July 2004.
[RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and
Registration Procedures", BCP 13, RFC 4288, December 2005.
[RFC4337] Y Lim and D. Singer, "MIME Type Registration for MPEG-4",
RFC 4337, March 2006.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload
Formats", RFC 4855, February 2007.
[RFC5109] Li, A., "RTP Payload Format for Generic Forward Error
Correction", RFC 5109, December 2007.
12.2. Normative References
[I-D.ietf-tsvwg-udp-guidelines]
Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines
for Application Designers",
draft-ietf-tsvwg-udp-guidelines-11 (work in progress),
October 2008.
[ITU-T-G719]
ITU-T, "Specification : ITU-T G.719 extension for 20 kHz
fullband audio", April 2008.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264,
June 2002.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551,
July 2003.
Westerlund & Johansson Expires May 21, 2009 [Page 26]
Internet-Draft RTP Payload format for G.719 Nov 2008
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
[RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234, January 2008.
Authors' Addresses
Magnus Westerlund
Ericsson AB
Torshamnsgatan 21-23
SE-164 83 Stockholm
SWEDEN
Phone: +46 8 7190000
Email: magnus.westerlund@ericsson.com
Ingemar Johansson
Ericsson AB
Laboratoriegrand 11
SE-971 28 Lulea
SWEDEN
Phone: +46 73 0783289
Email: ingemar.s.johansson@ericsson.com
Westerlund & Johansson Expires May 21, 2009 [Page 27]
Internet-Draft RTP Payload format for G.719 Nov 2008
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Westerlund & Johansson Expires May 21, 2009 [Page 28]
Html markup produced by rfcmarkup 1.98, available from
http://tools.ietf.org/tools/rfcmarkup/