draft-ietf-avtext-mixer-to-client-audio-level-02.txt   draft-ietf-avtext-mixer-to-client-audio-level-03.txt 
Network Working Group E. Ivov, Ed. Network Working Group E. Ivov, Ed.
Internet-Draft Jitsi Internet-Draft Jitsi
Intended status: Informational E. Marocco, Ed. Intended status: Standards Track E. Marocco, Ed.
Expires: November 10, 2011 Telecom Italia Expires: January 6, 2012 Telecom Italia
J. Lennox J. Lennox
Vidyo, Inc. Vidyo, Inc.
May 9, 2011 July 5, 2011
A Real-Time Transport Protocol (RTP) Header Extension for Mixer-to- A Real-Time Transport Protocol (RTP) Header Extension for Mixer-to-
Client Audio Level Indication Client Audio Level Indication
draft-ietf-avtext-mixer-to-client-audio-level-02 draft-ietf-avtext-mixer-to-client-audio-level-03
Abstract Abstract
This document describes a mechanism for RTP-level mixers in audio This document describes a mechanism for RTP-level mixers in audio
conferences to deliver information about the audio level of conferences to deliver information about the audio level of
individual participants. Such audio level indicators are transported individual participants. Such audio level indicators are transported
in the same RTP packets as the audio data they pertain to. in the same RTP packets as the audio data they pertain to.
Status of this Memo Status of this Memo
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 10, 2011. This Internet-Draft will expire on January 6, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 13 skipping to change at page 2, line 13
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Protocol Operation . . . . . . . . . . . . . . . . . . . . . . 4 3. Protocol Operation . . . . . . . . . . . . . . . . . . . . . . 4
4. Header Format . . . . . . . . . . . . . . . . . . . . . . . . 6 4. Audio Levels . . . . . . . . . . . . . . . . . . . . . . . . . 5
5. Audio level encoding . . . . . . . . . . . . . . . . . . . . . 6 5. Signaling Information . . . . . . . . . . . . . . . . . . . . 7
6. Signaling Information . . . . . . . . . . . . . . . . . . . . 7 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 9. Changes From Earlier Versions . . . . . . . . . . . . . . . . 11
10. Changes From Earlier Versions . . . . . . . . . . . . . . . . 11 9.1. Changes From Draft -02 . . . . . . . . . . . . . . . . . . 11
10.1. Changes From Draft -01 . . . . . . . . . . . . . . . . . 11 9.2. Changes From Draft -01 . . . . . . . . . . . . . . . . . . 11
10.2. Changes From Draft -00 . . . . . . . . . . . . . . . . . 11 9.3. Changes From Draft -00 . . . . . . . . . . . . . . . . . . 11
11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
11.1. Normative References . . . . . . . . . . . . . . . . . . 11 10.1. Normative References . . . . . . . . . . . . . . . . . . . 12
11.2. Informative References . . . . . . . . . . . . . . . . . 11 10.2. Informative References . . . . . . . . . . . . . . . . . . 12
Appendix A. Reference Implementation . . . . . . . . . . . . . . 12 Appendix A. Reference Implementation . . . . . . . . . . . . . . 13
A.1. AudioLevelCalculator.java . . . . . . . . . . . . . . . . 13 A.1. AudioLevelCalculator.java . . . . . . . . . . . . . . . . 13
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction 1. Introduction
The Framework for Conferencing with the Session Initiation Protocol The Framework for Conferencing with the Session Initiation Protocol
(SIP) defined in RFC 4353 [RFC4353] presents an overall architecture (SIP) defined in RFC 4353 [RFC4353] presents an overall architecture
for multi-party conferencing. Among others, the framework borrows for multi-party conferencing. Among others, the framework borrows
from RTP [RFC3550] and extends the concept of a mixer entity from RTP [RFC3550] and extends the concept of a mixer entity
"responsible for combining the media streams that make up a "responsible for combining the media streams that make up a
conference, and generating one or more output streams that are conference, and generating one or more output streams that are
skipping to change at page 5, line 18 skipping to change at page 5, line 18
tag which binds CSRC IDs to media streams and SIP URIs. tag which binds CSRC IDs to media streams and SIP URIs.
This document describes an RTP header extension that allows mixers to This document describes an RTP header extension that allows mixers to
indicate the audio-level of every conference participant (CSRC) in indicate the audio-level of every conference participant (CSRC) in
addition to simply indicating their on/off status. This new header addition to simply indicating their on/off status. This new header
extension uses "General Mechanism for RTP Header Extensions" extension uses "General Mechanism for RTP Header Extensions"
described in [RFC5285]. described in [RFC5285].
Each instance of this header contains a list of one-octet audio Each instance of this header contains a list of one-octet audio
levels expressed in -dBov, with values from 0 to 127 representing 0 levels expressed in -dBov, with values from 0 to 127 representing 0
to -127 dBov(see Section 4 and Section 5). Appendix A provides a to -127 dBov(see Figure 2 and Figure 3). Appendix A provides a
reference implementation indicating one way of obtaining such values reference implementation indicating one way of obtaining such values
from raw audio samples. from raw audio samples.
Every audio level value pertains to the CSRC identifier located at Every audio level value pertains to the CSRC identifier located at
the corresponding position in the CSRC list. In other words, the the corresponding position in the CSRC list. In other words, the
first value would indicate the audio level of the conference first value would indicate the audio level of the conference
participant represented by the first CSRC identifier in that packet participant represented by the first CSRC identifier in that packet
and so forth. The number and order of these values MUST therefore and so forth. The number and order of these values MUST therefore
match the number and order of the CSRC IDs present in the same match the number and order of the CSRC IDs present in the same
packet. packet.
When encoding audio level information, a mixer SHOULD include in a When encoding audio level information, a mixer SHOULD include in a
packet information that corresponds to the audio data being packet information that corresponds to the audio data being
transported in that same packet. It is important that these values transported in that same packet. It is important that these values
follow the actual stream as closely as possible. Therefore a mixer follow the actual stream as closely as possible. Therefore a mixer
SHOULD also calculate the values after the original contributing SHOULD also calculate the values after the original contributing
stream has undergone possible processing such as level normalization, stream has undergone possible processing such as level normalization,
and noise reduction for example. and noise reduction for example.
Note that in some cases a mixer may be sending an RTP audio stream
that only contains audio level information and no actual audio.
Updating a (web) interface conference module may be one reason for
this to happen.
It may sometimes happen that a conference involves more than a single It may sometimes happen that a conference involves more than a single
mixer. In such cases each of the mixers MAY choose to relay the CSRC mixer. In such cases each of the mixers MAY choose to relay the CSRC
list and audio-level information they receive from peer mixers (as list and audio-level information they receive from peer mixers (as
long as the total CSRC count remains below 16). Given that the long as the total CSRC count remains below 16). Given that the
maximum audio level is not precisely defined by this specification, maximum audio level is not precisely defined by this specification,
it is likely that in such situations average audio levels would be it is likely that in such situations average audio levels would be
perceptibly different for the participants located behind the perceptibly different for the participants located behind the
different mixers. different mixers.
4. Header Format 4. Audio Levels
The audio level indicators are delivered to the receivers in-band The audio level header extension carries the level of the audio in
using the "General Mechanism for RTP Header Extensions" [RFC5285]. the RTP payload of the packet it is associated with. This
The payload of this extension is an ordered sequence of 8-bit audio information is carried in an RTP header extension element as defined
level indicators encoded as per Section 5. by the "General Mechanism for RTP Header Extensions" [RFC5285].
The payload of the audio level header extension element can be
encoded using the one or the two-byte header defined in [RFC5285].
Figure 2 and Figure 3 show sample audio level encodings with each of
them.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID | len |0| level 1 |0| level 2 |0| level 3 ... | ID | len=2 |0| level 1 |0| level 2 |0| level 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: Audio level indicators extension format
The 4-bit len field is the number minus one of data bytes (i.e. audio
level values) transported in this header extension element following
the one-byte header. Therefore, the value zero in this field
indicates that one byte of data follows. RFC 3550 [RFC3550] only
allows RTP packets to carry a maximum of 15 CSRC IDs. Given that
audio levels directly refer to CSRC IDs, implementations MUST NOT
include more than 15 audio level values. The maximum value allowed
in the len field is therefore 14.
Note that use of the two-byte header defined in RFC 5285 [RFC5285] Sample audio level encoding using the one-byte header format
follows the same rules the only change being the length of the ID and
len fields.
5. Audio level encoding Figure 2
The audio level header extension only carries the level of the audio 0 1 2 3
in the RTP payload of the packet it is associated with. This 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
information is carried in an RTP header extension element as defined +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
by [RFC5285]. | ID | len=3 |0| level 1 |0| level 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0| level 3 | 0 (pad) | ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The audio level is defined in the same manner as is audio noise level Sample audio level encoding using the two-byte header format
in the RTP Payload Comfort Noise specification [RFC3389]. The
overall magnitude of the noise level is encoded into the first byte
of the payload, with spectral information about the noise in
subsequent bytes. This specification's audio level parameter is
defined so as to be identical to the comfort noise payload's noise-
level byte.
The magnitude of the audio level is packed into the seven least Figure 3
significant bits of the single byte of the header extension, shown in
Figure 3. The least significant bit of the audio level magnitude is
packed into the least significant bit of the byte. The most
significant bit of the byte is unused and always set to 0 as shown
below in Figure 3.
0 1 2 3 4 5 6 7 In the case of the one-byte header format, the 4-bit len field is the
+-+-+-+-+-+-+-+-+ number minus one of data bytes (i.e. audio level values) transported
|0| level | in this header extension element following the one-byte header.
+-+-+-+-+-+-+-+-+ Therefore, the value zero in this field indicates that one byte of
data follows. In the case of the two-byte header format the 8-bit
len field contains the exact number of audio levels carried in the
extension. RFC 3550 [RFC3550] only allows RTP packets to carry a
maximum of 15 CSRC IDs. Given that audio levels directly refer to
CSRC IDs, implementations MUST NOT include more than 15 audio level
values. The maximum value allowed in the len field is therefore 14
for one-byte header format adn 15 for two-byte header format.
Figure 3: Audio Level Encoding Audio levels in this document are defined in the same manner as is
audio noise level in the RTP Payload Comfort Noise specification
[RFC3389]. In the comfort noice specification, the overall magnitude
of the noise level in comfort noise is encoded into the first byte of
the payload, with spectral information about the noise in subsequent
bytes. This specification's audio level parameter is defined so as
to be identical to the comfort noise payload's noise-level byte.
The two-byte header defined in RFC 5285 [RFC5285] may also be used. The magnitude of the audio level itself is packed into the seven
least significant bits of the single byte of the header extension,
shown in Figure 2 and Figure 3. The least significant bit of the
audio level magnitude is packed into the least significant bit of the
byte. The most significant bit of the byte is unused and always set
to 0.
The audio level is expressed in -dBov, with values from 0 to 127 The audio level is expressed in -dBov, with values from 0 to 127
representing 0 to -127 dBov. dBov is the level, in decibels, relative representing 0 to -127 dBov. dBov is the level, in decibels, relative
to the overload point of the system, i.e. the maximum-amplitude to the overload point of the system, i.e. the maximum-amplitude
signal that can be handled by the system without clipping.(Note: signal that can be handled by the system without clipping. (Note:
Representation relative to the overload point of a system is Representation relative to the overload point of a system is
particularly useful for digital implementations, since one does not particularly useful for digital implementations, since one does not
need to know the relative calibration of the analog circuitry.) For need to know the relative calibration of the analog circuitry.) For
example, in the case of u-law (audio/pcmu) audio [ITU.G.711], the 0 example, in the case of u-law (audio/pcmu) audio [ITU.G.711], the 0
dBov reference would be a square wave with values +/- 8031. (This dBov reference would be a square wave with values +/- 8031. (This
translates to 6.18 dBm0, relative to u-law's dBm0 definition in Table translates to 6.18 dBm0, relative to u-law's dBm0 definition in Table
6 of G.711.) 6 of G.711.)
The audio level for digital silence, for example for a muted audio The audio level for digital silence, for example for a muted audio
source, MAY be represented as 127 (-127 dBov), regardless of the source, MUST be represented as 127 (-127 dBov), regardless of the
dynamic range of the encoded audio format. dynamic range of the encoded audio format.
Implementations MAY choose to measure audio levels prior to encoding
them in the payload carried in the RTP payload, e.g. on raw linear
PCM input.
The audio level header extension only carries the level of the audio The audio level header extension only carries the level of the audio
in the RTP payload of the packet it is associated with, with no long- in the RTP payload of the packet it is associated with, with no long-
term averaging or smoothing applied. term averaging or smoothing applied. That level is measured as a
root mean square of all the samples in the measured range.
To simplify implementation of the encoding procedures described here, To simplify implementation of the encoding procedures described here,
this specification provides a sample Java implementation (Appendix A) this specification provides a sample Java implementation (Appendix A)
of an audio level calculator that helps obtain such values from raw of an audio level calculator that helps obtain such values from raw
linear PCM audio samples. linear PCM audio samples.
6. Signaling Information 5. Signaling Information
The URI for declaring the audio level header extension in an SDP The URI for declaring the audio level header extension in an SDP
extmap attribute and mapping it to a local extension header extmap attribute and mapping it to a local extension header
identifier is "urn:ietf:params:rtp-hdrext:csrc-audio-level". There identifier is "urn:ietf:params:rtp-hdrext:csrc-audio-level". There
is no additional setup information needed for this extension (i.e. no is no additional setup information needed for this extension (i.e. no
extensionattributes). extensionattributes).
An example attribute line in the SDP, for a conference might be: An example attribute line in the SDP, for a conference might be:
a=extmap:7 urn:ietf:params:rtp-hdrext:csrc-audio-level a=extmap:7 urn:ietf:params:rtp-hdrext:csrc-audio-level
The above mapping will most often be provided per media stream (in The above mapping will most often be provided per media stream (in
the media-level section(s) of SDP, i.e., after an "m=" line) or the media-level section(s) of SDP, i.e., after an "m=" line) or
globally if there is more than one stream containing audio level globally if there is more than one stream containing audio level
indicators in a session. indicators in a session.
Presence of the above attribute in the SDP description of a media Presence of the above attribute in the SDP description of a media
stream indicates that some or all RTP packets in that stream would stream indicates that RTP packets in that stream, which contain the
contain the audio level information RTP extension header. level extension defined in this document, will be carrying them with
an ID of 7.
Conferencing clients that support audio level indicators and have no Conferencing clients that support audio level indicators and have no
mixing capabilities would not be able to content for this audio level mixing capabilities would not be able to content for this audio level
extension and would hence have to always include the direction extension and would hence have to always include the direction
parameter in the "extmap" attribute with a value of "recvonly". parameter in the "extmap" attribute with a value of "recvonly".
Conference focus entities with mixing capabilities can omit the Conference focus entities with mixing capabilities can omit the
direction or set it to "sendrecv" in SDP offers. Such entities would direction or set it to "sendrecv" in SDP offers. Such entities would
need to set it to "sendonly" in SDP answers to offers with a need to set it to "sendonly" in SDP answers to offers with a
"recvonly" parameter and to "sendrecv" when answering other "recvonly" parameter and to "sendrecv" when answering other
"sendrecv" offers. "sendrecv" offers.
This speicification does not define use of the audio level extensions This specification only defines use of the audio level extensions in
in video streams. Therefore, the extension defined in this document audio streams. They MUST NOT be advertised with other media types
SHOULD NOT be advertised in anything but audio streams. such as video or text for example.
The following Figure 4 and Figure 5 show two example offer/answer The following Figure 4 and Figure 5 show two example offer/answer
exchanges between a conferencing client and a focus, and between two exchanges between a conferencing client and a focus, and between two
conference focus entities. conference focus entities.
v=0 v=0
o=alice 2890844526 2890844526 IN IP6 host.example.com o=alice 2890844526 2890844526 IN IP6 host.example.com
c=IN IP6 host.example.com c=IN IP6 host.example.com
t=0 0 t=0 0
m=audio 49170 RTP/AVP 0 4 m=audio 49170 RTP/AVP 0 4
skipping to change at page 10, line 5 skipping to change at page 10, line 5
m=audio 52543 RTP/AVP 0 m=audio 52543 RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level
An example SDP offer/answer exchange between two conference focus An example SDP offer/answer exchange between two conference focus
entities with mixing capabilities negotiating an audio stream with entities with mixing capabilities negotiating an audio stream with
bidirectional flow of audio level information. bidirectional flow of audio level information.
Figure 5 Figure 5
7. Security Considerations 6. Security Considerations
1. This document defines a means of attributing audio level to a 1. This document defines a means of attributing audio level to a
particular participant in a conference. An attacker may try to particular participant in a conference. An attacker may try to
modify the content of RTP packets in a way that would make audio modify the content of RTP packets in a way that would make audio
activity from one participant appear as coming from another. activity from one participant appear as coming from another.
2. Furthermore, the fact that audio level values would not be 2. Furthermore, the fact that audio level values would not be
protected even in an SRTP session might be of concern in some protected even in an SRTP session might be of concern in some
cases where the activity of a particular participant in a cases where the activity of a particular participant in a
conference is confidential. conference is confidential. Also, as discussed in
[I-D.perkins-avt-srtp-vbr-audio], an attacker might be able to
infer information about the conversation, possibly with phoneme-
level resolution.
3. Both of the above are concerns that stem from the design of the 3. Both of the above are concerns that stem from the design of the
RTP protocol itself and they would probably also apply when using RTP protocol itself and they would probably also apply when using
CSRC identifiers the way they were specified in RFC 3550 CSRC identifiers the way they were specified in RFC 3550
[RFC3550]. It is therefore important that according to the needs [RFC3550]. It is therefore important that according to the needs
of a particular scenario, implementors and deployers consider use of a particular scenario, implementors and deployers consider use
of header extension encryption of header extension encryption
[I-D.lennox-avtcore-srtp-encrypted-header-ext] or a lower level [I-D.lennox-avtcore-srtp-encrypted-header-ext] or a lower level
security and authentication mechanism. security and authentication mechanism.
8. IANA Considerations 7. IANA Considerations
This document defines a new extension URI that, if approved, would This document defines a new extension URI that, if approved, would
need to be added to the RTP Compact Header Extensions sub-registry of need to be added to the RTP Compact Header Extensions sub-registry of
the Real-Time Transport Protocol (RTP) Parameters registry, according the Real-Time Transport Protocol (RTP) Parameters registry, according
to the following data: to the following data:
Extension URI: urn:ietf:params:rtp-hdrext:csrc-audio-level Extension URI: urn:ietf:params:rtp-hdrext:csrc-audio-level
Description: Mixer-to-client audio level indicators Description: Mixer-to-client audio level indicators
Contact: emcho@jitsi.org Contact: emcho@jitsi.org
Reference: RFC XXXX Reference: RFC XXXX
Note to the RFC-Editor: please replace "RFC XXXX" by the number of Note to the RFC-Editor: please replace "RFC XXXX" by the number of
this RFC. this RFC.
9. Acknowledgments 8. Acknowledgments
Lyubomir Marinov contributed level measurement and rendering code. Lyubomir Marinov contributed level measurement and rendering code.
Roni Even, Keith Drage, Ingemar Johansson, Michael Ramalho and Keith Drage, Roni Even, Ingemar Johansson, Michael Ramalho, Magnus
several others provided helpful feedback over the dispatch mailing Westerlund and several others provided helpful feedback over the
list. dispatch mailing list.
Jitsi's participation in this specification is funded by the NLnet Jitsi's participation in this specification is funded by the NLnet
Foundation. Foundation.
10. Changes From Earlier Versions 9. Changes From Earlier Versions
Note to the RFC-Editor: please remove this section prior to Note to the RFC-Editor: please remove this section prior to
publication as an RFC. publication as an RFC.
10.1. Changes From Draft -01 9.1. Changes From Draft -02
o Removed the no-data use case that allowed sending levels in RTP
packets. Choosing the right RTP payload type for this use case
would have incurred complexity without bringing any real value.
o Merged the "Header Format" and the "Audio level encoding" sections
into a single "Audio Levels" section.
o Changed encoding related text so that it would cover both the one-
byte and the two-byte header formats.
o Clarified use of root mean square for dBov calculation
o Added a reference to [I-D.perkins-avt-srtp-vbr-audio] to better
explain some "Security Considerations" .
o Other minor editorial changes.
9.2. Changes From Draft -01
o Removed code related the AudioLevelRenderer from "APPENDIX A. o Removed code related the AudioLevelRenderer from "APPENDIX A.
Reference Implementation" as it was considered an implementation Reference Implementation" as it was considered an implementation
matter by the working group. matter by the working group.
o Modified the AudioLevelCalculator in "APPENDIX A. Reference o Modified the AudioLevelCalculator in "APPENDIX A. Reference
Implementation" to take overload as a parameter. Implementation" to take overload as a parameter.
o Clarified non-use of audio levels in video streams o Clarified non-use of audio levels in video streams
o Closed the P.56 open issue. It was agreed on IETF 80 that P.56 is o Closed the P.56 open issue. It was agreed on IETF 80 that P.56 is
mostly about speech levels and the levels transported by the mostly about speech levels and the levels transported by the
extension defined here should also be able to serve as an extension defined here should also be able to serve as an
indication for noise. indication for noise.
o The Open Issues section has been removed as all issues that were o The Open Issues section has been removed as all issues that were
in there are now resolved or clarified. in there are now resolved or clarified.
o Editorial changes for consistency with o Editorial changes for consistency with
[I-D.ietf-avtext-client-to-mixer-audio-level]. [I-D.ietf-avtext-client-to-mixer-audio-level].
10.2. Changes From Draft -00 9.3. Changes From Draft -00
o Added code for sound pressure calculation and measurement in o Added code for sound pressure calculation and measurement in
"APPENDIX A. Reference Implementation". "APPENDIX A. Reference Implementation".
o Changed affiliation for Emil Ivov. o Changed affiliation for Emil Ivov.
o Removed "Appendix: Design choices". o Removed "Appendix: Design choices".
11. References 10. References
10.1. Normative References
11.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003. Applications", STD 64, RFC 3550, July 2003.
[RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP
Header Extensions", RFC 5285, July 2008. Header Extensions", RFC 5285, July 2008.
11.2. Informative References 10.2. Informative References
[I-D.ietf-avtext-client-to-mixer-audio-level] [I-D.ietf-avtext-client-to-mixer-audio-level]
Lennox, J., Ivov, E., and E. Marocco, "A Real-Time Lennox, J., Ivov, E., and E. Marocco, "A Real-Time
Transport Protocol (RTP) Header Extension for Client-to- Transport Protocol (RTP) Header Extension for Client-to-
Mixer Audio Level Indication", Mixer Audio Level Indication",
draft-ietf-avtext-client-to-mixer-audio-level-01 (work in draft-ietf-avtext-client-to-mixer-audio-level-02 (work in
progress), March 2011. progress), June 2011.
[I-D.lennox-avtcore-srtp-encrypted-header-ext] [I-D.lennox-avtcore-srtp-encrypted-header-ext]
Lennox, J., "Encryption of Header Extensions in the Secure Lennox, J., "Encryption of Header Extensions in the Secure
Real-Time Transport Protocol (SRTP)", Real-Time Transport Protocol (SRTP)",
draft-lennox-avtcore-srtp-encrypted-header-ext-00 (work in draft-lennox-avtcore-srtp-encrypted-header-ext-00 (work in
progress), March 2011. progress), March 2011.
[I-D.perkins-avt-srtp-vbr-audio]
Perkins, C. and J. Valin, "Guidelines for the use of
Variable Bit Rate Audio with Secure RTP",
draft-perkins-avt-srtp-vbr-audio-05 (work in progress),
December 2010.
[ITU.G.711] [ITU.G.711]
International Telecommunications Union, "Pulse Code International Telecommunications Union, "Pulse Code
Modulation (PCM) of Voice Frequencies", ITU- Modulation (PCM) of Voice Frequencies", ITU-
T Recommendation G.711, November 1988. T Recommendation G.711, November 1988.
[ITU.P56.1993] [ITU.P56.1993]
International Telecommunications Union, "Objective International Telecommunications Union, "Objective
Measurement of Active Speech Level", ITU-T Recommendation Measurement of Active Speech Level", ITU-T Recommendation
P.56, March 1988. P.56, March 1988.
 End of changes. 37 change blocks. 
100 lines changed or deleted 117 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/