draft-ietf-avtext-mixer-to-client-audio-level-05.txt   draft-ietf-avtext-mixer-to-client-audio-level-06.txt 
Network Working Group E. Ivov, Ed. Network Working Group E. Ivov, Ed.
Internet-Draft Jitsi Internet-Draft Jitsi
Intended status: Standards Track E. Marocco, Ed. Intended status: Standards Track E. Marocco, Ed.
Expires: March 8, 2012 Telecom Italia Expires: May 19, 2012 Telecom Italia
J. Lennox J. Lennox
Vidyo, Inc. Vidyo, Inc.
September 5, 2011 November 16, 2011
A Real-Time Transport Protocol (RTP) Header Extension for Mixer-to- A Real-Time Transport Protocol (RTP) Header Extension for Mixer-to-
Client Audio Level Indication Client Audio Level Indication
draft-ietf-avtext-mixer-to-client-audio-level-05 draft-ietf-avtext-mixer-to-client-audio-level-06
Abstract Abstract
This document describes a mechanism for RTP-level mixers in audio This document describes a mechanism for RTP-level mixers in audio
conferences to deliver information about the audio level of conferences to deliver information about the audio level of
individual participants. Such audio level indicators are transported individual participants. Such audio level indicators are transported
in the same RTP packets as the audio data they pertain to. in the same RTP packets as the audio data they pertain to.
Status of this Memo Status of this Memo
skipping to change at page 1, line 37 skipping to change at page 1, line 37
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 8, 2012. This Internet-Draft will expire on May 19, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2011 IETF Trust and the persons identified as the Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 19 skipping to change at page 2, line 19
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Protocol Operation . . . . . . . . . . . . . . . . . . . . . . 4 3. Protocol Operation . . . . . . . . . . . . . . . . . . . . . . 4
4. Audio Levels . . . . . . . . . . . . . . . . . . . . . . . . . 5 4. Audio Levels . . . . . . . . . . . . . . . . . . . . . . . . . 5
5. Signaling Information . . . . . . . . . . . . . . . . . . . . 7 5. Signaling Information . . . . . . . . . . . . . . . . . . . . 7
6. Security Considerations . . . . . . . . . . . . . . . . . . . 10 6. Security Considerations . . . . . . . . . . . . . . . . . . . 10
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 11
9. Changes From Earlier Versions . . . . . . . . . . . . . . . . 11 9. Changes From Earlier Versions . . . . . . . . . . . . . . . . 11
9.1. Changes From Draft -04 . . . . . . . . . . . . . . . . . . 11 9.1. Changes From Draft -05 . . . . . . . . . . . . . . . . . . 11
9.2. Changes From Draft -03 . . . . . . . . . . . . . . . . . . 11 9.2. Changes From Draft -04 . . . . . . . . . . . . . . . . . . 12
9.3. Changes From Draft -02 . . . . . . . . . . . . . . . . . . 11 9.3. Changes From Draft -03 . . . . . . . . . . . . . . . . . . 12
9.4. Changes From Draft -01 . . . . . . . . . . . . . . . . . . 12 9.4. Changes From Draft -02 . . . . . . . . . . . . . . . . . . 12
9.5. Changes From Draft -00 . . . . . . . . . . . . . . . . . . 12 9.5. Changes From Draft -01 . . . . . . . . . . . . . . . . . . 12
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 9.6. Changes From Draft -00 . . . . . . . . . . . . . . . . . . 13
10.1. Normative References . . . . . . . . . . . . . . . . . . . 12 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
10.1. Normative References . . . . . . . . . . . . . . . . . . . 13
10.2. Informative References . . . . . . . . . . . . . . . . . . 13 10.2. Informative References . . . . . . . . . . . . . . . . . . 13
Appendix A. Reference Implementation . . . . . . . . . . . . . . 14 Appendix A. Reference Implementation . . . . . . . . . . . . . . 14
A.1. AudioLevelCalculator.java . . . . . . . . . . . . . . . . 14 A.1. AudioLevelCalculator.java . . . . . . . . . . . . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17
1. Introduction 1. Introduction
The Framework for Conferencing with the Session Initiation Protocol The Framework for Conferencing with the Session Initiation Protocol
(SIP) defined in RFC 4353 [RFC4353] presents an overall architecture (SIP) defined in RFC 4353 [RFC4353] presents an overall architecture
for multi-party conferencing. Among others, the framework borrows for multi-party conferencing. Among others, the framework borrows
from RTP [RFC3550] and extends the concept of a mixer entity from RTP [RFC3550] and extends the concept of a mixer entity
"responsible for combining the media streams that make up a "responsible for combining the media streams that make up a
conference, and generating one or more output streams that are conference, and generating one or more output streams that are
delivered to recipients". Every participant would hence receive, in delivered to recipients". Every participant would hence receive, in
skipping to change at page 3, line 41 skipping to change at page 3, line 41
party asking them to mute their microphone). A more advanced party asking them to mute their microphone). A more advanced
scenario could involve an intense discussion between multiple scenario could involve an intense discussion between multiple
participants that the user does not personally know. Audio level participants that the user does not personally know. Audio level
information would help better recognize the speakers by associating information would help better recognize the speakers by associating
with them complex (but still human readable) characteristics like with them complex (but still human readable) characteristics like
loudness and speed for example. loudness and speed for example.
One way of presenting such information in a user friendly manner One way of presenting such information in a user friendly manner
would be for a conferencing client to attach audio level indicators would be for a conferencing client to attach audio level indicators
to the corresponding participant related components in the user to the corresponding participant related components in the user
interface as displayed in Figure 1. interface. One possible example is displayed in Figure 1 where
levels can help users determine that Alice is currently the active
speaker, Carol is mute, and Bob and Dave are sending some background
noise.
________________________ ________________________
| | | |
| 00:42 | Weekly Call | | 00:42 | Weekly Call |
|________________________| |________________________|
| | | |
| | | |
| Alice |====== | (S) | | Alice |====== | (S) |
| | | |
| Bob |= | | | Bob |= | |
skipping to change at page 7, line 20 skipping to change at page 7, line 20
The magnitude of the audio level itself is packed into the seven The magnitude of the audio level itself is packed into the seven
least significant bits of the single byte of the header extension, least significant bits of the single byte of the header extension,
shown in Figure 2 and Figure 3. The least significant bit of the shown in Figure 2 and Figure 3. The least significant bit of the
audio level magnitude is packed into the least significant bit of the audio level magnitude is packed into the least significant bit of the
byte. The most significant bit of the byte is unused and always set byte. The most significant bit of the byte is unused and always set
to 0. to 0.
The audio level is expressed in -dBov, with values from 0 to 127 The audio level is expressed in -dBov, with values from 0 to 127
representing 0 to -127 dBov. dBov is the level, in decibels, relative representing 0 to -127 dBov. dBov is the level, in decibels, relative
to the overload point of the system, i.e. the maximum-amplitude to the overload point of the system, i.e. the highest-intensity
signal that can be handled by the system without clipping. (Note: signal encodable by the payload format. (Note: Representation
Representation relative to the overload point of a system is relative to the overload point of a system is particularly useful for
particularly useful for digital implementations, since one does not digital implementations, since one does not need to know the relative
need to know the relative calibration of the analog circuitry.) For calibration of the analog circuitry.) For example, in the case of
example, in the case of u-law (audio/pcmu) audio [ITU.G.711], the 0 u-law (audio/pcmu) audio [ITU.G.711], the 0 dBov reference would be a
dBov reference would be a square wave with values +/- 8031. (This square wave with values +/- 8031. (This translates to 6.18 dBm0,
translates to 6.18 dBm0, relative to u-law's dBm0 definition in Table relative to u-law's dBm0 definition in Table 6 of G.711.)
6 of G.711.)
The audio level for digital silence, for example for a muted audio The audio level for digital silence, for example for a muted audio
source, MUST be represented as 127 (-127 dBov), regardless of the source, MUST be represented as 127 (-127 dBov), regardless of the
dynamic range of the encoded audio format. dynamic range of the encoded audio format.
The audio level header extension only carries the level of the audio The audio level header extension only carries the level of the audio
in the RTP payload of the packet it is associated with, with no long- in the RTP payload of the packet it is associated with, with no long-
term averaging or smoothing applied. That level is measured as a term averaging or smoothing applied. That level is measured as a
root mean square of all the samples in the measured range. root mean square of all the samples in the measured range.
skipping to change at page 9, line 5 skipping to change at page 9, line 5
"sendrecv" offers. "sendrecv" offers.
This specification only defines use of the audio level extensions in This specification only defines use of the audio level extensions in
audio streams. They MUST NOT be advertised with other media types audio streams. They MUST NOT be advertised with other media types
such as video or text for example. such as video or text for example.
The following Figure 4 and Figure 5 show two example offer/answer The following Figure 4 and Figure 5 show two example offer/answer
exchanges between a conferencing client and a focus, and between two exchanges between a conferencing client and a focus, and between two
conference focus entities. conference focus entities.
v=0 SDP Offer:
o=alice 2890844526 2890844526 IN IP6 host.example.com
s=-
c=IN IP6 host.example.com
t=0 0
m=audio 49170 RTP/AVP 0 4
a=rtpmap:0 PCMU/8000
a=rtpmap:4 G723/8000
a=extmap:1/recvonly urn:ietf:params:rtp-hdrext:csrc-audio-level
v=0 v=0
i=A Seminar on the session description protocol o=alice 2890844526 2890844526 IN IP6 host.example.com
o=conf-focus 2890844730 2890844730 IN IP6 focus.example.net s=-
s=- c=IN IP6 host.example.com
c=IN IP6 focus.example.net t=0 0
t=0 0 m=audio 49170 RTP/AVP 0 4
m=audio 52544 RTP/AVP 0 a=rtpmap:0 PCMU/8000
a=rtpmap:0 PCMU/8000 a=rtpmap:4 G723/8000
a=extmap:1/sendonly urn:ietf:params:rtp-hdrext:csrc-audio-level a=extmap:1/recvonly urn:ietf:params:rtp-hdrext:csrc-audio-level
SDP Answer:
v=0
i=A Seminar on the session description protocol
o=conf-focus 2890844730 2890844730 IN IP6 focus.example.net
s=-
c=IN IP6 focus.example.net
t=0 0
m=audio 52544 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=extmap:1/sendonly urn:ietf:params:rtp-hdrext:csrc-audio-level
A client-initiated example SDP offer/answer exchange negotiating an A client-initiated example SDP offer/answer exchange negotiating an
audio stream with one-way flow of of audio level information. audio stream with one-way flow of audio level information.
Figure 4 Figure 4
v=0 SDP Offer:
i=Un seminaire sur le protocole de description des sessions
o=fr-focus 2890844730 2890844730 IN IP6 focus.fr.example.net
s=-
c=IN IP6 focus.fr.example.net
t=0 0
m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level
v=0 v=0
i=A Seminar on the session description protocol i=Un seminaire sur le protocole de description des sessions
o=us-focus 2890844526 2890844526 IN IP6 focus.us.example.net o=fr-focus 2890844730 2890844730 IN IP6 focus.fr.example.net
s=- s=-
c=IN IP6 focus.us.example.net c=IN IP6 focus.fr.example.net
t=0 0 t=0 0
m=audio 52544 RTP/AVP 0 m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000 a=rtpmap:0 PCMU/8000
a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level
SDP Answer:
v=0
i=A Seminar on the session description protocol
o=us-focus 2890844526 2890844526 IN IP6 focus.us.example.net
s=-
c=IN IP6 focus.us.example.net
t=0 0
m=audio 52544 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=extmap:1/sendrecv urn:ietf:params:rtp-hdrext:csrc-audio-level
An example SDP offer/answer exchange between two conference focus An example SDP offer/answer exchange between two conference focus
entities with mixing capabilities negotiating an audio stream with entities with mixing capabilities negotiating an audio stream with
bidirectional flow of audio level information. bidirectional flow of audio level information.
Figure 5 Figure 5
6. Security Considerations 6. Security Considerations
1. This document defines a means of attributing audio level to a 1. This document defines a means of attributing audio level to a
particular participant in a conference. An attacker may try to particular participant in a conference. An attacker may try to
modify the content of RTP packets in a way that would make audio modify the content of RTP packets in a way that would make audio
activity from one participant appear as coming from another. activity from one participant appear as coming from another.
2. Furthermore, the fact that audio level values would not be 2. Furthermore, the fact that audio level values would not be
protected even in an SRTP session might be of concern in some protected even in an SRTP session might be of concern in some
cases where the activity of a particular participant in a cases where the activity of a particular participant in a
conference is confidential. Also, as discussed in conference is confidential. Also, as discussed in
[I-D.perkins-avt-srtp-vbr-audio], an attacker might be able to [I-D.ietf-avtcore-srtp-vbr-audio], an attacker might be able to
infer information about the conversation, possibly with phoneme- infer information about the conversation, possibly with phoneme-
level resolution. level resolution.
3. Both of the above are concerns that stem from the design of the 3. Both of the above are concerns that stem from the design of the
RTP protocol itself and they would probably also apply when using RTP protocol itself and they would probably also apply when using
CSRC identifiers the way they were specified in RFC 3550 CSRC identifiers the way they were specified in RFC 3550
[RFC3550]. It is therefore important that according to the needs [RFC3550]. It is therefore important that according to the needs
of a particular scenario, implementors and deployers consider use of a particular scenario, implementors and deployers consider use
of header extension encryption of header extension encryption
[I-D.ietf-avtcore-srtp-encrypted-header-ext] or a lower level [I-D.ietf-avtcore-srtp-encrypted-header-ext] or a lower level
security and authentication mechanism. security and authentication mechanism such as IPsec [RFC4301] for
example.
7. IANA Considerations 7. IANA Considerations
This document defines a new extension URI that, if approved, would This document defines a new extension URI in the RTP Compact Header
need to be added to the RTP Compact Header Extensions sub-registry of Extensions sub-registry of the Real-Time Transport Protocol (RTP)
the Real-Time Transport Protocol (RTP) Parameters registry, according Parameters registry, according to the following data:
to the following data:
Extension URI: urn:ietf:params:rtp-hdrext:csrc-audio-level Extension URI: urn:ietf:params:rtp-hdrext:csrc-audio-level
Description: Mixer-to-client audio level indicators Description: Mixer-to-client audio level indicators
Contact: emcho@jitsi.org Contact: emcho@jitsi.org
Reference: RFC XXXX Reference: RFC XXXX
Note to the RFC-Editor: please replace "RFC XXXX" by the number of Note to the RFC-Editor: please replace "RFC XXXX" by the number of
this RFC. this RFC.
8. Acknowledgments 8. Acknowledgments
skipping to change at page 11, line 37 skipping to change at page 11, line 41
mailing lists. mailing lists.
Jitsi's participation in this specification is funded by the NLnet Jitsi's participation in this specification is funded by the NLnet
Foundation. Foundation.
9. Changes From Earlier Versions 9. Changes From Earlier Versions
Note to the RFC-Editor: please remove this section prior to Note to the RFC-Editor: please remove this section prior to
publication as an RFC. publication as an RFC.
9.1. Changes From Draft -04 9.1. Changes From Draft -05
o Added proper licensing to the code samples. (Brought up by Sean
Turner).
o Added an informative reference to RFC 4301 (IPsec). (Brought up
by Stephen Farrell)
o Briefly explained the meaning of the (S) and (M) symbols in the
introductory UI example. Added "SDP Offer" and "SDP Answer"
subtitles in the SDP examples. Fixed typos. Removed conditional
language in the IANA considerations section (Brought up by Dan
Romascanu, Lionel Morand and others).
o Clarified the meaning of "overload point of the system". (Brought
up by Robert Sparks).
o Corrected sample source code to return the level value relative to
the overload point, as specified, rather than a level relative to
a reference sound pressure level; also to use round-to-nearest
rather than truncate to calculate the integer return value.
(Brought up by Michael Ramalho.)
o Updated reference to [I-D.ietf-avtcore-srtp-vbr-audio].
9.2. Changes From Draft -04
o Fixed problems with missing "s=" attributes and odd RTP port o Fixed problems with missing "s=" attributes and odd RTP port
numbers in the SDP examples. numbers in the SDP examples.
9.2. Changes From Draft -03 9.3. Changes From Draft -03
o Addressed editorial comments made on the mailing list. o Addressed editorial comments made on the mailing list.
9.3. Changes From Draft -02 9.4. Changes From Draft -02
o Removed the no-data use case that allowed sending levels in RTP o Removed the no-data use case that allowed sending levels in RTP
packets. Choosing the right RTP payload type for this use case packets. Choosing the right RTP payload type for this use case
would have incurred complexity without bringing any real value. would have incurred complexity without bringing any real value.
o Merged the "Header Format" and the "Audio level encoding" sections o Merged the "Header Format" and the "Audio level encoding" sections
into a single "Audio Levels" section. into a single "Audio Levels" section.
o Changed encoding related text so that it would cover both the one- o Changed encoding related text so that it would cover both the one-
byte and the two-byte header formats. byte and the two-byte header formats.
o Clarified use of root mean square for dBov calculation o Clarified use of root mean square for dBov calculation
o Added a reference to [I-D.perkins-avt-srtp-vbr-audio] to better o Added a reference to [I-D.ietf-avtcore-srtp-vbr-audio] to better
explain some "Security Considerations" . explain some "Security Considerations" .
o Other minor editorial changes. o Other minor editorial changes.
9.4. Changes From Draft -01 9.5. Changes From Draft -01
o Removed code related the AudioLevelRenderer from "APPENDIX A. o Removed code related the AudioLevelRenderer from "APPENDIX A.
Reference Implementation" as it was considered an implementation Reference Implementation" as it was considered an implementation
matter by the working group. matter by the working group.
o Modified the AudioLevelCalculator in "APPENDIX A. Reference o Modified the AudioLevelCalculator in "APPENDIX A. Reference
Implementation" to take overload as a parameter. Implementation" to take overload as a parameter.
o Clarified non-use of audio levels in video streams o Clarified non-use of audio levels in video streams
o Closed the P.56 open issue. It was agreed on IETF 80 that P.56 is o Closed the P.56 open issue. It was agreed on IETF 80 that P.56 is
mostly about speech levels and the levels transported by the mostly about speech levels and the levels transported by the
extension defined here should also be able to serve as an extension defined here should also be able to serve as an
indication for noise. indication for noise.
o The Open Issues section has been removed as all issues that were o The Open Issues section has been removed as all issues that were
in there are now resolved or clarified. in there are now resolved or clarified.
o Editorial changes for consistency with o Editorial changes for consistency with
[I-D.ietf-avtext-client-to-mixer-audio-level]. [I-D.ietf-avtext-client-to-mixer-audio-level].
9.5. Changes From Draft -00 9.6. Changes From Draft -00
o Added code for sound pressure calculation and measurement in o Added code for sound pressure calculation and measurement in
"APPENDIX A. Reference Implementation". "APPENDIX A. Reference Implementation".
o Changed affiliation for Emil Ivov. o Changed affiliation for Emil Ivov.
o Removed "Appendix: Design choices". o Removed "Appendix: Design choices".
10. References 10. References
10.1. Normative References 10.1. Normative References
skipping to change at page 13, line 10 skipping to change at page 13, line 36
Applications", STD 64, RFC 3550, July 2003. Applications", STD 64, RFC 3550, July 2003.
[RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP
Header Extensions", RFC 5285, July 2008. Header Extensions", RFC 5285, July 2008.
10.2. Informative References 10.2. Informative References
[I-D.ietf-avtcore-srtp-encrypted-header-ext] [I-D.ietf-avtcore-srtp-encrypted-header-ext]
Lennox, J., "Encryption of Header Extensions in the Secure Lennox, J., "Encryption of Header Extensions in the Secure
Real-Time Transport Protocol (SRTP)", Real-Time Transport Protocol (SRTP)",
draft-ietf-avtcore-srtp-encrypted-header-ext-00 (work in draft-ietf-avtcore-srtp-encrypted-header-ext-01 (work in
progress), June 2011. progress), October 2011.
[I-D.ietf-avtcore-srtp-vbr-audio]
Perkins, C. and J. Valin, "Guidelines for the use of
Variable Bit Rate Audio with Secure RTP",
draft-ietf-avtcore-srtp-vbr-audio-03 (work in progress),
July 2011.
[I-D.ietf-avtext-client-to-mixer-audio-level] [I-D.ietf-avtext-client-to-mixer-audio-level]
Lennox, J., Ivov, E., and E. Marocco, "A Real-Time Lennox, J., Ivov, E., and E. Marocco, "A Real-Time
Transport Protocol (RTP) Header Extension for Client-to- Transport Protocol (RTP) Header Extension for Client-to-
Mixer Audio Level Indication", Mixer Audio Level Indication",
draft-ietf-avtext-client-to-mixer-audio-level-04 (work in draft-ietf-avtext-client-to-mixer-audio-level-05 (work in
progress), August 2011. progress), September 2011.
[I-D.perkins-avt-srtp-vbr-audio]
Perkins, C. and J. Valin, "Guidelines for the use of
Variable Bit Rate Audio with Secure RTP",
draft-perkins-avt-srtp-vbr-audio-05 (work in progress),
December 2010.
[ITU.G.711] [ITU.G.711]
International Telecommunications Union, "Pulse Code International Telecommunications Union, "Pulse Code
Modulation (PCM) of Voice Frequencies", ITU- Modulation (PCM) of Voice Frequencies", ITU-
T Recommendation G.711, November 1988. T Recommendation G.711, November 1988.
[ITU.P56.1993] [ITU.P56.1993]
International Telecommunications Union, "Objective International Telecommunications Union, "Objective
Measurement of Active Speech Level", ITU-T Recommendation Measurement of Active Speech Level", ITU-T Recommendation
P.56, March 1988. P.56, March 1988.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E. A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261, Schooler, "SIP: Session Initiation Protocol", RFC 3261,
June 2002. June 2002.
[RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for
Comfort Noise (CN)", RFC 3389, September 2002. Comfort Noise (CN)", RFC 3389, September 2002.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, December 2005.
[RFC4353] Rosenberg, J., "A Framework for Conferencing with the [RFC4353] Rosenberg, J., "A Framework for Conferencing with the
Session Initiation Protocol (SIP)", RFC 4353, Session Initiation Protocol (SIP)", RFC 4353,
February 2006. February 2006.
[RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session
Initiation Protocol (SIP) Event Package for Conference Initiation Protocol (SIP) Event Package for Conference
State", RFC 4575, August 2006. State", RFC 4575, August 2006.
Appendix A. Reference Implementation Appendix A. Reference Implementation
skipping to change at page 14, line 21 skipping to change at page 15, line 4
The Java code contains an AudioLevelCalculator class that calculates The Java code contains an AudioLevelCalculator class that calculates
the sound pressure level of a signal with specific samples. It can the sound pressure level of a signal with specific samples. It can
be used in mixers to generate values suitable for the level extension be used in mixers to generate values suitable for the level extension
headers. headers.
The implementation is provided in Java but does not rely on any of The implementation is provided in Java but does not rely on any of
the language specific and can be easily ported to another. the language specific and can be easily ported to another.
A.1. AudioLevelCalculator.java A.1. AudioLevelCalculator.java
/*
Copyright (c) 2011 IETF Trust and the persons identified
as authors of the code. All rights reserved.
Redistribution and use in source and binary forms, with
or without modification, is permitted pursuant to, and subject
to the license terms contained in, the Simplified BSD License
set forth in Section 4.c of the IETF Trust's Legal Provisions
Relating to IETF Documents (http://trustee.ietf.org/license-info).
*/
/** /**
* Calculates the audio level of specific samples of a signal based on * Calculates the audio level of specific samples of a signal relative
* sound pressure level. * to overload.
*/ */
public class AudioLevelCalculator public class AudioLevelCalculator
{ {
/** /**
* Calculates the sound pressure level of a signal with specific * Calculates the audio level of a signal with specific
* <tt>samples</tt>. * <tt>samples</tt>.
* *
* @param samples the samples of the signal to calculate the sound * @param samples the samples of the signal to calculate the audio
* pressure level of. The samples are specified as an <tt>int</tt> * level of. The samples are specified as an <tt>int</tt>
* array starting at <tt>offset</tt>, extending <tt>length</tt> * array starting at <tt>offset</tt>, extending <tt>length</tt>
* number of elements and each <tt>int</tt> element in the specified * number of elements and each <tt>int</tt> element in the specified
* range representing a sample of the signal to calculate the sound * range representing a sample of the signal to calculate the audio
* pressure level of. Though a sample is provided in the form of an * level of. Though a sample is provided in the form of an
* <tt>int</tt> value, the sample size in bits is determined by the * <tt>int</tt> value, the sample size in bits is determined by the
* caller via <tt>overload</tt>. * caller via <tt>overload</tt>.
* *
* @param offset the offset in <tt>samples</tt> at which the samples * @param offset the offset in <tt>samples</tt> at which the samples
* start * start
* *
* @param length the length of the signal specified in * @param length the length of the signal specified in
* <tt>samples<tt> starting at <tt>offset</tt> * <tt>samples<tt> starting at <tt>offset</tt>
* *
* @param overload the overload (point) of <tt>signal</tt>. * @param overload the overload (point) of <tt>signal</tt>.
skipping to change at page 15, line 4 skipping to change at page 15, line 45
* @param offset the offset in <tt>samples</tt> at which the samples * @param offset the offset in <tt>samples</tt> at which the samples
* start * start
* *
* @param length the length of the signal specified in * @param length the length of the signal specified in
* <tt>samples<tt> starting at <tt>offset</tt> * <tt>samples<tt> starting at <tt>offset</tt>
* *
* @param overload the overload (point) of <tt>signal</tt>. * @param overload the overload (point) of <tt>signal</tt>.
* For example, <tt>overload</tt> can be {@link Byte#MAX_VALUE} * For example, <tt>overload</tt> can be {@link Byte#MAX_VALUE}
* for 8-bit signed samples or {@link Short#MAX_VALUE} for * for 8-bit signed samples or {@link Short#MAX_VALUE} for
* 16-bit signed samples. * 16-bit signed samples.
* *
* @return the sound pressure level of the specified signal * @return the audio level of the specified signal
*/ */
public static int calculateSoundPressureLevel( public static int calculateAudioLevel(
int[] samples, int offset, int length, int[] samples, int offset, int length,
int overload) int overload)
{ {
/* /*
* Calcuate the root mean square of the signal i.e. the * Calcuate the root mean square of the signal.
* effective sound pressure.
*/ */
double rms = 0; double rms = 0;
for (; offset < length; offset++) for (; offset < length; offset++)
{ {
double sample = samples[offset]; double sample = samples[offset];
sample /= overload; sample /= overload;
rms += sample * sample; rms += sample * sample;
} }
rms = (length == 0) ? 0 : Math.sqrt(rms / length); rms = (length == 0) ? 0 : Math.sqrt(rms / length);
/* /*
* The sound pressure level is a logarithmic measure of the * The audio level is a logarithmic measure of the
* effectivesound pressure of a sound relative to a reference * rms level of an audio sample relative to a reference
* value and is measured in decibels. * value and is measured in decibels.
*/ */
double db; double db;
/* /*
* The minimum sound pressure level which matches the maximum * The minimum audio level permitted.
* of the sound meter.
*/ */
final double MIN_SOUND_PRESSURE_LEVEL = 0; final double MIN_AUDIO_LEVEL = -127;
/* /*
* The maximum sound pressure level which matches the maximum * The maximum audio level permitted.
* of the sound meter.
*/ */
final double MAX_SOUND_PRESSURE_LEVEL final double MAX_AUDIO_LEVEL = 0;
= 127 /* HUMAN TINNITUS (RINGING IN THE EARS) BEGINS */;
if (rms > 0) if (rms > 0)
{ {
/* /*
* The commonly used "zero" reference sound pressure in air * The "zero" reference level is the overload level, which
* is 20 uPa RMS, which is usually considered the threshold * corresponds to 1.0 in this calculation, because the
* of human hearing. * samples are normalized in calculating the RMS.
*/ */
db = 20 * Math.log10(rms);
final double REF_SOUND_PRESSURE = 0.00002;
db = 20 * Math.log10(rms / REF_SOUND_PRESSURE);
/* /*
* Ensure that the calculated level is within the minimum * Ensure that the calculated level is within the minimum
* and maximum sound pressure level. * and maximum range permitted.
*/ */
if (db < MIN_SOUND_PRESSURE_LEVEL) if (db < MIN_AUDIO_LEVEL)
db = MIN_SOUND_PRESSURE_LEVEL; db = MIN_AUDIO_LEVEL;
else if (db > MAX_SOUND_PRESSURE_LEVEL) else if (db > MAX_AUDIO_LEVEL)
db = MAX_SOUND_PRESSURE_LEVEL; db = MAX_AUDIO_LEVEL;
} }
else else
{ {
db = MIN_SOUND_PRESSURE_LEVEL; db = MIN_AUDIO_LEVEL;
} }
return (int) db; return (int)Math.round(db);
} }
} }
AudioLevelCalculator.java AudioLevelCalculator.java
Authors' Addresses Authors' Addresses
Emil Ivov (editor) Emil Ivov (editor)
Jitsi Jitsi
Strasbourg 67000 Strasbourg 67000
 End of changes. 46 change blocks. 
117 lines changed or deleted 153 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/