draft-ietf-sipping-transc-3pcc-00.txt   draft-ietf-sipping-transc-3pcc-01.txt 
Internet Engineering Task Force SIP WG Internet Engineering Task Force SIP WG
Internet Draft G. Camarillo Internet Draft G. Camarillo
Ericsson Ericsson
E. Burger E. Burger
SnowShore Networks Brooktrout
H. Schulzrinne H. Schulzrinne
Columbia University Columbia University
A. van Wijk A. van Wijk
Viataal Viataal
draft-ietf-sipping-transc-3pcc-00.txt draft-ietf-sipping-transc-3pcc-01.txt
February 3, 2004 June 7, 2004
Expires: August, 2004 Expires: December, 2004
Transcoding Services Invocation in the Transcoding Services Invocation in the Session Initiation
Session Initiation Protocol Using Third Party Call Control Protocol (SIP) Using Third Party Call Control (3pcc)
STATUS OF THIS MEMO STATUS OF THIS MEMO
This document is an Internet-Draft and is in full conformance with By submitting this Internet-Draft, I certify that any applicable
all provisions of Section 10 of RFC2026. patent or other IPR claims of which I am aware have been disclosed,
and any of which I become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that other
other groups may also distribute working documents as Internet- groups may also distribute working documents as Internet-Drafts.
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress". material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt.
To view the list Internet-Draft Shadow Directories, see The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
This document describes how to invoke transcoding services using SIP This document describes how to invoke transcoding services using SIP
and third party call control. This way of invocation meets the and third party call control. This way of invocation meets the
requirements for SIP regarding transcoding services invocation to requirements for SIP regarding transcoding services invocation to
support deaf, hard of hearing and speech-impaired individuals. support deaf, hard of hearing and speech-impaired individuals.
Table of Contents Table of Contents
1 Introduction ........................................ 3 1 Introduction ........................................ 3
2 General Overview .................................... 3 2 General Overview .................................... 3
3 Third Party Call Control Flows ...................... 3 3 Third Party Call Control Flows ...................... 3
3.1 Terminology ......................................... 4 3.1 Terminology ......................................... 4
3.2 Callee's Invocation ................................. 4 3.2 Callee's Invocation ................................. 4
3.3 Caller's Invocation ................................. 9 3.3 Caller's Invocation ................................. 9
3.4 Receiving the Original Stream ....................... 9 3.4 Receiving the Original Stream ....................... 9
3.5 Transcoding Services in Parallel .................... 11 3.5 Transcoding Services in Parallel .................... 11
3.6 Transcoding Services in Serial ...................... 15 3.6 Transcoding Services in Serial ...................... 15
4 Security Considerations ............................. 15 4 Security Considerations ............................. 17
5 Authors' Addresses .................................. 15 5 Authors' Addresses .................................. 17
6 Bibliography ........................................ 16 6 Normative References ................................ 17
7 Informative References .............................. 18
1 Introduction 1 Introduction
The framework for transcoding with SIP [1] describes how two SIP UAs The framework for transcoding with SIP [4] describes how two SIP [1]
can discover imcompatibilities that prevent them from establishing a UAs (User Agents) can discover imcompatibilities that prevent them
session (e.g., lack of support for a common codec or for a common from establishing a session (e.g., lack of support for a common codec
media type). When such incompatibilities are found, the UAs need to or for a common media type). When such incompatibilities are found,
invoke transcoding services to successfully establish the session. the UAs need to invoke transcoding services to successfully establish
3pcc (third party call control) [2] is one way to perform such the session. 3pcc (third party call control) [2] is one way to
invocation. perform such invocation.
2 General Overview 2 General Overview
In the 3pcc model for transcoding invocation, a transcoding server In the 3pcc model for transcoding invocation, a transcoding server
that provides a particular transcoding service (e.g., speech-to-text) that provides a particular transcoding service (e.g., speech-to-text)
is identified by a URI. A UA that wishes to invoke that service sends is identified by a URI. A UA that wishes to invoke that service sends
an INVITE request to that URI establishing a number of media streams. an INVITE request to that URI establishing a number of media streams.
The way the transcoder manipulates and manages the contents of those The way the transcoder manipulates and manages the contents of those
media streams (e.g., the text received over the text stream is media streams (e.g., the text received over the text stream is
transformed into speech and sent over the audio stream) is service transformed into speech and sent over the audio stream) is service
skipping to change at page 4, line 38 skipping to change at page 4, line 38
It contains, among other things, the transport address/es It contains, among other things, the transport address/es
where T wants to receive media from B. where T wants to receive media from B.
SDP TA+TB: A session description generated by T that contains, SDP TA+TB: A session description generated by T that contains,
among other things, the transport address/es where T wants among other things, the transport address/es where T wants
to receive media from A and the transport address/es where to receive media from A and the transport address/es where
T wants to receive media from B. T wants to receive media from B.
3.2 Callee's Invocation 3.2 Callee's Invocation
In this scenario, B receives an INVITE from A, and B decides to In this scenario, B receives an INVITE from A and B decides to
introduce T in the session. Figure 1 shows the call flow for this introduce T in the session. Figure 1 shows the call flow for this
scenario. scenario.
In Figure 1, A can both hear and speak and B is a deaf user with a In Figure 1, A can both hear and speak and B is a deaf user with a
speech impairment. A proposes to establish a session that consists of speech impairment. A proposes to establish a session that consists of
an audio stream (1). B wants to send and receive only text, so it an audio stream (1). B wants to send and receive only text, so it
invokes a transcoding service T that will perform both speech-to-text invokes a transcoding service T that will perform both speech-to-text
and text-to-speech conversions (2). The session descriptions of and text-to-speech conversions (2). The session descriptions of
Figure 1 are partially shown below. Figure 1 are partially shown below.
(1) INVITE SDP A
m=audio 20000 RTP/AVP 0
A T B A T B
| | | | | |
|--------------------(1) INVITE SDP A-------------------->| |--------------------(1) INVITE SDP A-------------------->|
| | | | | |
| |<---(2) INVITE SDP A+B------| | |<---(2) INVITE SDP A+B------|
| | | | | |
| |---(3) 200 OK SDP TA+TB---->| | |---(3) 200 OK SDP TA+TB---->|
| | | | | |
| |<---------(4) ACK-----------| | |<---------(4) ACK-----------|
skipping to change at page 5, line 26 skipping to change at page 5, line 27
| | | | | |
|------------------------(6) ACK------------------------->| |------------------------(6) ACK------------------------->|
| | | | | |
| ************************** | ************************** | | ************************** | ************************** |
|* MEDIA *|* MEDIA *| |* MEDIA *|* MEDIA *|
| ************************** | ************************** | | ************************** | ************************** |
| | | | | |
Figure 1: Callee's invocation of a transcoding service Figure 1: Callee's invocation of a transcoding service
c=IN IP4 A.domain.com (1) INVITE SDP A
m=audio 20000 RTP/AVP 0
c=IN IP4 A.example.com
(2) INVITE SDP A+B (2) INVITE SDP A+B
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
m=text 40000 RTP/AVP 96 m=text 40000 RTP/AVP 96
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
(3) 200 OK SDP TA+TB (3) 200 OK SDP TA+TB
m=audio 30000 RTP/AVP 0 m=audio 30000 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
m=text 30002 RTP/AVP 96 m=text 30002 RTP/AVP 96
c=IN IP4 T.domain.com c=IN IP4 T.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
(5) 200 OK SDP TA (5) 200 OK SDP TA
m=audio 30000 RTP/AVP 0 m=audio 30000 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
Four media streams (i.e., two bi-directional streams) have been Four media streams (i.e., two bi-directional streams) have been
established at this point: established at this point:
1. Audio from A to T.domain.com:30000 1. Audio from A to T.example.com:30000
2. Text from T to B.domain.com:40000 2. Text from T to B.example.com:40000
3. Text from B to T.domain.com:30002 3. Text from B to T.example.com:30002
4. Audio from T to A.domain.com:20000 4. Audio from T to A.example.com:20000
When either A or B decide to terminate the session, B will send a BYE When either A or B decide to terminate the session, they send a BYE
to T indicating that the session is over. to T indicating that the session is over.
If the first INVITE (1) received by B is empty (no session If the first INVITE (1) received by B is empty (no session
description), the call flow is slightly different. Figure 2 shows the description), the call flow is slightly different. Figure 2 shows the
messages involved. messages involved.
B may have different reasons for invoking T before knowing A's B may have different reasons for invoking T before knowing A's
session description. B may want to hide its capabilities, and session description. B may want to hide its capabilities, and
therefore it wants to return a session description with all the therefore it wants to return a session description with all the
codecs B supports plus all the codecs T supports. Or T may provide codecs B supports plus all the codecs T supports. Or T may provide
skipping to change at page 6, line 42 skipping to change at page 6, line 42
conversation, regardless of whether or not transcoding is needed. conversation, regardless of whether or not transcoding is needed.
This scenario (Figure 2) is a bit more complex than the previous one. This scenario (Figure 2) is a bit more complex than the previous one.
In INVITE (2), B still does not have SDP A, so it cannot provide T In INVITE (2), B still does not have SDP A, so it cannot provide T
with that information. When B finally receives SDP A in (6), it has with that information. When B finally receives SDP A in (6), it has
to send it to T. B sends an empty INVITE to T (7) and gets a 200 OK to send it to T. B sends an empty INVITE to T (7) and gets a 200 OK
with SDP TA+TB (8). In general, this SDP TA+TB can be different than with SDP TA+TB (8). In general, this SDP TA+TB can be different than
the one that was sent in (3). That is why B needs to send the updated the one that was sent in (3). That is why B needs to send the updated
SDP TA to A in (9). A then sends a possibly updated SDP A (10) and B SDP TA to A in (9). A then sends a possibly updated SDP A (10) and B
sends it to T in (12). On the other hand, if T happens to return the sends it to T in (12). On the other hand, if T happens to return the
same SDP TA+TB in (8) as in (3), B can skip messages (9), (10) and same SDP TA+TB in (8) as in (3), B can skip messages (9), (10), and
(11). So, implementors of transcoding services are encouraged to (11). So, implementors of transcoding services are encouraged to
return the same session description in (8) as in (3) in this type of return the same session description in (8) as in (3) in this type of
scenario. The session descriptions of this flow are shown below: scenario. The session descriptions of this flow are shown below:
(2) INVITE SDP A+B
A T B A T B
| | | | | |
|----------------------(1) INVITE------------------------>| |----------------------(1) INVITE------------------------>|
| | | | | |
| |<-----(2) INVITE SDP B------| | |<-----(2) INVITE SDP B------|
| | | | | |
| |---(3) 200 OK SDP TA+TB---->| | |---(3) 200 OK SDP TA+TB---->|
| | | | | |
| |<---------(4) ACK-----------| | |<---------(4) ACK-----------|
skipping to change at page 7, line 37 skipping to change at page 7, line 38
|<-----------------------(11) ACK-------------------------| |<-----------------------(11) ACK-------------------------|
| | | | | |
| |<-----(12) ACK SDP A+B------| | |<-----(12) ACK SDP A+B------|
| | | | | |
| ************************** | ************************** | | ************************** | ************************** |
|* MEDIA *|* MEDIA *| |* MEDIA *|* MEDIA *|
| ************************** | ************************** | | ************************** | ************************** |
Figure 2: Callee's invocation after initial INVITE without SDP Figure 2: Callee's invocation after initial INVITE without SDP
(2) INVITE SDP A+B
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 0.0.0.0 c=IN IP4 0.0.0.0
m=text 40000 RTP/AVP 96 m=text 40000 RTP/AVP 96
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
(3) 200 OK SDP TA+TB (3) 200 OK SDP TA+TB
m=audio 30000 RTP/AVP 0 m=audio 30000 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
m=text 30002 RTP/AVP 96 m=text 30002 RTP/AVP 96
c=IN IP4 T.domain.com c=IN IP4 T.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
(5) 200 OK SDP TA (5) 200 OK SDP TA
m=audio 30000 RTP/AVP 0 m=audio 30000 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
(6) ACK SDP A (6) ACK SDP A
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
(8) 200 OK SDP TA+TB (8) 200 OK SDP TA+TB
m=audio 30004 RTP/AVP 0 m=audio 30004 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
m=text 30006 RTP/AVP 96 m=text 30006 RTP/AVP 96
c=IN IP4 T.domain.com c=IN IP4 T.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
(9) INVITE SDP TA (9) INVITE SDP TA
m=audio 30004 RTP/AVP 0 m=audio 30004 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
(10) 200 OK SDP A (10) 200 OK SDP A
m=audio 20002 RTP/AVP 0 m=audio 20002 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
(12) ACK SDP A+B (12) ACK SDP A+B
m=audio 20002 RTP/AVP 0 m=audio 20002 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
m=text 40000 RTP/AVP 96 m=text 40000 RTP/AVP 96
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
Four media streams (i.e., two bi-directional streams) have been Four media streams (i.e., two bi-directional streams) have been
established at this point: established at this point:
1. Audio from A to T.domain.com:30004 1. Audio from A to T.example.com:30004
2. Text from T to B.domain.com:40000 2. Text from T to B.example.com:40000
3. Text from B to T.domain.com:30006 3. Text from B to T.example.com:30006
4. Audio from T to A.domain.com:20002 4. Audio from T to A.example.com:20002
3.3 Caller's Invocation 3.3 Caller's Invocation
In this scenario, A wishes to establish a session with B using a In this scenario, A wishes to establish a session with B using a
transcoding service. A uses 3pcc to set up the session between T and transcoding service. A uses 3pcc to set up the session between T and
B. The call flow we provide here is slightly different than the ones B. The call flow we provide here is slightly different than the ones
in [2]. In [2], the controller establishes a session between two user in [2]. In [2], the controller establishes a session between two user
agents, which are the ones deciding the characteristics of the agents, which are the ones deciding the characteristics of the
streams. Here, A wants to establish a session between T and B, but A streams. Here, A wants to establish a session between T and B, but A
wants to decide how many and which types of streams are established. wants to decide how many and which types of streams are established.
That is why A sends its session description in the first INVITE (1) That is why A sends its session description in the first INVITE (1)
to T, as opposed to the media-less initial INVITE recommended by [2]. to T, as opposed to the media-less initial INVITE recommended by [2].
Figure 3 shows the call flow for this scenario. Figure 3 shows the call flow for this scenario.
We do not include the session descriptions of this flow, since they We do not include the session descriptions of this flow, since they
are very similar to the ones in Figure 2. In this flow, if T returns are very similar to the ones in Figure 2. In this flow, if T returns
the same SDP TA+TB in (8) as in (2), messages (9), (10) and (11) can the same SDP TA+TB in (8) as in (2), messages (9), (10), and (11) can
be skipped. be skipped.
3.4 Receiving the Original Stream 3.4 Receiving the Original Stream
Sometimes, as pointed out in the requirements for SIP in support of Sometimes, as pointed out in the requirements for SIP in support of
deaf, hard of hearing and speech-impaired individuals [3], a user deaf, hard of hearing, and speech-impaired individuals [5], a user
wants to receive both the original stream (e.g., audio) and the wants to receive both the original stream (e.g., audio) and the
transcoded stream (e.g., the output of the speech-to-text transcoded stream (e.g., the output of the speech-to-text
conversion). There are various possible solutions for this problem. conversion). There are various possible solutions for this problem.
One solution consists of using the SDP group attribute with FID One solution consists of using the SDP group attribute with FID
semantics [4]. FID allows requesting that a stream is sent to two semantics [3]. FID allows requesting that a stream is sent to two
different transport addresses in parallel, as shown below: different transport addresses in parallel, as shown below:
a=group:FID 1 2
A T B A T B
| | | | | |
|-------(1) INVITE SDP A---->| | |-------(1) INVITE SDP A---->| |
| | | | | |
|<----(2) 200 OK SDP TA+TB---| | |<----(2) 200 OK SDP TA+TB---| |
| | | | | |
|----------(3) ACK---------->| | |----------(3) ACK---------->| |
| | | | | |
|--------------------(4) INVITE SDP TA------------------->| |--------------------(4) INVITE SDP TA------------------->|
skipping to change at page 10, line 38 skipping to change at page 10, line 39
| | | | | |
|------(12) ACK SDP A+B----->| | |------(12) ACK SDP A+B----->| |
| | | | | |
| ************************** | ************************** | | ************************** | ************************** |
|* MEDIA *|* MEDIA *| |* MEDIA *|* MEDIA *|
| ************************** | ************************** | | ************************** | ************************** |
| | | | | |
Figure 3: Caller's invocation of a transcoding service Figure 3: Caller's invocation of a transcoding service
a=group:FID 1 2
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=mid:1 a=mid:1
m=audio 30000 RTP/AVP 0 m=audio 30000 RTP/AVP 0
c=IN IP4 T.domain.com c=IN IP4 T.example.com
a=mid:2 a=mid:2
The problem with this solution is that the majority of the SIP user The problem with this solution is that the majority of the SIP user
agents do not support FID. Moreover, only a small fraction of the few agents do not support FID. Moreover, only a small fraction of the few
UAs that do support FID, support sending simultaneous copies of the UAs that do support FID, support sending simultaneous copies of the
same media stream at the same time. In addition, FID forces both same media stream at the same time. In addition, FID forces both
copies of the stream to use the same codec. copies of the stream to use the same codec.
So, we recommend that T (instead of a user agent) replicates the So, we recommend that T (instead of one of the user agent) replicates
media stream. The transcoder T receiving the following session the media stream. The transcoder T receiving the following session
description performs speech-to-text and text-to-speech conversions description performs speech-to-text and text-to-speech conversions
between the first audio stream and the text stream. In addition, T between the first audio stream and the text stream. In addition, T
copies the first audio stream to the second audio stream and sends it copies the first audio stream to the second audio stream and sends it
to A. to A.
m=audio 40000 RTP/AVP 0 m=audio 40000 RTP/AVP 0
c=IN IP4 B.domain.com c=IN IP4 B.example.com
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=recvonly a=recvonly
m=text 20002 RTP/AVP 96 m=text 20002 RTP/AVP 96
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
3.5 Transcoding Services in Parallel 3.5 Transcoding Services in Parallel
Transcoding services sometimes consist of human relays (e.g., a Transcoding services sometimes consist of human relays (e.g., a
person performing speech-to-text and text-to-speech conversions for a person performing speech-to-text and text-to-speech conversions for a
session). If the same person is involved in both conversions (i.e., session). If the same person is involved in both conversions (i.e.,
from A to B and from B to A), he or she has access to all the from A to B and from B to A), he or she has access to all the
conversation. In order to provide some degree of privacy, sometimes conversation. In order to provide some degree of privacy, sometimes
two different persons are allocated to do the job (i.e., one person two different persons are allocated to do the job (i.e., one person
skipping to change at page 11, line 44 skipping to change at page 11, line 44
text to synthetic speech (text-to-speech) and a different machine text to synthetic speech (text-to-speech) and a different machine
performs voice recognition (speech-to-text). performs voice recognition (speech-to-text).
The scenario just described involves four different sessions; A-T1, The scenario just described involves four different sessions; A-T1,
T1-B, B-T2 and T2-A. Figure 4 shows the call flow where A invokes T1 T1-B, B-T2 and T2-A. Figure 4 shows the call flow where A invokes T1
and T2. and T2.
(1) INVITE SDP AT1 (1) INVITE SDP AT1
m=text 20000 RTP/AVP 96 m=text 20000 RTP/AVP 96
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=sendonly a=sendonly
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 0.0.0.0 c=IN IP4 0.0.0.0
a=recvonly a=recvonly
(2) INVITE SDP AT2 (2) INVITE SDP AT2
m=text 20002 RTP/AVP 96 m=text 20002 RTP/AVP 96
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=recvonly a=recvonly
m=audio 20000 RTP/AVP 0 m=audio 20000 RTP/AVP 0
c=IN IP4 0.0.0.0 c=IN IP4 0.0.0.0
a=sendonly a=sendonly
(3) 200 OK SDP T1A+T1B (3) 200 OK SDP T1A+T1B
m=text 30000 RTP/AVP 96 m=text 30000 RTP/AVP 96
c=IN IP4 T1.domain.com c=IN IP4 T1.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=recvonly a=recvonly
m=audio 30002 RTP/AVP 0 m=audio 30002 RTP/AVP 0
c=IN IP4 T1.domain.com c=IN IP4 T1.example.com
a=sendonly a=sendonly
(5) 200 OK SDP T2A+T2B (5) 200 OK SDP T2A+T2B
m=text 40000 RTP/AVP 96 m=text 40000 RTP/AVP 96
c=IN IP4 T2.domain.com c=IN IP4 T2.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=sendonly a=sendonly
m=audio 40002 RTP/AVP 0 m=audio 40002 RTP/AVP 0
c=IN IP4 T2.domain.com c=IN IP4 T2.example.com
a=recvonly a=recvonly
(7) INVITE SDP T1B+T2B (7) INVITE SDP T1B+T2B
m=audio 30002 RTP/AVP 0 m=audio 30002 RTP/AVP 0
c=IN IP4 T1.domain.com c=IN IP4 T1.example.com
a=sendonly a=sendonly
m=audio 40002 RTP/AVP 0 m=audio 40002 RTP/AVP 0
c=IN IP4 T2.domain.com c=IN IP4 T2.example.com
a=recvonly a=recvonly
A T1 T2 B A T1 T2 B
| | | | | | | |
|----(1) INVITE SDP AT1--->| | | |----(1) INVITE SDP AT1--->| | |
| | | | | | | |
|----------------(2) INVITE SDP AT2-------------->| | |----------------(2) INVITE SDP AT2-------------->| |
| | | | | | | |
|<-(3) 200 OK SDP T1A+T1B--| | | |<-(3) 200 OK SDP T1A+T1B--| | |
| | | | | | | |
skipping to change at page 14, line 7 skipping to change at page 14, line 7
| | | | | | | |
| *********************************************** *********** | *********************************************** ***********
|* MEDIA *|* MEDIA *| |* MEDIA *|* MEDIA *|
| *********************************************** | *********** | | *********************************************** | *********** |
| | | | | | | |
Figure 4: Transcoding services in parallel Figure 4: Transcoding services in parallel
(8) 200 OK SDP BT1+BT2 (8) 200 OK SDP BT1+BT2
m=audio 50000 RTP/AVP 0 m=audio 50000 RTP/AVP 0
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=recvonly a=recvonly
m=audio 50002 RTP/AVP 0 m=audio 50002 RTP/AVP 0
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=sendonly a=sendonly
(11) 200 OK SDP T1A+T1B (11) 200 OK SDP T1A+T1B
m=text 30000 RTP/AVP 96 m=text 30000 RTP/AVP 96
c=IN IP4 T1.domain.com c=IN IP4 T1.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=recvonly a=recvonly
m=audio 30002 RTP/AVP 0 m=audio 30002 RTP/AVP 0
c=IN IP4 T1.domain.com c=IN IP4 T1.example.com
a=sendonly a=sendonly
(12) 200 OK SDP T2A+T2B (12) 200 OK SDP T2A+T2B
m=text 40000 RTP/AVP 96 m=text 40000 RTP/AVP 96
c=IN IP4 T2.domain.com c=IN IP4 T2.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=sendonly a=sendonly
m=audio 40002 RTP/AVP 0 m=audio 40002 RTP/AVP 0
c=IN IP4 T2.domain.com c=IN IP4 T2.example.com
a=recvonly a=recvonly
Since T1 have returned the same SDP in (11) as in (3) and T2 has Since T1 have returned the same SDP in (11) as in (3) and T2 has
returned the same SDP in (12) as in (5), messages (13), (14) and (15) returned the same SDP in (12) as in (5), messages (13), (14) and (15)
can be skipped. can be skipped.
(16) ACK SDP AT1+BT1 (16) ACK SDP AT1+BT1
m=text 20000 RTP/AVP 96 m=text 20000 RTP/AVP 96
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=sendonly a=sendonly
m=audio 50000 RTP/AVP 0 m=audio 50000 RTP/AVP 0
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=recvonly a=recvonly
(17) ACK SDP AT2+BT2 (17) ACK SDP AT2+BT2
m=text 20002 RTP/AVP 96 m=text 20002 RTP/AVP 96
c=IN IP4 A.domain.com c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 a=rtpmap:96 t140/1000
a=recvonly a=recvonly
m=audio 50002 RTP/AVP 0 m=audio 50002 RTP/AVP 0
c=IN IP4 B.domain.com c=IN IP4 B.example.com
a=sendonly a=sendonly
Four media streams have been established at this point: Four media streams have been established at this point:
1. Text from A to T1.domain.com:30000 1. Text from A to T1.example.com:30000
2. Audio from T1 to B.domain.com:50000 2. Audio from T1 to B.example.com:50000
3. Audio from B to T2.domain.com:40002 3. Audio from B to T2.example.com:40002
4. Text from T2 to A.domain.com:20002 4. Text from T2 to A.example.com:20002
Note that B, the user agent server, needs to support two media Note that B, the user agent server, needs to support two media
streams; one sendonly and the other recvonly. At present, some user streams; one sendonly and the other recvonly. At present, some user
agents, although they support a single sendrecv media stream, they do agents, although they support a single sendrecv media stream, they do
not support a different media line per direction. Implementers are not support a different media line per direction. Implementers are
encouraged to build support for this feature. encouraged to build support for this feature.
3.6 Transcoding Services in Serial 3.6 Transcoding Services in Serial
In a distributed environment, a complex transcoding service (e.g., In a distributed environment, a complex transcoding service (e.g.,
English text to Spanish speech) is often provided by several servers. English text to Spanish speech) is often provided by several servers.
For example, one server performs English text to Spanish text For example, one server performs English text to Spanish text
translation, and its output is feed into a server that performs translation, and its output is feed into a server that performs
text-to-speech conversion. The flow in Figure 5 shows how A invokes text-to-speech conversion. The flow in Figure 5 shows how A invokes
T1 and T2. T1 and T2.
4 Security Considerations
This document describes how to use third party call control to invoke
transcoding services. It does not introduce new security
considerations besides the ones discussed in [2].
5 Authors' Addresses
Gonzalo Camarillo
Ericsson
Advanced Signalling Research Lab.
FIN-02420 Jorvas
Finland
electronic mail: Gonzalo.Camarillo@ericsson.com
Eric W. Burger
SnowShore Networks, Inc.
Chelmsford, MA
USA
electronic mail: eburger@snowshore.com
Henning Schulzrinne
Dept. of Computer Science
Columbia University 1214 Amsterdam Avenue, MC 0401
New York, NY 10027
USA
electronic mail: schulzrinne@cs.columbia.edu
Arnoud van Wijk
Viataal
Research & Development
Afdeling RDS
Theerestraat 42
5271 GD Sint-Michielsgestel
The Netherlands
electronic mail: a.vwijk@viataal.nl
6 Bibliography
[1] G. Camarillo, "Framework for transcoding with the session
initiation protocol," Internet Draft draft-camarillo-sipping-transc-
framework-00, Internet Engineering Task Force, Aug. 2003. Work in
progress.
[2] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo,
"Best current practices for third party call control in the session
initiation protocol," Internet Draft draft-ietf-sipping-3pcc-06,
Internet Engineering Task Force, Jan. 2004. Work in progress.
[3] N. Charlton, M. Gasson, G. Gybels, M. Spanner, and A. van Wijk,
"User requirements for the session initiation protocol (SIP) in
support of deaf, hard of hearing and speech-impaired individuals,"
RFC 3351, Internet Engineering Task Force, Aug. 2002.
[4] G. Camarillo, J. Holler, G. Eriksson, and H. Schulzrinne,
"Grouping of m lines in SDP," internet draft, Internet Engineering
Task Force, Feb. 2002. Work in progress.
A T1 T2 B A T1 T2 B
| | | | | | | |
|----(1) INVITE SDP A-----> | | | |----(1) INVITE SDP A-----> | | |
| | | | | | | |
|<-(2) 200 OK SDP T1A+T1T2- | | | |<-(2) 200 OK SDP T1A+T1T2- | | |
| | | | | | | |
|----------(3) ACK--------> | | | |----------(3) ACK--------> | | |
| | | | | | | |
|-----------(4) INVITE SDP T1T2------------------>| | |-----------(4) INVITE SDP T1T2------------------>| |
skipping to change at page 17, line 51 skipping to change at page 17, line 5
| | | | | | | |
|----------------------------(18) ACK-------------------------->| |----------------------------(18) ACK-------------------------->|
| | | | | | | |
| ************************* | ******************* *********** | | ************************* | ******************* *********** |
|* MEDIA *|* MEDIA *|* MEDIA *| |* MEDIA *|* MEDIA *|* MEDIA *|
| ************************* | ******************* | *********** | | ************************* | ******************* | *********** |
| | | | | | | |
Figure 5: Transcoding services in serial Figure 5: Transcoding services in serial
4 Security Considerations
This document describes how to use third party call control to invoke
transcoding services. It does not introduce new security
considerations besides the ones discussed in [2].
5 Authors' Addresses
Gonzalo Camarillo
Ericsson
Advanced Signalling Research Lab.
FIN-02420 Jorvas
Finland
electronic mail: Gonzalo.Camarillo@ericsson.com
Eric Burger
Brooktrout Technology, Inc.
18 Keewaydin Way
Salem, NH 03079
USA
electronic mail: eburger@ieee.org
Henning Schulzrinne
Dept. of Computer Science
Columbia University 1214 Amsterdam Avenue, MC 0401
New York, NY 10027
USA
electronic mail: schulzrinne@cs.columbia.edu
Arnoud van Wijk
Viataal
Research & Development
Afdeling RDS
Theerestraat 42
5271 GD Sint-Michielsgestel
The Netherlands
electronic mail: a.vwijk@viataal.nl
6 Normative References
[1] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. R. Johnston, J.
Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: session
initiation protocol," RFC 3261, Internet Engineering Task Force, June
2002.
[2] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo,
"Best current practices for third party call control (3pcc) in the
session initiation protocol (SIP)," RFC 3725, Internet Engineering
Task Force, Apr. 2004.
[3] G. Camarillo, G. Eriksson, J. Holler, and H. Schulzrinne,
"Grouping of media lines in the session description protocol (SDP),"
RFC 3388, Internet Engineering Task Force, Dec. 2002.
7 Informative References
[4] G. Camarillo, "Framework for transcoding with the session
initiation protocol," Internet Draft draft-camarillo-sipping-transc-
framework-00, Internet Engineering Task Force, Aug. 2003. Work in
progress.
[5] N. Charlton, M. Gasson, G. Gybels, M. Spanner, and A. van Wijk,
"User requirements for the session initiation protocol (SIP) in
support of deaf, hard of hearing and speech-impaired individuals,"
RFC 3351, Internet Engineering Task Force, Aug. 2002.
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
intellectual property or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in pertain to the implementation or use of the technology described in
has made any effort to identify any such rights. Information on the this document or the extent to which any license under such rights
IETF's procedures with respect to rights in standards-track and might or might not be available; nor does it represent that it has
standards-related documentation can be found in BCP-11. Copies of made any independent effort to identify any such rights. Information
claims of rights made available for publication and any assurances of on the IETF's procedures with respect to rights in IETF Documents can
licenses to be made available, or the result of an attempt made to be found in BCP 78 and BCP 79.
obtain a general license or permission for the use of such
proprietary rights by implementors or users of this specification can Copies of IPR disclosures made to the IETF Secretariat and any
be obtained from the IETF Secretariat. assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary copyrights, patents or patent applications, or other proprietary
rights which may cover technology that may be required to practice rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF Executive this standard. Please address the information to the IETF at ietf-
Director. ipr@ietf.org.
Full Copyright Statement Disclaimer of Validity
Copyright (c) The Internet Society (2004). All Rights Reserved. This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
This document and translations of it may be copied and furnished to Copyright Statement
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be Copyright (C) The Internet Society (2004). This document is subject
revoked by the Internet Society or its successors or assigns. to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
This document and the information contained herein is provided on an Acknowledgment
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING Funding for the RFC Editor function is currently provided by the
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Internet Society.
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/