draft-ietf-sipping-cc-framework-10.txt   draft-ietf-sipping-cc-framework-11.txt 
SIPPING WG R. Mahy SIPPING WG R. Mahy
Internet-Draft Plantronics Internet-Draft Plantronics
Intended status: Informational R. Sparks Intended status: Informational R. Sparks
Expires: October 18, 2008 Estacado Systems Expires: September 6, 2009 Tekelek
J. Rosenberg J. Rosenberg
Cisco Systems Cisco Systems
D. Petrie D. Petrie
SIP EZ SIP EZ
A. Johnston, Ed. A. Johnston, Ed.
Avaya Avaya
April 16, 2008 March 5, 2009
A Call Control and Multi-party usage framework for the Session A Call Control and Multi-party usage framework for the Session
Initiation Protocol (SIP) Initiation Protocol (SIP)
draft-ietf-sipping-cc-framework-10 draft-ietf-sipping-cc-framework-11
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any This Internet-Draft is submitted to IETF in full conformance with the
applicable patent or other IPR claims of which he or she is aware provisions of BCP 78 and BCP 79. This document may contain material
have been or will be disclosed, and any of which he or she becomes from IETF Documents or IETF Contributions published or made publicly
aware will be disclosed, in accordance with Section 6 of BCP 79. available before November 10, 2008. The person(s) controlling the
copyright in some of this material may not have granted the IETF
Trust the right to allow modifications of such material outside the
IETF Standards Process. Without obtaining an adequate license from
the person(s) controlling the copyright in such materials, this
document may not be modified outside the IETF Standards Process, and
derivative works of it may not be created outside the IETF Standards
Process, except to format it for publication as an RFC or to
translate it into languages other than English.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 18, 2008. This Internet-Draft will expire on September 6, 2009.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Abstract Abstract
This document defines a framework and requirements for call control This document defines a framework and requirements for call control
and multi-party usage of SIP. To enable discussion of multi-party and multi-party usage of SIP. To enable discussion of multi-party
features and applications we define an abstract call model for features and applications we define an abstract call model for
describing the media relationships required by many of these. The describing the media relationships required by many of these. The
model and actions described here are specifically chosen to be model and actions described here are specifically chosen to be
independent of the SIP signaling and/or mixing approach chosen to independent of the SIP signaling and/or mixing approach chosen to
actually setup the media relationships. In addition to its dialog actually setup the media relationships. In addition to its dialog
manipulation aspect, this framework includes requirements for manipulation aspect, this framework includes requirements for
communicating related information and events such as conference and communicating related information and events such as conference and
session state, and session history. This framework also describes session state, and session history. This framework also describes
other goals that embody the spirit of SIP applications as used on the other goals that embody the spirit of SIP applications as used on the
Internet. Internet.
Table of Contents Table of Contents
1. Motivation and Background . . . . . . . . . . . . . . . . . . 4 1. Motivation and Background . . . . . . . . . . . . . . . . . . 5
2. Key Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Key Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1. "Conversation Space" Model . . . . . . . . . . . . . . . . 6 2.1. "Conversation Space" Model . . . . . . . . . . . . . . . . 7
2.2. Relationship Between Conversation Space, SIP Dialogs, 2.2. Relationship Between Conversation Space, SIP Dialogs,
and SIP Sessions . . . . . . . . . . . . . . . . . . . . . 7 and SIP Sessions . . . . . . . . . . . . . . . . . . . . . 8
2.3. Signaling Models . . . . . . . . . . . . . . . . . . . . . 8 2.3. Signaling Models . . . . . . . . . . . . . . . . . . . . . 9
2.4. Mixing Models . . . . . . . . . . . . . . . . . . . . . . 9 2.4. Mixing Models . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1. Tightly Coupled . . . . . . . . . . . . . . . . . . . 10 2.4.1. Tightly Coupled . . . . . . . . . . . . . . . . . . . 11
2.4.2. Loosely Coupled . . . . . . . . . . . . . . . . . . . 11 2.4.2. Loosely Coupled . . . . . . . . . . . . . . . . . . . 12
2.5. Conveying Information and Events . . . . . . . . . . . . . 12 2.5. Conveying Information and Events . . . . . . . . . . . . . 13
2.6. Componentization and Decomposition . . . . . . . . . . . . 14 2.6. Componentization and Decomposition . . . . . . . . . . . . 15
2.6.1. Media Intermediaries . . . . . . . . . . . . . . . . . 14 2.6.1. Media Intermediaries . . . . . . . . . . . . . . . . . 15
2.6.2. Mixer . . . . . . . . . . . . . . . . . . . . . . . . 14 2.6.2. Mixer . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6.3. Transcoder . . . . . . . . . . . . . . . . . . . . . . 15 2.6.3. Transcoder . . . . . . . . . . . . . . . . . . . . . . 16
2.6.4. Media Relay . . . . . . . . . . . . . . . . . . . . . 15 2.6.4. Media Relay . . . . . . . . . . . . . . . . . . . . . 16
2.6.5. Queue Server . . . . . . . . . . . . . . . . . . . . . 15 2.6.5. Queue Server . . . . . . . . . . . . . . . . . . . . . 16
2.6.6. Parking Place . . . . . . . . . . . . . . . . . . . . 15 2.6.6. Parking Place . . . . . . . . . . . . . . . . . . . . 16
2.6.7. Announcements and Voice Dialogs . . . . . . . . . . . 15 2.6.7. Announcements and Voice Dialogs . . . . . . . . . . . 17
2.7. Use of URIs . . . . . . . . . . . . . . . . . . . . . . . 17 2.7. Use of URIs . . . . . . . . . . . . . . . . . . . . . . . 18
2.7.1. Naming Users in SIP . . . . . . . . . . . . . . . . . 18 2.7.1. Naming Users in SIP . . . . . . . . . . . . . . . . . 19
2.7.2. Naming Services with SIP URIs . . . . . . . . . . . . 19 2.7.2. Naming Services with SIP URIs . . . . . . . . . . . . 20
2.8. Invoker Independence . . . . . . . . . . . . . . . . . . . 21 2.8. Invoker Independence . . . . . . . . . . . . . . . . . . . 22
2.9. Billing issues . . . . . . . . . . . . . . . . . . . . . . 21 2.9. Billing issues . . . . . . . . . . . . . . . . . . . . . . 22
3. Catalog of call control actions and sample features . . . . . 22 3. Catalog of call control actions and sample features . . . . . 23
3.1. Remote Call Control Actions on Early Dialogs . . . . . . . 22 3.1. Remote Call Control Actions on Early Dialogs . . . . . . . 23
3.1.1. Remote Answer . . . . . . . . . . . . . . . . . . . . 23 3.1.1. Remote Answer . . . . . . . . . . . . . . . . . . . . 24
3.1.2. Remote Forward or Put . . . . . . . . . . . . . . . . 23 3.1.2. Remote Forward or Put . . . . . . . . . . . . . . . . 24
3.1.3. Remote Busy or Error Out . . . . . . . . . . . . . . . 23 3.1.3. Remote Busy or Error Out . . . . . . . . . . . . . . . 24
3.2. Remote Call Control Actions on Single Dialogs . . . . . . 23 3.2. Remote Call Control Actions on Single Dialogs . . . . . . 24
3.2.1. Remote Dial . . . . . . . . . . . . . . . . . . . . . 23 3.2.1. Remote Dial . . . . . . . . . . . . . . . . . . . . . 24
3.2.2. Remote On and Off Hold . . . . . . . . . . . . . . . . 23 3.2.2. Remote On and Off Hold . . . . . . . . . . . . . . . . 24
3.2.3. Remote Hangup . . . . . . . . . . . . . . . . . . . . 23 3.2.3. Remote Hangup . . . . . . . . . . . . . . . . . . . . 25
3.3. Call Control Actions on Multiple Dialogs . . . . . . . . . 24 3.3. Call Control Actions on Multiple Dialogs . . . . . . . . . 25
3.3.1. Transfer . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1. Transfer . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.2. Take . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.2. Take . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.3. Add . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.3. Add . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.4. Local Join . . . . . . . . . . . . . . . . . . . . . . 27 3.3.4. Local Join . . . . . . . . . . . . . . . . . . . . . . 28
3.3.5. Insert . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3.5. Insert . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.6. Split . . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.6. Split . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.7. Near-fork . . . . . . . . . . . . . . . . . . . . . . 28 3.3.7. Near-fork . . . . . . . . . . . . . . . . . . . . . . 29
3.3.8. Far fork . . . . . . . . . . . . . . . . . . . . . . . 28 3.3.8. Far fork . . . . . . . . . . . . . . . . . . . . . . . 29
4. Security Considerations . . . . . . . . . . . . . . . . . . . 29 4. Security Considerations . . . . . . . . . . . . . . . . . . . 30
5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
6. Appendix A: Example Features . . . . . . . . . . . . . . . . . 30 6. Appendix A: Example Features . . . . . . . . . . . . . . . . . 31
6.1. Implementation of these features . . . . . . . . . . . . . 33 6.1. Attended Transfer . . . . . . . . . . . . . . . . . . . . 31
6.1.1. Barge-in . . . . . . . . . . . . . . . . . . . . . . . 34 6.2. Auto Answer . . . . . . . . . . . . . . . . . . . . . . . 31
6.1.2. Call Monitoring . . . . . . . . . . . . . . . . . . . 34 6.3. Automatic Callback . . . . . . . . . . . . . . . . . . . . 32
6.1.3. Call Park . . . . . . . . . . . . . . . . . . . . . . 35 6.4. Barge-in . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.1.4. Call Pickup . . . . . . . . . . . . . . . . . . . . . 35 6.5. Blind Transfer . . . . . . . . . . . . . . . . . . . . . . 32
6.1.5. Click-to-dial . . . . . . . . . . . . . . . . . . . . 35 6.6. Call Forwarding . . . . . . . . . . . . . . . . . . . . . 32
6.1.6. Distinctive ring . . . . . . . . . . . . . . . . . . . 36 6.7. Call Monitoring . . . . . . . . . . . . . . . . . . . . . 32
6.1.7. Intercom . . . . . . . . . . . . . . . . . . . . . . . 36 6.8. Call Park . . . . . . . . . . . . . . . . . . . . . . . . 32
6.1.8. Music on Hold . . . . . . . . . . . . . . . . . . . . 36 6.9. Call Pickup . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.9. Pre-paid calling . . . . . . . . . . . . . . . . . . . 36 6.10. Call Return . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.10. Single Line Extension/Multiple Line Appearance . . . . 37 6.11. Call Waiting . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.11. Speakerphone paging . . . . . . . . . . . . . . . . . 37 6.12. Click-to-Dial . . . . . . . . . . . . . . . . . . . . . . 34
6.1.12. Voice message screening . . . . . . . . . . . . . . . 37 6.13. Conference Call . . . . . . . . . . . . . . . . . . . . . 34
6.1.13. Voice Portal . . . . . . . . . . . . . . . . . . . . . 38 6.14. Consultative Transfer . . . . . . . . . . . . . . . . . . 34
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 38 6.15. Distinctive Ring . . . . . . . . . . . . . . . . . . . . . 34
8. Informative References . . . . . . . . . . . . . . . . . . . . 38 6.16. Do Not Disturb . . . . . . . . . . . . . . . . . . . . . . 34
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41 6.17. Find-Me . . . . . . . . . . . . . . . . . . . . . . . . . 35
Intellectual Property and Copyright Statements . . . . . . . . . . 43 6.18. Hotline . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.19. IM Conference Alerts . . . . . . . . . . . . . . . . . . . 35
6.20. Inbound Call Screening . . . . . . . . . . . . . . . . . . 35
6.21. Intercom . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.22. Message Waiting . . . . . . . . . . . . . . . . . . . . . 35
6.23. Music on Hold . . . . . . . . . . . . . . . . . . . . . . 36
6.24. Outbound Call Screening . . . . . . . . . . . . . . . . . 36
6.25. Pre-paid Calling . . . . . . . . . . . . . . . . . . . . . 36
6.26. Presence-Enabled Conferencing . . . . . . . . . . . . . . 37
6.27. Single Line Extension/Multiple Line Appearance . . . . . . 37
6.28. Speakerphone Paging . . . . . . . . . . . . . . . . . . . 37
6.29. Speed Dial . . . . . . . . . . . . . . . . . . . . . . . . 38
6.30. Voice Message Screening . . . . . . . . . . . . . . . . . 38
6.31. Voice Portal . . . . . . . . . . . . . . . . . . . . . . . 38
6.32. Voicemail . . . . . . . . . . . . . . . . . . . . . . . . 39
6.33. Whispered Call Waiting . . . . . . . . . . . . . . . . . . 39
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
8. Informative References . . . . . . . . . . . . . . . . . . . . 40
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42
1. Motivation and Background 1. Motivation and Background
The Session Initiation Protocol [RFC3261] (SIP) was defined for the The Session Initiation Protocol [RFC3261] (SIP) was defined for the
initiation, maintenance, and termination of sessions or calls between initiation, maintenance, and termination of sessions or calls between
one or more users. However, despite its origins as a large-scale one or more users. However, despite its origins as a large-scale
multiparty conferencing protocol, SIP is used today primarily for multiparty conferencing protocol, SIP is used today primarily for
point to point calls. This two-party configuration is the focus of point to point calls. This two-party configuration is the focus of
the SIP specification and most of its extensions. the SIP specification and most of its extensions.
skipping to change at page 4, line 37 skipping to change at page 5, line 37
applications: applications:
o Define Primitives, Not Services. Allow for a handful of robust o Define Primitives, Not Services. Allow for a handful of robust
yet simple mechanisms that can be combined to deliver features and yet simple mechanisms that can be combined to deliver features and
services. Throughout this document we refer to these simple services. Throughout this document we refer to these simple
mechanisms as "primitives". Primitives should be sufficiently mechanisms as "primitives". Primitives should be sufficiently
robust so that when they are combined with eachother, they can be robust so that when they are combined with eachother, they can be
used to build lots of services. However, the goal is not to used to build lots of services. However, the goal is not to
define a provably complete set of primitives. Note that while the define a provably complete set of primitives. Note that while the
IETF will NOT standardize behavior or services, it may define IETF will NOT standardize behavior or services, it may define
example services for informational purposes, as in service example services for informational purposes, as in service
examples [I-D.ietf-sipping-service-examples]. examples [RFC5359].
o Participant oriented. The primitives should be designed to o Participant oriented. The primitives should be designed to
provide services that are oriented around the experience of the provide services that are oriented around the experience of the
participants. The authors observe that end users of features and participants. The authors observe that end users of features and
services usually don't care how a media relationship is setup. services usually don't care how a media relationship is setup.
Their ultimate experience is based only on the resulting media and Their ultimate experience is based only on the resulting media and
other externally visible characteristics. other externally visible characteristics.
o Signaling Model independent: Support both a central control and a o Signaling Model independent: Support both a central control and a
peer-to-peer feature invocation model (and combinations of the peer-to-peer feature invocation model (and combinations of the
two). Baseline SIP already supports a centralized control model two). Baseline SIP already supports a centralized control model
described in 3pcc [RFC3725], and the SIP community has expressed a described in 3pcc [RFC3725], and the SIP community has expressed a
skipping to change at page 5, line 13 skipping to change at page 6, line 13
Replaces [RFC3891], and Join [RFC3911]. Replaces [RFC3891], and Join [RFC3911].
o Mixing Model independent: The bulk of interesting multiparty o Mixing Model independent: The bulk of interesting multiparty
applications involve mixing or combining media from multiple applications involve mixing or combining media from multiple
participants. This mixing can be performed by one or more of the participants. This mixing can be performed by one or more of the
participants, or by a centralized mixing resource. The experience participants, or by a centralized mixing resource. The experience
of the participants should not depend on the mixing model used. of the participants should not depend on the mixing model used.
While most examples in this document refer to audio mixing, the While most examples in this document refer to audio mixing, the
framework applies to any media type. In this context a "mixer" framework applies to any media type. In this context a "mixer"
refers to combining media of the same type in an appropriate, refers to combining media of the same type in an appropriate,
media-specific way. This is consistent with model described in media-specific way. This is consistent with the model described
the SIP conferencing framework. in the SIP conferencing framework.
o Invoker oriented. Only the user who invokes a feature or a o Invoker oriented. Only the user who invokes a feature or a
service needs to know exactly which service is invoked or why. service needs to know exactly which service is invoked or why.
This is good because it allows new services to be created without This is good because it allows new services to be created without
requiring new primitives from all the participants; and it allows requiring new primitives from all the participants; and it allows
for much simpler feature authorization policies, for example, when for much simpler feature authorization policies, for example, when
participation spans organizational boundaries. As discussed in participation spans organizational boundaries. As discussed in
section 3.8, this also avoids exponential state explosion when section 2.7, this also avoids exponential state explosion when
combining features. The invoker only has to manage a user combining features. The invoker only has to manage a user
interface or API to prevent local feature interactions. All the interface or API to prevent local feature interactions. All the
other participants simply need to manage the feature interactions other participants simply need to manage the feature interactions
of a much smaller number of primitives. of a much smaller number of primitives.
o Primitives make full use of URIs. URIs are a very powerful o Primitives make full use of URIs. URIs are a very powerful
mechanism for describing users and services. They represent a mechanism for describing users and services. They represent a
plentiful resource that can be extremely expressive and easily plentiful resource that can be extremely expressive and easily
routed, translated, and manipulated--even across organizational routed, translated, and manipulated--even across organizational
boundaries. URIs can contain special parameters and informational boundaries. URIs can contain special parameters and informational
headers that need only be relevant to the owner of the namespace headers that need only be relevant to the owner of the namespace
skipping to change at page 6, line 42 skipping to change at page 7, line 42
one or more participants. one or more participants.
Participants are SIP User Agents that send original media to or Participants are SIP User Agents that send original media to or
terminate and receive media from other members of the conversation terminate and receive media from other members of the conversation
space. Logically, every participant in the conversation space has space. Logically, every participant in the conversation space has
access to all the media generated in that space (this is strictly access to all the media generated in that space (this is strictly
true if all participants share a common media type). A SIP User true if all participants share a common media type). A SIP User
Agent that does not contribute or consume any media is NOT a Agent that does not contribute or consume any media is NOT a
participant; nor is a user agent that merely forwards, transcoders, participant; nor is a user agent that merely forwards, transcoders,
mixes, or selects media originating elsewhere in the conversation mixes, or selects media originating elsewhere in the conversation
space. [Note that a conversation space consists of zero or more SIP space.
calls or SIP conferences. A conversation space is similar to the Note that a conversation space consists of zero or more SIP calls
definition of a "call" in some other call models.] or SIP conferences. A conversation space is similar to the
definition of a "call" in some other call models.
Participants may represent human users or non-human users (referred Participants may represent human users or non-human users (referred
to as robots or automatons in this document). Some participants may to as robots or automatons in this document). Some participants may
be hidden within a conversation space. Some examples of hidden be hidden within a conversation space. Some examples of hidden
participants include: robots that generate tones, images, or participants include: robots that generate tones, images, or
announcements during a conference to announce users arriving and announcements during a conference to announce users arriving and
departing, a human call center supervisor monitoring a conversation departing, a human call center supervisor monitoring a conversation
between a trainee and a customer, and robots that record media for between a trainee and a customer, and robots that record media for
training or archival purposes. training or archival purposes.
skipping to change at page 7, line 20 skipping to change at page 8, line 21
participant is obviously active.) Some robotic participants (such as participant is obviously active.) Some robotic participants (such as
a voice messaging system, an instant messaging agent, or a voice a voice messaging system, an instant messaging agent, or a voice
dialog system) may be active participants if they can leave the dialog system) may be active participants if they can leave the
conversation space when there is no human interaction. Other robots conversation space when there is no human interaction. Other robots
(for example our tone generating robot from the previous example) are (for example our tone generating robot from the previous example) are
passive participants. A human participant "on-hold" is passive. passive participants. A human participant "on-hold" is passive.
An example diagram of a conversation space can be shown as a "bubble" An example diagram of a conversation space can be shown as a "bubble"
or ovals, or as a "set" in curly or square brace notation. Each set, or ovals, or as a "set" in curly or square brace notation. Each set,
oval, or "bubble" represents a conversation space. Hidden oval, or "bubble" represents a conversation space. Hidden
participants are shown in lowercase letters. participants are shown in lowercase letters. Examples are given in
Figure 1.
Note that while the term "conversation" usually applies to oral Note that while the term "conversation" usually applies to oral
exchange of information, we apply the conversation space model to any exchange of information, we apply the conversation space model to any
media exchange between participants. media exchange between participants.
{ A , B } [ A , b, C, D ] { A , B } [ A , b, C, D ]
.-. .---. .-. .---.
/ \ / \ / \ / \
/ A \ / A b \ / A \ / A b \
( ) ( ) ( ) ( )
\ B / \ C D / \ B / \ C D /
\ / \ / \ / \ /
'-' '---' '-' '---'
Figure 1. Conversation Spaces.
2.2. Relationship Between Conversation Space, SIP Dialogs, and SIP 2.2. Relationship Between Conversation Space, SIP Dialogs, and SIP
Sessions Sessions
In SIP, a call is "an informal term that refers to some communication In SIP, a call is "an informal term that refers to some communication
between peers, generally set up for the purposes of a multimedia between peers, generally set up for the purposes of a multimedia
conversation." Obviously we cannot discuss normative behavior based conversation." The concept of a conversation space is needed because
on such an intentionally vague definition. The concept of a the SIP definition of call is not sufficiently precise for the
conversation space is needed because the SIP definition of call is purpose of describing the user experience of multiparty features.
not sufficiently precise for the purpose of describing the user
experience of multiparty features.
Do any other definitions convey the correct meaning? SIP, and SDP Do any other definitions convey the correct meaning? SIP, and SDP
[RFC4566] both define a conference as "a multimedia session [RFC4566] both define a conference as "a multimedia session
identified by a common session description." A session is defined as identified by a common session description." A session is defined as
"a set of multimedia senders and receivers and the data streams "a set of multimedia senders and receivers and the data streams
flowing from senders to receivers." Both of these definitions are flowing from senders to receivers." The definition of "call" in some
heavily oriented toward multicast sessions with little
differentiation among participants. As such, neither is particularly
useful for our purposes. In fact, the definition of "call" in some
call models is more similar to our definition of a conversation call models is more similar to our definition of a conversation
space. space.
Some examples of the relationship between conversation spaces, SIP Some examples of the relationship between conversation spaces, SIP
dialogs, and SIP sessions are listed below. In each example, a human dialogs, and SIP sessions are listed below. In each example, a human
user will perceive that there is a single call. user will perceive that there is a single call.
o A simple two-party call is a single conversation space, a single o A simple two-party call is a single conversation space, a single
session, and a single dialog. session, and a single dialog.
o A locally mixed three-way call is two sessions and two dialogs. o A locally mixed three-way call is two sessions and two dialogs.
It is also a single conversation space. It is also a single conversation space.
skipping to change at page 9, line 36 skipping to change at page 10, line 35
controller does not, the feature will not be able to be used. controller does not, the feature will not be able to be used.
Many of the features, primitives, and actions described in this Many of the features, primitives, and actions described in this
document also require some type of media mixing, combining, or document also require some type of media mixing, combining, or
selection as described in the next section. selection as described in the next section.
2.4. Mixing Models 2.4. Mixing Models
SIP permits a variety of mixing models, which are discussed here SIP permits a variety of mixing models, which are discussed here
briefly. This topic is discussed more thoroughly in the SIP briefly. This topic is discussed more thoroughly in the SIP
conferencing framework [RFC4353] and cc-conferencing [RFC4579]. SIP conferencing framework [RFC4353] and [RFC4579]. SIP supports both
supports both tightly-coupled and loosely-coupled conferencing, tightly-coupled and loosely-coupled conferencing, although more
although more sophisticated behavior is available in tightly-coupled sophisticated behavior is available in tightly-coupled conferences.
conferences. In a tightly-coupled conference, a single SIP user In a tightly-coupled conference, a single SIP user agent (called the
agent (called the focus) has a direct dialog relationship with each focus) has a direct dialog relationship with each participant (and
participant (and may control non participant user agents as well). may control non participant user agents as well). The focus can
In a loosely-coupled conference there is no coordinated signaling authoritatively publish information about the character and
relationships among the participants. participants in a conference. In a loosely-coupled conference there
is no coordinated signaling relationships among the participants.
For brevity, only the two most popular conferencing models are For brevity, only the two most popular conferencing models are
significantly discussed in this document (local and centralized significantly discussed in this document (local and centralized
mixing). Applications of the conversation spaces model to loosely- mixing). Applications of the conversation spaces model to loosely-
coupled multicast and distributed full unicast mesh conferences are coupled multicast and distributed full unicast mesh conferences are
left as an exercise for the reader. Note that a distributed full left as an exercise for the reader. Note that a distributed full
mesh conference can be used for basic conferences, but does not mesh conference can be used for basic conferences, but does not
easily allow for more complex conferencing actions like splitting, easily allow for more complex conferencing actions like splitting,
merging, and sidebars. merging, and sidebars.
skipping to change at page 10, line 36 skipping to change at page 11, line 36
completely separate SIP call. This call uses a different Call-ID, completely separate SIP call. This call uses a different Call-ID,
different tags, etc. There is no call set up directly between B and different tags, etc. There is no call set up directly between B and
C. No SIP extension or external signaling is needed. A merely C. No SIP extension or external signaling is needed. A merely
decides to locally join two dialogs. decides to locally join two dialogs.
B C B C
\ / \ /
\ / \ /
A A
A receives media streams from both B and C, and mixes them. A sends Figure 2. End System mixing Example.
a stream containing A's and C's streams to B, and a stream containing
A's and B's streams to C. Basically, user A handles both signaling In Figure 2, A receives media streams from both B and C, and mixes
and media mixing. them. A sends a stream containing A's and C's streams to B, and a
stream containing A's and B's streams to C. Basically, user A handles
both signaling and media mixing.
2.4.1.2. Centralized Mixing 2.4.1.2. Centralized Mixing
In a centralized mixing model, all participants have a pairwise SIP In a centralized mixing model, all participants have a pairwise SIP
and media relationship with the mixer. Common applications of and media relationship with the mixer. Common applications of
centralized mixing include ad-hoc conferences and scheduled dial-in centralized mixing include ad-hoc conferences and scheduled dial-in
or dial-out conferences. In the figure below, the mixer M receives or dial-out conferences. In Figure 3 below, the mixer M receives and
and sends media to participants A, B, C, D, and E. sends media to participants A, B, C, D, and E.
B C B C
\ / \ /
\ / \ /
M --- A M --- A
/ \ / \
/ \ / \
D E D E
Figure 3. Centralized Mixing Example.
2.4.1.3. Centralized Signaling, Distributed Media 2.4.1.3. Centralized Signaling, Distributed Media
In this conferencing model, there is a centralized controller, as in In this conferencing model, there is a centralized controller, as in
the dial-in and dial-out cases. However, the centralized server the dial-in and dial-out cases. However, the centralized server
handles signaling only. The media is still sent directly between handles signaling only. The media is still sent directly between
participants, using either multicast or multi-unicast. Participants participants, using either multicast or multi-unicast. Participants
perform their own mixing. Multi-unicast is when a user sends perform their own mixing. Multi-unicast is when a user sends
multiple packets (one for each recipient, addressed to that multiple packets (one for each recipient, addressed to that
recipient). This is referred to as a "Decentralized Multipoint recipient). This is referred to as a "Decentralized Multipoint
Conference" in [H.323]. Full mesh media with centralized mixing is Conference" in [H.323]. Full mesh media with centralized mixing is
skipping to change at page 12, line 37 skipping to change at page 13, line 43
server. For example, a focus involved in a conversation space may server. For example, a focus involved in a conversation space may
wish to provide URIs for conference status, and/or conference/floor wish to provide URIs for conference status, and/or conference/floor
control. control.
The SIP Events [RFC3265] architecture defines general mechanisms for The SIP Events [RFC3265] architecture defines general mechanisms for
subscription to and notification of events within SIP networks. It subscription to and notification of events within SIP networks. It
introduces the notion of a package that is a specific "instantiation" introduces the notion of a package that is a specific "instantiation"
of the events mechanism for a well-defined set of events. of the events mechanism for a well-defined set of events.
Event packages are needed to provide the status of a user's dialogs, Event packages are needed to provide the status of a user's dialogs,
provide the status of conferences and its participants, provide user provide the status of conferences and their participants, provide
presence information, provide the status of registrations, and user presence information, provide the status of registrations, and
provide the status of user's messages. While this is not an provide the status of user's messages. While this is not an
exhaustive list, these are sufficient to enable the sample features exhaustive list, these are sufficient to enable the sample features
described in this document. described in this document.
The conference event package [RFC4575] allows users to subscribe to The conference event package [RFC4575] allows users to subscribe to
information about an entire tightly-coupled SIP conference. information about an entire tightly-coupled SIP conference.
Notifications convey information about the participants such as: the Notifications convey information about the participants such as: the
SIP URI identifying each user, their status in the space (active, SIP URI identifying each user, their status in the space (active,
declined, departed), URIs to invoke other features (such as sidebar declined, departed), URIs to invoke other features (such as sidebar
conversations), links to other relevant information (such as floor conversations), links to other relevant information (such as floor
skipping to change at page 14, line 52 skipping to change at page 16, line 12
features implemented by participants. Some common media features implemented by participants. Some common media
intermediaries are described below. intermediaries are described below.
2.6.2. Mixer 2.6.2. Mixer
A SIP mixer is a component that combines media from all dialogs in A SIP mixer is a component that combines media from all dialogs in
the same conversation in a media specific way. For example, the the same conversation in a media specific way. For example, the
default combining for an audio conference might be an N-1 default combining for an audio conference might be an N-1
configuration, while a text mixer might interleave text messages on a configuration, while a text mixer might interleave text messages on a
per-line basis. More details about how to manipulate the media per-line basis. More details about how to manipulate the media
policy used by mixers is being discussed in the XCON Working Group. policy used by mixers is being discussed in [I-D.ietf-xcon-ccmp].
2.6.3. Transcoder 2.6.3. Transcoder
A transcoder translates media from one encoding or format to another A transcoder translates media from one encoding or format to another
(for example, GSM voice to G.711, MPEG2 to H.261, or text/html to (for example, GSM voice to G.711, MPEG2 to H.261, or text/html to
text/plain), or from one media type to another (for example text to text/plain), or from one media type to another (for example text to
speech). A more thorough discussion of transcoding is described in speech). A more thorough discussion of transcoding is described in
SIP transcoding services invocation SIP transcoding services invocation [RFC5369].
[I-D.ietf-sipping-transc-framework].
2.6.4. Media Relay 2.6.4. Media Relay
A media relay terminates media and simply forwards it to a new A media relay terminates media and simply forwards it to a new
destination without changing the content in any way. Sometimes media destination without changing the content in any way. Sometimes media
relays are used to provide source IP address anonymity, to facilitate relays are used to provide source IP address anonymity, to facilitate
middlebox traversal, or to provide a trusted entity where media can middlebox traversal, or to provide a trusted entity where media can
be forcefully disconnected. be forcefully disconnected.
2.6.5. Queue Server 2.6.5. Queue Server
skipping to change at page 16, line 28 skipping to change at page 17, line 38
2.6.7.1. Text-to-Speech and Automatic Speech Recognition 2.6.7.1. Text-to-Speech and Automatic Speech Recognition
Text-to-Speech (TTS) is a service that converts text into digitized Text-to-Speech (TTS) is a service that converts text into digitized
audio. TTS is frequently integrated into other applications, but audio. TTS is frequently integrated into other applications, but
when separated as a component, it provides greater opportunity for when separated as a component, it provides greater opportunity for
broad reuse. Automatic Speech Recognition (ASR) is a service that broad reuse. Automatic Speech Recognition (ASR) is a service that
attempts to decipher digitized speech based on a proposed grammar. attempts to decipher digitized speech based on a proposed grammar.
Like TTS, ASR services can be embedded, or exposed so that many Like TTS, ASR services can be embedded, or exposed so that many
applications can take advantage of such services. A standardized applications can take advantage of such services. A standardized
(decomposed) interface to access standalone TTS and ASR services is (decomposed) interface to access standalone TTS and ASR services is
currently being developed in the SPEECHSC Working Group. currently being developed in [RFC4313].
2.6.7.2. VoiceXML 2.6.7.2. VoiceXML
VoiceXML is a W3C recommendation that was designed to give authors VoiceXML is a W3C recommendation that was designed to give authors
control over the spoken dialog between users and applications. The control over the spoken dialog between users and applications. The
application and user take turns speaking: the application prompts the application and user take turns speaking: the application prompts the
user, and the user in turn responds. Its major goal is to bring the user, and the user in turn responds. Its major goal is to bring the
advantages of web-based development and content delivery to advantages of web-based development and content delivery to
interactive voice response applications. We believe that VoiceXML interactive voice response applications. We believe that VoiceXML
represents the ideal partner for SIP in the development of represents the ideal partner for SIP in the development of
distributed IVR servers. VoiceXML is an XML based scripting language distributed IVR servers. VoiceXML is an XML based scripting language
for describing IVR services at an abstract level. VoiceXML supports for describing IVR services at an abstract level. VoiceXML supports
DTMF recognition, speech recognition, text-to-speech, and playing out DTMF recognition, speech recognition, text-to-speech, and playing out
of recorded media files. The results of the data collected from the of recorded media files. The results of the data collected from the
user are passed to a controlling entity through an HTTP POST user are passed to a controlling entity through an HTTP POST
operation. The controller can then return another script, or operation. The controller can then return another script, or
terminate the interaction with the IVR server. terminate the interaction with the IVR server.
A VoiceXML server also need not be implemented as a monolithic A VoiceXML server also need not be implemented as a monolithic
server. Below is a diagram of a VoiceXML browser that is split into server. Figure 4 shows a diagram of a VoiceXML browser that is split
media and non-media handling parts. The VoiceXML interpreter handles into media and non-media handling parts. The VoiceXML interpreter
SIP dialog state and state within a VoiceXML document, and sends handles SIP dialog state and state within a VoiceXML document, and
requests to the media component over another protocol. sends requests to the media component over another protocol.
+-------------+ +-------------+
| | | |
| VoiceXML | | VoiceXML |
| Interpreter | | Interpreter |
| (signaling) | | (signaling) |
+-------------+ +-------------+
^ ^ ^ ^
| | | |
SIP | | RTSP SIP | | RTSP
| | | |
| | | |
v v v v
+-------------+ +-------------+ +-------------+ +-------------+
| | | | | | | |
| SIP UA | RTP | RTSP Server | | SIP UA | RTP | RTSP Server |
| |<------>| (media) | | |<------>| (media) |
| | | | | | | |
+-------------+ +-------------+ +-------------+ +-------------+
Figure : Decomposed VoiceXML Server Figure 4. Decomposed VoiceXML Server.
2.7. Use of URIs 2.7. Use of URIs
All naming in SIP uses URIs. URIs in SIP are used in a plethora of All naming in SIP uses URIs. URIs in SIP are used in a plethora of
contexts: the Request-URI; Contact, To, From, and *-Info headers; contexts: the Request-URI; Contact, To, From, and *-Info headers;
application/uri bodies; and embedded in email, web pages, instant application/uri bodies; and embedded in email, web pages, instant
messages, and ENUM records. The request-URI identifies the user or messages, and ENUM records. The request-URI identifies the user or
service that the call is destined for. service that the call is destined for.
SIP URIs embedded in informational SIP headers, SIP bodies, and non- SIP URIs embedded in informational SIP headers, SIP bodies, and non-
skipping to change at page 18, line 15 skipping to change at page 19, line 25
latter technique usually involves discovering the URI via a SIP event latter technique usually involves discovering the URI via a SIP event
package, a web page, a business card, or an Instant Message. Yet package, a web page, a business card, or an Instant Message. Yet
another means to acquire the URIs is to define a dictionary of another means to acquire the URIs is to define a dictionary of
primitives with well-defined semantics and provide a means to query primitives with well-defined semantics and provide a means to query
the named primitives and corresponding URIs that may be invoked on the named primitives and corresponding URIs that may be invoked on
the service or dialogs. the service or dialogs.
2.7.1. Naming Users in SIP 2.7.1. Naming Users in SIP
An address-of-record, or public SIP address, is a SIP (or SIPS) URI An address-of-record, or public SIP address, is a SIP (or SIPS) URI
that points to a domain with a location server that can map the URI that points to a domain with a location service that can map the URI
to set of Contact URIs where the user might be available. Typically to set of Contact URIs where the user might be available. Typically
the Contact URIs are populated via registration. the Contact URIs are populated via registration.
Address of Record Contacts Address of Record Contacts
sip:bob@biloxi.example.com -> sip:bob@babylon.biloxi.example.com:5060 sip:bob@biloxi.example.com -> sip:bob@babylon.biloxi.example.com:5060
sip:bbrown@mailbox.provider.example.net sip:bbrown@mailbox.provider.example.net
sip:+1.408.555.6789@mobile.example.net sip:+1.408.555.6789@mobile.example.net
Callee Capabilities [RFC3840] defines a set of additional parameters Callee Capabilities [RFC3840] defines a set of additional parameters
skipping to change at page 19, line 19 skipping to change at page 20, line 28
decisions based on the type of participants (active/passive, hidden, decisions based on the type of participants (active/passive, hidden,
human/robot) in a conversation space. This information is conveyed human/robot) in a conversation space. This information is conveyed
via the dialog package or in a SIP header parameter communicated via the dialog package or in a SIP header parameter communicated
using an appropriate SIP header. For example, a music on hold using an appropriate SIP header. For example, a music on hold
service may take the sensible approach that if there are two or more service may take the sensible approach that if there are two or more
unhidden participants, it should not provide hold music; or that it unhidden participants, it should not provide hold music; or that it
will not send hold music to robots. will not send hold music to robots.
Multiple participants in the same conversation space may represent Multiple participants in the same conversation space may represent
the same human user. For example, the user may use one participant the same human user. For example, the user may use one participant
for video, chat, and whiteboard media on a PC and another for audio device for video, chat, and whiteboard media on a PC and another for
media on a SIP phone. In this case, the address-of-record is the audio media on a SIP phone. In this case, the address-of-record is
same for both user agents, but the Contacts are different. In the same for both user agents, but the Contacts are different. In
addition, human users may add robot participants that act on their this case, there is really only one human participant. In addition,
behalf (for example a call recording service, or a calendar human users may add robot participants that act on their behalf (for
announcement reminder). Call Control features in SIP should continue example a call recording service, or a calendar announcement
to function as expected in such an environment. reminder). Call control features in SIP should continue to function
as expected in such an environment.
2.7.2. Naming Services with SIP URIs 2.7.2. Naming Services with SIP URIs
A critical piece of defining a session level service that can be A critical piece of defining a session level service that can be
accessed by SIP is defining the naming of the resources within that accessed by SIP is defining the naming of the resources within that
service. This point cannot be overstated. service. This point cannot be overstated.
In the context of SIP control of application components, we take In the context of SIP control of application components, we take
advantage of the fact that the left-hand-side of a standard SIP URI advantage of the fact that the left-hand-side of a standard SIP URI
is a user part. Most services may be thought of as user automatons is a user part. Most services may be thought of as user automatons
skipping to change at page 21, line 13 skipping to change at page 22, line 23
renders valid SIP URIs to be provisioned, rather than enforce only renders valid SIP URIs to be provisioned, rather than enforce only
one particular scheme. one particular scheme.
As we have shown, SIP URIs represent an ideal, flexible mechanism for As we have shown, SIP URIs represent an ideal, flexible mechanism for
describing and naming service resources, regardless if the resources describing and naming service resources, regardless if the resources
are queues, conferences, voice dialogs, announcements, voicemail are queues, conferences, voice dialogs, announcements, voicemail
treatments, or phone features. treatments, or phone features.
2.8. Invoker Independence 2.8. Invoker Independence
With functional signaling, only the invoker of features in SIP need With functional signaling, only the invoker of features in SIP needs
to know exactly which feature they are invoking. One of the primary to know exactly which feature they are invoking. One of the primary
benefits of this approach is that combinations of functional features benefits of this approach is that combinations of functional features
work in SIP call control without requiring complex feature work in SIP call control without requiring complex feature
interaction matrices. For example, let us examine the combination of interaction matrices. For example, let us examine the combination of
a "transfer" of a call that is "conferenced". a "transfer" of a call that is "conferenced".
Alice calls Bob. Alice silently "conferences in" her robotic Alice calls Bob. Alice silently "conferences in" her robotic
assistant Albert as a hidden party. Bob transfers Alice to Carol. assistant Albert as a hidden party. Bob transfers Alice to Carol.
If Bob asks Alice to Replace her leg with a new one to Carol then If Bob asks Alice to Replace her leg with a new one to Carol then
both Alice and Albert should be communicating with Carol both Alice and Albert should be communicating with Carol
skipping to change at page 22, line 46 skipping to change at page 24, line 8
early dialog. These actions can be thought of as a set of remote early dialog. These actions can be thought of as a set of remote
control operations. For example an automaton might perform the control operations. For example an automaton might perform the
operation on behalf of a user. Alternatively a user might use the operation on behalf of a user. Alternatively a user might use the
remote control in the form of an application to perform the action on remote control in the form of an application to perform the action on
the early dialog of a UA that may be out of reach. All of these the early dialog of a UA that may be out of reach. All of these
actions correspond to telling the UA how to respond to a request to actions correspond to telling the UA how to respond to a request to
establish an early dialog. These actions provide useful establish an early dialog. These actions provide useful
functionality for PDA, PC and server based applications that desire functionality for PDA, PC and server based applications that desire
the ability to control a UA. A proposed mechanism for this type of the ability to control a UA. A proposed mechanism for this type of
functionality is described in Remote Call Control functionality is described in Remote Call Control
[I-D.mahy-sip-remote-cc]. [I-D.audet-sipping-feature-ref].
3.1.1. Remote Answer 3.1.1. Remote Answer
A dialog is in some early dialog state such as 180 Ringing. It may A dialog is in some early dialog state such as 180 Ringing. It may
be desirable to tell the UA to answer the dialog. That is tell it to be desirable to tell the UA to answer the dialog. That is tell it to
send a 200 Ok response to establish the dialog. send a 200 Ok response to establish the dialog.
3.1.2. Remote Forward or Put 3.1.2. Remote Forward or Put
It may be desirable to tell the UA to respond with a 3xx class It may be desirable to tell the UA to respond with a 3xx class
skipping to change at page 23, line 31 skipping to change at page 24, line 36
3.2. Remote Call Control Actions on Single Dialogs 3.2. Remote Call Control Actions on Single Dialogs
There is another useful set of actions that operate on a single There is another useful set of actions that operate on a single
established dialog. These operations are useful in building established dialog. These operations are useful in building
productivity applications for aiding users to control their phone. productivity applications for aiding users to control their phone.
For example a Customer Relationship Management (CRM) application that For example a Customer Relationship Management (CRM) application that
sets up calls for a user eliminating the need for the user to sets up calls for a user eliminating the need for the user to
actually enter an address. These operations can also be thought of a actually enter an address. These operations can also be thought of a
remote control actions. A proposed mechanism for this type of remote control actions. A proposed mechanism for this type of
functionality is described in Remote Call Control functionality is described in Remote Call Control
[I-D.mahy-sip-remote-cc]. [I-D.audet-sipping-feature-ref].
3.2.1. Remote Dial 3.2.1. Remote Dial
This action instructs the UA to initiate a dialog. This action can This action instructs the UA to initiate a dialog. This action can
be performed using the REFER method. be performed using the REFER method.
3.2.2. Remote On and Off Hold 3.2.2. Remote On and Off Hold
This action instructs the UA to put an established dialog on hold. This action instructs the UA to put an established dialog on hold.
Though this operation can conceptually be performed with the REFER Though this operation can conceptually be performed with the REFER
skipping to change at page 24, line 49 skipping to change at page 26, line 7
- blind transfer - blind transfer
- transfer to a central mixer (some type of conference or forking) - transfer to a central mixer (some type of conference or forking)
- transfer to park server (park) - transfer to park server (park)
- transfer to music on hold or announcement server - transfer to music on hold or announcement server
- transfer to a "queue" - transfer to a "queue"
- transfer to a service (such as Voice Dialogs service) - transfer to a service (such as Voice Dialogs service)
- transition from local mixer to central mixer - transition from local mixer to central mixer
This action is frequently referred to as "completing an attended This action is frequently referred to as "completing an attended
transfer". It is described in more detail in cc-transfer transfer". It is described in more detail in
[I-D.ietf-sipping-cc-transfer]. [I-D.ietf-sipping-cc-transfer].
Note that if a transfer requires URI hiding or privacy, then the 3pcc Note that if a transfer requires URI hiding or privacy, then the 3pcc
approach can more easily implement this. For example, if the URI of approach can more easily implement this. For example, if the URI of
C needs to be hidden from B, then the use of 3pcc helps accomplish C needs to be hidden from B, then the use of 3pcc helps accomplish
this. this.
3.3.2. Take 3.3.2. Take
The conversation space changes as follows: The conversation space changes as follows:
skipping to change at page 25, line 47 skipping to change at page 27, line 6
Note: that pick up of a ringing call has perhaps some interesting Note: that pick up of a ringing call has perhaps some interesting
additional requirements. First of all it is an early dialog as additional requirements. First of all it is an early dialog as
opposed to an established dialog. Secondly the party which is to opposed to an established dialog. Secondly the party which is to
pickup the call may only wish to do so only while it is an early pickup the call may only wish to do so only while it is an early
dialog. That is in the race condition where the ringing UA accepts dialog. That is in the race condition where the ringing UA accepts
just before it receives signaling from the party wishing to take the just before it receives signaling from the party wishing to take the
call, the taking party wishes to yield or cancel the take. The goal call, the taking party wishes to yield or cancel the take. The goal
is to avoid yanking an answered call from the called party. is to avoid yanking an answered call from the called party.
This action is described in Replaces [RFC3891] and in cc-transfer This action is described in Replaces [RFC3891] and in
[I-D.ietf-sipping-cc-transfer]. [I-D.ietf-sipping-cc-transfer].
3.3.3. Add 3.3.3. Add
Note that the following 4 actions are described in cc-conferencing Note that the following 4 actions are described in [RFC4579].
[RFC4579].
This is merely adding a participant to a SIP conference. The This is merely adding a participant to a SIP conference. The
conversation space changes as follows: conversation space changes as follows:
{ A , B } --> { A , B , C } { A , B } --> { A , B , C }
A adds C to the conversation. A adds C to the conversation.
Using the peer-to-peer approach, adding a party using local mixing Using the peer-to-peer approach, adding a party using local mixing
requires no signaling. To transition from a 2-party call or a requires no signaling. To transition from a 2-party call or a
skipping to change at page 29, line 4 skipping to change at page 30, line 6
{ A, B } --> { A , B } & { B , C } { A, B } --> { A , B } & { B , C }
A requests B to be the "anchor" of two conversation spaces. A requests B to be the "anchor" of two conversation spaces.
This is easily setup by creating a conference with two sub- This is easily setup by creating a conference with two sub-
conferences and setting the media policy appropriately such that B is conferences and setting the media policy appropriately such that B is
a participant in both. Media forking can also be setup using 3pcc as a participant in both. Media forking can also be setup using 3pcc as
described in Section 5.1 of RFC3264 [RFC3264] (an offer/answer model described in Section 5.1 of RFC3264 [RFC3264] (an offer/answer model
for SDP). The session descriptions for forking are quite complex. for SDP). The session descriptions for forking are quite complex.
Controllers should verify that endpoints can handle forked-media, for Controllers should verify that endpoints can handle forked-media, for
example using prior configuration. example using prior configuration.
Features enabled: Features enabled:
- barge-in - barge-in
- voice portal services - voice portal services
- whisper - whisper
- hotword detection - key word detection
- sending DTMF somewhere else - sending DTMF somewhere else
4. Security Considerations 4. Security Considerations
Call Control primitives provide a powerful set of features that can Call Control primitives provide a powerful set of features that can
be dangerous in the hands of an attacker. To complicate matters, be dangerous in the hands of an attacker. To complicate matters,
call control primitives are likely to be automatically authorized call control primitives are likely to be automatically authorized
without direct human oversight. without direct human oversight.
The class of attacks that are possible using these tools include the The class of attacks that are possible using these tools include the
skipping to change at page 30, line 29 skipping to change at page 31, line 32
5. IANA Considerations 5. IANA Considerations
This document required no action by IANA. This document required no action by IANA.
6. Appendix A: Example Features 6. Appendix A: Example Features
Primitives are defined in terms of their ability to provide features. Primitives are defined in terms of their ability to provide features.
These example features should require an amply robust set of services These example features should require an amply robust set of services
to demonstrate a useful set of primitives. They are described here to demonstrate a useful set of primitives. They are described here
briefly. Note that the descriptions of these features are non- briefly. Note that the descriptions of these features are non-
normative. Some of these features are used as examples in section 6 normative. Note also that this document describes a mixture of both
to demonstrate how some features may require certain media features originating in the world of telephones, and features that
relationships. Note also that this document describes a mixture of are clearly Internet oriented.
both features originating in the world of telephones, and features
that are clearly Internet oriented.
Example Feature Definitions:
Attended Transfer - The transferring party establishes a session with
the transfer target before completing the transfer.
Auto Answer - Calls to a certain address or location answer
immediately via a speakerphone.
Automatic Callback: Alice calls Bob, but Bob is busy. Alice would
like Bob to call her automatically when he is available. When Bob
hangs up, Alice's phone rings. When Alice answers, Bob's phone
rings. Bob answers and they talk.
Barge-in - Carol interrupts Alice who has a call in-progress call
with Bob. In some variations, Alice forcibly joins a new conversation
with Carol, in other variations, all three parties are placed in the
same conversation (basically a 3-way conference).
Blind Transfer - Alice is in a conversation with Bob. Alice asks Bob
to contact Carol, but makes no attempt to contact Carol
independently. In many implementations, Alice does not verify Bob's
success or failure in contacting Carol.
Call Forwarding - Before a dialog is accepted it is redirected to 6.1. Attended Transfer
another location, for example, because the originally intended
recipient is busy, does not answer, is disconnected from the network,
configured all requests to go somewhere else.
Call Monitoring - A call center supervisor joins an in-progress call In Attended Transfer [I-D.ietf-sipping-cc-transfer] the transferring
for monitoring purposes. party establishes a session with the transfer target before
completing the transfer.
Call Park - A call participant parks a call (essentially puts the 6.2. Auto Answer
call on hold), and then retrieves it at a later time (typically from
another location).
Call Pickup - A party picks up a call that was ringing at another In Auto Answer, calls to a certain address or URI answer immediately
location. One variation allows the caller to choose which location, via a speakerphone. The Answer-Mode [RFC5373] header field can be
another variation just picks up any call in that user's "pickup used for this feature.
group".
Call Return - Alice calls Bob. Bob misses the call or is disconnected 6.3. Automatic Callback
before he is finished talking to Alice. Bob invokes Call return that
calls Alice, even if Alice did not provide her real identity or
location to Bob.
Call Waiting - Alice is in a call, then receives another call. Alice In Automatic Callback [RFC5359], Alice calls Bob, but Bob is busy.
can place the first call on hold, and talk with the other caller. Alice would like Bob to call her automatically when he is available.
She can typically switch back and forth between the callers. When Bob hangs up, Alice's phone rings. When Alice answers, Bob's
phone rings. Bob answers and they talk.
Click-to-dial - Alice looks in her company directory for Bob. When 6.4. Barge-in
she finds Bob, she clicks on a URI to call him. Her phone rings (or
possibly answers automatically), and when she answers, Bob's phone
rings.
Conference Call - Three or more active, visible participants in the In Barge-in, Carol interrupts Alice who has a call in-progress call
same conversation space. with Bob. In some variations, Alice forcibly joins a new conversation
with Carol, in other variations, all three parties are placed in the
same conversation (basically a 3-way conference). Barge-in works the
same as call monitoring except that it must indicate that the send
media stream to be mixed so that all of the other parties can hear
the stream from the UA which is barging in.
Consultative transfer - the transferring party establishes a session 6.5. Blind Transfer
with the target and mixes both sessions together so that all three
parties can participate, then disconnects leaving the transferee and
transfer target with an active session.
Distinctive ring - Incoming calls have different ring cadences or In Blind Transfer [I-D.ietf-sipping-cc-transfer], Alice is in a
sample sounds depending on the From party, the To party, or other conversation with Bob. Alice asks Bob to contact Carol, but makes no
factors. attempt to contact Carol independently. In many implementations,
Alice does not verify Bob's success or failure in contacting Carol.
Do Not Disturb - Alice selects the Do Not Disturb option. Calls to 6.6. Call Forwarding
her either ring briefly or not at all and are forwarded elsewhere.
Some variations allow specially authorized callers to override this
feature and ring Alice anyway.
Find-Me - Alice sets up complicated rules for how she can be reached In call forwarding [RFC5359], before a dialog is accepted it is
(possibly using CPL (Call Processing Language) [RFC3880], presence redirected to another location, for example, because the originally
RFC3856 [RFC3264], or other factors). When Bob calls Alice, his call intended recipient is busy, does not answer, is disconnected from the
is eventually routed to a temporary Contact where Alice happens to be network, configured all requests to go somewhere else.
available.
Hotline - Alice picks up a phone and is immediately connected to the 6.7. Call Monitoring
technical support hotline, for example.
IM Conference Alerts: A user receives an notification as an Instant Call monitoring is a Join [RFC3911] operation. For example, a call
Message whenever someone joins a conference they are also in. center supervisor joins an in-progress call for monitoring purposes.
The monitoring UA sends a Join to the dialog it wants to listen to.
It is able to discover the dialog via the dialog state on the
monitored UA. The monitoring UA sends SDP in the INVITE that
indicates receive only media. As the UA is monitoring only it does
not matter whether the UA indicates it wishes the send stream be mix
or point to point.
Inbound Call Screening - Alice doesn't want to receive calls from 6.8. Call Park
Matt. Inbound Screening prevents Matt from disturbing Alice. In
some variations this works even if Matt hides his identity.
Intercom - Alice typically presses a button on a phone that In Call Park [RFC5359], a participant parks a call (essentially puts
immediately connects to another user or phone and causes that phone the call on hold), and then retrieves it at a later time (typically
to play her voice over its speaker. Some variations immediately from another location). Call park requires the ability to: put a
setup two-way communications, other variations require another button dialog some place, advertise it to users in a pickup group and to
to be pressed to enable a two-way conversation. uniquely identify it in a means that can be communicated (including
human voice). The dialog can be held locally on the UA parking the
dialog or alternatively transferred to the park service for the
pickup group. The parked dialog then needs to be labeled (e.g. orbit
12) in a way that can be communicated to the party that is to pick up
the call. The UAs in the pick up group discovers the parked
dialog(s) via the dialog package from the park service. If the
dialog is parked locally the park service merely aggregates the
parked call states from the set of UAs in the pickup up group.
Message Waiting - Bob calls Alice when she steps away from her phone, 6.9. Call Pickup
when she returns a visible or audible indicator conveys that someone
has left her a voicemail message. The message waiting indication may
also convey how many messages are waiting, from whom, what time, and
other useful pieces of information.
Music on Hold - When Alice places a call with Bob on hold, it There are two different features that are called Call Pickup
replaces its audio with streaming content such as music, [RFC5359]. The first is the pickup of a parked dialog. The UA from
announcements, or advertisements. which the dialog is to be picked up subscribes to the dialog state of
the park service or the UA that has locally parked the dialog.
Dialogs that are parked should be labeled with an identifier. The
labels are used by the UA to allow the user to indicate which dialog
is to be picked up. The UA picking up the call invoked the URI in
the call state that is labeled as replace-remote.
Outbound Call Screening - Alice is paged and unknowingly calls a PSTN The other call pickup feature involves picking up an early dialog
pay-service telephone number in the Caribbean, but local policy (typically ringing). A party picks up a call that was ringing at
blocks her call, and possibly informs her why. another location. One variation allows the caller to choose which
location, another variation just picks up any call in that user's
"pickup group". This feature uses some of the same primitives as the
pick up of a parked call. The call state of the UA ringing phone is
advertised using the dialog package. The UA that is to pickup the
early dialog subscribes either directly to the ringing UA or to a
service aggregating the states for UAs in the pickup group. The call
state identifies early dialogs. The UA uses the call state(s) to
help the user choose which early dialog that is to be picked up. The
UA then invokes the URI in the call state labeled as replace-remote.
Pre-paid calling - Alice pays for a certain currency or unit amount 6.10. Call Return
of calling value. When she places a call, she provides her account
number somehow. If her account runs out of calling value during a
call her call is disconnected or redirected to a service where she
can purchase more calling value.
Presence-Enabled Conferencing: Alice wants to set up a conference In Call Return, Alice calls Bob. Bob misses the call or is
call with Bob and Cathy when they all happen to be available (rather disconnected before he is finished talking to Alice. Bob invokes
than scheduling a predefined time). The server providing the Call return that calls Alice, even if Alice did not provide her real
application monitors their status, and calls all three when they are identity or location to Bob.
all "online", not idle, and not in another call.
Single Line Extension/Multiple Line Appearance -- A group of phones 6.11. Call Waiting
are all treated as "extensions" of a single line. A call for one
rings them all. As soon as one answers, the others stop ringing. If
any extension is actively in a conversation, another extension can
"pick up" and immediately join the conversation. This emulates the
behavior of a home telephone line with multiple phones.
Speakerphone paging - Alice calls the paging address and speaks. Her In Call Waiting, Alice is in a call, then receives another call.
voice is played on the speaker of every idle phone in a preconfigured Alice can place the first call on hold, and talk with the other
group of phones. caller. She can typically switch back and forth between the callers.
Speed dial - Alice dials an abbreviated number, or enters an alias, 6.12. Click-to-Dial
or presses a special speed dial button representing Bob. Her action
is interpreted as if she specified the full address of Bob.
Voice message screening - Bob calls Alice. Alice is screening her In Click-to-Dial [RFC5359], Alice looks in her company directory for
calls, so Bob hears Alice's voicemail greeting. Alice can hear Bob Bob. When she finds Bob, she clicks on a URI to call him. Her phone
leave his message. If she decides to talk to Bob, she can take the rings (or possibly answers automatically), and when she answers,
call back from the voicemail system, otherwise she can let Bob leave Bob's phone rings. The application or server that hosts the Click-
a message. This emulates the behavior of a home telephone answering to-Dial application captures the URI to be dialed and can setup the
machine call using 3pcc or can send a REFER request to the UA that is to dial
the address. As users sometimes change their mind or wish to give up
listing to a ringing or voicemail answered phone, this application
illustrates the need to also have the ability to remotely hangup a
call.
Voice Portal - A service that allows users to access a portal site 6.13. Conference Call
using spoken dialog interaction. For example, Alice needs to
schedule a working dinner with her co-worker Carol. Alice uses a
voice portal to check Carol's flight schedule, find a restaurant near
her hotel, make a reservation, get directions there, and page Carol
with this information.
Whispered call waiting - Alice is in a conversation with Bob. Carol In a Conference Call [RFC4579], there are three or more active,
calls Alice. Either Carol can "whisper" to Alice directly ("Can you visible participants in the same conversation space.
get lunch in 15 minutes?"), or an automaton whispers to Alice
informing her that Carol is trying to reach her.
6.1. Implementation of these features 6.14. Consultative Transfer
Example Features: In Consultative Transfer [I-D.ietf-sipping-cc-transfer], the
transferring party establishes a session with the target and mixes
both sessions together so that all three parties can participate,
then disconnects leaving the transferee and transfer target with an
active session.
Attended Transfer [I-D.ietf-sipping-cc-transfer] 6.15. Distinctive Ring
Auto Answer [I-D.ietf-sip-answermode]
Automatic Callback Two person presence-based conference
Barge-in Section 6.1.1
Blind Transfer [I-D.ietf-sipping-cc-transfer]
Call Forwarding Proxy or Local implementation
Call Hold [I-D.ietf-sipping-service-examples]
Call Monitoring Section 6.1.2
Call Park Sec 6.1.3, [I-D.ietf-sipping-service-examples]
Call Pickup Sec 6.1.4, [I-D.ietf-sipping-service-examples]
Call Return Proxy feature
Call Waiting Local Implementation
Click-to-dial Sec 6.1.5, [I-D.ietf-sipping-service-examples]
Conference Call [RFC4579]
Presence-based
Conferencing [RFC4579], [RFC3856]
Consultative transfer [I-D.ietf-sipping-cc-transfer]
Distinctive ring Section 6.1.6, Proxy or Local implementation
Do Not Disturb [RFC3856]
Find-Me Proxy service based on presence
Hotline Local Implementation
IM Conference Alerts Subscribe to conference status
Inbound Call Screening Proxy or Local implementation
Intercom Section 6.1.7, [I-D.ietf-sip-answermode]
Message Waiting [RFC3842]
Multiple Appearances Section 6.1.10
Music on Hold Sec 6.1.8, [I-D.ietf-sipping-service-examples]
Outbound Call Screening Proxy feature
Pre-Paid Calling Section 6.1.9
Single Line Extension Section 6.1.10
Speakerphone paging Section 6.1.11, Speed dial + Auto Answer
Speed dial Local Implementation
Voice Message Screening Section 6.1.12
Voice Portal Section 6.1.13
Whispered call waiting Local implementation
6.1.1. Barge-in In Distinctive Ring, incoming calls have different ring cadences or
sample sounds depending on the From party, the To party, or other
factors. The target UA either makes a local decision based on
information in an incoming INVITE (To, From, Contact, Request-URI) or
trusts an Alert-Info [RFC3261] header provided by the caller or
inserted by a trusted proxy. In the latter case, the UA fetches the
content described in the URI (typically via http) and renders it to
the user.
Barge-in works the same as call monitoring except that it must 6.16. Do Not Disturb
indicate that the send media stream to be mixed so that all of the
other parties can hear the stream from UA barging in.
6.1.2. Call Monitoring In Do Not Disturb, Alice selects the Do Not Disturb option. Calls to
her either ring briefly or not at all and are forwarded elsewhere.
Some variations allow specially authorized callers to override this
feature and ring Alice anyway. Do Not Disturb is best implemented in
SIP using presence [RFC3264].
Call monitoring is a Join operation. The monitoring UA sends a Join 6.17. Find-Me
to the dialog it wants to listen to. It is able to discover the
dialog via the dialog state on the monitored UA. The monitoring UA
sends SDP in the INVITE that indicates receive only media. As the UA
is monitoring only it does not matter whether the UA indicates it
wishes the send stream be mix or point to point.
6.1.3. Call Park In Find-Me, Alice sets up complicated rules for how she can be
reached (possibly using CPL (Call Processing Language) [RFC3880],
presence [RFC3856], or other factors). When Bob calls Alice, his
call is eventually routed to a temporary Contact where Alice happens
to be available.
Call park requires the ability to: put a dialog some place, advertise 6.18. Hotline
it to users in a pickup group and to uniquely identify it in a means
that can be communicated (including human voice). The dialog can be
held locally on the UA parking the dialog or alternatively
transferred to the park service for the pickup group. The parked
dialog then needs to be labeled (e.g. orbit 12) in a way that can be
communicated to the party that is to pick up the call. The UAs in
the pick up group discovers the parked dialog(s) via the dialog
package from the park service. If the dialog is parked locally the
park service merely aggregates the parked call states from the set of
UAs in the pickup up group.
6.1.4. Call Pickup In Hotline, Alice picks up a phone and is immediately connected to
the technical support hotline, for example. Hotline is also
sometimes known as a Ringdown line.
There are two different features that are called call pickup. The 6.19. IM Conference Alerts
first is the pickup of a parked dialog. The UA from which the dialog
is to be picked up subscribes to the dialog state of the park service
or the UA that has locally parked the dialog. Dialogs that are
parked should be labeled with an identifier. The labels are used by
the UA to allow the user to indicate which dialog is to be picked up.
The UA picking up the call invoked the URI in the call state that is
labeled as replace-remote.
The other call pickup feature involves picking up an early dialog In IM Conference Alerts, A user receives an notification as an
(typically ringing). This feature uses some of the same primitives Instant Message whenever someone joins a conference they are also in.
as the pick up of a parked call. The call state of the UA ringing
phone is advertised using the dialog package. The UA that is to
pickup the early dialog subscribes either directly to the ringing UA
or to a service aggregating the states for UAs in the pickup group.
The call state identifies early dialogs. The UA uses the call
state(s) to help the user choose which early dialog that is to be
picked up. The UA then invokes the URI in the call state labeled as
replace-remote.
6.1.5. Click-to-dial 6.20. Inbound Call Screening
The application or server that hosts the click-to-dial application In Inbound Call Screening, Alice doesn't want to receive calls from
captures the URI to be dialed and can setup the call using 3pcc or Matt. Inbound Screening prevents Matt from disturbing Alice. In
can send a REFER request to the UA that is to dial the address. As some variations this works even if Matt hides his identity.
users sometimes change their mind or wish to give up listing to a
ringing or voicemail answered phone, this application illustrates the
need to also have the ability to remotely hangup a call.
6.1.6. Distinctive ring 6.21. Intercom
The target UA either makes a local decision based on information in In Intercom, Alice typically presses a button on a phone that
an incoming INVITE (To, From, Contact, Request-URI) or trusts an immediately connects to another user or phone and causes that phone
Alert-Info header provided by the caller or inserted by a trusted to play her voice over its speaker. Some variations immediately
proxy. In the latter case, the UA fetches the content described in setup two-way communications, other variations require another button
the URI (typically via http) and renders it to the user. to be pressed to enable a two-way conversation. The UA initiates a
dialog using INVITE and the Answer-Mode: Auto header field as
described in [RFC5373]. The called UA accepts the INVITE with a 200
OK and automatically enables the speakerphone.
6.1.7. Intercom Alternatively this can be a local decision for the UA to auto answer
based upon called party identification.
The UA initiates a dialog using INVITE and the Answer-Mode: Auto 6.22. Message Waiting
header field as described in [I-D.ietf-sip-answermode]. The called
UA accepts the INVITE with a 200 OK and automatically enables the
speakerphone.
Alternatively this can be a local decision for the UA to answer based In Message Waiting [RFC3842], Bob calls Alice when she steps away
upon called party identification. from her phone, when she returns a visible or audible indicator
conveys that someone has left her a voicemail message. The message
waiting indication may also convey how many messages are waiting,
from whom, what time, and other useful pieces of information.
6.1.8. Music on Hold 6.23. Music on Hold
Music on hold can be implemented a number of ways. One way is to In Music on Hold [RFC5359], when Alice places a call with Bob on
transfer the held call to a holding service. When the UA wishes to hold, it replaces its audio with streaming content such as music,
take the call off hold it basically performs a take on the call from announcements, or advertisements. Music on hold can be implemented a
the holding service. This involves subscribing to call state on the number of ways. One way is to transfer the held call to a holding
holding service and then invoking the URI in the call state labeled service. When the UA wishes to take the call off hold it basically
as replace-remote. performs a take on the call from the holding service. This involves
subscribing to call state on the holding service and then invoking
the URI in the call state labeled as replace-remote.
Alternatively music on hold can be performed as a local mixing Alternatively music on hold can be performed as a local mixing
operation. The UA holding the call can mix in the music from the operation. The UA holding the call can mix in the music from the
music service via RTP (i.e. an additional dialog) or RTSP or other music service via RTP (i.e. an additional dialog) or RTSP or other
streaming media source. This approach is simpler (i.e. the held streaming media source. This approach is simpler (i.e. the held
dialog does not move so there is less chance of loosing them) from a dialog does not move so there is less chance of loosing them) from a
protocol perspective, however it does use more LAN bandwidth and protocol perspective, however it does use more LAN bandwidth and
resources on the UA. resources on the UA.
6.1.9. Pre-paid calling 6.24. Outbound Call Screening
In Outbound Call Screening, Alice is paged and unknowingly calls a
PSTN pay-service telephone number in the Caribbean, but local policy
blocks her call, and possibly informs her why.
6.25. Pre-paid Calling
In Pre-paid Calling, Alice pays for a certain currency or unit amount
of calling value. When she places a call, she provides her account
number somehow. If her account runs out of calling value during a
call her call is disconnected or redirected to a service where she
can purchase more calling value.
For prepaid calling, the user's media always passes through a device For prepaid calling, the user's media always passes through a device
that is trusted by the pre-paid provider. This may be the other that is trusted by the pre-paid provider. This may be the other
endpoint (for example a PSTN gateway). In either case, an endpoint (for example a PSTN gateway). In either case, an
intermediary proxy or B2BUA can periodically verify the amount of intermediary proxy or B2BUA can periodically verify the amount of
time available on the pre-paid account, and use the session-timer time available on the pre-paid account, and use the session-timer
extension to cause the trusted endpoint (gateway) or intermediary extension to cause the trusted endpoint (gateway) or intermediary
(media relay) to send a reINVITE before that time runs out. During (media relay) to send a reINVITE before that time runs out. During
the reINVITE, the SIP intermediary can re-verify the account and the reINVITE, the SIP intermediary can re-verify the account and
insert another session-timer header. insert another session-timer header.
Note that while most pre-paid systems on the PSTN use an IVR to Note that while most pre-paid systems on the PSTN use an IVR to
collect the account number and destination, this isn't strictly collect the account number and destination, this isn't strictly
necessary for a SIP-originated prepaid call. SIP requests and SIP necessary for a SIP-originated prepaid call. SIP requests and SIP
URIs are sufficiently expressive to convey the final destination, the URIs are sufficiently expressive to convey the final destination, the
provider of the prepaid service, the location from which the user is provider of the prepaid service, the location from which the user is
calling, and the prepaid account they want to use. If a pre-paid IVR calling, and the prepaid account they want to use. If a pre-paid IVR
is used, the mechanism described below (Voice Portals) can be is used, the mechanism described below (Voice Portals) can be
combined as well. combined as well.
6.1.10. Single Line Extension/Multiple Line Appearance 6.26. Presence-Enabled Conferencing
Incoming calls ring all the extensions through basic parallel In Presence-Enabled Conferencing, Alice wants to set up a conference
forking. Each extension subscribes to dialog events from each other call with Bob and Cathy when they all happen to be available (rather
extension. While one user has an active call, any other UA extension than scheduling a predefined time). The server providing the
can insert itself into that conversation (it already knows the dialog application monitors their status, and calls all three when they are
all "online", not idle, and not in another call. This could be
implemented using conferencing [RFC4579] and presence [RFC3264]
primitives.
6.27. Single Line Extension/Multiple Line Appearance
In Single Line Extension/Multiple Line Appearances, group of phones
are all treated as "extensions" of a single line or AOR. A call for
one rings them all. As soon as one answers, the others stop ringing.
If any extension is actively in a conversation, another extension can
"pick up" and immediately join the conversation. This emulates the
behavior of a home telephone line with multiple phones. Incoming
calls ring all the extensions through basic parallel forking. Each
extension subscribes to dialog events from each other extension.
While one user has an active call, any other UA extension can insert
itself into that conversation (it already knows the dialog
information) in the same way as barge-in. information) in the same way as barge-in.
Standardization work to allow line appearance numbers to be When implemented using SIP, this feature is known as Shared
coordinated across a group of UAs is currently underway. Appearances of an AOR [I-D.ietf-bliss-shared-appearances].
Extensions to the dialog package are used to convey appearance
numbers (line numbers).
6.1.11. Speakerphone paging 6.28. Speakerphone Paging
Speakerphone paging can be implemented using either multicast or In Speakerphone Paging, Alice calls the paging address and speaks.
through a simple multipoint mixer. In the multicast solution the Her voice is played on the speaker of every idle phone in a
paging UA sends a multicast INVITE with send only media in the SDP preconfigured group of phones. Speakerphone paging can be
(see also RFC3264). The automatic answer and enabling of the implemented using either multicast or through a simple multipoint
speakerphone is a locally configured decision on the paged UAs. The mixer. In the multicast solution the paging UA sends a multicast
paging UA sends RTP via the multicast address indicated in the SDP. INVITE with send only media in the SDP (see also RFC3264). The
automatic answer and enabling of the speakerphone is a locally
configured decision on the paged UAs. The paging UA sends RTP via
the multicast address indicated in the SDP.
The multipoint solution is accomplished by sending an INVITE to the The multipoint solution is accomplished by sending an INVITE to the
multipoint mixer. The mixer is configured to automatically answer multipoint mixer. The mixer is configured to automatically answer
the dialog. The paging UA then sends REFER requests for each of the the dialog. The paging UA then sends REFER requests for each of the
UAs that are to become paging speakers (The UA is likely to send out UAs that are to become paging speakers (The UA is likely to send out
a single REFER that is parallel forked by the proxy server). The UAs a single REFER that is parallel forked by the proxy server). The UAs
performing as paging speakers are configured to automatically answer performing as paging speakers are configured to automatically answer
based upon caller identification (e.g. To field, URI or Referred-To based upon caller identification (e.g. To field, URI or Referred-To
headers). headers).
Finally as a third option, the user agent can send a mass-invitation Finally as a third option, the user agent can send a mass-invitation
request to a conference server, which would create a conference and request to a conference server, which would create a conference and
send INVITEs containing the Answer-Mode: Auto header field to all send INVITEs containing the Answer-Mode: Auto header field to all
user agents in the paging group. user agents in the paging group.
6.1.12. Voice message screening 6.29. Speed Dial
At first, this is the same as call monitoring. In this case the In Speed Dial, Alice dials an abbreviated number, or enters an alias,
voicemail service is one of the UAs. The UA screening the message or presses a special speed dial button representing Bob. Her action
monitors the call on the voicemail service, and also subscribes to is interpreted as if she specified the full address of Bob.
dialog information. If the user screening their messages decides to
answer, they perform a Take from the voicemail system (for example,
send an INVITE with Replaces to the UA leaving the message)
6.1.13. Voice Portal 6.30. Voice Message Screening
A voice portal is essentially a complex collection of voice dialogs In Voice Message Screening, Bob calls Alice. Alice is screening her
used to access interesting content. One of the most desirable call calls, so Bob hears Alice's voicemail greeting. Alice can hear Bob
control features of a Voice Portal is the ability to start a new leave his message. If she decides to talk to Bob, she can take the
outgoing call from within the context of the Portal (to make a call back from the voicemail system, otherwise she can let Bob leave
restaurant reservation, or return a voicemail message for example). a message. This emulates the behavior of a home telephone answering
Once the new call is over, the user should be able to return to the machine.
Portal by pressing a special key, using some DTMF sequence (ex: a
very long pound or hash tone), or by speaking a hotword (ex: "Main At first, this is the same as Call Monitoring (Section 6.7). In this
Menu"). case the voicemail service is one of the UAs. The UA screening the
message monitors the call on the voicemail service, and also
subscribes to dialog information. If the user screening their
messages decides to answer, they perform a Take from the voicemail
system (for example, send an INVITE with Replaces to the UA leaving
the message)
6.31. Voice Portal
Voice Portal is service that allows users to access a portal site
using spoken dialog interaction. For example, Alice needs to
schedule a working dinner with her co-worker Carol. Alice uses a
voice portal to check Carol's flight schedule, find a restaurant near
her hotel, make a reservation, get directions there, and page Carol
with this information. A voice portal is essentially a complex
collection of voice dialogs used to access interesting content. One
of the most desirable call control features of a Voice Portal is the
ability to start a new outgoing call from within the context of the
Portal (to make a restaurant reservation, or return a voicemail
message for example). Once the new call is over, the user should be
able to return to the Portal by pressing a special key, using some
DTMF sequence (ex: a very long pound or hash tone), or by speaking a
key word (ex: "Main Menu").
In order to accomplish this, the Voice Portal starts with the In order to accomplish this, the Voice Portal starts with the
following media relationship: following media relationship:
{ User , Voice Portal } { User , Voice Portal }
The user then asks to make an outgoing call. The Voice Portal asks The user then asks to make an outgoing call. The Voice Portal asks
the User to perform a Far-Fork. In other words the Voice Portal the User to perform a Far-Fork. In other words the Voice Portal
wants the following media relationship: wants the following media relationship:
{ Target , User } & { User , Voice Portal } { Target , User } & { User , Voice Portal }
The Voice Portal is now just listening for a hotword or the The Voice Portal is now just listening for a key word or the
appropriate DTMF. As soon as the user indicates they are done, the appropriate DTMF. As soon as the user indicates they are done, the
Voice Portal takes the call from the old Target, and we are back to Voice Portal takes the call from the old Target, and we are back to
the original media relationship. the original media relationship.
This feature can also be used by the account number and phone number This feature can also be used by the account number and phone number
collection menu in a pre-paid calling service. A user can press a collection menu in a pre-paid calling service. A user can press a
DTMF sequence that presents them with the appropriate menu again. DTMF sequence that presents them with the appropriate menu again.
6.32. Voicemail
In Voicemail, Alice calls Bob who does not answer or is not
available. The call forwards to a voicemail server which plays Bob's
greeting and records Alice's message for Bob. An indication is sent
to Bob that a new message is waiting, and he retrieves the message at
a later date. This feature is implemented using features such as
Call Forwarding (Section 6.6) and the History-Info [RFC4244] header
field or voicemail URI [RFC4458] convention and Message Waiting
[RFC3842] features.
6.33. Whispered Call Waiting
In Whispered Call Waiting, Alice is in a conversation with Bob. Carol
calls Alice. Either Carol can "whisper" to Alice directly ("Can you
get lunch in 15 minutes?"), or an automaton whispers to Alice
informing her that Carol is trying to reach her.
7. Acknowledgements 7. Acknowledgements
The authors would like to acknowledge Ben Campbell for his The authors would like to acknowledge Ben Campbell for his
contributions to the document and thank AC Mahendran, John Elwell, contributions to the document and thank AC Mahendran, John Elwell,
and Xavier Marjou for their detailed Working Group review of the and Xavier Marjou for their detailed Working Group review of the
document. document.
8. Informative References 8. Informative References
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
skipping to change at page 39, line 18 skipping to change at page 40, line 22
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264, with Session Description Protocol (SDP)", RFC 3264,
June 2002. June 2002.
[RFC3265] Roach, A., "Session Initiation Protocol (SIP)-Specific [RFC3265] Roach, A., "Session Initiation Protocol (SIP)-Specific
Event Notification", RFC 3265, June 2002. Event Notification", RFC 3265, June 2002.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006. Description Protocol", RFC 4566, July 2006.
[I-D.ietf-sipping-service-examples] [RFC5359] Johnston, A., Sparks, R., Cunningham, C., Donovan, S., and
Johnston, A., Sparks, R., Cunningham, C., Donovan, S., and
K. Summers, "Session Initiation Protocol Service K. Summers, "Session Initiation Protocol Service
Examples", draft-ietf-sipping-service-examples-14 (work in Examples", BCP 144, RFC 5359, October 2008.
progress), February 2008.
[RFC3725] Rosenberg, J., Peterson, J., Schulzrinne, H., and G. [RFC3725] Rosenberg, J., Peterson, J., Schulzrinne, H., and G.
Camarillo, "Best Current Practices for Third Party Call Camarillo, "Best Current Practices for Third Party Call
Control (3pcc) in the Session Initiation Protocol (SIP)", Control (3pcc) in the Session Initiation Protocol (SIP)",
BCP 85, RFC 3725, April 2004. BCP 85, RFC 3725, April 2004.
[RFC3515] Sparks, R., "The Session Initiation Protocol (SIP) Refer [RFC3515] Sparks, R., "The Session Initiation Protocol (SIP) Refer
Method", RFC 3515, April 2003. Method", RFC 3515, April 2003.
[RFC3891] Mahy, R., Biggs, B., and R. Dean, "The Session Initiation [RFC3891] Mahy, R., Biggs, B., and R. Dean, "The Session Initiation
skipping to change at page 40, line 15 skipping to change at page 41, line 16
[RFC4353] Rosenberg, J., "A Framework for Conferencing with the [RFC4353] Rosenberg, J., "A Framework for Conferencing with the
Session Initiation Protocol (SIP)", RFC 4353, Session Initiation Protocol (SIP)", RFC 4353,
February 2006. February 2006.
[I-D.ietf-sipping-app-interaction-framework] [I-D.ietf-sipping-app-interaction-framework]
Rosenberg, J., "A Framework for Application Interaction in Rosenberg, J., "A Framework for Application Interaction in
the Session Initiation Protocol (SIP)", the Session Initiation Protocol (SIP)",
draft-ietf-sipping-app-interaction-framework-05 (work in draft-ietf-sipping-app-interaction-framework-05 (work in
progress), July 2005. progress), July 2005.
[I-D.ietf-sipping-transc-framework] [RFC5369] Camarillo, G., "Framework for Transcoding with the Session
Camarillo, G., "Framework for Transcoding with the Session Initiation Protocol (SIP)", RFC 5369, October 2008.
Initiation Protocol (SIP)",
draft-ietf-sipping-transc-framework-05 (work in progress), [I-D.ietf-xcon-ccmp]
December 2006. Barnes, M., Boulton, C., Romano, S., and H. Schulzrinne,
"Centralized Conferencing Manipulation Protocol",
draft-ietf-xcon-ccmp-01 (work in progress), November 2008.
[I-D.ietf-sipping-cc-transfer] [I-D.ietf-sipping-cc-transfer]
Sparks, R., "Session Initiation Protocol Call Control - Sparks, R. and A. Johnston, "Session Initiation Protocol
Transfer", draft-ietf-sipping-cc-transfer-09 (work in Call Control - Transfer",
progress), December 2007. draft-ietf-sipping-cc-transfer-12 (work in progress),
March 2009.
[RFC4579] Johnston, A. and O. Levin, "Session Initiation Protocol [RFC4579] Johnston, A. and O. Levin, "Session Initiation Protocol
(SIP) Call Control - Conferencing for User Agents", (SIP) Call Control - Conferencing for User Agents",
BCP 119, RFC 4579, August 2006. BCP 119, RFC 4579, August 2006.
[RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, [RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat,
"Indicating User Agent Capabilities in the Session "Indicating User Agent Capabilities in the Session
Initiation Protocol (SIP)", RFC 3840, August 2004. Initiation Protocol (SIP)", RFC 3840, August 2004.
[RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller [RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller
Preferences for the Session Initiation Protocol (SIP)", Preferences for the Session Initiation Protocol (SIP)",
RFC 3841, August 2004. RFC 3841, August 2004.
[RFC3087] Campbell, B. and R. Sparks, "Control of Service Context [RFC3087] Campbell, B. and R. Sparks, "Control of Service Context
using SIP Request-URI", RFC 3087, April 2001. using SIP Request-URI", RFC 3087, April 2001.
[I-D.mahy-sip-remote-cc] [I-D.audet-sipping-feature-ref]
Jennings, C. and R. Mahy, "Remote Call Control in the Audet, F., Johnston, A., Mahy, R., and C. Jennings,
Session Initiation Protocol (SIP) using the REFER method "Feature Referral in the Session Initiation Protocol
and the session-oriented dialog package", (SIP)", draft-audet-sipping-feature-ref-00 (work in
draft-mahy-sip-remote-cc-05 (work in progress), progress), February 2008.
March 2007.
[RFC4240] Burger, E., Van Dyke, J., and A. Spitzer, "Basic Network [RFC4240] Burger, E., Van Dyke, J., and A. Spitzer, "Basic Network
Media Services with SIP", RFC 4240, December 2005. Media Services with SIP", RFC 4240, December 2005.
[RFC4458] Jennings, C., Audet, F., and J. Elwell, "Session [RFC4458] Jennings, C., Audet, F., and J. Elwell, "Session
Initiation Protocol (SIP) URIs for Applications such as Initiation Protocol (SIP) URIs for Applications such as
Voicemail and Interactive Voice Response (IVR)", RFC 4458, Voicemail and Interactive Voice Response (IVR)", RFC 4458,
April 2006. April 2006.
[RFC4538] Rosenberg, J., "Request Authorization through Dialog [RFC4538] Rosenberg, J., "Request Authorization through Dialog
Identification in the Session Initiation Protocol (SIP)", Identification in the Session Initiation Protocol (SIP)",
RFC 4538, June 2006. RFC 4538, June 2006.
[RFC3880] Lennox, J., Wu, X., and H. Schulzrinne, "Call Processing [RFC3880] Lennox, J., Wu, X., and H. Schulzrinne, "Call Processing
Language (CPL): A Language for User Control of Internet Language (CPL): A Language for User Control of Internet
Telephony Services", RFC 3880, October 2004. Telephony Services", RFC 3880, October 2004.
[I-D.ietf-sip-answermode] [RFC5373] Willis, D. and A. Allen, "Requesting Answering Modes for
Willis, D. and A. Allen, "Requesting Answering Modes for the Session Initiation Protocol (SIP)", RFC 5373,
the Session Initiation Protocol (SIP)", November 2008.
draft-ietf-sip-answermode-06 (work in progress),
September 2007.
[RFC3842] Mahy, R., "A Message Summary and Message Waiting [RFC3842] Mahy, R., "A Message Summary and Message Waiting
Indication Event Package for the Session Initiation Indication Event Package for the Session Initiation
Protocol (SIP)", RFC 3842, August 2004. Protocol (SIP)", RFC 3842, August 2004.
[I-D.ietf-bliss-shared-appearances]
Johnston, A., Soroushnejad, M., and V. Venkataramanan,
"Shared Appearances of a Session Initiation Protocol (SIP)
Address of Record (AOR)",
draft-ietf-bliss-shared-appearances-01 (work in progress),
November 2008.
[RFC4244] Barnes, M., "An Extension to the Session Initiation
Protocol (SIP) for Request History Information", RFC 4244,
November 2005.
[RFC4313] Oran, D., "Requirements for Distributed Control of
Automatic Speech Recognition (ASR), Speaker
Identification/Speaker Verification (SI/SV), and Text-to-
Speech (TTS) Resources", RFC 4313, December 2005.
Authors' Addresses Authors' Addresses
Rohan Mahy Rohan Mahy
Plantronics Plantronics
345 Encincal Street
Santa Cruz, CA
USA
Email: rohan@ekabal.com Email: rohan@ekabal.com
Robert Sparks Robert Sparks
Estacado Systems Tekelek
Email: rjsparks@nostrum.com Email: rjsparks@nostrum.com
Jonathan Rosenberg Jonathan Rosenberg
Cisco Systems Cisco Systems
Email: jdrosen@cisco.com Email: jdrosen@cisco.com
Dan Petrie Dan Petrie
SIP EZ SIP EZ
Email: dpetrie@sipez.com Email: dpetrie@sipez.com
Alan Johnston (editor) Alan Johnston (editor)
Avaya Avaya
Email: alan@sipstation.com Email: alan@sipstation.com
Full Copyright Statement
Copyright (C) The IETF Trust (2008).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
 End of changes. 108 change blocks. 
426 lines changed or deleted 474 lines changed or added

This html diff was produced by rfcdiff 1.35. The latest version is available from http://tools.ietf.org/tools/rfcdiff/