draft-ietf-sipping-app-interaction-framework-05.txt   rfc5629.txt 
SIP J. Rosenberg Network Working Group J. Rosenberg
Internet-Draft Cisco Systems Request for Comments: 5629 Cisco Systems
Expires: January 19, 2006 July 18, 2005 Category: Standards Track October 2009
A Framework for Application Interaction in the Session Initiation
Protocol (SIP)
draft-ietf-sipping-app-interaction-framework-05
Status of this Memo
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on January 19, 2006.
Copyright Notice
Copyright (C) The Internet Society (2005). A Framework for Application Interaction
in the Session Initiation Protocol (SIP)
Abstract Abstract
This document describes a framework for the interaction between users This document describes a framework for the interaction between users
and Session Initiation Protocol (SIP) based applications. By and Session Initiation Protocol (SIP) based applications. By
interacting with applications, users can guide the way in which they interacting with applications, users can guide the way in which they
operate. The focus of this framework is stimulus signaling, which operate. The focus of this framework is stimulus signaling, which
allows a user agent to interact with an application without knowledge allows a user agent (UA) to interact with an application without
of the semantics of that application. Stimulus signaling can occur knowledge of the semantics of that application. Stimulus signaling
to a user interface running locally with the client, or to a remote can occur to a user interface running locally with the client, or to
user interface, through media streams. Stimulus signaling a remote user interface, through media streams. Stimulus signaling
encompasses a wide range of mechanisms, ranging from clicking on encompasses a wide range of mechanisms, ranging from clicking on
hyperlinks, to pressing buttons, to traditional Dual Tone Multi hyperlinks, to pressing buttons, to traditional Dual-Tone Multi-
Frequency (DTMF) input. In all cases, stimulus signaling is Frequency (DTMF) input. In all cases, stimulus signaling is
supported through the use of markup languages, which play a key role supported through the use of markup languages, which play a key role
in this framework. in this framework.
Status of This Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
RFC 5629 App Interaction Framework October 2009
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
RFC 5629 App Interaction Framework October 2009
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Conventions Used in This Document . . . . . . . . . . . . . . 4
3. A Model for Application Interaction . . . . . . . . . . . . . 7 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 Functional vs. Stimulus . . . . . . . . . . . . . . . . . 9 4. A Model for Application Interaction . . . . . . . . . . . . . 7
3.2 Real-Time vs. Non-Real Time . . . . . . . . . . . . . . . 9 4.1. Functional vs. Stimulus . . . . . . . . . . . . . . . . . 9
3.3 Client-Local vs. Client-Remote . . . . . . . . . . . . . . 10 4.2. Real-Time vs. Non-Real-Time . . . . . . . . . . . . . . . 10
3.4 Presentation Capable vs. Presentation Free . . . . . . . . 11 4.3. Client-Local vs. Client-Remote . . . . . . . . . . . . . . 10
4. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11 4.4. Presentation-Capable vs. Presentation-Free . . . . . . . . 11
4.1 Client Remote . . . . . . . . . . . . . . . . . . . . . . 12 5. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11
4.2 Client Local . . . . . . . . . . . . . . . . . . . . . . . 12 5.1. Client Remote . . . . . . . . . . . . . . . . . . . . . . 12
4.3 Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 12 5.2. Client Local . . . . . . . . . . . . . . . . . . . . . . . 12
5. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13 5.3. Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 13
6. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 16 6. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13
6.1 Third Party Application . . . . . . . . . . . . . . . . . 16 7. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 16
6.2 Co-Resident Application . . . . . . . . . . . . . . . . . 17 7.1. Third-Party Application . . . . . . . . . . . . . . . . . 16
6.3 Third Party Application and User Device Proxy . . . . . . 18 7.2. Co-Resident Application . . . . . . . . . . . . . . . . . 17
6.4 Proxy Application . . . . . . . . . . . . . . . . . . . . 19 7.3. Third-Party Application and User Device Proxy . . . . . . 18
7. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19 7.4. Proxy Application . . . . . . . . . . . . . . . . . . . . 19
7.1 Client Local Interfaces . . . . . . . . . . . . . . . . . 20 8. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19
7.1.1 Discovering Capabilities . . . . . . . . . . . . . . . 20 8.1. Client-Local Interfaces . . . . . . . . . . . . . . . . . 20
7.1.2 Pushing an Initial Interface Component . . . . . . . . 20 8.1.1. Discovering Capabilities . . . . . . . . . . . . . . . 20
7.1.3 Updating an Interface Component . . . . . . . . . . . 22 8.1.2. Pushing an Initial Interface Component . . . . . . . . 20
7.1.4 Terminating an Interface Component . . . . . . . . . . 22 8.1.3. Updating an Interface Component . . . . . . . . . . . 22
7.2 Client Remote Interfaces . . . . . . . . . . . . . . . . . 23 8.1.4. Terminating an Interface Component . . . . . . . . . . 22
7.2.1 Originating and Terminating Applications . . . . . . . 23 8.2. Client-Remote Interfaces . . . . . . . . . . . . . . . . . 23
7.2.2 Intermediary Applications . . . . . . . . . . . . . . 23 8.2.1. Originating and Terminating Applications . . . . . . . 23
8. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24 8.2.2. Intermediary Applications . . . . . . . . . . . . . . 24
8.1 Advertising Capabilities . . . . . . . . . . . . . . . . . 24 9. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24
8.2 Receiving User Interface Components . . . . . . . . . . . 25 9.1. Advertising Capabilities . . . . . . . . . . . . . . . . . 24
8.3 Mapping User Input to User Interface Components . . . . . 26 9.2. Receiving User Interface Components . . . . . . . . . . . 25
8.4 Receiving Updates to User Interface Components . . . . . . 27 9.3. Mapping User Input to User Interface Components . . . . . 26
8.5 Terminating a User Interface Component . . . . . . . . . . 27 9.4. Receiving Updates to User Interface Components . . . . . . 27
9. Inter-Application Feature Interaction . . . . . . . . . . . . 27 9.5. Terminating a User Interface Component . . . . . . . . . . 27
9.1 Client Local UI . . . . . . . . . . . . . . . . . . . . . 28 10. Inter-Application Feature Interaction . . . . . . . . . . . . 27
9.2 Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29 10.1. Client-Local UI . . . . . . . . . . . . . . . . . . . . . 28
10. Intra Application Feature Interaction . . . . . . . . . . . 29 10.2. Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29
11. Example Call Flow . . . . . . . . . . . . . . . . . . . . . 30 11. Intra Application Feature Interaction . . . . . . . . . . . . 29
12. Security Considerations . . . . . . . . . . . . . . . . . . 35 12. Example Call Flow . . . . . . . . . . . . . . . . . . . . . . 30
13. IANA Considerations . . . . . . . . . . . . . . . . . . . . 36 13. Security Considerations . . . . . . . . . . . . . . . . . . . 36
14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 36 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 36
15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 36
16. References . . . . . . . . . . . . . . . . . . . . . . . . . 36 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36
16.1 Normative References . . . . . . . . . . . . . . . . . . . 36 16.1. Normative References . . . . . . . . . . . . . . . . . . . 36
16.2 Informative References . . . . . . . . . . . . . . . . . . 37 16.2. Informative References . . . . . . . . . . . . . . . . . . 37
Author's Address . . . . . . . . . . . . . . . . . . . . . . . 38
Intellectual Property and Copyright Statements . . . . . . . . 39 RFC 5629 App Interaction Framework October 2009
1. Introduction 1. Introduction
The Session Initiation Protocol (SIP) [1] provides the ability for The Session Initiation Protocol (SIP) [2] provides the ability for
users to initiate, manage, and terminate communications sessions. users to initiate, manage, and terminate communications sessions.
Frequently, these sessions will involve a SIP application. A SIP Frequently, these sessions will involve a SIP application. A SIP
application is defined as a program running on a SIP-based element application is defined as a program running on a SIP-based element
(such as a proxy or user agent) that provides some value-added (such as a proxy or user agent) that provides some value-added
function to a user or system administrator. Examples of SIP function to a user or system administrator. Examples of SIP
applications include pre-paid calling card calls, conferencing, and applications include prepaid calling card calls, conferencing, and
presence-based [12] call routing. presence-based [12] call routing.
In order for most applications to properly function, they need input In order for most applications to properly function, they need input
from the user to guide their operation. As an example, a pre-paid from the user to guide their operation. As an example, a prepaid
calling card application requires the user to input their calling calling card application requires the user to input their calling
card number, their PIN code, and the destination number they wish to card number, their PIN code, and the destination number they wish to
reach. The process by which a user provides input to an application reach. The process by which a user provides input to an application
is called "application interaction". is called "application interaction".
Application interaction can be either functional or stimulus. Application interaction can be either functional or stimulus.
Functional interaction requires the user device to understand the Functional interaction requires the user device to understand the
semantics of the application, whereas stimulus interaction does not. semantics of the application, whereas stimulus interaction does not.
Stimulus signaling allows for applications to be built without Stimulus signaling allows for applications to be built without
requiring modifications to the user device. Stimulus interaction is requiring modifications to the user device. Stimulus interaction is
the subject of this framework. The framework provides a model for the subject of this framework. The framework provides a model for
how users interact with applications through user interfaces, and how how users interact with applications through user interfaces, and how
user interfaces and applications can be distributed throughout a user interfaces and applications can be distributed throughout a
network. This model is then used to describe how applications can network. This model is then used to describe how applications can
instantiate and manage user interfaces. instantiate and manage user interfaces.
2. Definitions 2. Conventions Used in This Document
SIP Application: A SIP application is defined as a program running on The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
a SIP-based element (such as a proxy or user agent) that provides "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
some value-added function to a user or system administrator. document are to be interpreted as described in [1]
Examples of SIP applications include pre-paid calling card calls,
conferencing, and presence-based [12] call routing.
Application Interaction: The process by which a user provides input 3. Definitions
SIP Application: A SIP application is defined as a program running
on a SIP-based element (such as a proxy or user agent) that
provides some value-added function to a user or system
administrator. Examples of SIP applications include prepaid
calling card calls, conferencing, and presence-based [12] call
routing.
Application Interaction: The process by which a user provides input
to an application. to an application.
Real-Time Application Interaction: Application interaction that takes RFC 5629 App Interaction Framework October 2009
place while an application instance is executing. For example,
when a user enters their PIN number into a pre-paid calling card
application, this is real-time application interaction.
Non-Real Time Application Interaction: Application interaction that Real-Time Application Interaction: Application interaction that
takes place while an application instance is executing. For
example, when a user enters their PIN number into a prepaid
calling card application, this is real-time application
interaction.
Non-Real-Time Application Interaction: Application interaction that
takes place asynchronously with the execution of the application. takes place asynchronously with the execution of the application.
Generally, non-real time application interaction is accomplished Generally, non-real-time application interaction is accomplished
through provisioning. through provisioning.
Functional Application Interaction: Application interaction is Functional Application Interaction: Application interaction is
functional when the user device has an understanding of the functional when the user device has an understanding of the
semantics of the interaction with the application. semantics of the interaction with the application.
Stimulus Application Interaction: Application interaction is Stimulus Application Interaction: Application interaction is
considered to be stimulus when the user device has no stimulus when the user device has no understanding of the
understanding of the semantics of the interaction with the semantics of the interaction with the application.
application.
User Interface (UI): The user interface provides the user with User Interface (UI): The user interface provides the user with
context in order to make decisions about what they want. The user context to make decisions about what they want. The user
interacts with the device, which conveys the user input the the interacts with the device, which conveys the user input to the
user interface. The user interface interprets the information, user interface. The user interface interprets the information and
and passes it to the application. passes it to the application.
User Interface Component: A piece of user interface which operates User Interface Component: A piece of user interface that operates
independently of other pieces of the user interface. For example, independently of other pieces of the user interface. For example,
a user might have two separate web interfaces to a pre-paid a user might have two separate web interfaces to a prepaid calling
calling card application - one for hanging up and making another card application: one for hanging up and making another call, and
call, and another for entering the username and PIN. another for entering the username and PIN.
User Device: The software or hardware system that the user directly User Device: The software or hardware system that the user directly
interacts with in order to communicate with the application. An interacts with to communicate with the application. An example of
example of a user device is a telephone. Another example is a PC a user device is a telephone. Another example is a PC with a web
with a web browser. browser.
User Device Proxy: A software or hardware system that a user User Device Proxy: A software or hardware system that a user
indirectly interacts through in order to communicate with the indirectly interacts through to communicate with the application.
application. This indirection can be through a network. An This indirection can be through a network. An example is a
example is a gateway from IP to the Public Switched Telephone gateway from IP to the Public Switched Telephone Network (PSTN).
Network (PSTN). It acts a user device proxy, acting on behalf of It acts as a user device proxy, acting on behalf of the user on
the user on the circuit network. the circuit network.
User Input: The "raw" information passed from a user to a user User Input: The "raw" information passed from a user to a user
interface. Examples of user input include a spoken word or a interface. Examples of user input include a spoken word or a
click on a hyperlink. click on a hyperlink.
Client-Local User Interface: A user interface which is co-resident RFC 5629 App Interaction Framework October 2009
Client-Local User Interface: A user interface that is co-resident
with the user device. with the user device.
Client-Remote User Interface: A user interface which executes Client-Remote User Interface: A user interface that executes
remotely from the user device. In this case, a standardized remotely from the user device. In this case, a standardized
interface is needed between the user device and the user interface is needed between the user device and the user
interface. Typically, this is done through media sessions - interface. Typically, this is done through media sessions: audio,
audio, video, or application sharing. video, or application sharing.
Markup Language: A markup language describes a logical flow of Markup Language: A markup language describes a logical flow of
presentation of information to the user, collection of information presentation of information to the user, collection of information
from the user, and transmission of that information to an from the user, and transmission of that information to an
application. application.
Media Interaction: A means of separating a user and a user interface Media Interaction: A means of separating a user and a user interface
by connecting them with media streams. by connecting them with media streams.
Interactive Voice Response (IVR): An IVR is a type of user interface Interactive Voice Response (IVR): An IVR is a type of user interface
that allows users to speak commands to the application, and hear that allows users to speak commands to the application, and hear
responses to those commands prompting for more information. responses to those commands prompting for more information.
Prompt-and-Collect: The basic primitive of an IVR user interface. Prompt-and-Collect: The basic primitive of an IVR user interface.
The user is presented with a voice option, and the user speaks The user is presented with a voice option, and the user speaks
their choice. their choice.
Barge-In: The act of entering information into an IVR user inteface Barge-In: The act of entering information into an IVR user interface
prior to the completion of a prompt requesting that information. prior to the completion of a prompt requesting that information.
Focus: A user interface component has focus when user input is Focus: A user interface component has focus when user input is
provided to it, as opposed to any other user interface components. provided to it, as opposed to any other user interface components.
This is not to be confused with the term focus within the SIP This is not to be confused with the term "focus" within the SIP
conferencing framework, which refers to the center user agent in a conferencing framework, which refers to the center user agent in a
conference [14]. conference [14].
Focus Determination: The process by which the user device determines Focus Determination: The process by which the user device determines
which user interface component will receive the user input. which user interface component will receive the user input.
Focusless Device: A user device which has no ability to perform focus Focusless Device: A user device that has no ability to perform focus
determination. An example of a focusless device is a telephone determination. An example of a focusless device is a telephone
with a keypad. with a keypad.
Presentation Capable UI: A user interface which can prompt the user Presentation-Capable UI: A user interface that can prompt the user
with input, collect results, and then prompt the user with new with input, collect results, and then prompt the user with new
information based on those results. information based on those results.
Presentation Free UI: A user interface which cannot prompt the user RFC 5629 App Interaction Framework October 2009
Presentation-Free UI: A user interface that cannot prompt the user
with information. with information.
Feature Interaction: A class of problems which result when multiple Feature Interaction: A class of problems that result when multiple
applications or application components are trying to provide applications or application components are trying to provide
services to a user at the same time. services to a user at the same time.
Inter-Application Feature Interaction: Feature interactions that Inter-Application Feature Interaction: Feature interactions that
occur between applications. occur between applications.
DTMF: Dual-Tone Multi-Frequency. DTMF refer to a class of tones DTMF: Dual-Tone Multi-Frequency. DTMF refers to a class of tones
generated by circuit switched telephony devices when the user generated by circuit-switched telephony devices when the user
presses a key on the keypad. As a result, DTMF and keypad input presses a key on the keypad. As a result, DTMF and keypad input
are often used synonymously, when in fact one of them (DTMF) is are often used synonymously, when in fact one of them (DTMF) is
merely a means of conveying the other (the keypad input) to a merely a means of conveying the other (the keypad input) to a
client-remote user interface (the switch, for example). client-remote user interface (the switch, for example).
Application Instance: A single execution path of a SIP application. Application Instance: A single execution path of a SIP application.
Originating Application: A SIP application which acts as a UAC, Originating Application: A SIP application that acts as a User Agent
making a call on behalf of the user. Client (UAC), making a call on behalf of the user.
Terminating Application: A SIP application which acts as a UAS, Terminating Application: A SIP application that acts as a User Agent
answering a call generated by a user. IVR applications are Server (UAS), answering a call generated by a user. IVR
terminating applications. applications are terminating applications.
Intermediary Application: A SIP application which is neither the Intermediary Application: A SIP application that is neither the
caller or callee, but rather, a third party involved in a call. caller or callee, but rather a third party involved in a call.
3. A Model for Application Interaction 4. A Model for Application Interaction
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
| | | | | | | | | | | | | | | |
| | | U | | U | | A | | | | U | | U | | A |
| | Input | s | Input | s | Results | p | | | Input | s | Input | s | Results | p |
| | ---------> | e | ---------> | e | ----------> | p | | | ---------> | e | ---------> | e | ----------> | p |
| U | | r | | r | | l | | U | | r | | r | | l |
| s | | | | | | i | | s | | | | | | i |
| e | | D | | I | | c | | e | | D | | I | | c |
| r | Output | e | Output | f | Update | a | | r | Output | e | Output | f | Update | a |
| | <--------- | v | <--------- | a | <.......... | t | | | <--------- | v | <--------- | a | <.......... | t |
| | | i | | c | | i | | | | i | | c | | i |
| | | c | | e | | o | | | | c | | e | | o |
| | | e | | | | n | | | | e | | | | n |
| | | | | | | | | | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
Figure 1: Model for Real-Time Interactions Figure 1: Model for Real-Time Interactions
RFC 5629 App Interaction Framework October 2009
Figure 1 presents a general model for how users interact with Figure 1 presents a general model for how users interact with
applications. Generally, users interact with a user interface applications. Generally, users interact with a user interface
through a user device. A user device can be a telephone, or it can through a user device. A user device can be a telephone, or it can
be a PC with a web browser. Its role is to pass the user input from be a PC with a web browser. Its role is to pass the user input from
the user, to the user interface. The user interface provides the the user to the user interface. The user interface provides the user
user with context in order to make decisions about what they want. with context in order to make decisions about what they want. The
The user interacts with the device, causing information to be passed user interacts with the device, causing information to be passed from
from the device to the user interface. The user interface interprets the device to the user interface. The user interface interprets the
the information, and passes it as a user interface event to the information, and passes it as a user interface event to the
application. The application may be able to modify the user application. The application may be able to modify the user
interface based on this event. Whether or not this is possible interface based on this event. Whether or not this is possible
depends on the type of user interface. depends on the type of user interface.
User interfaces are fundamentally about rendering and interpretation. User interfaces are fundamentally about rendering and interpretation.
Rendering refers to the way in which the user is provided context. Rendering refers to the way in which the user is provided context.
This can be through hyperlinks, images, sounds, videos, text, and so This can be through hyperlinks, images, sounds, videos, text, and so
on. Interpretation refers to the way in which the user interface on. Interpretation refers to the way in which the user interface
takes the "raw" data provided by the user, and returns the result to takes the "raw" data provided by the user, and returns the result to
the application as a meaningful event, abstracted from the the application as a meaningful event, abstracted from the
particulars of the user interface. As an example, consider a pre- particulars of the user interface. As an example, consider a prepaid
paid calling card application. The user interface worries about calling card application. The user interface worries about details
details such as what prompt the user is provided, whether the voice such as what prompt the user is provided, whether the voice is male
is male or female, and so on. It is concerned with recognizing the or female, and so on. It is concerned with recognizing the speech
speech that the user provides, in order to obtain the desired that the user provides, in order to obtain the desired information.
information. In this case, the desired information is the calling In this case, the desired information is the calling card number, the
card number, the PIN code, and the destination number. The PIN code, and the destination number. The application needs that
application needs that data, and it doesn't matter to the application data, and it doesn't matter to the application whether it was
whether it was collected using a male prompt or a female one. collected using a male prompt or a female one.
User interfaces generally have real-time requirements towards the User interfaces generally have real-time requirements towards the
user. That is, when a user interacts with the user interface, the user. That is, when a user interacts with the user interface, the
user interface needs to react quickly, and that change needs to be user interface needs to react quickly, and that change needs to be
propagated to the user right away. However, the interface between propagated to the user right away. However, the interface between
the user interface and the application need not be that fast. Faster the user interface and the application need not be that fast. Faster
is better, but the user interface itself can frequently compensate is better, but the user interface itself can frequently compensate
for long latencies there. In the case of a pre-paid calling card for long latencies between the user interface and the application.
application, when the user is prompted to enter their PIN, the prompt In the case of a prepaid calling card application, when the user is
should generally stop immediately once the first digit of the PIN is prompted to enter their PIN, the prompt should generally stop
entered. This is referred to as barge-in. After the user-interface immediately once the first digit of the PIN is entered. This is
collects the rest of the PIN, it can tell the user to "please wait referred to as "barge-in". After the user interface collects the
while processing". The PIN can then be gradually transmitted to the rest of the PIN, it can tell the user to "please wait while
processing". The PIN can then be gradually transmitted to the
application. In this example, the user interface has compensated for application. In this example, the user interface has compensated for
a slow UI to application interface by asking the user to wait. a slow UI to application interface by asking the user to wait.
The separation between user interface and application is absolutely The separation between user interface and application is absolutely
fundamental to the entire framework provided in this document. Its fundamental to the entire framework provided in this document. Its
importance cannot be overstated. importance cannot be overstated.
RFC 5629 App Interaction Framework October 2009
With this basic model, we can begin to taxonomize the types of With this basic model, we can begin to taxonomize the types of
systems that can be built. systems that can be built.
3.1 Functional vs. Stimulus 4.1. Functional vs. Stimulus
The first way to taxonomize the system is to consider the interface The first way to taxonomize the system is to consider the interface
between the UI and the application. There are two fundamentally between the UI and the application. There are two fundamentally
different models for this interface. In a functional interface, the different models for this interface. In a functional interface, the
user interface has detailed knowledge about the application, and is, user interface has detailed knowledge about the application and is,
in fact, specific to the application. The interface between the two in fact, specific to the application. The interface between the two
components is through a functional protocol, capable of representing components is through a functional protocol, capable of representing
the semantics which can be exposed through the user interface. the semantics that can be exposed through the user interface.
Because the user interface has knowledge of the application, it can Because the user interface has knowledge of the application, it can
be optimally designed for that application. As a result, functional be optimally designed for that application. As a result, functional
user interfaces are almost always the most user friendly, the fastest user interfaces are almost always the most user friendly, the
and the most responsive. However, in order to allow interoperability fastest, and the most responsive. However, in order to allow
between user devices and applications, the details of the functional interoperability between user devices and applications, the details
protocols need to be specified in standards. This slows down of the functional protocols need to be specified in standards. This
innovation and limits the scope of applications that can be built. slows down innovation and limits the scope of applications that can
be built.
An alternative is a stimulus interface. In a stimulus interface, the An alternative is a stimulus interface. In a stimulus interface, the
user interface is generic; totally ignorant of the details of the user interface is generic -- that is, totally ignorant of the details
application. Indeed, the application may pass instructions to the of the application. Indeed, the application may pass instructions to
user interface describing how it should operate. The user interface the user interface describing how it should operate. The user
translates user input into "stimulus" - which are data understood interface translates user input into "stimulus", which are data
only by the application, and not by the user interface. Because they understood only by the application, and not by the user interface.
are generic, and because they require communications with the Because they are generic, and because they require communications
application in order to change the way in which they render with the application in order to change the way in which they render
information to the user, stimulus user interfaces are usually slower, information to the user, stimulus user interfaces are usually slower,
less user friendly, and less responsive than a functional less user friendly, and less responsive than a functional
counterpart. However, they allow for substantial innovation in counterpart. However, they allow for substantial innovation in
applications, since no standardization activity is needed to build a applications, since no standardization activity is needed to build a
new application, as long as it can interact with the user within the new application, as long as it can interact with the user within the
confines of the user interface mechanism. The web is an example of a confines of the user interface mechanism. The web is an example of a
stimulus user interface to applications. stimulus user interface to applications.
In SIP systems, functional interfaces are provided by extending the In SIP systems, functional interfaces are provided by extending the
SIP protocol to provide the needed functionality. For example, the SIP protocol to provide the needed functionality. For example, the
SIP caller preferences specification [15] provides a functional SIP caller preferences specification [15] provides a functional
interface that allows a user to request applications to route the interface that allows a user to request applications to route the
call to specific types of user agents. Functional interfaces are call to specific types of user agents. Functional interfaces are
important, but are not the subject of this framework. The primary important, but are not the subject of this framework. The primary
goal of this framework is to address the role of stimulus interfaces goal of this framework is to address the role of stimulus interfaces
to SIP applications. to SIP applications.
3.2 Real-Time vs. Non-Real Time RFC 5629 App Interaction Framework October 2009
4.2. Real-Time vs. Non-Real-Time
Application interaction systems can also be real-time or non-real- Application interaction systems can also be real-time or non-real-
time. Non-real interaction allows the user to enter information time. Non-real-time interaction allows the user to enter information
about application operation asynchronously with its invocation. about application operation asynchronously with its invocation.
Frequently, this is done through provisioning systems. As an Frequently, this is done through provisioning systems. As an
example, a user can set up the forwarding number for a call-forward example, a user can set up the forwarding number for a call-forward
on no-answer application using a web page. Real-time interaction on no-answer application using a web page. Real-time interaction
requires the user to interact with the application at the time of its requires the user to interact with the application at the time of its
invocation. invocation.
3.3 Client-Local vs. Client-Remote 4.3. Client-Local vs. Client-Remote
Another axis in the taxonomization is whether the user interface is Another axis in the taxonomization is whether the user interface is
co-resident with the user device (which we refer to as a client-local co-resident with the user device (which we refer to as a client-local
user interface), or the user interface runs in a host separated from user interface), or the user interface runs in a host separated from
the client (which we refer to as a client-remote user interface). In the client (which we refer to as a client-remote user interface). In
a client-remote user interface, there exists some kind of protocol a client-remote user interface, there exists some kind of protocol
between the client device and the UI that allows the client to between the client device and the UI that allows the client to
interact with the user interface over a network. interact with the user interface over a network.
The most important way to separate the UI and the client device is The most important way to separate the UI and the client device is
through media interaction. In media interaction, the interface through media interaction. In media interaction, the interface
between the user and the user interface is through media - audio, between the user and the user interface is through media: audio,
video, messaging, and so on. This is the classic mode of operation video, messaging, and so on. This is the classic mode of operation
for VoiceXML [4], where the user interface (also referred to as the for VoiceXML [5], where the user interface (also referred to as the
voice browser) runs on a platform in the network. Users communicate voice browser) runs on a platform in the network. Users communicate
with the voice browser through the telephone network (or using a SIP with the voice browser through the telephone network (or using a SIP
session). The voice browser interacts with the application using session). The voice browser interacts with the application using
HTTP to convey the information collected from the user. HTTP to convey the information collected from the user.
In the case of a client-local user interface, the user interface runs In the case of a client-local user interface, the user interface runs
co-located with the user device. The interface between them is co-located with the user device. The interface between them is
through the software that interprets the users input and passes them through the software that interprets the user's input and passes it
to the user interface. The classic example of this is the web. In to the user interface. The classic example of this is the Web. In
the web, the user interface is a web browser, and the interface is the Web, the user interface is a web browser, and the interface is
defined by the HTML document that it's rendering. The user interacts defined by the HTML document that it's rendering. The user interacts
directly with the user interface running in the browser. The results directly with the user interface running in the browser. The results
of that user interface are sent to the application (running on the of that user interface are sent to the application (running on the
web server) using HTTP. web server) using HTTP.
It is important to note that whether or not the user interface is It is important to note that whether or not the user interface is
local or remote (in the case of media interaction) is not a property local or remote (in the case of media interaction) is not a property
of the modality of the interface, but rather a property of the of the modality of the interface, but rather a property of the
system. As an example, it is possible for a web-based user interface system. As an example, it is possible for a Web-based user interface
to be provided with a client-remote user interface. In such a to be provided with a client-remote user interface. In such a
scenario, video and application sharing media sessions can be used scenario, video- and application-sharing media sessions can be used
between the user and the user interface. The user interface, still between the user and the user interface. The user interface, still
RFC 5629 App Interaction Framework October 2009
guided by HTML, now runs "in the network", remote from the client. guided by HTML, now runs "in the network", remote from the client.
Similarly, a VoiceXML document can be interpreted locally by a client Similarly, a VoiceXML document can be interpreted locally by a client
device, with no media streams at all. Indeed, the VoiceXML document device, with no media streams at all. Indeed, the VoiceXML document
can be rendered using text, rather than media, with no impact on the can be rendered using text, rather than media, with no impact on the
interface between the user interface and the application. interface between the user interface and the application.
It is also important to note that systems can be hybrid. In a hybrid It is also important to note that systems can be hybrid. In a hybrid
user interface, some aspects of it (usually those associated with a user interface, some aspects of it (usually those associated with a
particular modality) run locally, and others run remotely. particular modality) run locally, and others run remotely.
3.4 Presentation Capable vs. Presentation Free 4.4. Presentation-Capable vs. Presentation-Free
A user interface can be capable of presenting information to the user A user interface can be capable of presenting information to the user
(a presentation capable UI), or it can be capable only of collecting (a presentation-capable UI), or it can be capable only of collecting
user input (a presentation free UI). These are very different types user input (a presentation-free UI). These are very different types
of user interfaces. A presentation capable UI can provide the user of user interfaces. A presentation-capable UI can provide the user
with feedback after every input, providing the context for collecting with feedback after every input, providing the context for collecting
the next input. As a result, presentation capable user interfaces the next input. As a result, presentation-capable user interfaces
require an update to the information provided to the user after each require an update to the information provided to the user after each
input. The web is a classic example of this. After every input input. The Web is a classic example of this. After every input
(i.e., a click), the browser provides the input to the application (i.e., a click), the browser provides the input to the application
and fetches the next page to render. In a presentation free user and fetches the next page to render. In a presentation-free user
interface, this is not the case. Since the user is not provided with interface, this is not the case. Since the user is not provided with
feedback, these user interfaces tend to merely collect information as feedback, these user interfaces tend to merely collect information as
its entered, and pass it to the application. it's entered, and pass it to the application.
Another difference is that a presentation-free user interface cannot Another difference is that a presentation-free user interface cannot
easily support the concept of a focus. Selection of a focus usually easily support the concept of a focus. Selection of a focus usually
requires a means for informing the user of the available requires a means for informing the user of the available
applications, allowing the user to choose, and then informing them applications, allowing the user to choose, and then informing them
about which one they have chosen. Without the first and third steps about which one they have chosen. Without the first and third steps
(which a presentation-free UI cannot provide), focus selection is (which a presentation-free UI cannot provide), focus selection is
very difficult. Without a selected focus, the input provided to very difficult. Without a selected focus, the input provided to
applications through presentation-free user interfaces is more of a applications through presentation-free user interfaces is more of a
broadcast or notification operation, as a result. broadcast or notification operation.
4. Interaction Scenarios on Telephones 5. Interaction Scenarios on Telephones
In this section, we applied the model of Section 3 to telephones. In this section, we apply the model of Section 4 to telephones.
In a traditional telephone, the user interface consists of a 12-key In a traditional telephone, the user interface consists of a 12-key
keypad, a speaker, and a microphone. Indeed, from here forward, the keypad, a speaker, and a microphone. Indeed, from here forward, the
term "telephone" is used to represent any device that meets, at a term "telephone" is used to represent any device that meets, at a
minimum, the characteristics described in the previous sentence. minimum, the characteristics described in the previous sentence.
Circuit-switched telephony applications are almost universally Circuit-switched telephony applications are almost universally
client-remote user interfaces. In the Public Switched Telephone client-remote user interfaces. In the Public Switched Telephone
Network (PSTN), there is usually a circuit interface between the user Network (PSTN), there is usually a circuit interface between the user
and the user interface. The user input from the keypad is conveyed and the user interface. The user input from the keypad is conveyed
used Dual-Tone Multi-Frequency (DTMF), and the microphone input as
RFC 5629 App Interaction Framework October 2009
using Dual-Tone Multi-Frequency (DTMF), and the microphone input as
Pulse Code Modulated (PCM) encoded voice. Pulse Code Modulated (PCM) encoded voice.
In an IP-based system, there is more variability in how the system In an IP-based system, there is more variability in how the system
can be instantiated. Both client-remote and client-local user can be instantiated. Both client-remote and client-local user
interfaces to a telephone can be provided. interfaces to a telephone can be provided.
In this framework, a PSTN gateway can be considered a User Device In this framework, a PSTN gateway can be considered a User Device
Proxy. It is a proxy for the user because it can provide, to a user Proxy. It is a proxy for the user because it can provide, to a user
interface on an IP network, input taken from a user on a circuit interface on an IP network, input taken from a user on a circuit-
switched telephone. The gateway may be able to run a client-local switched telephone. The gateway may be able to run a client-local
user interface, just as an IP telephone might. user interface, just as an IP telephone might.
4.1 Client Remote 5.1. Client Remote
The most obvious instantiation is the "classic" circuit-switched The most obvious instantiation is the "classic" circuit-switched
telephony model. In that model, the user interface runs remotely telephony model. In that model, the user interface runs remotely
from the client. The interface between the user and the user from the client. The interface between the user and the user
interface is through media, set up by SIP and carried over the Real interface is through media, which is set up by SIP and carried over
Time Transport Protocol (RTP) [18]. The microphone input can be the Real Time Transport Protocol (RTP) [18]. The microphone input
carried using any suitable voice encoding algorithm. The keypad can be carried using any suitable voice-encoding algorithm. The
input can be conveyed in one of two ways. The first is to convert keypad input can be conveyed in one of two ways. The first is to
the keypad input to DTMF, and then convey that DTMF using a suitance convert the keypad input to DTMF, and then convey that DTMF using a
encoding algorithm for it (such as PCMU). An alternative, and suitable encoding algorithm (such as PCMU). An alternative, and
generally the preferred approach, is to transmit the keypad input generally the preferred approach, is to transmit the keypad input
using RFC 2833 [19], which provides an encoding mechanism for using RFC 4733 [19], which provides an encoding mechanism for
carrying keypad input within RTP. carrying keypad input within RTP.
In this classic model, the user interface would run on a server in In this classic model, the user interface would run on a server in
the IP network. It would perform speech recognition and DTMF the IP network. It would perform speech recognition and DTMF
recognition to derive the user intent, feed them through the user recognition to derive the user intent, feed them through the user
interface, and provide the result to an application. interface, and provide the result to an application.
4.2 Client Local 5.2. Client Local
An alternative model is for the entire user interface to reside on An alternative model is for the entire user interface to reside on
the telephone. The user interface can be a VoiceXML browser, running the telephone. The user interface can be a VoiceXML browser, running
speech recognition on the microphone input, and feeding the keypad speech recognition on the microphone input, and feeding the keypad
input directly into the script. As discussed above, the VoiceXML input directly into the script. As discussed above, the VoiceXML
script could be rendered using text instead of voice, if the script could be rendered using text instead of voice, if the
telephone had a textual display. telephone has a textual display.
For simpler phones without a display, the user interface can be For simpler phones without a display, the user interface can be
described by a Keypad Markup Language request document [7]. As the described by a Keypad Markup Language request document [8]. As the
user enters digits in the keypad, they are passed to the user user enters digits in the keypad, they are passed to the user
interface, which generates user interface events that can be interface, which generates user interface events that can be
transported to the application. transported to the application.
4.3 Flip-Flop RFC 5629 App Interaction Framework October 2009
5.3. Flip-Flop
A middle-ground approach is to flip back and forth between a client- A middle-ground approach is to flip back and forth between a client-
local and client-remote user interface. Many voice applications are local and client-remote user interface. Many voice applications are
of the type which listen to the media stream and wait for some of the type that listen to the media stream and wait for some
specific trigger that kicks off a more complex user interaction. The specific trigger that kicks off a more complex user interaction. The
long pound in a pre-paid calling card application is one example. long pound in a prepaid calling card application is one example.
Another example is a conference recording application, where the user Another example is a conference recording application, where the user
can press a key at some point in the call to begin recording. When can press a key at some point in the call to begin recording. When
the key is pressed, the user hears a whisper to inform them that the key is pressed, the user hears a whisper to inform them that
recording has started. recording has started.
The ideal way to support such an application is to install a client- The ideal way to support such an application is to install a client-
local user interface component that waits for the trigger to kick off local user interface component that waits for the trigger to kick off
the real interaction. Once the trigger is received, the application the real interaction. Once the trigger is received, the application
connects the user to a client-remote user interface that can play connects the user to a client-remote user interface that can play
announements, collect more information, and so on. announcements, collect more information, and so on.
The benefit of flip-flopping between a client-local and client-remote The benefit of flip-flopping between a client-local and client-remote
user interface is cost. The client-local user interface will user interface is cost. The client-local user interface will
eliminate the need to send media streams into the network just to eliminate the need to send media streams into the network just to
wait for the user to press the pound key on the keypad. wait for the user to press the pound key on the keypad.
The Keypad Markup Language (KPML) was designed to support exactly The Keypad Markup Language (KPML) was designed to support exactly
this kind of need [7]. It models the keypad on a phone, and allows this kind of need [8]. It models the keypad on a phone and allows an
an application to be informed when any sequence of keys have been application to be informed when any sequence of keys has been
pressed. However, KPML has no presentation component. Since user pressed. However, KPML has no presentation component. Since user
interfaces generally require a response to user input, the interfaces generally require a response to user input, the
presentation will need to be done using a client-remote user presentation will need to be done using a client-remote user
interface that gets instantiated as a result of the trigger. interface that gets instantiated as a result of the trigger.
It is tempting to use a hybrid model, where a prompt-and-collect It is tempting to use a hybrid model, where a prompt-and-collect
application is implemented by using a client-remote user interface application is implemented by using a client-remote user interface
that plays the prompts, and a client-local user interface, described that plays the prompts, and a client-local user interface, described
by KPML, that collects digits. However, this only complicates the by KPML, that collects digits. However, this only complicates the
application. Firstly, the keypad input will be sent to both the application. Firstly, the keypad input will be sent to both the
media stream and the KPML user interface. This requires the media stream and the KPML user interface. This requires the
application to sort out which user inputs are duplicates, a process application to sort out which user inputs are duplicates, a process
that is very complicated. Secondly, the primary benefit of KPML is that is very complicated. Secondly, the primary benefit of KPML is
to avoid having a media stream towards a user interface. However, to avoid having a media stream towards a user interface. However,
there is already a media stream for the prompting, so there is no there is already a media stream for the prompting, so there is no
real savings. real savings.
5. Framework Overview 6. Framework Overview
In this framework, we use the term "SIP application" to refer to a In this framework, we use the term "SIP application" to refer to a
broad set of functionality. A SIP application is a program running broad set of functionality. A SIP application is a program running
on a SIP-based element (such as a proxy or user agent) that provides on a SIP-based element (such as a proxy or user agent) that provides
RFC 5629 App Interaction Framework October 2009
some value-added function to a user or system administrator. SIP some value-added function to a user or system administrator. SIP
applications can execute on behalf of a caller, a called party, or a applications can execute on behalf of a caller, a called party, or a
multitude of users at once. multitude of users at once.
Each application has a number of instances that are executing at any Each application has a number of instances that are executing at any
given time. An instance represents a single execution path for an given time. An instance represents a single execution path for an
application. It is established as a result of some event. That application. It is established as a result of some event. That
event can be a SIP event, such as the reception of a SIP INVITE event can be a SIP event, such as the reception of a SIP INVITE
request, or it can be a non-SIP event, such as a web form post or request, or it can be a non-SIP event, such as a web form post or
even a timer. Application instances also have an end time. Some even a timer. Application instances also have an end time. Some
instances have a lifetime that is coupled with a SIP transaction or instances have a lifetime that is coupled with a SIP transaction or
dialog. For example, a proxy application might begin when an INVITE dialog. For example, a proxy application might begin when an INVITE
arrives, and terminate when the call is answered. Other applications arrives, and terminate when the call is answered. Other applications
have a lifetime that spans multiple dialogs or transactions. For have a lifetime that spans multiple dialogs or transactions. For
example, a conferencing application instance may exist so long as example, a conferencing application instance may exist so long as
there are any dialogs connected to it. When the last dialog there are dialogs connected to it. When the last dialog terminates,
terminates, the application instance terminates. Other applications the application instance terminates. Other applications have a
have a liftime that is completely decoupled from SIP events. lifetime that is completely decoupled from SIP events.
It is fundamental to the framework described here that multiple It is fundamental to the framework described here that multiple
application instances may interact with a user during a single SIP application instances may interact with a user during a single SIP
transaction or dialog. Each instance may be for the same transaction or dialog. Each instance may be for the same
application, or different applications. Each of the applications may application, or different applications. Each of the applications may
be completely independent, in that they may be owned by different be completely independent, in that each may be owned by a different
providers, and may not be aware of each others existence. Similarly, provider, and may not be aware of each other's existence. Similarly,
there may be application instances interacting with the caller, and there may be application instances interacting with the caller, and
instances interacting with the callee, both within the same instances interacting with the callee, both within the same
transaction or dialog. transaction or dialog.
The first step in the interaction with the user is to instantiate one The first step in the interaction with the user is to instantiate one
or more user interface components for the application instance. A or more user interface components for the application instance. A
user interface component is a single piece of the user interface that user interface component is a single piece of the user interface that
is defined by a logical flow that is not synchronously coupled with is defined by a logical flow that is not synchronously coupled with
any other component. In other words, each component runs any other component. In other words, each component runs
independently. independently.
A user interface component can be instantiated in one of the user A user interface component can be instantiated in one of the user
agents in a dialog (for a client-local user interface), or within a agents in a dialog (for a client-local user interface), or within a
network element (for a client-remote user interface). If a client- network element (for a client-remote user interface). If a client-
local user interface is to be used, the application needs to local user interface is to be used, the application needs to
determine whether or not the user agent is capable of supporting a determine whether or not the user agent is capable of supporting a
client-local user interface, and in what format. In this framework, client-local user interface, and in what format. In this framework,
all client-local user interface components are described by a markup all client-local user interface components are described by a markup
language. A markup language describes a logical flow of presentation language. A markup language describes a logical flow of presentation
of information to the user, collection of information from the user, of information to the user, a collection of information from the
and transmission of that information to an application. Examples of user, and a transmission of that information to an application.
markup languages include HTML, WML, VoiceXML, and the Keypad Markup Examples of markup languages include HTML, Wireless Markup Language
Language (KPML) [7]. (WML), VoiceXML, and the Keypad Markup Language (KPML) [8].
Unlike an application instance, which has very flexible lifetimes, a RFC 5629 App Interaction Framework October 2009
Unlike an application instance, which has a very flexible lifetime, a
user interface component has a very fixed lifetime. A user interface user interface component has a very fixed lifetime. A user interface
component is always associated with a dialog. The user interface component is always associated with a dialog. The user interface
component can be created at any point after the dialog (or early component can be created at any point after the dialog (or early
dialog) is created. However, the user interface component terminates dialog) is created. However, the user interface component terminates
when the dialog terminates. The user interface component can be when the dialog terminates. The user interface component can be
terminated earlier by the user agent, and possibly by the terminated earlier by the user agent, and possibly by the
application, but its lifetime never exceeds that of its associated application, but its lifetime never exceeds that of its associated
dialog. dialog.
There are two ways to create a client local interface component. For There are two ways to create a client-local interface component. For
interface components that are presentation capable, the application interface components that are presentation capable, the application
sends a REFER [6] request to the user agent. The Refer-To header sends a REFER [7] request to the user agent. The Refer-To header
field contains an HTTP URI that points to the markup for the user field contains an HTTP URI that points to the markup for the user
interface, and the REFER contains a Target-Dialog header field [9] interface, and the REFER contains a Target-Dialog header field [10]
identifying the dialog associated with the user interface component. which identifies the dialog associated with the user interface
For user interface components that are presentation free (such as component. For user interface components that are presentation free
those defined by KPML), the application sends a SUBSCRIBE request to (such as those defined by KPML), the application sends a SUBSCRIBE
the user agent. The body of the SUBSCRIBE request contains a filter, request to the user agent. The body of the SUBSCRIBE request
which, in this case, is the markup that defines when information is contains a filter, which, in this case, is the markup that defines
to be sent to the application in a NOTIFY. The SUBSCRIBE does not when information is to be sent to the application in a NOTIFY. The
contain the Target-Dialog header field, since equivalent information SUBSCRIBE does not contain the Target-Dialog header field, since
is conveyed in the Event header field. equivalent information is conveyed in the Event header field.
If a user interface component is to be instantiated in the network, If a user interface component is to be instantiated in the network,
there is no need to determine the capabilities of the device on which there is no need to determine the capabilities of the device on which
the user interface is instantiated. Presumably, it is on a device on the user interface is instantiated. Presumably, it is on a device on
which the application knows a UI can be created. However, the which the application knows a UI can be created. However, the
application does need to connect the user device to the user application does need to connect the user device to the user
interface. This will require manipulation of media streams in order interface. This will require manipulation of media streams in order
to establish that connection. to establish that connection.
The interface between the user interface component and the The interface between the user interface component and the
application depends on the type of user interface. For presentation application depends on the type of user interface. For presentation-
capable user interfaces, such as those described by HTML and capable user interfaces, such as those described by HTML and
VoiceXML, HTTP form POST operations are used. For presentation free VoiceXML, HTTP form POST operations are used. For presentation-free
user interfaces, a SIP NOTIFY is used. The differing needs and user interfaces, a SIP NOTIFY is used. The differing needs and
capabilities of these two user interfaces, as described in capabilities of these two user interfaces, as described in
Section 3.4, is what drives the different choices for the Section 4.4, are what drives the different choices for the
interactions. Since presentation capable user interfaces require an interactions. Since presentation-capable user interfaces require an
update to the presentation every time user data is entered, they are update to the presentation every time user data is entered, they are
a good match for HTTP. Since presentation free user interfaces a good match for HTTP. Since presentation-free user interfaces
merely transmit user input to the application, a NOTIFY is more merely transmit user input to the application, a NOTIFY is more
appropriate. appropriate.
Indeed, for presentation free user interfaces, there are two Indeed, for presentation-free user interfaces, there are two
different modalities of operation. The first is called "one shot". different modalities of operation. The first is called "one shot".
In the one-shot role, the markup waits for a user to enter some In the one-shot role, the markup waits for a user to enter some
information, and when they do, reports this event to the application.
RFC 5629 App Interaction Framework October 2009
information and, when they do, reports this event to the application.
The application then does something, and the markup is no longer The application then does something, and the markup is no longer
used. In the other modality, called "monitor", the markup stays used. In the other modality, called "monitor", the markup stays
permanently resident, and reports information back to an application permanently resident, and reports information back to an application
until termination of the associated dialog. until termination of the associated dialog.
6. Deployment Topologies 7. Deployment Topologies
This section presents some of the network topologies in which this This section presents some of the network topologies in which this
framework can be instantiated. framework can be instantiated.
6.1 Third Party Application 7.1. Third-Party Application
+-------------+ +-------------+
/---| Application | /---| Application |
/ +-------------+ / +-------------+
/ /
SUB/ / REFER/ SUB/ / REFER/
NOT / HTTP NOT / HTTP
/ /
+--------+ SIP (INVITE) +-----+ +--------+ SIP (INVITE) +-----+
| UI A--------------------X | | UI A--------------------X |
|........| | SIP | |........| | SIP |
| User | RTP | UA | | User | RTP | UA |
| Device B--------------------Y | | Device B--------------------Y |
+--------+ +-----+ +--------+ +-----+
Figure 2: Third Party Topology Figure 2: Third-Party Topology
In this topology, the application that is interested in interacting In this topology, the application that is interested in interacting
with the users exists outside of the SIP dialog between the user with the users exists outside of the SIP dialog between the user
agents. In that case, the application learns about the initiation agents. In that case, the application learns about the initiation
and termination of the dialog, along with the dialog identifiers, and termination of the dialog, along with the dialog identifiers,
through some out of band means. One such possibility is the dialog through some out-of-band means. One such possibility is the dialog
event package [16]. Dialog information is only revealed to trusted event package [16]. Dialog information is only revealed to trusted
parties, so the application would need to be trusted by one of the parties, so the application would need to be trusted by one of the
users in order to obtain this information. users in order to obtain this information.
At any point during the dialog, the application can instantiate user At any point during the dialog, the application can instantiate user
interface components on the user device of the caller or callee. It interface components on the user device of the caller or callee. It
can do this either using SUBSCRIBE or REFER, depending on the type of can do this using either SUBSCRIBE or REFER, depending on the type of
user interface (presentation capable or presentation free). user interface (presentation capable or presentation free).
6.2 Co-Resident Application RFC 5629 App Interaction Framework October 2009
7.2. Co-Resident Application
+--------+ SIP (INVITE) +-----+ +--------+ SIP (INVITE) +-----+
| User A--------------------X SIP | | User A--------------------X SIP |
| Device | RTP | UA | | Device | RTP | UA |
|........B--------------------Y | |........B--------------------Y |
| | SUB/NOT | App)| | | SUB/NOT | App)|
| UI A'-------------------X' | | UI A'-------------------X' |
+--------+ REFER/HTTP +-----+ +--------+ REFER/HTTP +-----+
Figure 3: Co-Resident Topology Figure 3: Co-Resident Topology
In this deployment topology, the application is co-resident with one In this deployment topology, the application is co-resident with one
of the user agents (the one on the right in the picture above). This of the user agents (the one on the right in the picture above). This
application can install client-local user interface components on the application can install client-local user interface components on the
other user agent, which is acting as the user device. These other user agent, which is acting as the user device. These
components can be installed using either SUBSCRIBE, for presentation components can be installed using either SUBSCRIBE, for presentation-
free user interfaces, or REFER, for presentation capable ones. This free user interfaces, or REFER, for presentation-capable ones. This
situation typically arises when the application wishes to install UI situation typically arises when the application wishes to install UI
components on a presentation capable user interface. If the only components on a presentation-capable user interface. If the only
user input is via keypad input, the framework is not needed per se, user input is via keypad input, the framework is not needed per se,
because the UA/application will receive the input via RFC 2833 in the because the UA/application will receive the input via RFC 4733 in the
RTP stream. RTP stream.
If the application resides in the called party, it is called a If the application resides in the called party, it is called a
terminating application. If it resides in the calling party, it is "terminating application". If it resides in the calling party, it is
called an originating application. called an "originating application".
This kind of topology is common in protocol converter and gateway This kind of topology is common in protocol converter and gateway
applications. applications.
6.3 Third Party Application and User Device Proxy RFC 5629 App Interaction Framework October 2009
7.3. Third-Party Application and User Device Proxy
+-------------+ +-------------+
/---| Application | /---| Application |
/ +-------------+ / +-------------+
/ /
SUB/ / REFER/ SUB/ / REFER/
NOT / HTTP NOT / HTTP
/ /
+-----+ SIP +---M----+ SIP +-----+ +-----+ SIP +---M----+ SIP +-----+
| V--------------------C A--------------------X | | V--------------------C A--------------------X |
| SIP | | UI | | SIP | | SIP | | UI | | SIP |
| UAa | RTP | | RTP | UAb | | UAa | RTP | | RTP | UAb |
| W--------------------D B--------------------Y | | W--------------------D B--------------------Y |
+-----+ +--------+ +-----+ +-----+ +--------+ +-----+
User User User User
Device Device Device Device
Proxy Proxy
Figure 4: User Device Proxy Topology Figure 4: User Device Proxy Topology
In this deployment topology, there is a third party application as in In this deployment topology, there is a third-party application as in
Section 6.1. However, instead of installing a user interface Section 7.1. However, instead of installing a user interface
component on the end user device, the component is installed in an component on the end user device, the component is installed in an
intermediate device, known as a User Device Proxy. From the intermediate device, known as a User Device Proxy. From the
perspective of the actual user device (on the left), the User Device perspective of the actual user device (on the left), the User Device
Proxy is a client remote user interface. As such, media, typically Proxy is a client remote user interface. As such, media, typically
transported using RTP (including RFC 2833 for carrying user input), transported using RTP (including RFC 4733 for carrying user input),
is sent from the user device to the client remote user interface on is sent from the user device to the client remote user interface on
the User Device Proxy. As far as the application is concerned, it is the User Device Proxy. As far as the application is concerned, it is
installing what it thinks is a client local user interface on the installing what it thinks is a client-local user interface on the
user device, but it happens to be on a user device proxy which looks user device, but it happens to be on a user device proxy that looks
like the user device to the application. like the user device to the application.
The user device proxy will need to terminate and re-originate both The user device proxy will need to terminate and re-originate both
signaling (SIP) and media traffic towards the actual peer in the signaling (SIP) and media traffic towards the actual peer in the
conversation. The User Device Proxy is a media relay in the conversation. The User Device Proxy is a media relay in the
terminology of RFC 3550 [18]. The User Device Proxy will need to terminology of RFC 3550 [18]. The User Device Proxy will need to
monitor the media streams associated with each dialog, in order to monitor the media streams associated with each dialog, in order to
convert user input received in the media stream to events reported to convert user input received in the media stream to events reported to
the user interface. This can pose a challenge in multi-media the user interface. This can pose a challenge in multi-media
systems, where it may be unclear on which media stream the user input systems, where it may be unclear on which media stream the user input
is being sent. As discussed in RFC 3264 [20], if a user agent has a is being sent. As discussed in RFC 3264 [20], if a user agent has a
single media source and is supporting multiple streams, it is single media source and is supporting multiple streams, it is
supposed to send that source to all streams. In cases where there supposed to send that source to all streams. In cases where there
are multiple sources, the mapping is a matter of local policy. In are multiple sources, the mapping is a matter of local policy. In
RFC 5629 App Interaction Framework October 2009
the absence of a way to explicitly identify or request which sources the absence of a way to explicitly identify or request which sources
map to which streams, the user device proxy will need to do the best map to which streams, the user device proxy will need to do the best
job it can. This specification RECOMMENDS that the User Device Proxy job it can. This specification RECOMMENDS that the User Device Proxy
monitor the first stream (defined in terms of ordering of media monitor the first stream (defined in terms of ordering of media
sessions within a session description). As such, user agents SHOULD sessions within a session description). As such, user agents SHOULD
send their user input on the first stream, absent a policy to direct send their user input on the first stream, absent a policy to direct
it otherwise. it otherwise.
6.4 Proxy Application 7.4. Proxy Application
+----------+ +----------+
SUB/NOT | App | SUB/NOT SUB/NOT | App | SUB/NOT
+--------------->| |<-----------------+ +--------------->| |<-----------------+
| REFER/HTTP |..........| REFER/HTTP | | REFER/HTTP |..........| REFER/HTTP |
| | SIP | | | | SIP | |
| | Proxy | | | | Proxy | |
| +----------+ | | +----------+ |
V ^ | V V ^ | V
+----------+ | | +----------+ +----------+ | | +----------+
skipping to change at page 19, line 39 skipping to change at page 19, line 42
User Device User Device User Device User Device
Figure 5: Proxy Application Topology Figure 5: Proxy Application Topology
In this topology, the application is co-resident with a transaction In this topology, the application is co-resident with a transaction
stateful, record-routing proxy server on the call path between two stateful, record-routing proxy server on the call path between two
user devices. The application uses SUBSCRIBE or REFER to install user devices. The application uses SUBSCRIBE or REFER to install
user interface components on one or both user devices. user interface components on one or both user devices.
This topology is common in routing applications, such as a web- This topology is common in routing applications, such as a web-
assisted call routing application. assisted call-routing application.
7. Application Behavior 8. Application Behavior
The behavior of an application within this framework depends on The behavior of an application within this framework depends on
whether it seeks to use a client-local or client-remote user whether it seeks to use a client-local or client-remote user
interface. interface.
7.1 Client Local Interfaces RFC 5629 App Interaction Framework October 2009
One key component of this framework is support for client local user 8.1. Client-Local Interfaces
One key component of this framework is support for client-local user
interfaces. interfaces.
7.1.1 Discovering Capabilities 8.1.1. Discovering Capabilities
A client local user interface can only be instantiated on a user A client-local user interface can only be instantiated on a user
agent if the user agent supports that type of user interface agent if the user agent supports that type of user interface
component. Support for client local user interface components is component. Support for client-local user interface components is
declared by both the UAC and a UAS in its Allow, Accept, Supported, declared by both the UAC and UAS in their Allow, Accept, Supported,
and Allow-Event header fields of dialog-initiating requests and and Allow-Event header fields of dialog-initiating requests and
responses. If the Allow header field indicates support for the SIP responses. If the Allow header field indicates support for the SIP
SUBSCRIBE method, and the Allow-Event header field indicates support SUBSCRIBE method, and the Allow-Event header field indicates support
for the kpml package [7], and the Supported header field indicates for the KPML package [8], and the Supported header field indicates
support for the GRUU GRUU [8] specification (which, in turn, means support for the Globally Routable UA URI (GRUU) [9] specification
that the Contact header field contains a GRUU), it means that the UA (which, in turn, means that the Contact header field contains a
can instantiate presentation free user interface components. In this GRUU), it means that the UA can instantiate presentation-free user
case, the application can push presentation free user interface interface components. In this case, the application can push
components according to the rules of Section 7.1.2. The specific presentation-free user interface components according to the rules of
markup languages that can be supported are indicated in the Accept Section 8.1.2. The specific markup languages that can be supported
header field. are indicated in the Accept header field.
If the Allow header field indicates support for the SIP REFER method, If the Allow header field indicates support for the SIP REFER method,
and the Supported header field indicates support for the Target- and the Supported header field indicates support for the Target-
Dialog header field [9], and the Contact header field contains UA Dialog header field [10], and the Contact header field contains UA
capabilities [5] that indicate support for the HTTP URI scheme, it capabilities [6] that indicate support for the HTTP URI scheme, it
means that the UA supports presentation capable user interface means that the UA supports presentation-capable user interface
components. In this case, the application can push presentation components. In this case, the application can push presentation-
capable user interface components to the client according to the capable user interface components to the client according to the
rules of Section 7.1.2. The specific markups that are supported are rules of Section 8.1.2. The specific markups that are supported are
indicated in the Accept header field. indicated in the Accept header field.
A third party application that is not present on the call path will A third-party application that is not present on the call path will
not be privy to these header fields in the dialog initiating requests not be privy to these header fields in the dialog-initiating requests
that pass by. As such, it will need to obtain this capability that pass by. As such, it will need to obtain this capability
information in other ways. One way is through the registration event information in other ways. One way is through the registration event
package [21], which can contain user agent capability information package [21], which can contain user agent capability information
provided in REGISTER requests [5]. provided in REGISTER requests [6].
7.1.2 Pushing an Initial Interface Component 8.1.2. Pushing an Initial Interface Component
Generally, we anticipate that interface components will need to be Generally, we anticipate that interface components will need to be
created at various different points in a SIP session. Clearly, they created at various different points in a SIP session. Clearly, they
will need to be pushed during session setup, or after the session is will need to be pushed during session setup, or after the session is
established. A user interface component is always associated with a established. A user interface component is always associated with a
specific dialog, however. specific dialog, however.
RFC 5629 App Interaction Framework October 2009
An application MUST NOT attempt to push a user interface component to An application MUST NOT attempt to push a user interface component to
a user agent until it has determined that the user agent has the a user agent until it has determined that the user agent has the
neccesary capabilities and a dialog has been created. In the case of necessary capabilities and a dialog has been created. In the case of
a UAC, this means that an application MUST NOT push a user interface a UAC, this means that an application MUST NOT push a user interface
component for an INVITE initiated dialog until the application has component for an INVITE-initiated dialog until the application has
seen a request confirming the receipt of a dialog-creating response. seen a request confirming the receipt of a dialog-creating response.
This could be an ACK for a 200 OK, or a PRACK for a provisional This could be an ACK for a 200 OK, or a PRACK for a provisional
response [2]. For SUBSCRIBE initiated dialogs, it MUST NOT push a response [3]. For SUBSCRIBE-initiated dialogs, the application MUST
user interface component until the application has seen a 200 OK to NOT push a user interface component until the application has seen a
the NOTIFY request. For a user interface component on a UAS, the 200 OK to the NOTIFY request. For a user interface component on a
application MUST NOT push a user interface component for an INVITE UAS, the application MUST NOT push a user interface component for an
initiated dialog until it has seen a dialog-creating response from INVITE-initiated dialog until it has seen a dialog-creating response
the UAS. For a SUBSCRIBE initiated dialog, it MUST NOT push a user from the UAS. For a SUBSCRIBE-initiated dialog, it MUST NOT push a
interface component until it has seen a NOTIFY request from the user interface component until it has seen a NOTIFY request from the
notifier. notifier.
To create a presentation capable UI component on the UA, the To create a presentation-capable UI component on the UA, the
application sends a REFER request to the UA. This REFER MUST be sent application sends a REFER request to the UA. This REFER MUST be sent
to the Globally Routable UA URI (GRUU) [8] advertised by that UA in to the GRUU [9] advertised by that UA in the Contact header field of
the Contact header field of the dialog initiating request or response the dialog-initiating request or response sent by that UA. Note that
sent by that UA. Note that this REFER request creates a separate this REFER request creates a separate dialog between the application
dialog between the application and the UA. The Refer-To header field and the UA. The Refer-To header field of the REFER request MUST
of the REFER request MUST contain an HTTP URI that references the contain an HTTP URI that references the markup document to be
markup document to be fetched. fetched.
Furthermore, it is essential for the REFER request to be correlated Furthermore, it is essential for the REFER request to be correlated
with the dialog to which the user interface component will be with the dialog to which the user interface component will be
associated. This is necessary for authorization and for terminating associated. This is necessary for authorization and for terminating
the user interface components when the dialog terminates. To provide the user interface components when the dialog terminates. To provide
this context, the REFER request MUST contain a Target-Dialog header this context, the REFER request MUST contain a Target-Dialog header
field identifying the dialog with which the user interface component field identifying the dialog with which the user interface component
is associated. As discussed in [9], this request will also contain a is associated. As discussed in [10], this request will also contain
Require header field with the tdialog option tag. a Require header field with the tdialog option tag.
To create a presentation free user interface component, the To create a presentation-free user interface component, the
application sends a SUBSCRIBE request to the UA. The SUBSCRIBE MUST application sends a SUBSCRIBE request to the UA. The SUBSCRIBE MUST
be sent to the GRUU advertised by the UA. This SUBSCRIBE request be sent to the GRUU advertised by the UA. This SUBSCRIBE request
creates a separate dialog. The SUBSCRIBE request MUST use the KPML creates a separate dialog. The SUBSCRIBE request MUST use the KPML
[7] event package. The body of the SUBSCRIBE request contains the [8] event package. The body of the SUBSCRIBE request contains the
markup document that defines the conditions under which the markup document that defines the conditions under which the
application wishes to be notified of user input. application wishes to be notified of user input.
In both cases, the REFER or SUBSCRIBE request SHOULD include a In both cases, the REFER or SUBSCRIBE request SHOULD include a
display name in the From header field which identifies the name of display name in the From header field that identifies the name of the
the application. For example, a prepaid calling card might include a application. For example, a prepaid calling card might include a
From header field which looks like: From header field that looks like:
RFC 5629 App Interaction Framework October 2009
From: "Prepaid Calling Card" <sip:prepaid@example.com> From: "Prepaid Calling Card" <sip:prepaid@example.com>
Any of the SIP identity assertion mechanisms that have been defined, Any of the SIP identity assertion mechanisms that have been defined,
such as [11] and [13] are applicable to these requests as well. such as [11] and [13], are applicable to these requests as well.
7.1.3 Updating an Interface Component 8.1.3. Updating an Interface Component
Once a user interface component has been created on a client, it can Once a user interface component has been created on a client, it can
be updated. The means for updating it depends on the type of UI be updated. The means for updating it depends on the type of UI
component. component.
Presentation capable UI components are updated using techniques Presentation-capable UI components are updated using techniques
already in place for those markups. In particular, user input will already in place for those markups. In particular, user input will
cause an HTTP POST operation to push the user input to the cause an HTTP POST operation to push the user input to the
application. The result of the POST operation is a new markup that application. The result of the POST operation is a new markup that
the UI is supposed to use. This allows the UI to be updated in the UI is supposed to use. This allows the UI to be updated in
response to user action. Some markups, such as HTML, provide the response to user action. Some markups, such as HTML, provide the
ability to force a refresh after a certain period of time, so that ability to force a refresh after a certain period of time, so that
the UI can be updated without user input. Those mechanisms can be the UI can be updated without user input. Those mechanisms can be
used here as well. However, there is no support for an asynchronous used here as well. However, there is no support for an asynchronous
push of an updated UI component from the appliciation to the user push of an updated UI component from the application to the user
agent. A new REFER request to the same GRUU would create a new UI agent. A new REFER request to the same GRUU would create a new UI
component rather than updating any components already in place. component rather than update any components already in place.
For presentation free UI, the story is different. The application For presentation-free UI, the story is different. The application
MAY update the filter at any time by generating a SUBSCRIBE refresh MAY update the filter at any time by generating a SUBSCRIBE refresh
with the new filter. The UA will immediately begin using this new with the new filter. The UA will immediately begin using this new
filter. filter.
7.1.4 Terminating an Interface Component 8.1.4. Terminating an Interface Component
User interface components have a well defined lifetime. They are User interface components have a well-defined lifetime. They are
created when the component is first pushed to the client. User created when the component is first pushed to the client. User
interface components are always associated with the SIP dialog on interface components are always associated with the SIP dialog on
which they were pushed. As such, their lifetime is bound by the which they were pushed. As such, their lifetime is bound by the
lifetime of the dialog. When the dialog ends, so does the interface lifetime of the dialog. When the dialog ends, so does the interface
component. component.
However, there are some cases where the application would like to However, there are some cases where the application would like to
terminate the user interface component before its natural termination terminate the user interface component before its natural termination
point. For presentation capable user interfaces, this is not point. For presentation-capable user interfaces, this is not
possible. For presentation free user interfaces, the application MAY possible. For presentation-free user interfaces, the application MAY
terminate the component by sending a SUBSCRIBE with Expires equal to terminate the component by sending a SUBSCRIBE with Expires equal to
zero. This terminates the subscription, which removes the UI zero. This terminates the subscription, which removes the UI
component. component.
A client can remove a UI component at any time. For presentation A client can remove a UI component at any time. For presentation-
capable UI, this is analagous to the user dismissing the web form capable UI, this is analogous to the user dismissing the web form
RFC 5629 App Interaction Framework October 2009
window. There is no mechanism provided for reporting this kind of window. There is no mechanism provided for reporting this kind of
event to the application. The application MUST be prepared to time event to the application. The application MUST be prepared to time
out, and never receive input from a user. The duration of this out and never receive input from a user. The duration of this
timeout is application dependent. For presentation free user timeout is application dependent. For presentation-free user
interfaces, the UA can explicitly terminate the subscription. This interfaces, the UA can explicitly terminate the subscription. This
will result in the generation of a NOTIFY with a Subscription-State will result in the generation of a NOTIFY with a Subscription-State
header field equal to "terminated". header field equal to "terminated".
7.2 Client Remote Interfaces 8.2. Client-Remote Interfaces
As an alternative to, or in conjunction with client local user As an alternative to, or in conjunction with client-local user
interfaces, an application can make use of client remote user interfaces, an application can make use of client-remote user
interfaces. These user interfaces can execute co-resident with the interfaces. These user interfaces can execute co-resident with the
application itself (in which case no standardized interfaces between application itself (in which case no standardized interfaces between
the UI and the application need to be used), or it can run the UI and the application need to be used), or they can run
separately. This framework assumes that the user interface runs on a separately. This framework assumes that the user interface runs on a
host that has a sufficient trust relationship with the application. host that has a sufficient trust relationship with the application.
As such, the means for instantiating the user interface is not As such, the means for instantiating the user interface is not
considered here. considered here.
The primary issue is to connect the user device to the remote user The primary issue is to connect the user device to the remote user
interface. Doing so requires the manipulation of media streams interface. Doing so requires the manipulation of media streams
between the client and the user interface. Such manipulation can between the client and the user interface. Such manipulation can
only be done by user agents. There are two types of user agent only be done by user agents. There are two types of user agent
applications within this framework - originating/terminating applications within this framework: originating/terminating
applications, and intermediary applications. applications, and intermediary applications.
7.2.1 Originating and Terminating Applications 8.2.1. Originating and Terminating Applications
Originating and terminating applications are applications which are Originating and terminating applications are applications that are
themselves the originator or the final recipient of a SIP invitation. themselves the originator or the final recipient of a SIP invitation.
They are "pure" user agent applications - not back-to-back user They are "pure" user agent applications, not back-to-back user
agents. The classic example of such an application is an interactive agents. The classic example of such an application is an interactive
voice response (IVR) application, which is typically a terminating voice response (IVR) application, which is typically a terminating
application. It is a terminating application because the user application. It is a terminating application because the user
explicitly calls it; i.e., it is the actual called party. An example explicitly calls it; i.e., it is the actual called party. An example
of an originating application is a wakeup call application, which of an originating application is a wakeup call application, which
calls a user at a specified time in order to wake them up. calls a user at a specified time in order to wake them up.
Because originating and terminating applications are a natural Because originating and terminating applications are a natural
termination point of the dialog, manipulation of the media session by termination point of the dialog, manipulation of the media session by
the application is trivial. Traditional SIP techniques for adding the application is trivial. Traditional SIP techniques for adding
and removing media streams, modifying codecs, and changing the and removing media streams, modifying codecs, and changing the
address of the recipient of the media streams, can be applied. address of the recipient of the media streams can be applied.
7.2.2 Intermediary Applications RFC 5629 App Interaction Framework October 2009
8.2.2. Intermediary Applications
Intermediary applications are, at the same time, more common than Intermediary applications are, at the same time, more common than
originating/terminating applications, and more complex. Intermediary originating/terminating applications and more complex. Intermediary
applications are applications that are neither the actual caller or applications are applications that are neither the actual caller nor
called party. Rather, they represent a "third party" that wishes to the called party. Rather, they represent a "third party" that wishes
interact with the user. The classic example is the ubiquitous pre- to interact with the user. The classic example is the ubiquitous
paid calling card application. prepaid calling card application.
In order for the intermediary application to add a client remote user In order for the intermediary application to add a client-remote user
interface, it needs to manipulate the media streams of the user agent interface, it needs to manipulate the media streams of the user agent
to terminate on that user interface. This also introduces a to terminate on that user interface. This also introduces a
fundamental feature interaction issue. Since the intermediary fundamental feature interaction issue. Since the intermediary
application is not an actual participant in the call, the user will application is not an actual participant in the call, the user will
need to interact with both the intermediary application and its peer need to interact with both the intermediary application and its peer
in the dialog. Doing both at the same time is complicated, and is in the dialog. Doing both at the same time is complicated and is
discussed in more detail in Section 9. discussed in more detail in Section 10.
8. User Agent Behavior 9. User Agent Behavior
8.1 Advertising Capabilities 9.1. Advertising Capabilities
In order to participate in applications that make use of stimulus In order to participate in applications that make use of stimulus
interfaces, a user agent needs to advertise its interaction interfaces, a user agent needs to advertise its interaction
capabilities. capabilities.
If a user agent supports presentation capable user interfaces, it If a user agent supports presentation-capable user interfaces, it
MUST support the REFER method. It MUST include, in all dialog MUST support the REFER method. It MUST include, in all dialog-
initiating requests and responses, an Allow header field that initiating requests and responses, an Allow header field that
includes the REFER method. The user agent MUST support the target includes the REFER method. The user agent MUST support the target
dialog specification [9], and MUST include the "tdialog" option tag dialog specification [10], and MUST include the "tdialog" option tag
in the Supported header field of dialog forming requests and in the Supported header field of dialog-forming requests and
responses. Furthermore, the UA MUST support the SIP user agent responses. Furthermore, the UA MUST support the SIP user agent
capabilities specification [5]. The UA MUST be capable of being capabilities specification [6]. The UA MUST be capable of being
REFER'd to an HTTP URI. It MUST include, in the Contact header field REFERed to an HTTP URI. It MUST include, in the Contact header field
of its dialog initiating requests and responses, a "schemes" Contact of its dialog-initiating requests and responses, a "schemes" Contact
header field parameter that includes the http URI scheme. The UA header field parameter that includes the HTTP URI scheme. The UA
MUST include, in all dialog initiating requests and responses, an MUST include, in all dialog-initiating requests and responses, an
Accept header field listing all of those markups supported by the UA. Accept header field listing all of those markups supported by the UA.
It is RECOMMENDED that all user agents that support presentation It is RECOMMENDED that all user agents that support presentation-
capable user interfaces support HTML. capable user interfaces support HTML.
If a user agent supports presentation free user interfaces, it MUST If a user agent supports presentation-free user interfaces, it MUST
support the SUBSCRIBE [3] method. It MUST support the KPML [7] event support the SUBSCRIBE [4] method. It MUST support the KPML [8] event
package. It MUST include, in all dialog initiating requests and package. It MUST include, in all dialog-initiating requests and
responses, an Allow header field that includes the SUBSCRIBE method. responses, an Allow header field that includes the SUBSCRIBE method.
It MUST include, in all dialog initiating requests and responses, an It MUST include, in all dialog-initiating requests and responses, an
Allow-Events header field that lists the KPML event package. The UA Allow-Events header field that lists the KPML event package. The UA
MUST include, in all dialog initiating requests and responses, an
RFC 5629 App Interaction Framework October 2009
MUST include, in all dialog-initiating requests and responses, an
Accept header field listing those event filters it supports. At a Accept header field listing those event filters it supports. At a
minimum, a UA MUST support the "application/kpml-request+xml" MIME minimum, a UA MUST support the "application/kpml-request+xml" MIME
type. type.
For either presentation free or presentation capable user interfaces, For either presentation-free or presentation-capable user interfaces,
the user agent MUST support the GRUU [8] specification. The Contact the user agent MUST support the GRUU [9] specification. The Contact
header field in all dialog initiating requests and responses MUST header field in all dialog-initiating requests and responses MUST
contain a GRUU. The UA MUST include a Supported header field which contain a GRUU. The UA MUST include a Supported header field that
contains the "gruu" option tag and the "tdialog" option tag. contains the "gruu" option tag and the "tdialog" option tag.
Because these headers are examined by proxies which may be executing Because these headers are examined by proxies that may be executing
applications, a UA that wishes to support client local user applications, a UA that wishes to support client-local user
interfaces should not encrypt them. interfaces should not encrypt them.
8.2 Receiving User Interface Components 9.2. Receiving User Interface Components
Once the UA has created a dialog (in either the early or confirmed Once the UA has created a dialog (in either the early or confirmed
states), it MUST be prepared to receive a SUBSCRIBE or REFER request states), it MUST be prepared to receive a SUBSCRIBE or REFER request
against its GRUU. If the UA receives such a request prior to the against its GRUU. If the UA receives such a request prior to the
establishment of a dialog, the UA MUST reject the request. establishment of a dialog, the UA MUST reject the request.
A user agent SHOULD attempt to authenticate the sender of the A user agent SHOULD attempt to authenticate the sender of the
request. The sender will generally be an application, and therefore request. The sender will generally be an application; therefore, the
the user agent is unlikely to ever have a shared secret with it, user agent is unlikely to ever have a shared secret with it, making
making digest authentication useless. However, authenticated digest authentication useless. However, authenticated identities can
identities can be obtained through other means, such as [11]. be obtained through other means, such as the Identity mechanism [11].
A user agent MAY have pre-defined authorization policies which permit A user agent MAY have pre-defined authorization policies that permit
applications which have authenticated themselves with a particular applications which have authenticated themselves with a particular
identity, to push user interface components. If such a set of identity to push user interface components. If such a set of
policies are present, they are checked first. If the application is policies is present, it is checked first. If the application is
authorized, processing proceeds. authorized, processing proceeds.
If the application has authenticated itself, but it is not explicitly If the application has authenticated itself but is not explicitly
authorized or blocked, this specification RECOMMENDS that the authorized or blocked, this specification RECOMMENDS that the
application be automatically authorized if it can prove that it was application be automatically authorized if it can prove that it was
either on the call path, or is trusted by one of the elements on the either on the call path, or is trusted by one of the elements on the
call path. An application proves this to the user agent by call path. An application proves this to the user agent by
demonstrating that it knows the dialog identifiers. That occurs by demonstrating that it knows the dialog identifiers. That occurs by
including them in a Target-Dialog header field for REFER requests, or including them in a Target-Dialog header field for REFER requests, or
in the Event header field parameters of the KPML SUBSCRIBE request. in the Event header field parameters of the KPML SUBSCRIBE request.
Because of the dialog identifiers serve as a tool for authorization, Because the dialog identifiers serve as a tool for authorization, a
a user agent compliant to this framework SHOULD use dialog user agent compliant to this framework SHOULD use dialog identifiers
identifiers that are cryptographically random, with at least 128 bits that are cryptographically random, with at least 128 bits of
of randomness. It is recommended that this randomness be split randomness. It is recommended that this randomness be split between
between the Call-ID and From header field tag in the case of a UAC. the Call-ID and From header field tags in the case of a UAC.
RFC 5629 App Interaction Framework October 2009
Furthermore, to ensure that only applications resident in or trusted Furthermore, to ensure that only applications resident in or trusted
by on-path elements can instantiate a user interface component, a by on-path elements can instantiate a user interface component, a
user agent compliant to this specification SHOULD use the sips URI user agent compliant to this specification SHOULD use the Session
scheme for all dialogs it initiates. This will guarantee secure Initiation Protocol Secure (SIPS) URI scheme for all dialogs it
links between all of the elements on the signaling path. initiates. This will guarantee secure links between all the elements
on the signaling path.
If the dialog was not established with a sips URI, or the user agent If the dialog was not established with a SIPS URI, or the user agent
did not choose cryptographically random dialog identifiers, then the did not choose cryptographically random dialog identifiers, then the
application MUST NOT automatically be authorized, even if it application MUST NOT automatically be authorized, even if it
presented valid dialog identifiers. A user agent MAY apply any other presented valid dialog identifiers. A user agent MAY apply any other
policies in addition to (but not instead of) the ones specified here policies in addition to (but not instead of) the ones specified here
in order to authorize the creation of the user interface component. in order to authorize the creation of the user interface component.
One such mechanism would be to prompt the user, informing them of the One such mechanism would be to prompt the user, informing them of the
identity of the application and the dialog it is associated with. If identity of the application and the dialog it is associated with. If
an authorization policy requires user interaction, the user agent an authorization policy requires user interaction, the user agent
SHOULD respond to the SUBSCRIBE or REFER request with a 202. In the SHOULD respond to the SUBSCRIBE or REFER request with a 202. In the
case of SUBSCRIBE, if authorization is not granted, the user agent case of SUBSCRIBE, if authorization is not granted, the user agent
SHOULD generate a NOTIFY to terminate the subscription. In the case SHOULD generate a NOTIFY to terminate the subscription. In the case
of REFER, the user agent MUST NOT act upon the URI in the Refer-To of REFER, the user agent MUST NOT act upon the URI in the Refer-To
header field until user authorization was obtained. header field until user authorization is obtained.
If an application does not present a valid dialog identifier in its If an application does not present a valid dialog identifier in its
REFER or SUBSCRIBE request, the user agent MUST reject the request REFER or SUBSCRIBE request, the user agent MUST reject the request
with a 403 response. with a 403 response.
If a REFER request to an HTTP URI was authorized, the UA executes the If a REFER request to an HTTP URI is authorized, the UA executes the
URI and fetches the content to be rendered to the user. This URI and fetches the content to be rendered to the user. This
instantiates a presentation capable user interface component. If a instantiates a presentation-capable user interface component. If a
SUBSCRIBE was authorized, a presentation free user interface SUBSCRIBE was authorized, a presentation-free user interface
component is instantiated. component is instantiated.
8.3 Mapping User Input to User Interface Components 9.3. Mapping User Input to User Interface Components
Once the user interface components are instantiated, the user agent Once the user interface components are instantiated, the user agent
must direct user input to the appropriate component. In the case of must direct user input to the appropriate component. In the case of
presentation capable user interfaces, this process is known as focus presentation-capable user interfaces, this process is known as focus
selection. It is done by means that are specific to the user selection. It is done by means that are specific to the user
interface on the device. In the case of a PC, for example, the interface on the device. In the case of a PC, for example, the
window manager would allow the user to select the appropriate user window manager would allow the user to select the appropriate user
interface component that their input is directed to. interface component to which their input is directed.
For presentation free user interfaces, the situation is more For presentation-free user interfaces, the situation is more
complicated. In some cases, the device may support a mechanism that complicated. In some cases, the device may support a mechanism that
allows the user to select a "line", and thus the associated dialog. allows the user to select a "line", and thus the associated dialog.
Any user input on the keypad while this line is selected are fed to Any user input on the keypad while this line is selected are fed to
the user interface components associated with that dialog. the user interface components associated with that dialog.
Otherwise, for client local user interfaces, the user input is RFC 5629 App Interaction Framework October 2009
Otherwise, for client-local user interfaces, the user input is
assumed to be associated with all user interface components. For assumed to be associated with all user interface components. For
client remote user interfaces, the user device converts the user client-remote user interfaces, the user device converts the user
input to media, typically conveyed using RFC 2833, and sends this to input to media, typically conveyed using RFC 4733, and sends this to
the client remote user interface. This user interface then needs to the client-remote user interface. This user interface then needs to
map user input from potentially many media streams into user map user input from potentially many media streams into user
interface events. The process for doing this is described in interface events. The process for doing this is described in
Section 6.3. Section 7.3.
8.4 Receiving Updates to User Interface Components 9.4. Receiving Updates to User Interface Components
For presentation capable user interfaces, updates to the user For presentation-capable user interfaces, updates to the user
interface occur in ways specific to that user interface component. interface occur in ways specific to that user interface component.
In the case of HTML, for example, the document can tell the client to In the case of HTML, for example, the document can tell the client to
fetch a new document periodically. However, this framework does not fetch a new document periodically. However, this framework does not
provide any additional machinery to asynchronously push a new user provide any additional machinery to asynchronously push a new user
interface component to the client. interface component to the client.
For presentation free user interfaces, an application can push an For presentation-free user interfaces, an application can push an
update to a component by sending a SUBSCRIBE refresh with a new update to a component by sending a SUBSCRIBE refresh with a new
filter. The user agent will process these according to the rules of filter. The user agent will process these according to the rules of
the event package. the event package.
8.5 Terminating a User Interface Component 9.5. Terminating a User Interface Component
Termination of a presentation capable user interface component is a Termination of a presentation-capable user interface component is a
trivial procedure. The user agent merely dismisses the window (or trivial procedure. The user agent merely dismisses the window (or
equivalent). The fact that the component is dismissed is not its equivalent). The fact that the component is dismissed is not
communicated to the application. As such, it is purely a local communicated to the application. As such, it is purely a local
matter. matter.
In the case of a presentation free user interface, the user might In the case of a presentation-free user interface, the user might
wish to cease interacting with the application. However, most wish to cease interacting with the application. However, most
presentation free user interfaces will not have a way for the user to presentation-free user interfaces will not have a way for the user to
signal this through the device. If such a mechanism did exist, the signal this through the device. If such a mechanism did exist, the
UA SHOULD generate a NOTIFY request with a Subscription-State equal UA SHOULD generate a NOTIFY request with a Subscription-State header
to "terminated" and a reason of "rejected". This tells the field equal to "terminated" and a reason of "rejected". This tells
application that the component has been removed, and that it should the application that the component has been removed and that it
not attempt to re-subscribe. should not attempt to re-subscribe.
9. Inter-Application Feature Interaction 10. Inter-Application Feature Interaction
The inter-application feature interaction problem is inherent to The inter-application feature interaction problem is inherent to
stimulus signaling. Whenever there are multiple applications, there stimulus signaling. Whenever there are multiple applications, there
are multiple user interfaces. The system has to determine to which are multiple user interfaces. The system has to determine to which
user interface any particular input is destined. That question is user interface any particular input is destined. That question is
the essence of the inter-application feature interaction problem. the essence of the inter-application feature interaction problem.
RFC 5629 App Interaction Framework October 2009
Inter-application feature interaction is not an easy problem to Inter-application feature interaction is not an easy problem to
resolve. For now, we consider separately the issues for client-local resolve. For now, we consider separately the issues for client-local
and client-remote user interface components. and client-remote user interface components.
9.1 Client Local UI 10.1. Client-Local UI
When the user interface itself resides locally on the client device, When the user interface itself resides locally on the client device,
the feature interaction problem is actually much simpler. The end the feature interaction problem is actually much simpler. The end
device knows explicitly about each application, and therefore can device knows explicitly about each application, and therefore can
present the user with each one separately. When the user provides present the user with each one separately. When the user provides
input, the client device can determine to which user interface the input, the client device can determine to which user interface the
input is destined. The user interface to which input is destined is input is destined. The user interface to which input is destined is
referred to as the application in focus, and the means by which the referred to as the "application in focus", and the means by which the
focused application is selected is called focus determination. focused application is selected is called "focus determination".
Generally speaking, focus determination is purely a local operation. Generally speaking, focus determination is purely a local operation.
In the PC universe, focus determination is provided by window In the PC universe, focus determination is provided by window
managers. Each application does not know about focus, it merely managers. Each application does not know about focus; it merely
receives the user input that has been targeted to it when its in receives the user input that has been targeted to it when it's in
focus. This basic concept applies to SIP-based applications as well. focus. This basic concept applies to SIP-based applications as well.
Focus determination will frequently be trivial, depending on the user Focus determination will frequently be trivial, depending on the user
interface type. Consider a user that makes a call from a PC. The interface type. Consider a user that makes a call from a PC. The
call passes through a pre-paid calling card application, and a call call passes through a prepaid calling card application and a call-
recording application. Both of these wish to interact with the user. recording application. Both of these wish to interact with the user.
Both push an HTML-based user interface to the user. On the PC, each Both push an HTML-based user interface to the user. On the PC, each
user interface would appear as a separate window. The user interacts user interface would appear as a separate window. The user interacts
with the call recording application by selecting its window, and with with the call-recording application by selecting its window, and with
the pre-paid calling card application by selecting its window. Focus the prepaid calling card application by selecting its window. Focus
determination is literally provided by the PC window manager. It is determination is literally provided by the PC window manager. It is
clear to which application the user input is targeted. clear to which application the user input is targeted.
As another example, consider the same two applications, but on a As another example, consider the same two applications, but on a
"smart phone" that has a set of buttons, and next to each button, an "smart phone" that has a set of buttons, and next to each button,
LCD display that can provide the user with an option. This user there is an LCD display that can provide the user with an option.
interface can be represented using the Wireless Markup Language This user interface can be represented using the Wireless Markup
(WML), for example. Language (WML), for example.
The phone would allocate some number of buttons to each application. The phone would allocate some number of buttons to each application.
The prepaid calling card would get one button for its "hangup" The prepaid calling card would get one button for its "hangup"
command, and the recording application would get one for its "start/ command, and the recording application would get one for its "start/
stop" command. The user can easily determine which application to stop" command. The user can easily determine which application to
interact with by pressing the appropriate button. Pressing a button interact with by pressing the appropriate button. Pressing a button
determines focus and provides user input, both at the same time. determines focus and provides user input, both at the same time.
Unfortunately, not all devices will have these advanced displays. A Unfortunately, not all devices will have these advanced displays. A
PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. PSTN gateway, or a basic IP telephone, may only have a 12-key keypad.
The user interfaces for these devices are provided through the Keypad The user interfaces for these devices are provided through the Keypad
RFC 5629 App Interaction Framework October 2009
Markup Language (KPML). Considering once again the feature Markup Language (KPML). Considering once again the feature
interaction case above, the pre-paid calling card application and the interaction case above, the prepaid calling card application and the
call recording application would both pass a KPML document to the call-recording application would both pass a KPML document to the
device. When the user presses a button on the keypad, to which device. When the user presses a button on the keypad, to which
document does the input apply? The device does not allow the user to document does the input apply? The device does not allow the user to
select. A device where the user cannot provide focus is called a select. A device where the user cannot provide focus is called a
focusless device. This is quite a hard problem to solve. This "focusless device". This is quite a hard problem to solve. This
framework does not make any explicit normative recommendation, but framework does not make any explicit normative recommendation, but it
concludes that the best option is to send the input to both user concludes that the best option is to send the input to both user
interfaces unless the markup in one interface has indicated that it interfaces unless the markup in one interface has indicated that it
should be suppressed from others. This is a sensible choice by should be suppressed from others. This is a sensible choice by
analogy - its exactly what the existing circuit switched telephone analogy -- it's exactly what the existing circuit-switched telephone
network will do. It is an explicit non-goal to provide a better network will do. It is an explicit non-goal to provide a better
mechanism for feature interaction resolution than the PSTN on devices mechanism for feature interaction resolution than the PSTN on devices
which have the same user interface as they do on the PSTN. Devices that have the same user interface as they do on the PSTN. Devices
with better displays, such as PCs or screen phones, can benefit from with better displays, such as PCs or screen phones, can benefit from
the capabilities of this framework, allowing the user to determine the capabilities of this framework, allowing the user to determine
which application they are interacting with. which application they are interacting with.
Indeed, when a user provides input on a focusless device, the input Indeed, when a user provides input on a focusless device, the input
must be passed to all client local user interfaces, AND all client must be passed to all client-local user interfaces AND all client-
remote user interfaces, unless the markup tells the UI to suppress remote user interfaces, unless the markup tells the UI to suppress
the media. In the case of KPML, key events are passed to remote user the media. In the case of KPML, key events are passed to remote user
interfaces by encoding them in RFC 2833 [19]. Of course, since a interfaces by encoding them as described in RFC 4733 [19]. Of
client cannot determine if a media stream terminates in a remote user course, since a client cannot determine whether or not a media stream
interface or not, these key events are passed in all audio media terminates in a remote user interface, these key events are passed in
streams unless the KPML request document is used to suppress. all audio media streams unless the KPML request document is used to
suppress them.
9.2 Client-Remote UI 10.2. Client-Remote UI
When the user interfaces run remotely, the determination of focus can When the user interfaces run remotely, the determination of focus can
be much, much harder. There are many architectures that can be be much, much harder. There are many architectures that can be
deployed to handle the interaction. None are ideal. However, all deployed to handle the interaction. None are ideal. However, all
are beyond the scope of this specification. are beyond the scope of this specification.
10. Intra Application Feature Interaction 11. Intra Application Feature Interaction
An application can instantiate a multiplicity of user interface An application can instantiate a multiplicity of user interface
components. For example, a single application can instantiate two components. For example, a single application can instantiate two
separate HTML components and one WML component. Furthermore, an separate HTML components and one WML component. Furthermore, an
application can instantiate both client local and client remote user application can instantiate both client-local and client-remote user
interfaces. interfaces.
The feature interaction issues between these components within the The feature interaction issues between these components within the
same application are less severe. If an application has multiple same application are less severe. If an application has multiple
client user interface components, their interaction is resolved client user interface components, their interaction is resolved
identically to the inter-application case - through focus identically to the inter-application case -- through focus
RFC 5629 App Interaction Framework October 2009
determination. However, the problems in focusless user devices (such determination. However, the problems in focusless user devices (such
as a keypad on a telephone) generally won't exist, since the as a keypad on a telephone) generally won't exist, since the
application can generate user interfaces which do not overlap in application can generate user interfaces that do not overlap in their
their usage of an input. usage of an input.
The real issue is that the optimal user experience frequently The real issue is that the optimal user experience frequently
requires some kind of coupling between the differing user interface requires some kind of coupling between the differing user interface
components. This is a classic problem in multi-modal user components. This is a classic problem in multi-modal user
interfaces, such as those described by Speech Application Language interfaces, such as those described by Speech Application Language
Tags (SALT). As an example, consider a user interface where a user Tags (SALT). As an example, consider a user interface where a user
can either press a labeled button to make a selection, or listen to a can either press a labeled button to make a selection, or listen to a
prompt, and speak the desired selection. Ideally, when the user prompt, and speak the desired selection. Ideally, when the user
presses the button, the prompt should cease immediately, since both presses the button, the prompt should cease immediately, since both
of them were targeted at collecting the same information in parallel. of them were targeted at collecting the same information in parallel.
Such interactions are best handled by markups which natively support Such interactions are best handled by markups that natively support
such interactions, such as SALT, and thus require no explicit support such interactions, such as SALT, and thus require no explicit support
from this framework. from this framework.
11. Example Call Flow 12. Example Call Flow
This section shows the operation of a call recording application. This section shows the operation of a call-recording application.
This application allows a user to record the media in their call by This application allows a user to record the media in their call by
clicking on a button in a web form. The application uses a clicking on a button in a web form. The application uses a
presentation capable user interface component that is pushed to the presentation-capable user interface component that is pushed to the
caller. The conventions of [17] are used to describe representation caller. The conventions of [17] are used to describe representation
of long message lines. of long message lines.
RFC 5629 App Interaction Framework October 2009
A Recording App B A Recording App B
|(1) INVITE | | |(1) INVITE | |
|----------------------->| | |----------------------->| |
| |(2) INVITE | | |(2) INVITE |
| |----------------------->| | |----------------------->|
| |(3) 200 OK | | |(3) 200 OK |
| |<-----------------------| | |<-----------------------|
|(4) 200 OK | | |(4) 200 OK | |
|<-----------------------| | |<-----------------------| |
|(5) ACK | | |(5) ACK | |
skipping to change at page 31, line 39 skipping to change at page 31, line 41
|<-----------------------| | |<-----------------------| |
|(13) NOTIFY | | |(13) NOTIFY | |
|----------------------->| | |----------------------->| |
|(14) 200 OK | | |(14) 200 OK | |
|<-----------------------| | |<-----------------------| |
|(15) HTTP POST | | |(15) HTTP POST | |
|----------------------->| | |----------------------->| |
|(16) 200 OK | | |(16) 200 OK | |
|<-----------------------| | |<-----------------------| |
Figure 7 Figure 6
First, the caller, A, sends an INVITE to setup a call (message 1). First, the caller, A, sends an INVITE to set up a call (message 1).
Since the caller supports the framework, and can handle presentation Since the caller supports the framework and can handle presentation-
capable user interface components, it includes the Supported header capable user interface components, it includes the Supported header
field indicating that the GRUU extension and the Target-Dialog header field indicating that the GRUU extension and the Target-Dialog header
field are understood, Allow indicating that REFER is understood, and field are understood, the Allow header field indicating that REFER is
a Contact header field that includes the "schemes" header field understood, and the Contact header field that includes the "schemes"
parameter. header field parameter.
INVITE sips:B@example.com SIP/2.0 RFC 5629 App Interaction Framework October 2009
INVITE sip:B@example.com SIP/2.0
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.org> To: Callee <sip:B@example.org>
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 INVITE CSeq: 1 INVITE
Max-Forwards: 70 Max-Forwards: 70
Supported: gruu, tdialog Supported: gruu, tdialog
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER
Accept: application/sdp, text/html Accept: application/sdp, text/html
<allOneLine> <allOneLine>
Contact: <sips:A@example.com;opaque=urn:uuid:f81d4f Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a>;schemes="http,sip,sips" -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip"
</allOneLine> </allOneLine>
Content-Length: ... Content-Length: ...
Content-Type: application/sdp Content-Type: application/sdp
--SDP not shown-- --SDP not shown--
The proxy acts as a recording server, and forwards the INVITE to the The proxy acts as a recording server, and forwards the INVITE to the
called party (message 2). It strips the Record-Route it would called party (message 2). It strips the Record-Route it would
normally insert due to the presence of the GRUU in the INVITE: normally insert due to the presence of the GRUU in the INVITE:
INVITE sips:B@pc.example.com SIP/2.0 INVITE sip:B@pc.example.com SIP/2.0
Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.org> To: Callee <sip:B@example.org>
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 INVITE CSeq: 1 INVITE
Max-Forwards: 70 Max-Forwards: 70
Supported: gruu, tdialog Supported: gruu, tdialog
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER
Accept: application/sdp, text/html Accept: application/sdp, text/html
<allOneLine> <allOneLine>
Contact: <sips:A@example.com;opaque=urn:uuid:f81d4f Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a>;schemes="http,sip,sips" -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip"
</allOneLine> </allOneLine>
Content-Length: ... Content-Length: ...
Content-Type: application/sdp Content-Type: application/sdp
--SDP not shown-- --SDP not shown--
B accepts the call with a 200 OK (message 3). It does not support B accepts the call with a 200 OK (message 3). It does not support
the framework, and so the various header fields are not present. the framework, so the various header fields are not present.
RFC 5629 App Interaction Framework October 2009
SIP/2.0 200 OK SIP/2.0 200 OK
Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.com>;tag=7777 To: Callee <sip:B@example.com>;tag=7777
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 INVITE CSeq: 1 INVITE
Contact: <sips:B@pc.example.com> Contact: <sip:B@pc.example.com>
Content-Length: ... Content-Length: ...
Content-Type: application/sdp Content-Type: application/sdp
--SDP not shown-- --SDP not shown--
This 200 OK is passed back to the caller (message 4): This 200 OK is passed back to the caller (message 4):
SIP/2.0 200 OK SIP/2.0 200 OK
Record-Route: <sips:app.example.com;lr> Record-Route: <sip:app.example.com;lr>
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.com>;tag=7777 To: Callee <sip:B@example.com>;tag=7777
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 INVITE CSeq: 1 INVITE
Contact: <sips:B@pc.example.com> Contact: <sip:B@pc.example.com>
Content-Length: ... Content-Length: ...
Content-Type: application/sdp Content-Type: application/sdp
--SDP not shown-- --SDP not shown--
The caller generates an ACK (message 5). The caller generates an ACK (message 5).
ACK sips:B@pc.example.com ACK sip:B@pc.example.com
Route: <sips:app.example.com;lr> Route: <sip:app.example.com;lr>
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.com>;tag=7777 To: Callee <sip:B@example.com>;tag=7777
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 ACK CSeq: 1 ACK
The ACK is forwarded to the called party (message 6). The ACK is forwarded to the called party (message 6).
ACK sips:B@pc.example.com ACK sip:B@pc.example.com
Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9
From: Caller <sip:A@example.com>;tag=kkaz- From: Caller <sip:A@example.com>;tag=kkaz-
To: Callee <sip:B@example.com>;tag=7777 To: Callee <sip:B@example.com>;tag=7777
Call-ID: fa77as7dad8-sd98ajzz@host.example.com Call-ID: fa77as7dad8-sd98ajzz@host.example.com
CSeq: 1 ACK CSeq: 1 ACK
RFC 5629 App Interaction Framework October 2009
Now, the application decides to push a user interface component to Now, the application decides to push a user interface component to
user A. So, it sends it a REFER request (message 7): user A. So, it sends it a REFER request (message 7):
<allOneLine> <allOneLine>
REFER sips:A@example.com;opaque=urn:uuid:f81d4f REFER sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a SIP/2.0 -7dec-11d0-a765-00a0c91e6bf6 SIP/2.0
</allOneLine> </allOneLine>
Refer-To: https://app.example.com/script.pl Refer-To: https://app.example.com/script.pl
Target-Dialog: fa77as7dad8-sd98ajzz@host.example.com Target-Dialog: fa77as7dad8-sd98ajzz@host.example.com
;remote-tag=7777;local-tag=kkaz- ;remote-tag=7777;local-tag=kkaz-
Require: tdialog Require: tdialog
Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6
Max-Forwards: 70 Max-Forwards: 70
From: Recorder Application <sip:app.example.com>;tag=jhgf From: Recorder Application <sip:app.example.com>;tag=jhgf
<allOneLine> <allOneLine>
To: Caller <sips:A@example.com;opaque=urn:uuid:f81d4f To: Caller <sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a> -7dec-11d0-a765-00a0c91e6bf6>
</allOneLine> </allOneLine>
Require: tdialog Require: tdialog
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER
Call-ID: 66676776767@app.example.com Call-ID: 66676776767@app.example.com
CSeq: 1 REFER CSeq: 1 REFER
Event: refer Event: refer
Contact: <sips:app.example.com> Contact: <sip:app.example.com>
The REFER request goes to itself, where the Request URI is resolved Since the recording application is the same as the authoritative
to the registered contact of A, and then sent there. The REFER is proxy for the domain, it resolves the Request URI to the registered
answered by a 200 OK (message 8). contact of A, and then sent there. The REFER is answered by a 200 OK
(message 8).
SIP/2.0 200 OK SIP/2.0 200 OK
Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6
From: Recorder Application <sip:app.example.com>;tag=jhgf From: Recorder Application <sip:app.example.com>;tag=jhgf
To: Caller <sip:A@example.com>;tag=pqoew To: Caller <sip:A@example.com>;tag=pqoew
Call-ID: 66676776767@app.example.com Call-ID: 66676776767@app.example.com
Supported: gruu, tdialog Supported: gruu, tdialog
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER
<allOneLine> <allOneLine>
Contact: <sips:A@example.com;opaque=urn:uuid:f81d4f Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a>;schemes="http,sip,sips" -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip"
</allOneLine> </allOneLine>
CSeq: 1 REFER CSeq: 1 REFER
RFC 5629 App Interaction Framework October 2009
User A sends a NOTIFY (message 9): User A sends a NOTIFY (message 9):
NOTIFY sips:app.example.com SIP/2.0 NOTIFY sip:app.example.com SIP/2.0
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995
To: Recorder Application <sip:app.example.com>;tag=jhgf To: Recorder Application <sip:app.example.com>;tag=jhgf
From: Caller <sip:A@example.com>;tag=pqoew From: Caller <sip:A@example.com>;tag=pqoew
Call-ID: 66676776767@app.example.com Call-ID: 66676776767@app.example.com
CSeq: 1 NOTIFY CSeq: 1 NOTIFY
Max-Forwards: 70 Max-Forwards: 70
<allOneLine> <allOneLine>
Contact: <sips:A@example.com;opaque=urn:uuid:f81d4f Contact: <sip:A@example.com;gr=urn:uuid:f81d4fae
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a>;schemes="http,sip,sips" -7dec-11d0-a765-00a0c91e6bf6>;schemes="http,sip"
</allOneLine> </allOneLine>
Event: refer;id=93809824 Event: refer;id=93809824
Subscription-State: active;expires=3600 Subscription-State: active;expires=3600
Content-Type: message/sipfrag;version=2.0 Content-Type: message/sipfrag;version=2.0
Content-Length: 20 Content-Length: 20
SIP/2.0 100 Trying SIP/2.0 100 Trying
And the recording server responds with a 200 OK (message 10) And the recording server responds with a 200 OK (message 10).
SIP/2.0 200 OK SIP/2.0 200 OK
Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995
To: Recorder Application <sip:app.example.com>;tag=jhgf To: Recorder Application <sip:app.example.com>;tag=jhgf
From: Caller <sip:A@example.com>;tag=pqoew From: Caller <sip:A@example.com>;tag=pqoew
Call-ID: 66676776767@app.example.com Call-ID: 66676776767@app.example.com
CSeq: 1 NOTIFY CSeq: 1 NOTIFY
The REFER request contained a Target-Dialog header field parameter The REFER request contained a Target-Dialog header field parameter
with a valid dialog identifier. Furthermore, all of the signaling with a valid dialog identifier. Furthermore, all of the signaling
skipping to change at page 35, line 46 skipping to change at page 36, line 5
application. It then acts on the Refer-To URI, fetching the script application. It then acts on the Refer-To URI, fetching the script
from app.example.com (message 11). The response, message 12, from app.example.com (message 11). The response, message 12,
contains a web application that the user can click on to enable contains a web application that the user can click on to enable
recording. Because the client executed the URL in the Refer-To, it recording. Because the client executed the URL in the Refer-To, it
generates another NOTIFY to the application, informing it of the generates another NOTIFY to the application, informing it of the
successful response (message 13). This is answered with a 200 OK successful response (message 13). This is answered with a 200 OK
(message 14). When the user clicks on the link (message 15), the (message 14). When the user clicks on the link (message 15), the
results are posted to the server, and an updated display is provided results are posted to the server, and an updated display is provided
(message 16). (message 16).
12. Security Considerations RFC 5629 App Interaction Framework October 2009
13. Security Considerations
There are many security considerations associated with this There are many security considerations associated with this
framework. It allows applications in the network to instantiate user framework. It allows applications in the network to instantiate user
interface components on a client device. Such instantiations need to interface components on a client device. Such instantiations need to
be from authenticated applications, and also need to be authorized to be from authenticated applications, and also need to be authorized to
place a UI into the client. Indeed, the stronger requirement is place a UI into the client. Indeed, the stronger requirement is
authorization. It is not so important to know that name of the authorization. It is not as important to know the name of the
provider of the application, but rather, that the provider is provider of the application, as it is to know that the provider is
authorized to instantiate components. authorized to instantiate components.
This specification defines specific authorization techniques and This specification defines specific authorization techniques and
requirements. Automatic authorization is granted if the application requirements. Automatic authorization is granted if the application
can prove that it is on the call path, or is trusted by an element on can prove that it is on the call path, or is trusted by an element on
the call path. As documented above, this can be accompished by the the call path. As documented above, this can be accomplished by the
use of cryptographically random dialog identifiers and the usage of use of cryptographically random dialog identifiers and the usage of
sips for message confidentiality. It is RECOMMENDED that sips be SIPS for message confidentiality. It is RECOMMENDED that SIPS be
implemented by user agents compliant to this specification. This implemented by user agents compliant to this specification. This
does not represent a change from the requirements in RFC 3261. does not represent a change from the requirements in RFC 3261.
13. IANA Considerations
There are no IANA considerations associated with this specification.
14. Contributors 14. Contributors
This document was produced as a result of discussions amongst the This document was produced as a result of discussions amongst the
application interaction design team. All members of this team application interaction design team. All members of this team
contributed significantly to the ideas embodied in this document. contributed significantly to the ideas embodied in this document.
The members of this team were: The members of this team were:
Eric Burger Eric Burger
Cullen Jennings Cullen Jennings
Robert Fairlie-Cuninghame Robert Fairlie-Cuninghame
15. Acknowledgements 15. Acknowledgements
The authors would like to thank Martin Dolly and Rohan Mahy for their The authors would like to thank Martin Dolly and Rohan Mahy for their
input and comments. Thanks to Allison Mankin for her support of this input and comments. Thanks to Allison Mankin for her support of this
work. work.
16. References 16. References
16.1 Normative References 16.1. Normative References
[1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Levels", BCP 14, RFC 2119, March 1997.
[2] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
Session Initiation Protocol", RFC 3261, June 2002. Session Initiation Protocol", RFC 3261, June 2002.
[2] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional RFC 5629 App Interaction Framework October 2009
[3] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional
Responses in Session Initiation Protocol (SIP)", RFC 3262, Responses in Session Initiation Protocol (SIP)", RFC 3262,
June 2002. June 2002.
[3] Roach, A., "Session Initiation Protocol (SIP)-Specific Event [4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event
Notification", RFC 3265, June 2002. Notification", RFC 3265, June 2002.
[4] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D., [5] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D.,
Carter, J., Ferrans, J., and A. Hunt, "Voice Extensible Markup Carter, J., Ferrans, J., and A. Hunt, "Voice Extensible Markup
Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20- Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20-
20030220, February 2003. 20030220, February 2003.
[5] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating [6] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating
User Agent Capabilities in the Session Initiation Protocol User Agent Capabilities in the Session Initiation Protocol
(SIP)", RFC 3840, August 2004. (SIP)", RFC 3840, August 2004.
[6] Sparks, R., "The Session Initiation Protocol (SIP) Refer [7] Sparks, R., "The Session Initiation Protocol (SIP) Refer
Method", RFC 3515, April 2003. Method", RFC 3515, April 2003.
[7] Burger, E., "A Session Initiation Protocol (SIP) Event Package [8] Burger, E. and M. Dolly, "A Session Initiation Protocol (SIP)
for Key Press Stimulus (KPML)", draft-ietf-sipping-kpml-07 Event Package for Key Press Stimulus (KPML)", RFC 4730,
(work in progress), December 2004. November 2006.
[8] Rosenberg, J., "Obtaining and Using Globally Routable User
Agent (UA) URIs (GRUU) in the Session Initiation Protocol
(SIP)", draft-ietf-sip-gruu-03 (work in progress),
February 2005.
[9] Rosenberg, J., "Request Authorization through Dialog [9] Rosenberg, J., "Obtaining and Using Globally Routable User
Identification in the Session Initiation Protocol (SIP)", Agent URIs (GRUUs) in the Session Initiation Protocol (SIP)",
draft-ietf-sip-target-dialog-00 (work in progress), April 2005. RFC 5627, October 2009.
[10] Camarillo, G., "The Internet Assigned Number Authority (IANA) [10] Rosenberg, J., "Request Authorization through Dialog
Header Field Parameter Registry for the Session Initiation Identification in the Session Initiation Protocol (SIP)",
Protocol (SIP)", BCP 98, RFC 3968, December 2004. RFC 4538, June 2006.
16.2 Informative References 16.2. Informative References
[11] Peterson, J. and C. Jennings, "Enhancements for Authenticated [11] Peterson, J. and C. Jennings, "Enhancements for Authenticated
Identity Management in the Session Initiation Protocol (SIP)", Identity Management in the Session Initiation Protocol (SIP)",
draft-ietf-sip-identity-05 (work in progress), May 2005. RFC 4474, August 2006.
[12] Day, M., Rosenberg, J., and H. Sugano, "A Model for Presence [12] Day, M., Rosenberg, J., and H. Sugano, "A Model for Presence
and Instant Messaging", RFC 2778, February 2000. and Instant Messaging", RFC 2778, February 2000.
[13] Jennings, C., Peterson, J., and M. Watson, "Private Extensions [13] Jennings, C., Peterson, J., and M. Watson, "Private Extensions
to the Session Initiation Protocol (SIP) for Asserted Identity to the Session Initiation Protocol (SIP) for Asserted Identity
within Trusted Networks", RFC 3325, November 2002. within Trusted Networks", RFC 3325, November 2002.
[14] Rosenberg, J., "A Framework for Conferencing with the Session [14] Rosenberg, J., "A Framework for Conferencing with the Session
Initiation Protocol", Initiation Protocol (SIP)", RFC 4353, February 2006.
draft-ietf-sipping-conferencing-framework-05 (work in
progress), May 2005. RFC 5629 App Interaction Framework October 2009
[15] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller [15] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller
Preferences for the Session Initiation Protocol (SIP)", Preferences for the Session Initiation Protocol (SIP)",
RFC 3841, August 2004. RFC 3841, August 2004.
[16] Rosenberg, J., "An INVITE Inititiated Dialog Event Package for [16] Rosenberg, J., Schulzrinne, H., and R. Mahy, "An INVITE-
the Session Initiation Protocol (SIP)", Initiated Dialog Event Package for the Session Initiation
draft-ietf-sipping-dialog-package-06 (work in progress), Protocol (SIP)", RFC 4235, November 2005.
April 2005.
[17] Sparks, R., "Session Initiation Protocol Torture Test [17] Sparks, R., Hawrylyshen, A., Johnston, A., Rosenberg, J., and
Messages", draft-ietf-sipping-torture-tests-07 (work in H. Schulzrinne, "Session Initiation Protocol (SIP) Torture Test
progress), May 2005. Messages", RFC 4475, May 2006.
[18] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, [18] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
"RTP: A Transport Protocol for Real-Time Applications", "RTP: A Transport Protocol for Real-Time Applications", STD 64,
RFC 3550, July 2003. RFC 3550, July 2003.
[19] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, [19] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF Digits,
Telephony Tones and Telephony Signals", RFC 2833, May 2000. Telephony Tones, and Telephony Signals", RFC 4733, December
2006.
[20] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with [20] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
Session Description Protocol (SDP)", RFC 3264, June 2002. Session Description Protocol (SDP)", RFC 3264, June 2002.
[21] Rosenberg, J., "A Session Initiation Protocol (SIP) Event [21] Rosenberg, J., "A Session Initiation Protocol (SIP) Event
Package for Registrations", RFC 3680, March 2004. Package for Registrations", RFC 3680, March 2004.
Author's Address Author's Address
Jonathan Rosenberg Jonathan Rosenberg
Cisco Systems Cisco Systems
600 Lanidex Plaza 600 Lanidex Plaza
Parsippany, NJ 07054 Parsippany, NJ 07054
US US
Phone: +1 973 952-5000 Phone: +1 973 952-5000
Email: jdrosen@cisco.com EMail: jdrosen@cisco.com
URI: http://www.jdrosen.net URI: http://www.jdrosen.net
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
 End of changes. 273 change blocks. 
539 lines changed or deleted 630 lines changed or added

This html diff was produced by rfcdiff 1.37a. The latest version is available from http://tools.ietf.org/tools/rfcdiff/