[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]
Versions: (draft-even-clue-rtp-mapping) 00 01
02 03 04 05 06 07 08 09 10 11 12 13
14 RFC 8849
CLUE WG R. Even
Internet-Draft Huawei Technologies
Intended status: Standards Track J. Lennox
Expires: January 23, 2015 Vidyo
July 22, 2014
Mapping RTP streams to CLUE media captures
draft-ietf-clue-rtp-mapping-02.txt
Abstract
This document describes mechanisms and recommended practice for
mapping RTP media streams defined in SDP to CLUE media captures.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 23, 2015.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Even & Lennox Expires January 23, 2015 [Page 1]
Internet-Draft RTP mapping to CLUE July 2014
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. RTP topologies for CLUE . . . . . . . . . . . . . . . . . . . 3
4. Mapping CLUE Capture Encodings to RTP streams . . . . . . . . 5
4.1. Review of current directions in MMUSIC, AVText and
AVTcore . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2. Requirements of a solution . . . . . . . . . . . . . . . 7
4.3. Static Mapping . . . . . . . . . . . . . . . . . . . . . 8
4.4. Dynamic mapping . . . . . . . . . . . . . . . . . . . . . 9
4.5. Recommendations . . . . . . . . . . . . . . . . . . . . . 9
5. Application to CLUE Media Requirements . . . . . . . . . . . 10
6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1. Static mapping . . . . . . . . . . . . . . . . . . . . . 12
6.2. Dynamic Mapping . . . . . . . . . . . . . . . . . . . . . 14
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15
9. Security Considerations . . . . . . . . . . . . . . . . . . . 15
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 15
10.1. Normative References . . . . . . . . . . . . . . . . . . 15
10.2. Informative References . . . . . . . . . . . . . . . . . 16
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17
1. Introduction
Telepresence systems can send and receive multiple media streams.
The CLUE framework [I-D.ietf-clue-framework] defines media captures
as a source of Media, such as from one or more Capture Devices. A
Media Capture (MC) may be the source of one or more Media streams. A
Media Capture may also be constructed from other Media streams. A
middle box can express conceptual Media Captures that it constructs
from Media streams it receives.
SIP offer answer [RFC3264] uses SDP [RFC4566] to describe the
RTP[RFC3550] media streams. Each RTP stream has a unique SSRC within
its RTP session. The content of the RTP stream is created by an
encoder in the endpoint. This may be an original content from a
camera or a content created by an intermediary device like an MCU.
This document makes recommendations, for the telepresence
architecture, about how RTP and RTCP streams should be encoded and
transmitted, and how their relation to CLUE Media Captures should be
communicated. The proposed solution supports multiple RTP
topologies.
Even & Lennox Expires January 23, 2015 [Page 2]
Internet-Draft RTP mapping to CLUE July 2014
2. Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC2119[RFC2119] and
indicate requirement levels for compliant RTP implementations.
3. RTP topologies for CLUE
The typical RTP topologies used by Telepresence systems specify
different behaviors for RTP and RTCP distribution. A number of RTP
topologies are described in
[I-D.westerlund-avtcore-rtp-topologies-update]. For telepresence,
the relevant topologies include point-to-point, as well as media
mixers, media- switching mixers, and source-projection middleboxs.
In the point-to-point topology, one peer communicates directly with a
single peer over unicast. There can be one or more RTP sessions, and
each RTP session can carry multiple RTP streams identified by their
SSRC. All SSRCs will be recognized by the peers based on the
information in the RTCP SDES report that will include the CNAME and
SSRC of the sent RTP streams. There are different point to point use
cases as specified in CLUE use case [RFC7205]. There may be a
difference between the symmetric and asymmetric use cases. While in
the symmetric use case the typical mapping will be from a Media
capture device to a render device (e.g. camera to monitor) in the
asymmetric case the render device may receive different capture
information (RTP stream from a different camera) if it has fewer
rendering devices (monitors). In some cases, a CLUE session which,
at a high-level, is point-to-point may nonetheless have RTP which is
best described by one of the mixer topologies below. For example, a
CLUE endpoint can produce composited or switched captures for use by
a receiving system with fewer displays than the sender has cameras.
In the Media Mixer topology, the peers communicate only with the
mixer. The mixer provides mixed or composited media streams, using
its own SSRC for the sent streams. There are two cases here. In the
first case the mixer may have separate RTP sessions with each peer
(similar to the point to point topology) terminating the RTCP
sessions on the mixer; this is known as Topo-RTCP-Terminating MCU in
[RFC5117]. In the second case, the mixer can use a conference-wide
RTP session similar to RFC 5117's Topo-mixer or Topo-Video-switching.
The major difference is that for the second case, the mixer uses
conference-wide RTP sessions, and distributes the RTCP reports to all
the RTP session participants, enabling them to learn all the CNAMEs
and SSRCs of the participants and know the contributing source or
sources (CSRCs) of the original streams from the RTP header. In the
first case, the Mixer terminates the RTCP and the participants cannot
Even & Lennox Expires January 23, 2015 [Page 3]
Internet-Draft RTP mapping to CLUE July 2014
know all the available sources based on the RTCP information. The
conference roster information including conference participants,
endpoints, media and media-id (SSRC) can be available using the
conference event package [RFC4575] element.
In the Media-Switching Mixer topology, the peer to mixer
communication is unicast with mixer RTCP feedback. It is
conceptually similar to a compositing mixer as described in the
previous paragraph, except that rather than compositing or mixing
multiple sources, the mixer provides one or more conceptual sources
selecting one source at a time from the original sources. The Mixer
creates a conference-wide RTP session by sharing remote SSRC values
as CSRCs to all conference participants.
In the Source-Projection middlebox topology, the peer to mixer
communication is unicast with RTCP mixer feedback. Every potential
sender in the conference has a source which is "projected" by the
mixer into every other session in the conference; thus, every
original source is maintained with an independent RTP identity to
every receiver, maintaining separate decoding state and its original
RTCP SDES information. However, RTCP is terminated at the mixer,
which might also perform reliability, repair, rate adaptation, or
transcoding on the stream. Senders' SSRCs may be renumbered by the
mixer. The sender may turn the projected sources on and off at any
time, depending on which sources it thinks are most relevant for the
receiver; this is the primary reason why this topology must act as an
RTP mixer rather than as a translator, as otherwise these disabled
sources would appear to have enormous packet loss. Source switching
is accomplished through this process of enabling and disabling
projected sources, with the higher-level semantic assignment of
reason for the RTP streams assigned externally.
The above topologies demonstrate two major RTP/RTCP behaviors:
1. The mixer may either use the source SSRC when forwarding RTP
packets, or use its own created SSRC. Still the mixer will
distribute all RTCP information to all participants creating
conference-wide RTP session/s. This allows the participants to
learn the available RTP sources in each RTP session. The
original source information will be the SSRC or in the CSRC
depending on the topology. The point to point case behaves like
this.
2. The mixer terminates the RTCP from the source, creating separate
RTP sessions with the peers. In this case the participants will
not receive the source SSRC in the CSRC. Since this is usually a
mixer topology, the source information is available from the SIP
conference event package [RFC4575]. Subscribing to the
Even & Lennox Expires January 23, 2015 [Page 4]
Internet-Draft RTP mapping to CLUE July 2014
conference event package allows each participant to know the
SSRCs of all sources in the conference.
4. Mapping CLUE Capture Encodings to RTP streams
The different topologies described in Section 3 support different
SSRC distribution models and RTP stream multiplexing points.
Most video conferencing systems today can separate multiple RTP
sources by placing them into separate RTP sessions using, the SDP
description. For example, main and slides video sources are
separated into separate RTP sessions based on the content attribute
[RFC4796]. This solution works straightforwardly if the multiplexing
point is at the UDP transport level, where each RTP stream uses a
separate RTP session. This will also be true for mapping the RTP
streams to Media Captures Encodings if each media capture encodings
uses a separate RTP session, and the consumer can identify it based
on the receiving RTP port. In this case, SDP only needs to label the
RTP session with an identifier that identifies the media capture in
the CLUE description. In this case, it does not change the mapping
even if the RTP session is switched using same or different SSRC.
(The multiplexing is not at the SSRC level).
Even though Session multiplexing is supported by CLUE, for scaling
reasons, CLUE recommends using SSRC multiplexing in a single or
multiple sessions. So we need to look at how to map RTP streams to
Media Captures Encodings when SSRC multiplexing is used.
When looking at SSRC multiplexing we can see that in various
topologies, the SSRC behavior may be different:
1. The SSRCs are static (assigned by the MCU/Mixer), and there is an
SSRC for each media capture encoding defined in the CLUE
protocol. Source information may be conveyed using CSRC, or, in
the case of topo-RTCP-Terminating MCU, is not conveyed.
2. The SSRCs are dynamic, representing the original source and are
relayed by the Mixer/MCU to the participants.
In the above two cases the MCU/Mixer may create an advertisement,
with a virtual room capture scene.
Another case we can envision is that the MCU / Mixer relays all the
capture scenes from all advertisements to all consumers. This means
that the advertisement will include multiple capture scenes, each
representing a separate TelePresence room with its own coordinate
system.
Even & Lennox Expires January 23, 2015 [Page 5]
Internet-Draft RTP mapping to CLUE July 2014
4.1. Review of current directions in MMUSIC, AVText and AVTcore
Editor's note: This section provides an overview of the RFCs and
drafts that can be used a base for a mapping solution. This section
is for information only, and if the WG thinks that it is the right
direction, the authors will bring the required work to the relevant
WGs.
The solution needs to also support the simulcast case where more than
one RTP session may be advertised for a Media Capture. Support of
such simulcast is out of scope for CLUE.
When looking at the available tools based on current work in MMUSIC,
AVTcore and AVText for supporting SSRC multiplexing the following
documents are considered to be relevant.
SDP Source attribute [RFC5576] mechanisms to describe specific
attributes of RTP sources based on their SSRC.
Negotiation of generic image attributes in SDP [RFC6236] provides the
means to negotiate the image size. The image attribute can be used
to offer different image parameters like size but in order to offer
multiple RTP streams with different resolutions it does it using
separate RTP session for each image option.
[I-D.westerlund-avtcore-max-ssrc] proposes a signaling solution for
how to use multiple SSRCs within one RTP session.
[I-D.westerlund-avtext-rtcp-sdes-srcname] provides an extension that
may be send in SDP, as an RTCP SDES information or as an RTP header
extension that uniquely identifies a single media source. It defines
an hierarchical order of the SRCNAME parameter that can be used to
for example to describe multiple resolution from the same source (see
section 5.1 of [I-D.westerlund-avtcore-rtp-simulcast]). Still all
the examples are using RTP session multiplexing.
Other documents reviewed by the authors but are currently not used in
a proposed solution include:
[I-D.lennox-mmusic-sdp-source-selection] specifies how participants
in a multimedia session can request a specific source from a remote
party.
[I-D.westerlund-avtext-codec-operation-point](expired) extends the
codec control messages by specifying messages that let participants
communicate a set of codec configuration parameters.
Even & Lennox Expires January 23, 2015 [Page 6]
Internet-Draft RTP mapping to CLUE July 2014
Using the above documents it is possible to negotiate the max number
of received and sent RTP streams inside an RTP session (m-line or
bundled m-line). This allows also offering allowed combinations of
codec configurations using different payload type numbers
Examples: max-recv-ssrc:{96:2 & 97:3) where 96 and 96 are different
payload type numbers. Or max-send-ssrc{*:4}.
In the next sections, the document will propose mechanisms to map the
RTP streams to media captures addressing.
4.2. Requirements of a solution
This section lists, more briefly, the requirements a media
architecture for Clue telepresence needs to achieve, summarizing the
discussion of previous sections. In this section, RFC 2119 [RFC2119]
language refers to requirements on a solution, not an implementation;
thus, requirements keywords are not written in capital letters.
Media-1: It must not be necessary for a Clue session to use more than
a single transport flow for transport of a given media type (video or
audio).
Media-2: It must, however, be possible for a Clue session to use
multiple transport flows for a given media type where it is
considered valuable (for example, for distributed media, or
differential quality-of-service).
Media-3: It must be possible for a Clue endpoint or MCU to
simultaneously send sources corresponding to static, to composited,
and to switched captures, in the same transport flow. (Any given
device might not necessarily be able send all of these source types;
but for those that can, it must be possible for them to be sent
simultaneously.)
Media-4: It must be possible for an original source to move among
switched captures (i.e. at one time be sent for one switched capture,
and at a later time be sent for another one).
Media-5: It must be possible for a source to be placed into a
switched capture even if the source is a "late joiner", i.e. was
added to the conference after the receiver requested the switched
source.
Media-6: Whenever a given source is assigned to a switched capture,
it must be immediately possible for a receiver to determine the
switched capture it corresponds to, and thus that any previous source
is no longer being mapped to that switched capture.
Even & Lennox Expires January 23, 2015 [Page 7]
Internet-Draft RTP mapping to CLUE July 2014
Media-7: It must be possible for a receiver to identify the actual
source that is currently being mapped to a switched capture, and
correlate it with out-of-band (non-Clue) information such as rosters.
Media-8: It must be possible for a source to move among switched
captures without requiring a refresh of decoder state (e.g., for
video, a fresh I-frame), when this is unnecessary. However, it must
also be possible for a receiver to indicate when a refresh of decoder
state is in fact necessary.
Media-9: If a given source is being sent on the same transport flow
for more than one reason (e.g. if it corresponds to more than one
switched capture at once, or to a static capture), it should be
possible for a sender to send only one copy of the source.
Media-10: On the network, media flows should, as much as possible,
look and behave like currently-defined usages of existing protocols;
established semantics of existing protocols must not be redefined.
Media-11: The solution should seek to minimize the processing burden
for boxes that distribute media to decoding hardware.
Media-12: If multiple sources from a single synchronization context
are being sent simultaneously, it must be possible for a receiver to
associate and synchronize them properly, even for sources that are
are mapped to switched captures.
4.3. Static Mapping
Static mapping is widely used in current MCU implementations. It is
also common for a point to point symmetric use case when both
endpoints have the same capabilities. For capture encodings with
static SSRCs, it is most straightforward to indicate this mapping
outside the media stream, in the CLUE or SDP signaling. An SDP
source attribute [RFC5576] can be used to associate CLUE encoding
using appIds[I-D.even-mmusic-application-token] with SSRCs in SDP.
Each SSRC will have an appId value that will be specified also in the
CLUE media capture as an attribute. The provider advertisement
could, if it wished, use the same SSRC for media capture encodings
that are mutually exclusive. (This would be natural, for example, if
two advertised captures are implemented as different configurations
of the same physical camera, zoomed in or out.). Section 6 provide
an example of an SDP offer and CLUE advertisement.
Even & Lennox Expires January 23, 2015 [Page 8]
Internet-Draft RTP mapping to CLUE July 2014
4.4. Dynamic mapping
Dynamic mapping by tagging each media packet with the appId. This
means that a receiver immediately knows how to interpret received
media, even when an unknown SSRC is seen. As long as the media
carries a known appId, it can be assumed that this media stream will
replace the stream currently being received with that appId.
This gives significant advantages to switching latency, as a switch
between sources can be achieved without any form of negotiation with
the receiver.
However, the disadvantage in using a appId in the stream that it
introduces additional processing costs for every media packet, as
appIds are scoped only within one hop (i.e., within a cascaded
conference a appId that is used from the source to the first MCU is
not meaningful between two MCUs, or between an MCU and a receiver),
and so they may need to be added or modified at every stage.
If the appIds are chosen by the media sender, by offering a
particular capture encoding to multiple recipients with the same ID,
this requires the sender to only produce one version of the stream
(assuming outgoing payload type numbers match). This reduces the
cost in the multicast case, although does not necessarily help in the
switching case.
An additional issue with putting appIds in the RTP packets comes from
cases where a non-CLUE aware endpoint is being switched by an MCU to
a CLUE endpoint. In this case, we may require up to an additional 12
bytes in the RTP header, which may push a media packet over the MTU.
However, as the MTU on either side of the switch may not match, it is
possible that this could happen even without adding extra data into
the RTP packet. The 12 additional bytes per packet could also be a
significant bandwidth increase in the case of very low bandwidth
audio codecs.
4.5. Recommendations
The recommendation is that endpoints MUST support both the static
declaration of capture encoding SSRCs, and appId in every media
packet. For low bandwidth situations, this may be considered
excessive overhead; in which case endpoints MAY support the approach
where appIds are sent selectively. The SDP offer MAY specify the
SSRC mapping to capture encoding. In the case of static mapping
topologies there will be no need to use the header extensions in the
media, since the SSRC for the RTP stream will remain the same during
the call unless a collision is detected and handled according to
RFC5576 [RFC5576]. If the used topology uses dynamic mapping then
Even & Lennox Expires January 23, 2015 [Page 9]
Internet-Draft RTP mapping to CLUE July 2014
the appId will be used to indicate the RTP stream switch for the
media capture. In this case the SDP description may be used to
negotiate the initial SSRC but this will be left for the
implementation. Note that if the SSRC is defined explicitly in the
SDP the SSRC collision should be handled as in RFC5576.
5. Application to CLUE Media Requirements
The requirement section Section 4.2 offers a number of requirements
that are believed to be necessary for a CLUE RTP mapping. The
solutions described in this document are believed to meet these
requirements, though some of them are only possible for some of the
topologies. (Since the requirements are generally of the form "it
must be possible for a sender to do something", this is adequate; a
sender which wishes to perform that action needs to choose a topology
which allows the behavior it wants.
In this section we address only those requirements where the
topologies or the association mechanisms treat the requirements
differently.
Media-4: It must be possible for an original source to move among
switched captures (i.e. at one time be sent for one switched capture,
and at a later time be sent for another one).
This applies naturally for static sources with a Switched Mixer. For
dynamic sources with a Source-Projecting middlebox, this just
requires the appId in the header extension element to be updated
appropriately.
Media-6: Whenever a given source is transmitted for a switched
capture, it must be immediately possible for a receiver to determine
the switched capture it corresponds to, and thus that any previous
source is no longer being mapped to that switched capture.
For a Switched Mixer, this applies naturally. For a Source-
Projecting middlebox, this is done based on the appId.
Media-7: It must be possible for a receiver to identify the original
source that is currently being mapped to a switched capture, and
correlate it with out-of-band (non-Clue) information such as rosters.
For a Switched Mixer, this is done based on the CSRC, if the mixer is
providing CSRCs; if for a Source-Projecting middlebox, this is done
based on the SSRC.
Media-8: It must be possible for a source to move among switched
captures without requiring a refresh of decoder state (e.g., for
Even & Lennox Expires January 23, 2015 [Page 10]
Internet-Draft RTP mapping to CLUE July 2014
video, a fresh I-frame), when this is unnecessary. However, it must
also be possible for a receiver to indicate when a refresh of decoder
state is in fact necessary.
This can be done by a Source-Projecting middlebox, but not by a
Switching Mixer. The last requirement can be accomplished through an
FIR message [RFC5104], though potentially a faster mechanism (not
requiring a round-trip time from the receiver) would be preferable.
Media-9: If a given source is being sent on the same transport flow
to satisfy more than one capture (e.g. if it corresponds to more than
one switched capture at once, or to a static capture as well as a
switched capture), it should be possible for a sender to send only
one copy of the source.
For a Source-Projecting middlebox, this can be accomplished by
sending multiple dynamic appIds for the same source; this can also be
done for an environment with a hybrid of mixer topologies and static
and dynamic captures, described below in Section 6. It is not
possible for static captures from a Switched Mixer.
Media-12: If multiple sources from a single synchronization context
are being sent simultaneously, it must be possible for a receiver to
associate and synchronize them properly, even for sources that are
mapped to switched captures.
For a Mixed or Switched Mixer topology, receivers will see only a
single synchronization context (CNAME), corresponding to the mixer.
For a Source-Projecting middlebox, separate projecting sources keep
separate synchronization contexts based on their original CNAMEs,
thus allowing independent synchronization of sources from independent
rooms without needing global synchronization. In hybrid cases,
however (e.g. if audio is mixed), all sources which need to be
synchronized with the mixed audio must get the same CNAME (and thus a
mixer-provided timebase) as the mixed audio.
6. Examples
It is possible for a CLUE device to send multiple instances of the
topologies in Section 3 simultaneously. For example, an MCU which
uses a traditional audio bridge with switched video would be a Mixer
topology for audio, but a Switched Mixer or a Source-Projecting
middlebox for video. In the latter case, the audio could be sent as
a static source, whereas the video could be dynamic.
More notably, it is possible for an endpoint to send the same sources
both for static and dynamic captures. Consider the example
[I-D.ietf-clue-framework], where an endpoint can provide both three
Even & Lennox Expires January 23, 2015 [Page 11]
Internet-Draft RTP mapping to CLUE July 2014
cameras (VC0, VC1, and VC2) for left, center, and right views, as
well as a switched view (VC3) of the loudest panel.
It is possible for a consumer to request both the (VC0 - VC2) set and
VC3. It is worth noting that the content of VC3 is, at all times,
exactly the content of one of VC0, VC1, or VC2. Thus, if the sender
uses the Source-Projecting middlebox topology for VC3, the consumer
that receives these three sources would not need to send any
additional media traffic over just sending (VC0 - VC2).
In this case, the advertiser could describe VC0, VC1, and VC2 in its
initial advertisement or SDP with static SSRCs, whereas VC3 would
need to be dynamic. The role of VC3 would move among VC0, VC1, or
VC2, indicated by the appId RTP header extension on those streams'
RTP packets.
6.1. Static mapping
Using the video capture example from the framework for a three camera
system with four monitors where one is for the presentation stream
[I-D.ietf-clue-framework] document:
o VC0- (the camera-left camera stream, purpose=main, switched:no
o VC1- (the center camera stream, purpose=main, switched:no
o VC2- (the camera-right camera stream), purpose=main, switched:no
o VC3- (the loudest panel stream), purpose=main, switched:yes
o VC4- (the loudest panel stream with PiPs), purpose=main,
composed=true; switched:yes
o VC5- (the zoomed out view of all people in the room),
purpose=main, composed=no; switched:no
o VC6- (presentation stream), purpose=presentation, switched:no
Where the physical simultaneity information is:
{VC0, VC1, VC2, VC3, VC4, VC6}
{VC0, VC2, VC5, VC6}
In this case the provider can send up to six simultaneous streams and
receive four one for each monitor. This is the maximum case but it
can be further limited by the capture scene entries which may propose
sending only three camera streams and one presentation, still since
Even & Lennox Expires January 23, 2015 [Page 12]
Internet-Draft RTP mapping to CLUE July 2014
the consumer can select any media captures that can be sent
simultaneously the offer will specify 6 streams where VC5 and VC1 are
using the same resource and are mutually exclusive.
In the Advertisement there may be two capture scenes:
The first capture scene may have four entries:
{VC0, VC1, VC2}
{VC3}
{VC4}
{VC5}
The second capture scene will have the following single entry.
{VC6}
We assume that an intermediary will need to look at CLUE if want to
have better decision on handling specific RTP streams for example
based on them being part of the same capture scene so the SDP will
not group streams by capture scene.
The SIP offer may be
m=video 49200 RTP/AVP 99
a=extmap:1 urn:ietf:params:rtp-hdrex:appId/ for support of dynamic
mapping
a=rtpmap:99 H264/90000
a=max-send-ssrc:{*:6}
a=max-recv-ssrc:{*:4}
a=ssrc:11111 appId:1
a=ssrc:22222appId:2
a=ssrc:33333 appId:3
a=ssrc:44444appId:4
a=ssrc:55555 appId:5
Even & Lennox Expires January 23, 2015 [Page 13]
Internet-Draft RTP mapping to CLUE July 2014
a=ssrc:66666 appId:6
In the above example the provider can send up to five main streams
and one presentation stream.
Note that VC1 and VC5 have the same SSRC since they are using the
same resource.
o VC0- (the camera-left camera stream, purpose=main, switched:no,
appId =1
o VC1- (the center camera stream, purpose=main, switched:no, appId
=2
o VC2- (the camera-right camera stream), purpose=main, switched:no,
appId =3
o VC3- (the loudest panel stream), purpose=main, switched:yes, appId
=4
o VC4- (the loudest panel stream with PiPs), purpose=main,
composed=true; switched:yes, appId =5
o VC5- (the zoomed out view of all people in the room),
purpose=main, composed=no; switched:no, appId =2
o VC6- (presentation stream), purpose=presentation, switched:no,
appId =6
Note: We can allocate an SSRC for each MC which will not require the
indirection of using an appId. This will require if a switch to
dynamic is done to provide information about which SSRC is being
replaced by the new one.
6.2. Dynamic Mapping
For topologies that use dynamic mapping there is no need to provide
the SSRCs in the offer (they may not be available if the offers from
the sources will not include them when connecting to the mixer or
remote endpoint) In this case the appId will be specified first in
the advertisement.
The SIP offer may be
m=video 49200 RTP/AVP 99
a=extmap:1 urn:ietf:params:appId
Even & Lennox Expires January 23, 2015 [Page 14]
Internet-Draft RTP mapping to CLUE July 2014
a=rtpmap:99 H264/90000
a=max-send-ssrc:{*:4}
a=max-recv-ssrc:{*:4}
This will work for ssrc multiplex. It is not clear how it will work
when RTP streams of the same media are not multiplexed in a single
RTP session. How to know which encoding will be in which of the
different RTP sessions.
7. Acknowledgements
The authors would like to thanks Allyn Romanow and Paul Witty for
contributing text to this work.
8. IANA Considerations
TBD
9. Security Considerations
TBD.
10. References
10.1. Normative References
[I-D.even-mmusic-application-token]
Even, R., Lennox, J., and Q. Wu, "The Session Description
Protocol (SDP) Application Token Attribute", draft-even-
mmusic-application-token-03 (work in progress), April
2014.
[I-D.ietf-clue-framework]
Romanow, A., Duckworth, M., Pepperell, A., and B. Baldino,
"Framework for Telepresence Multi-Streams", draft-ietf-
clue-framework-16 (work in progress), June 2014.
[I-D.lennox-clue-rtp-usage]
Lennox, J., Witty, P., and A. Romanow, "Real-Time
Transport Protocol (RTP) Usage for Telepresence Sessions",
draft-lennox-clue-rtp-usage-04 (work in progress), June
2012.
Even & Lennox Expires January 23, 2015 [Page 15]
Internet-Draft RTP mapping to CLUE July 2014
[I-D.westerlund-avtcore-max-ssrc]
Westerlund, M., Burman, B., and F. Jansson, "Multiple
Synchronization sources (SSRC) in RTP Session Signaling",
draft-westerlund-avtcore-max-ssrc-02 (work in progress),
July 2012.
[I-D.westerlund-avtext-rtcp-sdes-srcname]
Westerlund, M., Burman, B., and P. Sandgren, "RTCP SDES
Item SRCNAME to Label Individual Sources", draft-
westerlund-avtext-rtcp-sdes-srcname-01 (work in progress),
July 2012.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
10.2. Informative References
[I-D.lennox-mmusic-sdp-source-selection]
Lennox, J. and H. Schulzrinne, "Mechanisms for Media
Source Selection in the Session Description Protocol
(SDP)", draft-lennox-mmusic-sdp-source-selection-04 (work
in progress), March 2012.
[I-D.westerlund-avtcore-rtp-simulcast]
Westerlund, M., Burman, B., Lindqvist, M., and F. Jansson,
"Using Simulcast in RTP sessions", draft-westerlund-
avtcore-rtp-simulcast-04 (work in progress), July 2014.
[I-D.westerlund-avtcore-rtp-topologies-update]
Westerlund, M. and S. Wenger, "RTP Topologies", draft-
westerlund-avtcore-rtp-topologies-update-01 (work in
progress), October 2012.
[I-D.westerlund-avtext-codec-operation-point]
Westerlund, M., Burman, B., and L. Hamm, "Codec Operation
Point RTCP Extension", draft-westerlund-avtext-codec-
operation-point-00 (work in progress), March 2012.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264, June
2002.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006.
Even & Lennox Expires January 23, 2015 [Page 16]
Internet-Draft RTP mapping to CLUE July 2014
[RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session
Initiation Protocol (SIP) Event Package for Conference
State", RFC 4575, August 2006.
[RFC4796] Hautakorpi, J. and G. Camarillo, "The Session Description
Protocol (SDP) Content Attribute", RFC 4796, February
2007.
[RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
"Codec Control Messages in the RTP Audio-Visual Profile
with Feedback (AVPF)", RFC 5104, February 2008.
[RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
January 2008.
[RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP
Header Extensions", RFC 5285, July 2008.
[RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
Media Attributes in the Session Description Protocol
(SDP)", RFC 5576, June 2009.
[RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image
Attributes in the Session Description Protocol (SDP)", RFC
6236, May 2011.
[RFC7205] Romanow, A., Botzko, S., Duckworth, M., and R. Even, "Use
Cases for Telepresence Multistreams", RFC 7205, April
2014.
Authors' Addresses
Roni Even
Huawei Technologies
Tel Aviv
Israel
Email: roni.even@mail01.huawei.com
Jonathan Lennox
Vidyo, Inc.
433 Hackensack Avenue
Seventh Floor
Hackensack, NJ 07601
US
Email: jonathan@vidyo.com
Even & Lennox Expires January 23, 2015 [Page 17]
Html markup produced by rfcmarkup 1.129d, available from
https://tools.ietf.org/tools/rfcmarkup/