draft-ietf-avtext-rtp-grouping-taxonomy-03.txt   draft-ietf-avtext-rtp-grouping-taxonomy-04.txt 
Network Working Group J. Lennox Network Working Group J. Lennox
Internet-Draft Vidyo Internet-Draft Vidyo
Intended status: Informational K. Gross Intended status: Informational K. Gross
Expires: May 18, 2015 AVA Expires: July 20, 2015 AVA
S. Nandakumar S. Nandakumar
G. Salgueiro G. Salgueiro
Cisco Systems Cisco Systems
B. Burman B. Burman
Ericsson Ericsson
November 14, 2014 January 16, 2015
A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport A Taxonomy of Grouping Semantics and Mechanisms for Real-Time Transport
Protocol (RTP) Sources Protocol (RTP) Sources
draft-ietf-avtext-rtp-grouping-taxonomy-03 draft-ietf-avtext-rtp-grouping-taxonomy-04
Abstract Abstract
The terminology about, and associations among, Real-Time Transport The terminology about, and associations among, Real-Time Transport
Protocol (RTP) sources can be complex and somewhat opaque. This Protocol (RTP) sources can be complex and somewhat opaque. This
document describes a number of existing and proposed relationships document describes a number of existing and proposed relationships
among RTP sources, and attempts to define common terminology for among RTP sources, and attempts to define common terminology for
discussing protocol entities and their relationships. discussing protocol entities and their relationships.
Status of This Memo Status of This Memo
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 18, 2015. This Internet-Draft will expire on July 20, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 2, line 22 skipping to change at page 2, line 22
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Media Chain . . . . . . . . . . . . . . . . . . . . . . . 4 2.1. Media Chain . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1. Physical Stimulus . . . . . . . . . . . . . . . . . . 8 2.1.1. Physical Stimulus . . . . . . . . . . . . . . . . . . 8
2.1.2. Media Capture . . . . . . . . . . . . . . . . . . . . 8 2.1.2. Media Capture . . . . . . . . . . . . . . . . . . . . 8
2.1.3. Raw Stream . . . . . . . . . . . . . . . . . . . . . 8 2.1.3. Raw Stream . . . . . . . . . . . . . . . . . . . . . 8
2.1.4. Media Source . . . . . . . . . . . . . . . . . . . . 8 2.1.4. Media Source . . . . . . . . . . . . . . . . . . . . 8
2.1.5. Source Stream . . . . . . . . . . . . . . . . . . . . 9 2.1.5. Source Stream . . . . . . . . . . . . . . . . . . . . 9
2.1.6. Media Encoder . . . . . . . . . . . . . . . . . . . . 9 2.1.6. Media Encoder . . . . . . . . . . . . . . . . . . . . 10
2.1.7. Encoded Stream . . . . . . . . . . . . . . . . . . . 10 2.1.7. Encoded Stream . . . . . . . . . . . . . . . . . . . 11
2.1.8. Dependent Stream . . . . . . . . . . . . . . . . . . 11 2.1.8. Dependent Stream . . . . . . . . . . . . . . . . . . 11
2.1.9. Media Packetizer . . . . . . . . . . . . . . . . . . 11 2.1.9. Media Packetizer . . . . . . . . . . . . . . . . . . 11
2.1.10. RTP Stream . . . . . . . . . . . . . . . . . . . . . 11 2.1.10. RTP Stream . . . . . . . . . . . . . . . . . . . . . 12
2.1.11. Media Redundancy . . . . . . . . . . . . . . . . . . 12 2.1.11. RTP-based Redundancy . . . . . . . . . . . . . . . . 13
2.1.12. Redundancy RTP Stream . . . . . . . . . . . . . . . . 13 2.1.12. Redundancy RTP Stream . . . . . . . . . . . . . . . . 13
2.1.13. Media Transport . . . . . . . . . . . . . . . . . . . 13 2.1.13. Media Transport . . . . . . . . . . . . . . . . . . . 13
2.1.14. Media Transport Sender . . . . . . . . . . . . . . . 14 2.1.14. Media Transport Sender . . . . . . . . . . . . . . . 14
2.1.15. Sent RTP Stream . . . . . . . . . . . . . . . . . . . 14 2.1.15. Sent RTP Stream . . . . . . . . . . . . . . . . . . . 15
2.1.16. Network Transport . . . . . . . . . . . . . . . . . . 15 2.1.16. Network Transport . . . . . . . . . . . . . . . . . . 15
2.1.17. Transported RTP Stream . . . . . . . . . . . . . . . 15 2.1.17. Transported RTP Stream . . . . . . . . . . . . . . . 15
2.1.18. Media Transport Receiver . . . . . . . . . . . . . . 15 2.1.18. Media Transport Receiver . . . . . . . . . . . . . . 15
2.1.19. Received RTP Stream . . . . . . . . . . . . . . . . . 15 2.1.19. Received RTP Stream . . . . . . . . . . . . . . . . . 15
2.1.20. Received Redundancy RTP Stream . . . . . . . . . . . 15 2.1.20. Received Redundancy RTP Stream . . . . . . . . . . . 16
2.1.21. Media Repair . . . . . . . . . . . . . . . . . . . . 15 2.1.21. RTP-based Repair . . . . . . . . . . . . . . . . . . 16
2.1.22. Repaired RTP Stream . . . . . . . . . . . . . . . . . 16 2.1.22. Repaired RTP Stream . . . . . . . . . . . . . . . . . 16
2.1.23. Media Depacketizer . . . . . . . . . . . . . . . . . 16 2.1.23. Media Depacketizer . . . . . . . . . . . . . . . . . 16
2.1.24. Received Encoded Stream . . . . . . . . . . . . . . . 16 2.1.24. Received Encoded Stream . . . . . . . . . . . . . . . 16
2.1.25. Media Decoder . . . . . . . . . . . . . . . . . . . . 16 2.1.25. Media Decoder . . . . . . . . . . . . . . . . . . . . 16
2.1.26. Received Source Stream . . . . . . . . . . . . . . . 17 2.1.26. Received Source Stream . . . . . . . . . . . . . . . 17
2.1.27. Media Sink . . . . . . . . . . . . . . . . . . . . . 17 2.1.27. Media Sink . . . . . . . . . . . . . . . . . . . . . 17
2.1.28. Received Raw Stream . . . . . . . . . . . . . . . . . 17 2.1.28. Received Raw Stream . . . . . . . . . . . . . . . . . 17
2.1.29. Media Render . . . . . . . . . . . . . . . . . . . . 17 2.1.29. Media Render . . . . . . . . . . . . . . . . . . . . 17
2.2. Communication Entities . . . . . . . . . . . . . . . . . 17 2.2. Communication Entities . . . . . . . . . . . . . . . . . 18
2.2.1. End Point . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1. Endpoint . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2. RTP Session . . . . . . . . . . . . . . . . . . . . . 19 2.2.2. RTP Session . . . . . . . . . . . . . . . . . . . . . 19
2.2.3. Participant . . . . . . . . . . . . . . . . . . . . . 20 2.2.3. Participant . . . . . . . . . . . . . . . . . . . . . 20
2.2.4. Multimedia Session . . . . . . . . . . . . . . . . . 20 2.2.4. Multimedia Session . . . . . . . . . . . . . . . . . 20
2.2.5. Communication Session . . . . . . . . . . . . . . . . 21 2.2.5. Communication Session . . . . . . . . . . . . . . . . 21
3. Concept Inter-Relations . . . . . . . . . . . . . . . . . . . 21 3. Concepts of Inter-Relations . . . . . . . . . . . . . . . . . 21
3.1. Synchronization Context . . . . . . . . . . . . . . . . . 22 3.1. Synchronization Context . . . . . . . . . . . . . . . . . 21
3.1.1. RTCP CNAME . . . . . . . . . . . . . . . . . . . . . 22 3.1.1. RTCP CNAME . . . . . . . . . . . . . . . . . . . . . 22
3.1.2. Clock Source Signaling . . . . . . . . . . . . . . . 22 3.1.2. Clock Source Signaling . . . . . . . . . . . . . . . 22
3.1.3. Implicitly via RtcMediaStream . . . . . . . . . . . . 23 3.1.3. Implicitly via RtcMediaStream . . . . . . . . . . . . 22
3.1.4. Explicitly via SDP Mechanisms . . . . . . . . . . . . 23 3.1.4. Explicitly via SDP Mechanisms . . . . . . . . . . . . 22
3.2. End Point . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2. Endpoint . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3. Participant . . . . . . . . . . . . . . . . . . . . . . . 23 3.3. Participant . . . . . . . . . . . . . . . . . . . . . . . 23
3.4. RtcMediaStream . . . . . . . . . . . . . . . . . . . . . 24 3.4. RtcMediaStream . . . . . . . . . . . . . . . . . . . . . 23
3.5. Single- and Multi-Session Transmission of Dependent 3.5. Single- and Multi-Session Transmission of Dependent
Streams . . . . . . . . . . . . . . . . . . . . . . . . . 24 Streams . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6. Multi-Channel Audio . . . . . . . . . . . . . . . . . . . 25 3.6. Multi-Channel Audio . . . . . . . . . . . . . . . . . . . 24
3.7. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 25 3.7. Simulcast . . . . . . . . . . . . . . . . . . . . . . . . 24
3.8. Layered Multi-Stream . . . . . . . . . . . . . . . . . . 26 3.8. Layered Multi-Stream . . . . . . . . . . . . . . . . . . 25
3.9. RTP Stream Duplication . . . . . . . . . . . . . . . . . 28 3.9. RTP Stream Duplication . . . . . . . . . . . . . . . . . 27
3.10. Redundancy Format . . . . . . . . . . . . . . . . . . . . 28 3.10. Redundancy Format . . . . . . . . . . . . . . . . . . . . 27
3.11. RTP Retransmission . . . . . . . . . . . . . . . . . . . 29 3.11. RTP Retransmission . . . . . . . . . . . . . . . . . . . 28
3.12. Forward Error Correction . . . . . . . . . . . . . . . . 30 3.12. Forward Error Correction . . . . . . . . . . . . . . . . 29
3.13. RTP Stream Separation . . . . . . . . . . . . . . . . . . 32 3.13. RTP Stream Separation . . . . . . . . . . . . . . . . . . 31
3.14. Multiple RTP Sessions over one Media Transport . . . . . 33 3.14. Multiple RTP Sessions over one Media Transport . . . . . 32
4. Mapping from Existing Terms . . . . . . . . . . . . . . . . . 33 4. Mapping from Existing Terms . . . . . . . . . . . . . . . . . 32
4.1. Telepresence Terms . . . . . . . . . . . . . . . . . . . 33 4.1. Telepresence Terms . . . . . . . . . . . . . . . . . . . 32
4.1.1. Audio Capture . . . . . . . . . . . . . . . . . . . . 33 4.1.1. Audio Capture . . . . . . . . . . . . . . . . . . . . 32
4.1.2. Capture Device . . . . . . . . . . . . . . . . . . . 33 4.1.2. Capture Device . . . . . . . . . . . . . . . . . . . 32
4.1.3. Capture Encoding . . . . . . . . . . . . . . . . . . 33 4.1.3. Capture Encoding . . . . . . . . . . . . . . . . . . 32
4.1.4. Capture Scene . . . . . . . . . . . . . . . . . . . . 34 4.1.4. Capture Scene . . . . . . . . . . . . . . . . . . . . 33
4.1.5. Endpoint . . . . . . . . . . . . . . . . . . . . . . 34 4.1.5. Endpoint . . . . . . . . . . . . . . . . . . . . . . 33
4.1.6. Individual Encoding . . . . . . . . . . . . . . . . . 34 4.1.6. Individual Encoding . . . . . . . . . . . . . . . . . 33
4.1.7. Media Capture . . . . . . . . . . . . . . . . . . . . 34 4.1.7. Media Capture . . . . . . . . . . . . . . . . . . . . 33
4.1.8. Media Consumer . . . . . . . . . . . . . . . . . . . 34 4.1.8. Media Consumer . . . . . . . . . . . . . . . . . . . 33
4.1.9. Media Provider . . . . . . . . . . . . . . . . . . . 34 4.1.9. Media Provider . . . . . . . . . . . . . . . . . . . 33
4.1.10. Stream . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.10. Stream . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.11. Video Capture . . . . . . . . . . . . . . . . . . . . 34 4.1.11. Video Capture . . . . . . . . . . . . . . . . . . . . 33
4.2. Media Description . . . . . . . . . . . . . . . . . . . . 34 4.2. Media Description . . . . . . . . . . . . . . . . . . . . 33
4.3. Media Stream . . . . . . . . . . . . . . . . . . . . . . 35 4.3. Media Stream . . . . . . . . . . . . . . . . . . . . . . 34
4.4. Multimedia Conference . . . . . . . . . . . . . . . . . . 35 4.4. Multimedia Conference . . . . . . . . . . . . . . . . . . 34
4.5. Multimedia Session . . . . . . . . . . . . . . . . . . . 35 4.5. Multimedia Session . . . . . . . . . . . . . . . . . . . 34
4.6. Multipoint Control Unit (MCU) . . . . . . . . . . . . . . 35 4.6. Multipoint Control Unit (MCU) . . . . . . . . . . . . . . 34
4.7. Recording Device . . . . . . . . . . . . . . . . . . . . 35 4.7. Recording Device . . . . . . . . . . . . . . . . . . . . 34
4.8. RtcMediaStream . . . . . . . . . . . . . . . . . . . . . 36 4.8. RtcMediaStream . . . . . . . . . . . . . . . . . . . . . 35
4.9. RtcMediaStreamTrack . . . . . . . . . . . . . . . . . . . 36 4.9. RtcMediaStreamTrack . . . . . . . . . . . . . . . . . . . 35
4.10. RTP Sender . . . . . . . . . . . . . . . . . . . . . . . 36 4.10. RTP Sender . . . . . . . . . . . . . . . . . . . . . . . 35
4.11. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 36 4.11. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 35
4.12. SSRC . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.12. SSRC . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5. Security Considerations . . . . . . . . . . . . . . . . . . . 36 5. Security Considerations . . . . . . . . . . . . . . . . . . . 35
6. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 37 6. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 36
7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 37 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 36
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36
9. Informative References . . . . . . . . . . . . . . . . . . . 37 9. Informative References . . . . . . . . . . . . . . . . . . . 36
Appendix A. Changes From Earlier Versions . . . . . . . . . . . 39 Appendix A. Changes From Earlier Versions . . . . . . . . . . . 38
A.1. Modifications Between WG Version -02 and -03 . . . . . . 39 A.1. Modifications Between WG Version -03 and -04 . . . . . . 38
A.2. Modifications Between WG Version -01 and -02 . . . . . . 39 A.2. Modifications Between WG Version -02 and -03 . . . . . . 39
A.3. Modifications Between WG Version -00 and -01 . . . . . . 40 A.3. Modifications Between WG Version -01 and -02 . . . . . . 39
A.4. Modifications Between Version -02 and -03 . . . . . . . . 41 A.4. Modifications Between WG Version -00 and -01 . . . . . . 40
A.5. Modifications Between Version -01 and -02 . . . . . . . . 41 A.5. Modifications Between Version -02 and -03 . . . . . . . . 40
A.6. Modifications Between Version -00 and -01 . . . . . . . . 41 A.6. Modifications Between Version -01 and -02 . . . . . . . . 41
A.7. Modifications Between Version -00 and -01 . . . . . . . . 41
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41
1. Introduction 1. Introduction
The existing taxonomy of sources in RTP is often regarded as The existing taxonomy of sources in RTP is often regarded as
confusing and inconsistent. Consequently, a deep understanding of confusing and inconsistent. Consequently, a deep understanding of
how the different terms relate to each other becomes a real how the different terms relate to each other becomes a real
challenge. Frequently cited examples of this confusion are (1) how challenge. Frequently cited examples of this confusion are (1) how
different protocols that make use of RTP use the same terms to different protocols that make use of RTP use the same terms to
signify different things and (2) how the complexities addressed at signify different things and (2) how the complexities addressed at
skipping to change at page 6, line 5 skipping to change at page 6, line 5
It is also important to remember that this is a conceptual model. It is also important to remember that this is a conceptual model.
Thus real-world implementations may look different and have different Thus real-world implementations may look different and have different
structure. structure.
To provide a basic understanding of the relationships in the chain we To provide a basic understanding of the relationships in the chain we
below first introduce the concepts for the sender side (Figure 1). below first introduce the concepts for the sender side (Figure 1).
This covers physical stimulus until media packets are emitted onto This covers physical stimulus until media packets are emitted onto
the network. the network.
Physical Stimulus Physical Stimulus
| |
V V
+--------------------+ +--------------------+
| Media Capture | | Media Capture |
+--------------------+ +--------------------+
| |
Raw Stream Raw Stream
V V
+--------------------+ +--------------------+
| Media Source |<- Synchronization Timing | Media Source |<- Synchronization Timing
+--------------------+ +--------------------+
| |
Source Stream Source Stream
V V
+--------------------+ +--------------------+
| Media Encoder | | Media Encoder |
+--------------------+ +--------------------+
| |
Encoded Stream +-----------+ Encoded Stream +------------+
V | V V | V
+--------------------+ | +--------------------+ +--------------------+ | +----------------------+
| Media Packetizer | | | Media Redundancy | | Media Packetizer | | | RTP-based Redundancy |
+--------------------+ | +--------------------+ +--------------------+ | +----------------------+
| | | | | |
+------------+ Redundancy RTP Stream +------------+ Redundancy RTP Stream
Source RTP Stream | Source RTP Stream |
V V V V
+--------------------+ +--------------------+ +--------------------+ +--------------------+
| Media Transport | | Media Transport | | Media Transport | | Media Transport |
+--------------------+ +--------------------+ +--------------------+ +--------------------+
Figure 1: Sender Side Concepts in the Media Chain Figure 1: Sender Side Concepts in the Media Chain
In Figure 1 we have included a branched chain to cover the concepts In Figure 1 we have included a branched chain to cover the concepts
for using redundancy to improve the reliability of the transport. for using redundancy to improve the reliability of the transport.
The Media Transport concept is an aggregate that is decomposed below The Media Transport concept is an aggregate that is decomposed below
in Section 2.1.13. in Section 2.1.13.
Below we review a receiver media chain (Figure 2) matching the sender Below we review a receiver media chain (Figure 2) matching the sender
side, to look at the inverse transformations and their attempts to side, to look at the inverse transformations and their attempts to
skipping to change at page 7, line 8 skipping to change at page 7, line 8
be lossy compression and imperfect Media Transport. Note that the be lossy compression and imperfect Media Transport. Note that the
streams out of a reverse transformation, like the Source Stream out streams out of a reverse transformation, like the Source Stream out
the Media Decoder are in many cases not the same as the corresponding the Media Decoder are in many cases not the same as the corresponding
ones on the sender side, thus they are prefixed with a "Received" to ones on the sender side, thus they are prefixed with a "Received" to
denote a potentially modified version. The reason for not being the denote a potentially modified version. The reason for not being the
same lies in the transformations that can be of irreversible type. same lies in the transformations that can be of irreversible type.
For example, lossy source coding in the Media Encoder prevents the For example, lossy source coding in the Media Encoder prevents the
Source Stream out of the Media Decoder to be the same as the one fed Source Stream out of the Media Decoder to be the same as the one fed
into the Media Encoder. Other reasons include packet loss or late into the Media Encoder. Other reasons include packet loss or late
loss in the Media Transport transformation that even Media Repair, if loss in the Media Transport transformation that even RTP-based
used, fails to repair. It should be noted that some transformations Repair, if used, fails to repair. It should be noted that some
are not always present, like Media Repair that cannot operate without transformations are not always present, like RTP-based Repair that
Redundancy RTP Streams. cannot operate without Redundancy RTP Streams.
+--------------------+ +--------------------+ +--------------------+ +--------------------+
| Media Transport | | Media Transport | | Media Transport | | Media Transport |
+--------------------+ +--------------------+ +--------------------+ +--------------------+
| | | |
Received RTP Stream Received Redundancy RTP Stream Received RTP Stream Received Redundancy RTP Stream
| | | |
| +-------------------+ | +-------------------+
V V V V
+--------------------+ +--------------------+
| Media Repair | | RTP-based Repair |
+--------------------+ +--------------------+
| |
Repaired RTP Stream Repaired RTP Stream
V V
+--------------------+ +--------------------+
| Media Depacketizer | | Media Depacketizer |
+--------------------+ +--------------------+
| |
Received Encoded Stream Received Encoded Stream
V V
skipping to change at page 8, line 35 skipping to change at page 8, line 35
Characteristics: Characteristics:
o A Media Capture is identified either by hardware/manufacturer ID o A Media Capture is identified either by hardware/manufacturer ID
or via a session-scoped device identifier as mandated by the or via a session-scoped device identifier as mandated by the
application usage. application usage.
o A Media Capture can generate an Encoded Stream (Section 2.1.7) if o A Media Capture can generate an Encoded Stream (Section 2.1.7) if
the capture device support such a configuration. the capture device support such a configuration.
o The nature of the Media Capture may impose constraints on the
clock handling in some of the subsequent steps. For example, many
audio or video capture devices are not completely free in
selecting the sample rate.
2.1.3. Raw Stream 2.1.3. Raw Stream
The time progressing stream of digitally sampled information, usually The time progressing stream of digitally sampled information, usually
periodically sampled and provided by a Media Capture (Section 2.1.2). periodically sampled and provided by a Media Capture (Section 2.1.2).
A Raw Stream can also contain synthesized Media that may not require A Raw Stream can also contain synthesized Media that may not require
any explicit Media Capture, since it is already in an appropriate any explicit Media Capture, since it is already in an appropriate
digital form. digital form.
2.1.4. Media Source 2.1.4. Media Source
A Media Source is the logical source of a reference clock A Media Source is the logical source of a reference clock
synchronized, time progressing, digital media stream, called a Source synchronized, time progressing, digital media stream, called a Source
Stream (Section 2.1.5). This transformation takes one or more Raw Stream (Section 2.1.5). This transformation takes one or more Raw
Streams (Section 2.1.3) and provides a Source Stream as output. The Streams (Section 2.1.3) and provides a Source Stream as output. The
output is synchronized with a reference clock, which can be as simple output is synchronized with a reference clock (Section 3.1), which
as a system local wall clock or as complex as NTP synchronized. can be as simple as a system local wall clock or as complex as NTP
synchronized.
The output can be of different types. One type is directly The output can be of different types. One type is directly
associated with a particular Media Capture's Raw Stream. Others are associated with a particular Media Capture's Raw Stream. Others are
more conceptual sources, like an audio mix of multiple Raw Streams more conceptual sources, like an audio mix of multiple Source Streams
(Figure 3), a mixed selection of the three loudest inputs regarding (Figure 3). Mixing multiple streams typically requires that the
speech activity, a selection of a particular video based on the input streams are possible to relate in time, meaning that they have
current speaker, i.e. typically based on other Media Sources. to be Source Streams (Section 2.1.5) rather than Raw Streams. In the
below example, the generated Source Stream is a mix of the three
input Source Streams.
Raw Raw Raw Source Source Source
Stream Stream Stream Stream Stream Stream
| | | | | |
V V V V V V
+--------------------------+ +--------------------------+
| Media Source |<-- Reference Clock | Media Source |<-- Reference Clock
| Mixer | | Mixer |
+--------------------------+ +--------------------------+
| |
V V
Source Stream Source Stream
Figure 3: Conceptual Media Source in form of Audio Mixer Figure 3: Conceptual Media Source in form of Audio Mixer
Another possible example of a conceptual Media Source is a video
surveillance switch, where the input is multiple Source Streams from
different cameras, and the output is one of those Source Streams
based on some selection criteria, like a round-robin or based on some
video activity measure.
Characteristics: Characteristics:
o At any point, it can represent a physical captured source or o At any point, it can represent a physical captured source or
conceptual source. conceptual source.
2.1.5. Source Stream 2.1.5. Source Stream
A time progressing stream of digital samples that has been A time progressing stream of digital samples that has been
synchronized with a reference clock and comes from particular Media synchronized with a reference clock and comes from particular Media
Source (Section 2.1.4). Source (Section 2.1.4).
skipping to change at page 9, line 47 skipping to change at page 10, line 15
2.1.6. Media Encoder 2.1.6. Media Encoder
A Media Encoder is a transform that is responsible for encoding the A Media Encoder is a transform that is responsible for encoding the
media data from a Source Stream (Section 2.1.5) into another media data from a Source Stream (Section 2.1.5) into another
representation, usually more compact, that is output as an Encoded representation, usually more compact, that is output as an Encoded
Stream (Section 2.1.7). Stream (Section 2.1.7).
The Media Encoder step commonly includes pre-encoding The Media Encoder step commonly includes pre-encoding
transformations, such as scaling, resampling etc. The Media Encoder transformations, such as scaling, resampling etc. The Media Encoder
can have a significant number of configuration options that affects can have a significant number of configuration options that affects
the properties of the encoded stream. This include properties such the properties of the Encoded Stream. This include properties such
as bit-rate, start points for decoding, resolution, bandwidth or as bit-rate, start points for decoding, resolution, bandwidth or
other fidelity affecting properties. The actually used codec is also other fidelity affecting properties. The actually used codec is also
an important factor in many communication systems. an important factor in many communication systems.
Scalable Media Encoders need special attention as they produce Scalable Media Encoders need special attention as they produce
multiple outputs that are potentially of different types. A scalable multiple outputs that are potentially of different types. A scalable
Media Encoder takes one input Source Stream and encodes it into Media Encoder takes one input Source Stream and encodes it into
multiple output streams of two different types; at least one Encoded multiple output streams of two different types; at least one Encoded
Stream that is independently decodable and one or more Dependent Stream that is independently decodable and one or more Dependent
Streams (Section 2.1.8). Decoding requires at least one Encoded Streams (Section 2.1.8). Decoding requires at least one Encoded
skipping to change at page 10, line 38 skipping to change at page 11, line 7
There are also other variants of encoders, like so-called Multiple There are also other variants of encoders, like so-called Multiple
Description Coding (MDC). Such Media Encoder produce multiple Description Coding (MDC). Such Media Encoder produce multiple
independent and thus individually decodable Encoded Streams. independent and thus individually decodable Encoded Streams.
However, (logically) combining multiple of these Encoded Streams into However, (logically) combining multiple of these Encoded Streams into
a single Received Source Stream during decoding leads to an a single Received Source Stream during decoding leads to an
improvement in perceptual reproduced quality when compared to improvement in perceptual reproduced quality when compared to
decoding a single Encoded Stream. decoding a single Encoded Stream.
Creating multiple Encoded Streams from the same Source Stream, where Creating multiple Encoded Streams from the same Source Stream, where
the Encoded Streams are neither in a scalable nor in an MDC the Encoded Streams are neither in a scalable nor in an MDC
relationship is commonly utilized in simulcast environments. relationship is commonly utilized in Simulcast environments.
Characteristics: Characteristics:
o A Media Source can be multiply encoded by different Media Encoders o A Media Source can be multiply encoded by different Media Encoders
to provide various encoded representations. to provide various encoded representations.
2.1.7. Encoded Stream 2.1.7. Encoded Stream
A stream of time synchronized encoded media that can be independently A stream of time synchronized encoded media that can be independently
decoded. decoded.
skipping to change at page 11, line 20 skipping to change at page 11, line 36
2.1.8. Dependent Stream 2.1.8. Dependent Stream
A stream of time synchronized encoded media fragments that are A stream of time synchronized encoded media fragments that are
dependent on one or more Encoded Streams (Section 2.1.7) and zero or dependent on one or more Encoded Streams (Section 2.1.7) and zero or
more Dependent Streams to be possible to decode. more Dependent Streams to be possible to decode.
Characteristics: Characteristics:
o Each Dependent Stream has a set of dependencies. These o Each Dependent Stream has a set of dependencies. These
dependencies must be understood by the parties in a multi-media dependencies must be understood by the parties in a Multimedia
session that intend to use a Dependent Stream. Session that intend to use a Dependent Stream.
2.1.9. Media Packetizer 2.1.9. Media Packetizer
The transformation of taking one or more Encoded (Section 2.1.7) or The transformation of taking one or more Encoded (Section 2.1.7) or
Dependent Streams (Section 2.1.8) and put their content into one or Dependent Streams (Section 2.1.8) and put their content into one or
more sequences of packets, normally RTP packets, and output Source more sequences of packets, normally RTP packets, and output Source
RTP Streams (Section 2.1.10). This step includes both generating RTP RTP Streams (Section 2.1.10). This step includes both generating RTP
payloads as well as RTP packets. payloads as well as RTP packets.
The Media Packetizer can use multiple inputs when producing a single The Media Packetizer can use multiple inputs when producing a single
skipping to change at page 11, line 43 skipping to change at page 12, line 10
(Section 3.5). (Section 3.5).
The Media Packetizer can also produce multiple RTP Streams, for The Media Packetizer can also produce multiple RTP Streams, for
example when Encoded and/or Dependent Streams are distributed over example when Encoded and/or Dependent Streams are distributed over
multiple RTP Streams. One example of this is MRMT packetization when multiple RTP Streams. One example of this is MRMT packetization when
using SVC (Section 3.5). using SVC (Section 3.5).
Characteristics: Characteristics:
o The Media Packetizer will select which Synchronization source(s) o The Media Packetizer will select which Synchronization source(s)
(SSRC) [RFC3550] in which RTP sessions that are used. (SSRC) [RFC3550] in which RTP Sessions that are used.
o Media Packetizer can combine multiple Encoded or Dependent Streams o Media Packetizer can combine multiple Encoded or Dependent Streams
into one or more RTP Streams. into one or more RTP Streams.
2.1.10. RTP Stream 2.1.10. RTP Stream
A stream of RTP packets containing media data, source or redundant. A stream of RTP packets containing media data, source or redundant.
The RTP Stream is identified by an SSRC belonging to a particular RTP The RTP Stream is identified by an SSRC belonging to a particular RTP
session. The RTP session is identified as discussed in Session. The RTP Session is identified as discussed in
Section 2.2.2. Section 2.2.2.
A Source RTP Stream is a RTP Stream containing at least some content A Source RTP Stream is a RTP Stream containing at least some content
from an Encoded Stream. Source material is any media material that from an Encoded Stream (Section 2.1.7). Source material is any media
is produced for transport over RTP without any additional redundancy material that is produced for transport over RTP without any
applied (outside what is generally there in the media format of the additional RTP-based redundancy applied. Note that RTP-based
Encoded Stream) to cope with network transport losses. Compare this redundancy excludes the type of redundancy that most suitable Media
with the Redundancy RTP Stream (Section 2.1.12). Encoders (Section 2.1.6) may add to the media format of the Encoded
Stream that makes it cope better with inevitable RTP packet losses.
This is further described in RTP-based Redundancy (Section 2.1.11)
and Redundancy RTP Stream (Section 2.1.12).
Characteristics: Characteristics:
o Each RTP Stream is identified by a Synchronization source (SSRC) o Each RTP Stream is identified by a Synchronization source (SSRC)
[RFC3550] that is carried in every RTP and RTP Control Protocol [RFC3550] that is carried in every RTP and RTP Control Protocol
(RTCP) packet header. The SSRC is unique in a specific RTP (RTCP) packet header. The SSRC is unique in a specific RTP
session context. Session context.
o At any given point in time, a RTP Stream can have one and only one o At any given point in time, a RTP Stream can have one and only one
SSRC, but SSRCs for a given RTP Stream can change over time. SSRC SSRC, but SSRCs for a given RTP Stream can change over time. SSRC
collision and clock rate change [RFC7160] are examples of valid collision and clock rate change [RFC7160] are examples of valid
reasons to change SSRC for an RTP Stream. In those cases, the RTP reasons to change SSRC for an RTP Stream. In those cases, the RTP
Stream itself is not changed in any significant way, only the Stream itself is not changed in any significant way, only the
identifying SSRC number. identifying SSRC number.
o Each RTP Stream defines a unique RTP sequence numbering and timing o Each SSRC defines a unique RTP sequence numbering and timing
space. space.
o Several RTP Streams may represent a single Media Source. o Several RTP Streams, each with their own SSRC, may represent a
single Media Source.
o Several RTP Streams can be carried in a single RTP Session. o Several RTP Streams, each with their own SSRC, can be carried in a
single RTP Session.
2.1.11. Media Redundancy 2.1.11. RTP-based Redundancy
Media redundancy is defined here as a transformation that generates RTP-based Redundancy is defined here as a transformation that
redundant or repair packets sent out as a Redundancy RTP Stream to generates redundant or repair packets sent out as a Redundancy RTP
mitigate network transport impairments, like packet loss and delay. Stream (Section 2.1.12) to mitigate network transport impairments,
like packet loss and delay.
The Media Redundancy exists in many flavors; they may be generating The RTP-based Redundancy exists in many flavors; they may be
independent Repair Streams that are used in addition to the Source generating independent Repair Streams that are used in addition to
Stream (RTP Retransmission [RFC4588] and some FEC [RFC5109]), they the Source Stream (like RTP Retransmission (Section 3.11) and some
may generate a new Source Stream by combining redundancy information special types of Forward Error Correction, like RTP stream
with source information (Using XOR FEC [RFC5109] as a redundancy duplication (Section 3.9)), they may generate a new Source Stream by
payload [RFC2198]), or completely replace the source information with combining redundancy information with source information (Using XOR
only redundancy packets. FEC (Section 3.12) as a redundancy payload (Section 3.10)), or
completely replace the source information with only redundancy
packets.
2.1.12. Redundancy RTP Stream 2.1.12. Redundancy RTP Stream
A RTP Stream (Section 2.1.10) that contains no original source data, A RTP Stream (Section 2.1.10) that contains no original source data,
only redundant data that may be combined with one or more Received only redundant data that may be combined with one or more Received
RTP Stream (Section 2.1.19) to produce Repaired RTP Streams RTP Stream (Section 2.1.19) to produce Repaired RTP Streams
(Section 2.1.22). (Section 2.1.22).
2.1.13. Media Transport 2.1.13. Media Transport
A Media Transport defines the transformation that the RTP Streams A Media Transport defines the transformation that the RTP Streams
(Section 2.1.10) are subjected to by the end-to-end transport from (Section 2.1.10) are subjected to by the end-to-end transport from
one RTP sender to one specific RTP receiver (an RTP session may one RTP sender to one specific RTP receiver (an RTP Session
contain multiple RTP receivers per sender). Each Media Transport is (Section 2.2.2) may contain multiple RTP receivers per sender). Each
defined by a transport association that is identified by a 5-tuple Media Transport is defined by a transport association that is
(source address, source port, destination address, destination port, normally identified by a 5-tuple (source address, source port,
transport protocol). Each transport association normally contains destination address, destination port, transport protocol), but a
only a single RTP session, although a proposal exists for sending proposal exists for sending multiple transport associations on a
multiple RTP sessions over one transport association single 5-tuple [I-D.westerlund-avtcore-transport-multiplexing].
[I-D.westerlund-avtcore-transport-multiplexing].
Characteristics: Characteristics:
o Media Transport transmits RTP Streams of RTP Packets from a source o Media Transport transmits RTP Streams of RTP Packets from a source
transport address to a destination transport address. transport address to a destination transport address.
o Each Media Transport contains only a single RTP Session.
o A single RTP Session can span multiple Media Transports.
The Media Transport concept sometimes needs to be decomposed into The Media Transport concept sometimes needs to be decomposed into
more steps to enable discussion of what a sender emits that gets more steps to enable discussion of what a sender emits that gets
transformed by the network before it is received by the receiver. transformed by the network before it is received by the receiver.
Thus we provide also this Media Transport decomposition (Figure 5). Thus we provide also this Media Transport decomposition (Figure 5).
RTP Stream RTP Stream
| |
V V
+--------------------------+ +--------------------------+
| Media Transport Sender | | Media Transport Sender |
+--------------------------+ +--------------------------+
| |
Sent RTP Stream Sent RTP Stream
V V
+--------------------------+ +--------------------------+
| Network Transport | | Network Transport |
+--------------------------+ +--------------------------+
| |
Transported RTP Stream Transported RTP Stream
V V
+--------------------------+ +--------------------------+
| Media Transport Receiver | | Media Transport Receiver |
+--------------------------+ +--------------------------+
| |
V V
Received RTP Stream Received RTP Stream
Figure 5: Decomposition of Media Transport Figure 5: Decomposition of Media Transport
2.1.14. Media Transport Sender 2.1.14. Media Transport Sender
The first transformation within the Media Transport (Section 2.1.13) The first transformation within the Media Transport (Section 2.1.13)
is the Media Transport Sender. The sending End Point (Section 2.2.1) is the Media Transport Sender. The sending Endpoint (Section 2.2.1)
takes an RTP Stream and emits the packets onto the network using the takes an RTP Stream and emits the packets onto the network using the
transport association established for this Media Transport, thereby transport association established for this Media Transport, thereby
creating a Sent RTP Stream (Section 2.1.15). In the process, it creating a Sent RTP Stream (Section 2.1.15). In the process, it
transforms the RTP Stream in several ways. First, it generates the transforms the RTP Stream in several ways. First, it generates the
necessary protocol headers for the transport association, for example necessary protocol headers for the transport association, for example
IP and UDP headers, thus forming IP/UDP/RTP packets. In addition, IP and UDP headers, thus forming IP/UDP/RTP packets. In addition,
the Media Transport Sender may queue, pace or otherwise affect how the Media Transport Sender may queue, pace or otherwise affect how
the packets are emitted onto the network, thereby potentially the packets are emitted onto the network, thereby potentially
introducing delay, jitter and inter packet spacings that characterize introducing delay, jitter and inter packet spacings that characterize
the Sent RTP Stream. the Sent RTP Stream.
skipping to change at page 15, line 23 skipping to change at page 15, line 31
the exit of the network path. the exit of the network path.
2.1.17. Transported RTP Stream 2.1.17. Transported RTP Stream
The RTP Stream that is emitted out of the network path at the The RTP Stream that is emitted out of the network path at the
destination, subjected to the Network Transport's transformation destination, subjected to the Network Transport's transformation
(Section 2.1.16). (Section 2.1.16).
2.1.18. Media Transport Receiver 2.1.18. Media Transport Receiver
The receiver End Point's (Section 2.2.1) transformation of the The receiver Endpoint's (Section 2.2.1) transformation of the
Transported RTP Stream (Section 2.1.17) by its reception process, Transported RTP Stream (Section 2.1.17) by its reception process,
which results in the Received RTP Stream (Section 2.1.19). This which results in the Received RTP Stream (Section 2.1.19). This
transformation includes transport checksums being verified and, if transformation includes transport checksums being verified. Sensible
non-matching, may cause discarding of the corrupted packet. Other system designs typically either discard packets with mis-matching
checksums, or pass them on while somehow marking them in the
resulting Received RTP Stream so to alarm subsequent transformations
about the possible corrupt state. In this context it is worth noting
that there is typically some probability for corrupt packets to pass
through undetected (with a seemingly correct checksum). Other
transformations can compensate for delay variations in receiving a transformations can compensate for delay variations in receiving a
packet on the network interface and providing it to the application packet on the network interface and providing it to the application
(de-jitter buffer). (de-jitter buffer).
2.1.19. Received RTP Stream 2.1.19. Received RTP Stream
The RTP Stream (Section 2.1.10) resulting from the Media Transport's The RTP Stream (Section 2.1.10) resulting from the Media Transport's
transformation, i.e. subjected to packet loss, packet corruption, transformation, i.e. subjected to packet loss, packet corruption,
packet duplication and varying transmission delay from sender to packet duplication and varying transmission delay from sender to
receiver. receiver.
2.1.20. Received Redundancy RTP Stream 2.1.20. Received Redundancy RTP Stream
The Redundancy RTP Stream (Section 2.1.12) resulting from the Media The Redundancy RTP Stream (Section 2.1.12) resulting from the Media
Transport transformation, i.e. subjected to packet loss, packet Transport transformation, i.e. subjected to packet loss, packet
corruption, and varying transmission delay from sender to receiver. corruption, and varying transmission delay from sender to receiver.
2.1.21. Media Repair 2.1.21. RTP-based Repair
A Transformation that takes as input one or more Received RTP Streams RTP-based Repair is a Transformation that takes as input one or more
(Section 2.1.19) as well as Redundancy RTP Streams (Section 2.1.20) Received RTP Streams (Section 2.1.19) and Received Redundancy RTP
and attempts to combine them to counter the transformations Streams (Section 2.1.20), and produces one or more Repaired RTP
introduced by the Media Transport (Section 2.1.13) to minimize the Streams (Section 2.1.22) that are as close to the corresponding sent
difference between the Source RTP Stream (Section 2.1.10) and the Source RTP Streams (Section 2.1.10) as possible, using different RTP-
Repaired RTP Stream (Section 2.1.22). The output is a Repaired RTP based repair methods, for example the ones referred in RTP-based
Stream (Section 2.1.22). Redundancy (Section 2.1.11).
2.1.22. Repaired RTP Stream 2.1.22. Repaired RTP Stream
A Received RTP Stream (Section 2.1.19) for which Received Redundancy A Received RTP Stream (Section 2.1.19) for which Received Redundancy
RTP Stream (Section 2.1.20) information has been used to try to re- RTP Stream (Section 2.1.20) information has been used to try to
create the RTP Stream (Section 2.1.10) as it was before Media recover the Source RTP Stream (Section 2.1.10) as it was before Media
Transport (Section 2.1.13). Transport (Section 2.1.13).
2.1.23. Media Depacketizer 2.1.23. Media Depacketizer
A Media Depacketizer takes one or more RTP Streams (Section 2.1.10), A Media Depacketizer takes one or more RTP Streams (Section 2.1.10),
depacketizes them, and attempts to reconstitute the Encoded Streams depacketizes them, and attempts to reconstitute the Encoded Streams
(Section 2.1.7) or Dependent Streams (Section 2.1.8) present in those (Section 2.1.7) or Dependent Streams (Section 2.1.8) present in those
RTP Streams. RTP Streams.
It should be noted that in practical implementations, the Media It should be noted that in practical implementations, the Media
skipping to change at page 16, line 47 skipping to change at page 17, line 14
It should be noted that in practical implementations, the Media It should be noted that in practical implementations, the Media
Decoder and the Media Depacketizer may be tightly coupled and share Decoder and the Media Depacketizer may be tightly coupled and share
information to improve or optimize the overall decoding process in information to improve or optimize the overall decoding process in
various ways. It is however not expected that there would be any various ways. It is however not expected that there would be any
benefit in defining a taxonomy for those detailed (and likely very benefit in defining a taxonomy for those detailed (and likely very
implementation-dependent) steps. implementation-dependent) steps.
Characteristics: Characteristics:
o A Media Decoder has to deal with any errors in the encoded streams o A Media Decoder has to deal with any errors in the Encoded Streams
that resulted from corruption or failure to repair packet losses. that resulted from corruption or failure to repair packet losses.
Therefore, it commonly is robust to error and losses, and includes Therefore, it commonly is robust to error and losses, and includes
concealment methods. concealment methods.
2.1.26. Received Source Stream 2.1.26. Received Source Stream
The received version of a Source Stream (Section 2.1.5). The received version of a Source Stream (Section 2.1.5).
2.1.27. Media Sink 2.1.27. Media Sink
The Media Sink receives a Source Stream (Section 2.1.5) that The Media Sink receives a Source Stream (Section 2.1.5) that
contains, usually periodically, sampled media data together with contains, usually periodically, sampled media data together with
associated synchronization information. Depending on application, associated synchronization information. Depending on application,
this Source Stream then needs to be transformed into a Raw Stream this Source Stream then needs to be transformed into a Raw Stream
(Section 2.1.3) that is conveyed to the Media Render (Section 2.1.3) that is conveyed to the Media Render
(Section 2.1.29), synchronized with the output from other Media (Section 2.1.29), synchronized with the output from other Media
Sinks. The media sink may also be connected with a Media Source Sinks. The Media Sink may also be connected with a Media Source
(Section 2.1.4) and be used as part of a conceptual Media Source. (Section 2.1.4) and be used as part of a conceptual Media Source.
Characteristics: Characteristics:
o The Media Sink can further transform the Source Stream into a o The Media Sink can further transform the Source Stream into a
representation that is suitable for rendering on the Media Render representation that is suitable for rendering on the Media Render
as defined by the application or system-wide configuration. This as defined by the application or system-wide configuration. This
include sample scaling, level adjustments etc. include sample scaling, level adjustments etc.
2.1.28. Received Raw Stream 2.1.28. Received Raw Stream
skipping to change at page 17, line 40 skipping to change at page 18, line 5
2.1.29. Media Render 2.1.29. Media Render
A Media Render takes a Raw Stream (Section 2.1.3) and converts it A Media Render takes a Raw Stream (Section 2.1.3) and converts it
into Physical Stimulus (Section 2.1.1) that a human user can into Physical Stimulus (Section 2.1.1) that a human user can
perceive. Examples of such devices are screens, and D/A converters perceive. Examples of such devices are screens, and D/A converters
connected to amplifiers and loudspeakers. connected to amplifiers and loudspeakers.
Characteristics: Characteristics:
o An End Point can potentially have multiple Media Renders for each o An Endpoint can potentially have multiple Media Renders for each
media type. media type.
2.2. Communication Entities 2.2. Communication Entities
This section contains concept for entities involved in the This section contains concept for entities involved in the
communication. communication.
+----------------------------------------------------------+ +------------------------------------------------------------+
| Communication Session | | Communication Session |
| | | |
| +----------------+ +----------------+ | | +----------------+ +----------------+ |
| | Participant A | +------------+ | Participant B | | | | Participant A | +------------+ | Participant B | |
| | | | Multimedia | | | | | | | | Multimedia | | | |
| | +-------------+|<=>| Session |<=>|+-------------+ | | | | +-------------+|<==>| Session |<==>|+-------------+ | |
| | | End Point A || | | || End Point B | | | | | | Endpoint A || | | || Endpoint B | | |
| | | || +------------+ || | | | | | | || +------------+ || | | |
| | | +-----------++--------------------++-----------+ | | | | | | +-----------++----------------------++-----------+ | | |
| | | | RTP Session| | | | | | | | | | | | | | | |
| | | | Audio |--Media Transport-->| | | | | | | | | RTP Session|---Media Transport--->| | | | |
| | | | |<--Media Transport--| | | | | | | | | Audio |<---Media Transport---| | | | |
| | | +-----------++--------------------++-----------+ | | | | | | | | ^ | | | | |
| | | || || | | | | | | +-----------++----------|-----------++-----------+ | | |
| | | +-----------++--------------------++-----------+ | | | | | | || v || | | |
| | | | RTP Session| | | | | | | | | || +-----------------+ || | | |
| | | | Video |--Media Transport-->| | | | | | | | || | Synchronization | || | | |
| | | | |<--Media Transport--| | | | | | | | || | Context | || | | |
| | | +-----------++--------------------++-----------+ | | | | | | || +-----------------+ || | | |
| | +-------------+| |+-------------+ | | | | | || ^ || | | |
| +----------------+ +----------------+ | | | | +-----------++----------|-----------++-----------+ | | |
+----------------------------------------------------------+ | | | | | v | | | | |
| | | | RTP Session|<---Media Transport---| | | | |
| | | | Video |---Media Transport--->| | | | |
| | | | | | | | | |
| | | +-----------++----------------------++-----------+ | | |
| | +-------------+| |+-------------+ | |
| +----------------+ +----------------+ |
+------------------------------------------------------------+
Figure 6: Example Point to Point Communication Session with two RTP Figure 6: Example Point to Point Communication Session with two RTP
Sessions Sessions
The figure above shows a high-level example representation of a very The figure above shows a high-level example representation of a very
basic point-to-point Communication Session between Participants A and basic point-to-point Communication Session between Participants A and
B. It uses two different audio and video RTP Sessions between A's B. It uses two different audio and video RTP Sessions between A's
and B's End Points, using separate Media Transports for those RTP and B's Endpoints, using separate Media Transports for those RTP
Sessions. The Multimedia Session shared by the participants can, for Sessions. The Multimedia Session shared by the Participants can, for
example, be established using SIP (i.e., there is a SIP Dialog example, be established using SIP (i.e., there is a SIP Dialog
between A and B). The terms used in that figure are further between A and B). The terms used in that figure are further
elaborated in the sub-sections below. elaborated in the sub-sections below.
2.2.1. End Point 2.2.1. Endpoint
Editor's note: Consider if a single word, "Endpoint", is
preferable
A single addressable entity sending or receiving RTP packets. It may A single addressable entity sending or receiving RTP packets. It may
be decomposed into several functional blocks, but as long as it be decomposed into several functional blocks, but as long as it
behaves as a single RTP stack entity it is classified as a single behaves as a single RTP stack entity it is classified as a single
"End Point". "Endpoint".
Characteristics: Characteristics:
o End Points can be identified in several different ways. While o Endpoints can be identified in several different ways. While RTCP
RTCP Canonical Names (CNAMEs) [RFC3550] provide a globally unique Canonical Names (CNAMEs) [RFC3550] provide a globally unique and
and stable identification mechanism for the duration of the stable identification mechanism for the duration of the
Communication Session (see Section 2.2.5), their validity applies Communication Session (see Section 2.2.5), their validity applies
exclusively within a Synchronization Context (Section 3.1). Thus exclusively within a Synchronization Context (Section 3.1). Thus
one End Point can handle multiple CNAMEs, each of which can be one Endpoint can handle multiple CNAMEs, each of which can be
shared among a set of End Points belonging to the same Participant shared among a set of Endpoints belonging to the same Participant
(Section 2.2.3). Therefore, mechanisms outside the scope of RTP, (Section 2.2.3). Therefore, mechanisms outside the scope of RTP,
such as application defined mechanisms, must be used to ensure End such as application defined mechanisms, must be used to ensure
Point identification when outside this Synchronization Context. Endpoint identification when outside this Synchronization Context.
o An End Point can be associated with at most one Participant o An Endpoint can be associated with at most one Participant
(Section 2.2.3) at any single point in time. (Section 2.2.3) at any single point in time.
o In some contexts, an End Point would typically correspond to a o In some contexts, an Endpoint would typically correspond to a
single "host", for example a computer using a single network single "host", for example a computer using a single network
interface and being used by a single human user. interface and being used by a single human user.
2.2.2. RTP Session 2.2.2. RTP Session
Editor's note: Re-consider if this is really a Communication An RTP Session is an association among a group of Participants
Entity, or if it is rather an existing concept that should be
described in Section 4.
An RTP session is an association among a group of participants
communicating with RTP. It is a group communications channel which communicating with RTP. It is a group communications channel which
can potentially carry a number of RTP Streams. Within an RTP can potentially carry a number of RTP Streams. Within an RTP
session, every participant can find meta-data and control information Session, every Participant can find meta-data and control information
(over RTCP) about all the RTP Streams in the RTP session. The (over RTCP) about all the RTP Streams in the RTP Session. The
bandwidth of the RTCP control channel is shared between all bandwidth of the RTCP control channel is shared between all
participants within an RTP Session. Participants within an RTP Session.
Characteristics: Characteristics:
o An RTP Session can carry one ore more RTP Streams. o An RTP Session can carry one ore more RTP Streams.
o An RTP Session shares a single SSRC space as defined in RFC3550 o An RTP Session shares a single SSRC space as defined in RFC3550
[RFC3550]. That is, the End Points participating in an RTP [RFC3550]. That is, the Endpoints participating in an RTP Session
Session can see an SSRC identifier transmitted by any of the other can see an SSRC identifier transmitted by any of the other
End Points. An End Point can receive an SSRC either as SSRC or as Endpoints. An Endpoint can receive an SSRC either as SSRC or as a
a Contributing source (CSRC) in RTP and RTCP packets, as defined Contributing source (CSRC) in RTP and RTCP packets, as defined by
by the endpoints' network interconnection topology. the Endpoints' network interconnection topology.
o An RTP Session uses at least two Media Transports o An RTP Session uses at least two Media Transports
(Section 2.1.13), one for sending and one for receiving. (Section 2.1.13), one for sending and one for receiving.
Commonly, the receiving Media Transport is the reverse direction Commonly, the receiving Media Transport is the reverse direction
of the Media Transport used for sending. An RTP Session may use of the Media Transport used for sending. An RTP Session may use
many Media Transports and these define the session's network many Media Transports and these define the session's network
interconnection topology. A single Media Transport can normally interconnection topology.
not transport more than one RTP Session, unless a solution for
multiplexing multiple RTP sessions over a single Media Transport
is used. One example of such a scheme is Multiple RTP Sessions on
a Single Lower-Layer Transport
[I-D.westerlund-avtcore-transport-multiplexing].
o Multiple RTP Sessions can be related. o A single Media Transport always carries a single RTP Session.
o Multiple RTP Sessions can be conceptually related, for example
originating from or targeted for the same Participant
(Section 2.2.3) or Endpoint (Section 2.2.1), or by containing RTP
Streams that are somehow related (Section 3).
2.2.3. Participant 2.2.3. Participant
A Participant is an entity reachable by a single signaling address, A Participant is an entity reachable by a single signaling address,
and is thus related more to the signaling context than to the media and is thus related more to the signaling context than to the media
context. context.
Characteristics: Characteristics:
o A single signaling-addressable entity, using an application- o A single signaling-addressable entity, using an application-
specific signaling address space, for example a SIP URI. specific signaling address space, for example a SIP URI.
o A Participant can have several Multimedia Sessions o A Participant can participate in several Multimedia Sessions
(Section 2.2.4). (Section 2.2.4).
o A Participant can have several associated End Points o A Participant can be comprised of several associated Endpoints
(Section 2.2.1). (Section 2.2.1).
2.2.4. Multimedia Session 2.2.4. Multimedia Session
A multimedia session is an association among a group of participants A Multimedia Session is an association among a group of Participants
engaged in the communication via one or more RTP Sessions (Section 2.2.3) engaged in the communication via one or more RTP
(Section 2.2.2). It defines logical relationships among Media Sessions (Section 2.2.2). It defines logical relationships among
Sources (Section 2.1.4) that appear in multiple RTP Sessions. Media Sources (Section 2.1.4) that appear in multiple RTP Sessions.
Characteristics: Characteristics:
o A Multimedia Session can be composed of several RTP Sessions with o A Multimedia Session can be composed of several RTP Sessions with
potentially multiple RTP Streams per RTP Session. potentially multiple RTP Streams per RTP Session.
o Each participant in a Multimedia Session can have a multitude of o Each Participant in a Multimedia Session can have a multitude of
Media Captures and Media Rendering devices. Media Captures and Media Rendering devices.
o A single Multimedia Session can contain media from one or more o A single Multimedia Session can contain media from one or more
Synchronization Contexts (Section 3.1). An example of that is a Synchronization Contexts (Section 3.1). An example of that is a
Multimedia Session containing one set of audio and video for Multimedia Session containing one set of audio and video for
communication purposes belonging to one Synchronization Context, communication purposes belonging to one Synchronization Context,
and another set of audio and video for presentation purposes (like and another set of audio and video for presentation purposes (like
playing a video file) with a separate Synchronization Context that playing a video file) with a separate Synchronization Context that
has no strong timing relationship and need not be strictly has no strong timing relationship and need not be strictly
synchronized with the audio and video used for communication. synchronized with the audio and video used for communication.
2.2.5. Communication Session 2.2.5. Communication Session
A Communication Session is an association among group of participants A Communication Session is an association among two or more
communicating with each other via a set of Multimedia Sessions. Participants (Section 2.2.3) communicating with each other via one or
more Multimedia Sessions (Section 2.2.4).
Characteristics: Characteristics:
o Each participant in a Communication Session is identified via an o Each Participant in a Communication Session is identified via an
application-specific signaling address. application-specific signaling address.
o A Communication Session is composed of at least one Multimedia o A Communication Session is composed of Participants that share at
Session per participant, involving one or more parallel RTP least one Multimedia Session, involving one or more parallel RTP
Sessions with potentially multiple RTP Streams per RTP Session. Sessions with potentially multiple RTP Streams per RTP Session.
For example, in a full mesh communication, the Communication Session For example, in a full mesh communication, the Communication Session
consists of a set of separate Multimedia Sessions between each pair consists of a set of separate Multimedia Sessions between each pair
of Participants. Another example is a centralized conference, where of Participants. Another example is a centralized conference, where
the Communication Session consists of a set of Multimedia Sessions the Communication Session consists of a set of Multimedia Sessions
between each Participant and the conference handler. between each Participant and the conference handler.
3. Concept Inter-Relations 3. Concepts of Inter-Relations
This section uses the concepts from previous sections, and looks at This section uses the concepts from previous sections, and looks at
different types of relationships among them. These relationships different types of relationships among them. These relationships
occur at different abstraction levels and for different purposes. occur at different abstraction levels and for different purposes, but
The section is organized such as to look at the level where a the reason for the needed relationship at a certain step in the media
relation is required. The reason for the relationship may exist at handling chain may exist at another step. For example, the use of
another step in the media handling chain. For example, the use of Simulcast (Section 3.7)) implies a need to determine relations at RTP
Simulcast (discussed in Section 3.7) implies a need to determine Stream level, but the underlying reason is that multiple Media
relations at RTP Stream level. However the reason to relate RTP Encoders use the same Media Source, i.e. to be able to identify a
Streams in this context is not bound to RTP Streams, but is that common Media Source.
multiple Media Encoders use the same Media Source, i.e. to be able to
identify a common Media Source.
Media Sources (Section 2.1.4) are commonly grouped and related to an
End Point (Section 2.2.1) or a Participant (Section 2.2.3) for a
number of reasons, for example application logic and media handling
purposes.
At RTP Packetization time, a Media Packetizer has options to
packetize according to a number of different types of relationships
between Encoded Streams (Section 2.1.7), Dependent Streams
(Section 2.1.8) and RTP Streams (Section 2.1.10). These are caused
by grouping together or distributing these different types of streams
into RTP Streams.
While RTP Streams are generally separate, with independent sequence
number and timestamp spaces, they may have underlying relationships
that comes from a different level of abstraction.
RTP Streams may be protected by Redundancy RTP Streams during
transport. Several approaches listed below can be used to create
Redundancy RTP Streams;
o Duplication of the original RTP Stream
o Duplication of the original RTP Stream with a time offset,
o Forward Error Correction (FEC) techniques, and
o Retransmission of lost packets (either globally or selectively).
The different RTP Streams can be transported within the same RTP
Session or in different RTP Sessions to accomplish different
transport goals. This explicit separation of RTP Streams is further
discussed in Section 3.13.
3.1. Synchronization Context 3.1. Synchronization Context
A Synchronization Context defines a requirement on a strong timing A Synchronization Context defines a requirement on a strong timing
relationship between the Media Sources, typically requiring alignment relationship between the Media Sources, typically requiring alignment
of clock sources. Such a relationship can be identified in multiple of clock sources. Such a relationship can be identified in multiple
ways as listed below. A single Media Source can only belong to a ways as listed below. A single Media Source can only belong to a
single Synchronization Context, since it is assumed that a single single Synchronization Context, since it is assumed that a single
Media Source can only have a single media clock and requiring Media Source can only have a single media clock and requiring
alignment to several Synchronization Contexts (and thus reference alignment to several Synchronization Contexts (and thus reference
skipping to change at page 23, line 14 skipping to change at page 22, line 34
3.1.3. Implicitly via RtcMediaStream 3.1.3. Implicitly via RtcMediaStream
The WebRTC WG defines "RtcMediaStream" with one or more The WebRTC WG defines "RtcMediaStream" with one or more
"RtcMediaStreamTracks". All tracks in a "RtcMediaStream" are "RtcMediaStreamTracks". All tracks in a "RtcMediaStream" are
intended to be synchronized when rendered, implying that they must be intended to be synchronized when rendered, implying that they must be
generated such that synchronization is possible. generated such that synchronization is possible.
3.1.4. Explicitly via SDP Mechanisms 3.1.4. Explicitly via SDP Mechanisms
RFC5888 [RFC5888] defines m=line grouping mechanism called "Lip The SDP Grouping Framework [RFC5888] defines an m= line (Section 4.2)
Synchronization (LS)" for establishing the synchronization grouping mechanism called "Lip Synchronization (LS)" for establishing
requirement across m=lines when they map to individual sources. the synchronization requirement across m= lines when they map to
individual sources.
RFC5576 [RFC5576] extends the above mechanism when multiple media Source-Specific Media Attributes in SDP [RFC5576] extends the above
sources are described by a single m=line. mechanism when multiple Media Sources are described by a single m=
line.
3.2. End Point 3.2. Endpoint
Some applications requires knowledge of what Media Sources originate Some applications requires knowledge of what Media Sources originate
from a particular End Point (Section 2.2.1). This can include such from a particular Endpoint (Section 2.2.1). This can include such
decisions as packet routing between parts of the topology, knowing decisions as packet routing between parts of the topology, knowing
the End Point origin of the RTP Streams. the Endpoint origin of the RTP Streams.
In RTP, this identification has been overloaded with the In RTP, this identification has been overloaded with the
Synchronization Context (Section 3.1) through the usage of the RTCP Synchronization Context (Section 3.1) through the usage of the RTCP
source description CNAME (Section 3.1.1). This works for some source description CNAME (Section 3.1.1). This works for some
usages, but in others it breaks down. For example, if an End Point usages, but in others it breaks down. For example, if an Endpoint
has two sets of Media Sources that have different Synchronization has two sets of Media Sources that have different Synchronization
Contexts, like the audio and video of the human participant as well Contexts, like the audio and video of the human Participant as well
as a set of Media Sources of audio and video for a shared movie, as a set of Media Sources of audio and video for a shared movie,
CNAME would not be an appropriate identification for that End Point. CNAME would not be an appropriate identification for that Endpoint.
Therefore, an End Point may have multiple CNAMEs. The CNAMEs or the Therefore, an Endpoint may have multiple CNAMEs. The CNAMEs or the
Media Sources themselves can be related to the End Point. Media Sources themselves can be related to the Endpoint.
3.3. Participant 3.3. Participant
In communication scenarios, it is commonly needed to know which Media In communication scenarios, it is commonly needed to know which Media
Sources originate from which Participant (Section 2.2.3). One reason Sources originate from which Participant (Section 2.2.3). One reason
is, for example, to enable the application to display Participant is, for example, to enable the application to display Participant
Identity information correctly associated with the Media Sources. Identity information correctly associated with the Media Sources.
This association is handled through the signaling solution to point This association is handled through the signaling solution to point
at a specific Multimedia Session where the Media Sources may be at a specific Multimedia Session where the Media Sources may be
explicitly or implicitly tied to a particular End Point. explicitly or implicitly tied to a particular Endpoint.
Participant information becomes more problematic due to Media Sources Participant information becomes more problematic due to Media Sources
that are generated through mixing or other conceptual processing of that are generated through mixing or other conceptual processing of
Raw Streams or Source Streams that originate from different Raw Streams or Source Streams that originate from different
Participants. This type of Media Sources can thus have a dynamically Participants. This type of Media Sources can thus have a dynamically
varying set of origins and Participants. RTP contains the concept of varying set of origins and Participants. RTP contains the concept of
Contributing Sources (CSRC) that carry information about the previous Contributing Sources (CSRC) that carry information about the previous
step origin of the included media content on RTP level. step origin of the included media content on RTP level.
3.4. RtcMediaStream 3.4. RtcMediaStream
skipping to change at page 24, line 38 skipping to change at page 24, line 14
SST denotes one RTP Stream (SSRC) per Media Source in a single RTP SST denotes one RTP Stream (SSRC) per Media Source in a single RTP
Session. MST denotes one or more RTP Streams (SSRC) per Media Source Session. MST denotes one or more RTP Streams (SSRC) per Media Source
in each of multiple RTP Sessions. The above is not unambiguously in each of multiple RTP Sessions. The above is not unambiguously
specified in the SVC payload format text [RFC6190], but it is what specified in the SVC payload format text [RFC6190], but it is what
existing deployments of that RFC have implemented. existing deployments of that RFC have implemented.
The use of the term "RTP Session" in the SST/MST definition is The use of the term "RTP Session" in the SST/MST definition is
somewhat misleading, since a single RTP Session can contain multiple somewhat misleading, since a single RTP Session can contain multiple
RTP Streams. Also, it is sometimes useful to make a distinction RTP Streams. Also, it is sometimes useful to make a distinction
between using a single Transport or multiple separate Transports when between using a single Media Transport or multiple separate Media
(in both cases) using multiple RTP Streams to carry Encoded Streams Transports when (in both cases) using multiple RTP Streams to carry
and Dependent Streams for a Media Source. Therefore, herein the Encoded Streams and Dependent Streams for a Media Source. Therefore,
following new terminology is defined: herein the following new terminology is defined:
SRST: Single RTP stream on a Single Transport SRST: Single RTP Stream on a Single Media Transport
MRST: Multiple RTP streams on a Single Transport MRST: Multiple RTP Streams on a Single Media Transport
MRMT: Multiple RTP streams on Multiple Transports MRMT: Multiple RTP Streams on Multiple Media Transports
3.6. Multi-Channel Audio 3.6. Multi-Channel Audio
There exist a number of RTP payload formats that can carry multi- There exist a number of RTP payload formats that can carry multi-
channel audio, despite the codec being a mono encoder. Multi-channel channel audio, despite the codec being a mono encoder. Multi-channel
audio can be viewed as multiple Media Sources sharing a common audio can be viewed as multiple Media Sources sharing a common
Synchronization Context. These are independently encoded by a Media Synchronization Context. These are independently encoded by a Media
Encoder and the different Encoded Streams are packetized together in Encoder and the different Encoded Streams are packetized together in
a time synchronized way into a single Source RTP Stream, using the a time synchronized way into a single Source RTP Stream, using the
used codec's RTP Payload format. Example of such codecs are, PCMA used codec's RTP Payload format. Examples of codecs that support
and PCMU [RFC3551], AMR [RFC4867], and G.719 [RFC5404]. multi-channel audio are PCMA and PCMU [RFC3551], AMR [RFC4867], and
G.719 [RFC5404].
3.7. Simulcast 3.7. Simulcast
A Media Source represented as multiple independent Encoded Streams A Media Source represented as multiple independent Encoded Streams
constitutes a simulcast or Multiple Description Coding of that Media constitutes a Simulcast or Multiple Description Coding of that Media
Source. Figure 7 below shows an example of a Media Source that is Source. Figure 7 below shows an example of a Media Source that is
encoded into three separate Simulcast streams, that are in turn sent encoded into three separate Simulcast streams, that are in turn sent
on the same Media Transport flow. When using Simulcast, the RTP on the same Media Transport flow. When using Simulcast, the RTP
Streams may be sharing RTP Session and Media Transport, or be Streams may be sharing RTP Session and Media Transport, or be
separated on different RTP Sessions and Media Transports, or any separated on different RTP Sessions and Media Transports, or any
combination of these two. It is other considerations that affect combination of these two. It is other considerations that affect
which usage is desirable, as discussed in Section 3.13. which usage is desirable, as discussed in Section 3.13.
+----------------+ +----------------+
| Media Source | | Media Source |
skipping to change at page 26, line 33 skipping to change at page 25, line 33
| Stream | Stream | Stream | Stream | Stream | Stream
+-----------------+ | +-----------------+ +-----------------+ | +-----------------+
| | | | | |
V V V V V V
+-------------------+ +-------------------+
| Media Transport | | Media Transport |
+-------------------+ +-------------------+
Figure 7: Example of Media Source Simulcast Figure 7: Example of Media Source Simulcast
The simulcast relation between the RTP Streams is the common Media The Simulcast relation between the RTP Streams is the common Media
Source. In addition, to be able to identify the common Media Source, Source. In addition, to be able to identify the common Media Source,
a receiver of the RTP Stream may need to know which configuration or a receiver of the RTP Stream may need to know which configuration or
encoding goals that lay behind the produced Encoded Stream and its encoding goals that lay behind the produced Encoded Stream and its
properties. This to enable selection of the stream that is most properties. This to enable selection of the stream that is most
useful in the application at that moment. useful in the application at that moment.
3.8. Layered Multi-Stream 3.8. Layered Multi-Stream
Layered Multi-Stream (LMS) is a mechanism by which different portions Layered Multi-Stream (LMS) is a mechanism by which different portions
of a layered encoding of a Source Stream are sent using separate RTP of a layered encoding of a Source Stream are sent using separate RTP
Streams (sometimes in separate RTP Sessions). LMSs are useful for Streams (sometimes in separate RTP Sessions). LMSs are useful for
receiver control of layered media. receiver control of layered media.
A Media Source represented as an Encoded Stream and multiple A Media Source represented as an Encoded Stream and multiple
Dependent Streams constitutes a Media Source that has layered Dependent Streams constitutes a Media Source that has layered
dependencies. The figure below represents an example of a Media dependencies. The figure below represents an example of a Media
Source that is encoded into three dependent layers, where two layers Source that is encoded into three dependent layers, where two layers
are sent on the same Media Transport using different RTP Streams, are sent on the same Media Transport using different RTP Streams,
i.e. SSRCs, and the third layer is sent on a separate Media i.e. SSRCs, and the third layer is sent on a separate Media
Transport, i.e. a different RTP Session. Transport.
+----------------+ +----------------+
| Media Source | | Media Source |
+----------------+ +----------------+
| |
| |
V V
+---------------------------------------------------------+ +---------------------------------------------------------+
| Media Encoder | | Media Encoder |
+---------------------------------------------------------+ +---------------------------------------------------------+
skipping to change at page 30, line 52 skipping to change at page 29, line 52
original sequence number in the payload of any new Redundancy RTP original sequence number in the payload of any new Redundancy RTP
Stream using the RTX payload format. In cases where the Redundancy Stream using the RTX payload format. In cases where the Redundancy
RTP Stream is sent in a separate RTP Session from the Source RTP RTP Stream is sent in a separate RTP Session from the Source RTP
Stream, these sessions are related, which is signaled by using the Stream, these sessions are related, which is signaled by using the
SDP Media Grouping's [RFC5888] FID semantics. SDP Media Grouping's [RFC5888] FID semantics.
3.12. Forward Error Correction 3.12. Forward Error Correction
The figure below (Figure 12) shows an example where two Media The figure below (Figure 12) shows an example where two Media
Sources' Source RTP Streams are protected by FEC. Source RTP Stream Sources' Source RTP Streams are protected by FEC. Source RTP Stream
A has a Media Redundancy transformation in FEC Encoder 1. This A has a RTP-based Redundancy transformation in FEC Encoder 1. This
produces a Redundancy RTP Stream 1, that is only related to Source produces a Redundancy RTP Stream 1, that is only related to Source
RTP Stream A. The FEC Encoder 2, however, takes two Source RTP RTP Stream A. The FEC Encoder 2, however, takes two Source RTP
Streams (A and B) and produces a Redundancy RTP Stream 2 that Streams (A and B) and produces a Redundancy RTP Stream 2 that
protects them jointly, i.e. Redundancy RTP Stream 2 relates to two protects them jointly, i.e. Redundancy RTP Stream 2 relates to two
Source RTP Streams (a FEC group). FEC decoding, when needed due to Source RTP Streams (a FEC group). FEC decoding, when needed due to
packet loss or packet corruption at the receiver, requires knowledge packet loss or packet corruption at the receiver, requires knowledge
about which Source RTP Streams that the FEC encoding was based on. about which Source RTP Streams that the FEC encoding was based on.
In Figure 12 all RTP Streams are sent on the same Media Transport. In Figure 12 all RTP Streams are sent on the same Media Transport.
This is however not the only possible choice. Numerous combinations This is however not the only possible choice. Numerous combinations
skipping to change at page 32, line 6 skipping to change at page 31, line 6
+----------------------------------------------------------+ +----------------------------------------------------------+
Figure 12: Example of FEC Redundancy RTP Streams Figure 12: Example of FEC Redundancy RTP Streams
As FEC Encoding exists in various forms, the methods for relating FEC As FEC Encoding exists in various forms, the methods for relating FEC
Redundancy RTP Streams with its source information in Source RTP Redundancy RTP Streams with its source information in Source RTP
Streams are many. The XOR based RTP FEC Payload format [RFC5109] is Streams are many. The XOR based RTP FEC Payload format [RFC5109] is
defined in such a way that a Redundancy RTP Stream has a one to one defined in such a way that a Redundancy RTP Stream has a one to one
relation with a Source RTP Stream. In fact, the RFC requires the relation with a Source RTP Stream. In fact, the RFC requires the
Redundancy RTP Stream to use the same SSRC as the Source RTP Stream. Redundancy RTP Stream to use the same SSRC as the Source RTP Stream.
This requires to either use a separate RTP session or to use the This requires to either use a separate RTP Session or to use the
Redundancy RTP Payload format [RFC2198]. The underlying relation Redundancy RTP Payload format [RFC2198]. The underlying relation
requirement for this FEC format and a particular Redundancy RTP requirement for this FEC format and a particular Redundancy RTP
Stream is to know the related Source RTP Stream, including its SSRC. Stream is to know the related Source RTP Stream, including its SSRC.
3.13. RTP Stream Separation 3.13. RTP Stream Separation
RTP Streams can be separated exclusively based on their SSRCs, at the RTP Streams can be separated exclusively based on their SSRCs, at the
RTP Session level, or at the Multi-Media Session level. RTP Session level, or at the Multi-Media Session level.
When the RTP Streams that have a relationship are all sent in the When the RTP Streams that have a relationship are all sent in the
same RTP Session and are uniquely identified based on their SSRC same RTP Session and are uniquely identified based on their SSRC
only, it is termed an SSRC-Only Based Separation. Such streams can only, it is termed an SSRC-Only Based Separation. Such streams can
be related via RTCP CNAME to identify that the streams belong to the be related via RTCP CNAME to identify that the streams belong to the
same End Point. SSRC-based approaches [RFC5576], when used, can same Endpoint. SSRC-based approaches [RFC5576], when used, can
explicitly relate various such RTP Streams. explicitly relate various such RTP Streams.
On the other hand, when RTP Streams that are related but are sent in On the other hand, when RTP Streams that are related but are sent in
the context of different RTP Sessions to achieve separation, it is the context of different RTP Sessions to achieve separation, it is
known as RTP Session-based separation. This is commonly used when known as RTP Session-based separation. This is commonly used when
the different RTP Streams are intended for different Media the different RTP Streams are intended for different Media
Transports. Transports.
Several mechanisms that use RTP Session-based separation rely on it Several mechanisms that use RTP Session-based separation rely on it
to enable an implicit grouping mechanism expressing the relationship. to enable an implicit grouping mechanism expressing the relationship.
skipping to change at page 32, line 44 skipping to change at page 31, line 44
level relations have been established using semantics from Grouping level relations have been established using semantics from Grouping
of Media lines framework [RFC5888]. Examples of this are RTP of Media lines framework [RFC5888]. Examples of this are RTP
Retransmission [RFC4588], SVC Multi-Session Transmission [RFC6190] Retransmission [RFC4588], SVC Multi-Session Transmission [RFC6190]
and XOR Based FEC [RFC5109]. RTCP CNAME explicitly relates RTP and XOR Based FEC [RFC5109]. RTCP CNAME explicitly relates RTP
Streams across different RTP Sessions, as explained in the previous Streams across different RTP Sessions, as explained in the previous
section. Such a relationship can be used to perform inter-media section. Such a relationship can be used to perform inter-media
synchronization. synchronization.
RTP Streams that are related and need to be associated can be part of RTP Streams that are related and need to be associated can be part of
different Multimedia Sessions, rather than just different RTP different Multimedia Sessions, rather than just different RTP
sessions within the same Multimedia Session context. This puts Sessions within the same Multimedia Session context. This puts
further demand on the scope of the mechanism(s) and its handling of further demand on the scope of the mechanism(s) and its handling of
identifiers used for expressing the relationships. identifiers used for expressing the relationships.
3.14. Multiple RTP Sessions over one Media Transport 3.14. Multiple RTP Sessions over one Media Transport
[I-D.westerlund-avtcore-transport-multiplexing] describes a mechanism [I-D.westerlund-avtcore-transport-multiplexing] describes a mechanism
that allows several RTP Sessions to be carried over a single that allows several RTP Sessions to be carried over a single
underlying Media Transport. The main reasons for doing this are underlying Media Transport. The main reasons for doing this are
related to the impact of using one or more Media Transports (using a related to the impact of using one or more Media Transports (using a
common network path or potentially have different ones). The fewer common network path or potentially have different ones). The fewer
skipping to change at page 34, line 11 skipping to change at page 33, line 11
Describes an Encoded Stream (Section 2.1.7) related to CLUE specific Describes an Encoded Stream (Section 2.1.7) related to CLUE specific
semantic information. semantic information.
4.1.4. Capture Scene 4.1.4. Capture Scene
Describes a set of spatially related Media Sources (Section 2.1.4). Describes a set of spatially related Media Sources (Section 2.1.4).
4.1.5. Endpoint 4.1.5. Endpoint
Describes exactly one Participant (Section 2.2.3) and one or more End Describes exactly one Participant (Section 2.2.3) and one or more
Points (Section 2.2.1). Endpoints (Section 2.2.1).
4.1.6. Individual Encoding 4.1.6. Individual Encoding
Describes the configuration information needed to perform a Media Describes the configuration information needed to perform a Media
Encoder (Section 2.1.6) transformation. Encoder (Section 2.1.6) transformation.
4.1.7. Media Capture 4.1.7. Media Capture
Describes either a Media Capture (Section 2.1.2) or a Media Source Describes either a Media Capture (Section 2.1.2) or a Media Source
(Section 2.1.4), depending on in which context the term is used. (Section 2.1.4), depending on in which context the term is used.
4.1.8. Media Consumer 4.1.8. Media Consumer
Describes the media receiving part of an End Point (Section 2.2.1). Describes the media receiving part of an Endpoint (Section 2.2.1).
4.1.9. Media Provider 4.1.9. Media Provider
Describes the media sending part of an End Point (Section 2.2.1). Describes the media sending part of an Endpoint (Section 2.2.1).
4.1.10. Stream 4.1.10. Stream
Describes an RTP Stream (Section 2.1.10). Describes an RTP Stream (Section 2.1.10).
4.1.11. Video Capture 4.1.11. Video Capture
Describes a video Media Source (Section 2.1.4). Describes a video Media Source (Section 2.1.4).
4.2. Media Description 4.2. Media Description
skipping to change at page 35, line 22 skipping to change at page 34, line 22
stream of (RTP) packets interchangeably, which are all RTP Streams. stream of (RTP) packets interchangeably, which are all RTP Streams.
4.4. Multimedia Conference 4.4. Multimedia Conference
A Multimedia Conference is a Communication Session (Section 2.2.5) A Multimedia Conference is a Communication Session (Section 2.2.5)
between two or more Participants (Section 2.2.3), along with the between two or more Participants (Section 2.2.3), along with the
software they are using to communicate. software they are using to communicate.
4.5. Multimedia Session 4.5. Multimedia Session
SDP [RFC4566] defines a multimedia session as a set of multimedia SDP [RFC4566] defines a Multimedia Session as a set of multimedia
senders and receivers and the data streams flowing from senders to senders and receivers and the data streams flowing from senders to
receivers, which would correspond to a set of End Points and the RTP receivers, which would correspond to a set of Endpoints and the RTP
Streams that flow between them. In this memo, Multimedia Session Streams that flow between them. In this memo, Multimedia Session
(Section 2.2.4) also assumes those End Points belong to a set of (Section 2.2.4) also assumes those Endpoints belong to a set of
Participants that are engaged in communication via a set of related Participants that are engaged in communication via a set of related
RTP Streams. RTP Streams.
RTP [RFC3550] defines a multimedia session as a set of concurrent RTP RTP [RFC3550] defines a Multimedia Session as a set of concurrent RTP
Sessions among a common group of participants. For example, a video Sessions among a common group of Participants. For example, a video
conference may contain an audio RTP Session and a video RTP Session. conference may contain an audio RTP Session and a video RTP Session.
This would correspond to a group of Participants (each using one or This would correspond to a group of Participants (each using one or
more End Points) sharing a set of concurrent RTP Sessions. In this more Endpoints) sharing a set of concurrent RTP Sessions. In this
memo, Multimedia Session also defines those RTP Sessions to have some memo, Multimedia Session also defines those RTP Sessions to have some
relation and be part of a communication among the Participants. relation and be part of a communication among the Participants.
4.6. Multipoint Control Unit (MCU) 4.6. Multipoint Control Unit (MCU)
This term is commonly used to describe the central node in any type This term is commonly used to describe the central node in any type
of star topology [I-D.ietf-avtcore-rtp-topologies-update] conference. of star topology [I-D.ietf-avtcore-rtp-topologies-update] conference.
It describes a device that includes one Participant (Section 2.2.3) It describes a device that includes one Participant (Section 2.2.3)
(usually corresponding to a so-called conference focus) and one or (usually corresponding to a so-called conference focus) and one or
more related End Points (Section 2.2.1) (sometimes one or more per more related Endpoints (Section 2.2.1) (sometimes one or more per
conference participant). conference Participant).
4.7. Recording Device 4.7. Recording Device
WebRTC specifications use this term to refer to locally available WebRTC specifications use this term to refer to locally available
entities performing a Media Capture (Section 2.1.2) transformation. entities performing a Media Capture (Section 2.1.2) transformation.
4.8. RtcMediaStream 4.8. RtcMediaStream
A WebRTC RtcMediaStreamTrack is a set of Media Sources A WebRTC RtcMediaStreamTrack is a set of Media Sources
(Section 2.1.4) sharing the same Synchronization Context (Section 2.1.4) sharing the same Synchronization Context
skipping to change at page 36, line 22 skipping to change at page 35, line 22
A WebRTC RtcMediaStreamTrack is a Media Source (Section 2.1.4). A WebRTC RtcMediaStreamTrack is a Media Source (Section 2.1.4).
4.10. RTP Sender 4.10. RTP Sender
RTP [RFC3550] uses this term, which can be seen as the RTP protocol RTP [RFC3550] uses this term, which can be seen as the RTP protocol
part of a Media Packetizer (Section 2.1.9). part of a Media Packetizer (Section 2.1.9).
4.11. RTP Session 4.11. RTP Session
Within the context of SDP, a singe m=line can map to a single RTP Within the context of SDP, a singe m= line can map to a single RTP
Session or multiple m=lines can map to a single RTP Session. The Session (Section 2.2.2) or multiple m= lines can map to a single RTP
latter is enabled via multiplexing schemes such as BUNDLE Session. The latter is enabled via multiplexing schemes such as
[I-D.ietf-mmusic-sdp-bundle-negotiation], for example, which allows BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation], for example, which
mapping of multiple m=lines to a single RTP Session. allows mapping of multiple m= lines to a single RTP Session.
Editor's note: Consider if the contents of Section 2.2.2 should be
moved here, or if this section should be kept and refer to the
above.
4.12. SSRC 4.12. SSRC
RTP [RFC3550] defines this as "the source of a stream of RTP RTP [RFC3550] defines this as "the source of a stream of RTP
packets", which indicates that an SSRC is not only a unique packets", which indicates that an SSRC is not only a unique
identifier for the Encoded Stream (Section 2.1.7) carried in those identifier for the Encoded Stream (Section 2.1.7) carried in those
packets, but is also effectively used as a term to denote a Media packets, but is also effectively used as a term to denote a Media
Packetizer (Section 2.1.9). Packetizer (Section 2.1.9).
5. Security Considerations 5. Security Considerations
skipping to change at page 37, line 46 skipping to change at page 36, line 46
October 2014. October 2014.
[I-D.ietf-avtcore-rtp-topologies-update] [I-D.ietf-avtcore-rtp-topologies-update]
Westerlund, M. and S. Wenger, "RTP Topologies", draft- Westerlund, M. and S. Wenger, "RTP Topologies", draft-
ietf-avtcore-rtp-topologies-update-05 (work in progress), ietf-avtcore-rtp-topologies-update-05 (work in progress),
November 2014. November 2014.
[I-D.ietf-clue-framework] [I-D.ietf-clue-framework]
Duckworth, M., Pepperell, A., and S. Wenger, "Framework Duckworth, M., Pepperell, A., and S. Wenger, "Framework
for Telepresence Multi-Streams", draft-ietf-clue- for Telepresence Multi-Streams", draft-ietf-clue-
framework-18 (work in progress), October 2014. framework-19 (work in progress), December 2014.
[I-D.ietf-mmusic-sdp-bundle-negotiation] [I-D.ietf-mmusic-sdp-bundle-negotiation]
Holmberg, C., Alvestrand, H., and C. Jennings, Holmberg, C., Alvestrand, H., and C. Jennings,
"Negotiating Media Multiplexing Using the Session "Negotiating Media Multiplexing Using the Session
Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle-
negotiation-12 (work in progress), October 2014. negotiation-14 (work in progress), December 2014.
[I-D.ietf-rtcweb-overview] [I-D.ietf-rtcweb-overview]
Alvestrand, H., "Overview: Real Time Protocols for Alvestrand, H., "Overview: Real Time Protocols for
Browser-based Applications", draft-ietf-rtcweb-overview-12 Browser-based Applications", draft-ietf-rtcweb-overview-13
(work in progress), October 2014. (work in progress), November 2014.
[I-D.westerlund-avtcore-transport-multiplexing] [I-D.westerlund-avtcore-transport-multiplexing]
Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP
Sessions onto a Single Lower-Layer Transport", draft- Sessions onto a Single Lower-Layer Transport", draft-
westerlund-avtcore-transport-multiplexing-07 (work in westerlund-avtcore-transport-multiplexing-07 (work in
progress), October 2013. progress), October 2013.
[RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
skipping to change at page 39, line 31 skipping to change at page 38, line 31
7198, April 2014. 7198, April 2014.
[RFC7273] Williams, A., Gross, K., van Brandenburg, R., and H. [RFC7273] Williams, A., Gross, K., van Brandenburg, R., and H.
Stokking, "RTP Clock Source Signalling", RFC 7273, June Stokking, "RTP Clock Source Signalling", RFC 7273, June
2014. 2014.
Appendix A. Changes From Earlier Versions Appendix A. Changes From Earlier Versions
NOTE TO RFC EDITOR: Please remove this section prior to publication. NOTE TO RFC EDITOR: Please remove this section prior to publication.
A.1. Modifications Between WG Version -02 and -03 A.1. Modifications Between WG Version -03 and -04
o Changed "Media Redundancy" and "Media Repair" to "RTP-based
Redundancy" and "RTP-based Repair", since those terms are more
specific and correct.
o Changed "End Point" to "Endpoint" and removed Editor's Note on
this.
o Clarified that a Media Capture may impose constraints on clock
handling.
o Clarified that mixing multiple Raw Streams into a Source Stream is
not possible, since that requires mixed streams to have a timing
relation, requiring them to be Source Streams, and added an
example.
o Clarified that RTP-based Redundancy excludes the type of encoding
redundancy found within the encoded media format in an Encoded
Stream.
o Clarified that a Media Transport contains only a single RTP
Session, but a single RTP Session can span multiple Media
Transports.
o Clarified that packets with seemingly correct checksum that are
received by a Media Transport Receiver may still be corrupt.
o Clarified that a corrupt packet in a Media Transport Receiver is
typically either discarded or somehow marked and passed on in the
Received RTP Stream.
o Added Synchronization Context to Figure 6.
o Editorial improvements and clarifications.
A.2. Modifications Between WG Version -02 and -03
o Changed section 3.5, removing SST-SS/MS and MST-SS/MS, replacing o Changed section 3.5, removing SST-SS/MS and MST-SS/MS, replacing
them with SRST, MRST, and MRMT. them with SRST, MRST, and MRMT.
o Updated section 3.8 to align with terminology changes in section o Updated section 3.8 to align with terminology changes in section
3.5. 3.5.
o Added a new section 4.12, describing the term Multimedia o Added a new section 4.12, describing the term Multimedia
Conference. Conference.
o Changed reference from I-D to now published RFC 7273. o Changed reference from I-D to now published RFC 7273.
o Editorial improvements and clarifications. o Editorial improvements and clarifications.
A.2. Modifications Between WG Version -01 and -02 A.3. Modifications Between WG Version -01 and -02
o Major re-structure o Major re-structure
o Moved media chain Media Transport detailing up one section level o Moved media chain Media Transport detailing up one section level
o Collapsed level 2 sub-sections of section 3 and thus moved level 3 o Collapsed level 2 sub-sections of section 3 and thus moved level 3
sub-sections up one level, gathering some introductory text into sub-sections up one level, gathering some introductory text into
the beginning of section 3 the beginning of section 3
o Added that not only SSRC collision, but also a clock rate change o Added that not only SSRC collision, but also a clock rate change
[RFC7160] is a valid reason to change SSRC value for an RTP stream [RFC7160] is a valid reason to change SSRC value for an RTP stream
o Added a sub-section on clock source signaling o Added a sub-section on clock source signaling
o Added a sub-section on RTP stream duplication o Added a sub-section on RTP stream duplication
skipping to change at page 40, line 42 skipping to change at page 40, line 31
section per term, mainly by moving text from sections 2 and 3 section per term, mainly by moving text from sections 2 and 3
o Changed all occurrences of Packet Stream to RTP Stream o Changed all occurrences of Packet Stream to RTP Stream
o Moved all normative references to informative, since this is an o Moved all normative references to informative, since this is an
informative document informative document
o Added references to RFC 7160, RFC 7197 and RFC 7198, and removed o Added references to RFC 7160, RFC 7197 and RFC 7198, and removed
unused references unused references
A.3. Modifications Between WG Version -00 and -01 A.4. Modifications Between WG Version -00 and -01
o WG version -00 text is identical to individual draft -03 o WG version -00 text is identical to individual draft -03
o Amended description of SVC SST and MST encodings with respect to o Amended description of SVC SST and MST encodings with respect to
concepts defined in this text concepts defined in this text
o Removed UML as normative reference, since the text no longer uses o Removed UML as normative reference, since the text no longer uses
any UML notation any UML notation
o Removed a number of level 4 sections and moved out text to the o Removed a number of level 4 sections and moved out text to the
level above level above
A.4. Modifications Between Version -02 and -03 A.5. Modifications Between Version -02 and -03
o Section 4 rewritten (and new communication topologies added) to o Section 4 rewritten (and new communication topologies added) to
reflect the major updates to Sections 1-3 reflect the major updates to Sections 1-3
o Section 8 removed (carryover from initial -00 draft) o Section 8 removed (carryover from initial -00 draft)
o General clean up of text, grammar and nits o General clean up of text, grammar and nits
A.5. Modifications Between Version -01 and -02 A.6. Modifications Between Version -01 and -02
o Section 2 rewritten to add both streams and transformations in the o Section 2 rewritten to add both streams and transformations in the
media chain. media chain.
o Section 3 rewritten to focus on exposing relationships. o Section 3 rewritten to focus on exposing relationships.
A.6. Modifications Between Version -00 and -01 A.7. Modifications Between Version -00 and -01
o Too many to list o Too many to list
o Added new authors o Added new authors
o Updated content organization and presentation o Updated content organization and presentation
Authors' Addresses Authors' Addresses
Jonathan Lennox Jonathan Lennox
skipping to change at page 42, line 23 skipping to change at page 42, line 15
Cisco Systems Cisco Systems
7200-12 Kit Creek Road 7200-12 Kit Creek Road
Research Triangle Park, NC 27709 Research Triangle Park, NC 27709
US US
Email: gsalguei@cisco.com Email: gsalguei@cisco.com
Bo Burman Bo Burman
Ericsson Ericsson
Kistavagen 25 Kistavagen 25
SE-164 80 Kista SE-164 80 Stockholm
Sweden Sweden
Phone: +46 10 714 13 11 Phone: +46 10 714 13 11
Email: bo.burman@ericsson.com Email: bo.burman@ericsson.com
 End of changes. 114 change blocks. 
338 lines changed or deleted 371 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/