--- 1/draft-ietf-avtext-splicing-for-rtp-09.txt 2012-10-11 10:14:16.225425908 +0200 +++ 2/draft-ietf-avtext-splicing-for-rtp-10.txt 2012-10-11 10:14:16.261426294 +0200 @@ -1,18 +1,18 @@ AVTEXT Working Group J. Xia Internet-Draft Huawei -Intended status: Informational August 13, 2012 -Expires: February 14, 2013 +Intended status: Informational October 10, 2012 +Expires: April 12, 2013 Content Splicing for RTP Sessions - draft-ietf-avtext-splicing-for-rtp-09 + draft-ietf-avtext-splicing-for-rtp-10 Abstract Content splicing is a process that replaces the content of a main multimedia stream with other multimedia content, and delivers the substitutive multimedia content to the receivers for a period of time. Splicing is commonly used for local advertisement insertion by cable operators, replacing a national advertisement content with a local advertisement. @@ -29,21 +29,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 14, 2013. + This Internet-Draft will expire on April 12, 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -54,23 +54,24 @@ described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. System Model and Terminology . . . . . . . . . . . . . . . . . 3 3. Requirements for RTP Splicing . . . . . . . . . . . . . . . . 6 4. Content Splicing for RTP sessions . . . . . . . . . . . . . . 7 4.1. RTP Processing in RTP Mixer . . . . . . . . . . . . . . . 7 4.2. RTCP Processing in RTP Mixer . . . . . . . . . . . . . . . 8 - 4.3. Media Clipping Considerations . . . . . . . . . . . . . . 10 + 4.3. Considerations for Handling Media Clipping at the RTP + Layer . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.4. Congestion Control Considerations . . . . . . . . . . . . 11 - 4.5. Processing Splicing in User Invisibility Case . . . . . . 12 + 4.5. Considerations for Implementing Undetectable Splicing . . 12 5. Implementation Considerations . . . . . . . . . . . . . . . . 13 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 9. 10. Appendix- Why Mixer Is Chosen . . . . . . . . . . . . . . 14 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 10.1. Normative References . . . . . . . . . . . . . . . . . . . 15 10.2. Informative References . . . . . . . . . . . . . . . . . . 15 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 @@ -262,33 +263,31 @@ sender to the receiver. REQ-6: The splicer must not affect other RTP sessions running between the RTP sender and the RTP receiver, and must be transparent for the RTP sessions it does not splice. REQ-7: - The splicer should be able to modify the RTP stream across a - splicing in or splicing out point such that the splicing point is - not easy to be detected in the RTP stream. For the advertisement - insertion use case, it is important to make it difficult for the - receiver to detect where an advertisement insertion is starting or - ending from the RTP packets. Ensuring the splicing point is not - visible in the media content may be easy with some codecs, but - extremely difficult with others; in the worst case, the splicer - may need to perform full media transcoding if it has to hide the - splicing point in the media content. This memo only focuses on - making the splicing invisible at the RTP layer. How (or if) the - splicing is made invisible in the media stream is outside the - scope of this memo. + The splicer should be able to modify the RTP stream such that the + splicing point is not easy to be detected by the RTP receiver at + the RTP layer. For the advertisement insertion use case, it is + important to make it difficult for the RTP receiver to detect + where an advertisement insertion is starting or ending from the + RTP packets, and thus avoiding the RTP receiver from filtering out + the advertisement content. This memo only focuses on making the + splicing undetectable at the RTP layer. How (or if) the splicing + is made undetectable in the media stream is outside the scope of + this memo. The corresponding processing is depicted in section + 4.5. 4. Content Splicing for RTP sessions The RTP specification [RFC3550] defines two types of middlebox: RTP translators and RTP mixers. Splicing is best viewed as a mixing operation. The splicer generates a new RTP stream that is a mix of the main RTP stream and the substitutive RTP stream. An RTP mixer is therefore an appropriate model for a content splicer. In next four subsections (from subsection 4.1 to subsection 4.4), the document analyzes how the mixer handles RTP splicing and how it satisfies the @@ -296,27 +295,27 @@ document looks at REQ-7 in order to hide the fact that splicing take place. 4.1. RTP Processing in RTP Mixer A splicer could be implemented as a mixer that receives the main RTP stream and the substitutive content (possibly via a substitutive RTP stream), and sends a single output RTP stream to the receiver(s). That output RTP stream will contain either the main content or the substitutive content. The output RTP stream will come from the - mixer, and will have the SSRC of the mixer rather than the main RTP - sender or the substitutive RTP sender. + mixer, and will have the synchronization source (SSRC) of the mixer + rather than the main RTP sender or the substitutive RTP sender. The mixer uses its own SSRC, sequence number space and timing model when generating the output stream. Moreover, the mixer may insert - the SSRC of main RTP stream into CSRC list in the output media - stream. + the SSRC of main RTP stream into contributing source (CSRC) list in + the output media stream. At the splicing in point, when the substitutive content becomes active, the mixer chooses the substitutive RTP stream as input stream at splicing in point, and extracts the payload data (i.e., substitutive content). If the substitutive content comes from local media file storage, the mixer directly fetches the substitutive content. After that, the mixer encapsulates substitutive content instead of main content as the payload of the output media stream, and then sends the output RTP media stream to receiver. The mixer may insert the SSRC of substitutive RTP stream into CSRC list in the @@ -419,28 +418,33 @@ packet into two separate feedback packets and process the information in the feedback control information (FCI) in the two feedback packets, just as the RTCP report process described above. If the substitutive content comes from local media file storage (i.e., the mixer can be regarded as the substitutive RTP sender), any RTCP packets received from downstream relate to the substitutive content must be terminated on the mixer without any further processing. -4.3. Media Clipping Considerations +4.3. Considerations for Handling Media Clipping at the RTP Layer This section provides informative guideline about how media clipping - is shaped and how the mixer deal with the media clipping. + is shaped and how the mixer deal with the media clipping only at the + RTP layer. Dealing with the media clipping at the RTP layer just do + a good quality implementation, perfectly erasing the media clipping + needs more considerations in the higher layers, how to realize it is + outside of the scope of this memo. If the time slot for substitutive content mismatches (is shorter or longer than) the duration of the main content to be replaced, then - media clipping may occur at the splicing point. + media clipping may occur at the splicing point and thus impact the + user's experience. If the substitutive content has shorter duration from the main content, then there will be a gap in the output RTP stream. The RTP sequence number will be contiguous across this gap, but there will be an unexpected jump in the RTP timestamp. This gap will cause the receiver to have nothing to play. This is unavoidable, unless the mixer adjusts the splice in or splice out point to compensate, sending more of the main RTP stream in place of the shorter substitutive stream, or unless the mixer can vary the length of the substitutive content. It is the responsibility of the higher layer @@ -530,54 +534,54 @@ RTP sender. From above analysis, to reduce the risk of congestion and remain the bandwidth consumption stable over time, the substitutive RTP stream is recommended to be encoded at an appropriate bitrate to match that of main RTP stream. If the substitutive RTP stream comes from the substitutive RTP sender, this sender had better has some knowledge about the media encoding bitrate of main content in advance. How it knows that is out of scope in this draft. -4.5. Processing Splicing in User Invisibility Case +4.5. Considerations for Implementing Undetectable Splicing If it is desirable to prevent receivers from detecting that splicing is occurring at the RTP layer, the mixer must not include a CSRC list in outgoing RTP packets, and must not forward RTCP messages from the main RTP sender or from the substitutive RTP sender. Due to the absence of CSRC list in the output RTP stream, the RTP receiver only initiates SDES, BYE and APP packets to the mixer without any knowledge of the main RTP sender and the substitutive RTP sender. CSRC list identifies the contributing sources, these SSRC identifiers of contributing sources are kept globally unique for each RTP session. The uniqueness of SSRC identifier is used to resolve collisions and detecting RTP-level forwarding loops as defined in section 8.2 of [RFC3550]. The absence of CSRC list in this case will create a danger that loops involving those contributing sources could - not be detected. The Loops could occur if either the mixer is + not be detected. The loops could occur if either the mixer is misconfigured to form a loop, or a second mixer/translator is added, - causing packets to loop back to upstream of the original mixer. So - Non-RTP means must be used to detect and resolve loops if the mixer - does not add a CSRC list. + causing packets to loop back to upstream of the original mixer and + hence wasting the network bandwidth. So Non-RTP means must be used + to detect and resolve loops if the mixer does not add a CSRC list. 5. Implementation Considerations When the mixer is used to handle RTP splicing, RTP receiver does not need any RTP/RTCP extension for splicing. As a trade-off, additional overhead could be induced on the mixer which uses its own sequence number space and timing model. So the mixer will rewrite RTP sequence number and timestamp whatever splicing is active or not, and generate RTCP flows for both sides. In case the mixer serves multiple main RTP streams simultaneously, this may lead to more overhead on the mixer. - If User Invisibility Requirement is required, CSRC list is not + If undetectable splicing requirement is required, CSRC list is not included in outgoing RTP packet, this brings a potential issue with loop detection as briefly described in section 4.5. 6. Security Considerations The splicing application is subject to the general security considerations of the RTP specification [RFC3550]. The mixer acting as splicer replaces some content with other content in RTP packets, thus breaking any RTP level end-to-end security, such @@ -589,22 +593,22 @@ security services, the splicer can modify and re-protect the RTP packets without enabling the receiver to detect if the data comes from the original source or from the splicer. Security goals to have source authentication all the way from the RTP main sender to the receiver through the splicer is not possible with splicing. The nature of this RTP service offered by a network operator employing a content splicer is that the RTP layer security relationship is between the receiver and the splicer, and between the senders and the splicer, are not end-to-end. This appears to - invalidate the invisibility goal, but in the common case the receiver - will consider the splicer as the main media source. + invalidate the undetectability goal, but in the common case the + receiver will consider the splicer as the main media source. Commonly no RTP level security mechanism is employed. Instead only payload security mechanisms (e.g., ISMACryp [ISMACryp]) are used. If any payload internal security mechanisms are used, only the RTP sender and the RTP receiver can learn the security keying material generated by such internal security mechanism, in which case, any middlebox (e.g., splicer) between the RTP sender and the RTP receiver can't get such keying material, and thus fail to perform splicing. 7. IANA Considerations