--- 1/draft-ietf-avtext-splicing-notification-02.txt 2015-11-26 19:15:06.897924346 -0800 +++ 2/draft-ietf-avtext-splicing-notification-03.txt 2015-11-26 19:15:06.933925214 -0800 @@ -1,21 +1,21 @@ AVTEXT Working Group J. Xia INTERNET-DRAFT R. Even Intended Status: Standards Track R. Huang -Expires: October 31, 2015 Huawei +Expires: May 30, 2016 Huawei L. Deng China Mobile - April 29, 2015 + November 27, 2015 RTP/RTCP extension for RTP Splicing Notification - draft-ietf-avtext-splicing-notification-02 + draft-ietf-avtext-splicing-notification-03 Abstract Content splicing is a process that replaces the content of a main multimedia stream with other multimedia content, and delivers the substitutive multimedia content to the receivers for a period of time. The splicer is designed to handle RTP splicing and needs to know when to start and end the splicing. This memo defines two RTP/RTCP extensions to indicate the splicing @@ -84,175 +84,184 @@ 10.1 Normative References . . . . . . . . . . . . . . . . . . . 15 10.2 Informative References . . . . . . . . . . . . . . . . . . 15 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 1 Introduction Splicing is a process that replaces some multimedia content with other multimedia content and delivers the substitutive multimedia content to the receivers for a period of time. In some predictable splicing cases, e.g., advertisement insertion, the splicing duration - MUST be inside of the specific, pre-designated time slot. Certain - timing information about when to start and end the splicing must be - first acquired by the splicer in order to start the splicing. This - document refers to this information as Splicing Interval. + needs to be inside of the specific, pre-designated time slot. + Certain timing information about when to start and end the splicing + must be first acquired by the splicer in order to start the splicing. + This document refers to this information as the Splicing Interval. [SCTE35] provides a method that encapsulates the Splicing Interval inside the MPEG2-TS layer in cable TV systems. But in the RTP - splicing scenario described in [RFC6828], the RTP mixer designed as - the splicer has to decode the RTP packets and search for the Splicing + splicing scenario described in [RFC6828], the mixer designed as the + splicer has to decode the RTP packets and search for the Splicing Interval inside the payloads. The need for such processing increases the workload of the mixer and limits the number of RTP sessions the mixer can support. The document defines an RTP header extension [RFC5285] used by the main RTP sender to provide the Splicing Interval by including it in the RTP packets. Nevertheless, the Splicing Interval conveyed in the RTP header - extension might not reach the mixer successfully, any splicing un- - aware middlebox on the path between the RTP sender and the mixer - might strip this RTP header extension. + extension might not reach the splicer successfully. Any splicing un- + aware middlebox on the path between the RTP sender might strip this + RTP header extension. To increase robustness against such case, the document also defines a - new RTCP packet type in a complementary fashion to carry the same - Splicing Interval to the mixer. + complementary RTCP packet type to carry the same Splicing Interval to + the splicer. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. The terminology defined in "Content Splicing for RTP Sessions" [RFC6828] applies to this document and in addition, we define: Splicing Interval: The NTP timestamps for the Splicing-In point and Splicing-Out - point per [RFC6828] allowing the mixer to know when to start and + point per [RFC6828] allowing the splicer to know when to start and end the RTP splicing. 2 Overview of RTP Splicing Notification - According to [RFC6828], a mixer is designed to handle splicing on the - RTP layer at the reserved time slots set by the main RTP sender. This - implies that the mixer must first know the Splicing Interval from the - main RTP sender before it can start splicing. + A splicer is designed to handle splicing on the RTP layer at the + reserved time slots set by the main RTP sender. This implies that the + splicer must first know the Splicing Interval from the main RTP + sender before it can start splicing. The splicer can be a mixer as + described in [RFC6828]. - When a new splicing is forthcoming, the main RTP sender MUST send the - Splicing Interval to the mixer. Usually, the Splicing Interval SHOULD - be sent more than once to mitigate the possible packet loss. To - enable the mixer to get the substitutive content before the splicing + When a new splicing is forthcoming, the main RTP sender needs to send + the Splicing Interval to the splicer. The Splicing Interval SHOULD be + sent more than once to mitigate the possible packet loss. To enable + the splicer to get the substitutive content before the splicing starts, the main RTP sender MUST send the Splicing Interval far ahead. For example, the main RTP sender can estimate when to send the Splicing Interval based on the round-trip time (RTT) following the - mechanisms in section 6.4.1 of [RFC3550] when the mixer sends RTCP RR - to the main sender. + mechanisms in section 6.4.1 of [RFC3550] when the splicer sends RTCP + RR to the main sender. The substitutive sender also needs to learn the Splicing Interval from the main RTP sender in advance, and thus estimates when to - transfer the substitutive content to the mixer. The Splicing Interval - could be transmitted from the main RTP sender to the substitutive - content using some out-of-band mechanisms, the details how to achieve - that are beyond the scope of this memo. To ensure the Splicing - Interval is valid for both the main RTP sender and the substitutive - RTP sender, the two senders MUST share a common reference clock, so - the mixer can achieve accurate splicing. + transfer the substitutive content to the splicer. The Splicing + Interval could be transmitted from the main RTP sender to the + substitutive content using some out-of-band mechanisms, for example, + a proprietary mechanism to exchange the Splicing Interval, or the + substitutive sender is implemented together with the main RTP sender + inside a single device. To ensure the Splicing Interval is valid for + both the main RTP sender and the substitutive RTP sender, the two + senders MUST share a common reference clock so that the splicer can + achieve accurate splicing. The common reference clock depends on the + codec the media content using. In this document, the main RTP sender uses a pair of NTP-format - timestamps, derived from the common reference clock, to indicate when - to start and end the splicing to the mixer: the timestamp of the - first substitutive RTP packet at the splicing in point, and the - timestamp of the first main RTP packet at the splicing out point. + timestamps, to indicate when to start and end the splicing to the + splicer: the timestamp of the first substitutive RTP packet at the + splicing in point, and the timestamp of the first main RTP packet at + the splicing out point. When the substitutive RTP sender gets the Splicing Interval, it must - prepare the substitutive stream. The mixer MUST ensure that the RTP - timestamp of the first substitutive RTP packet that would be - presented to the receivers corresponds to the same time instant as - the former NTP timestamp in the Splicing Interval. To enable the - mixer to know the first substitutive RTP packet it needs to send, the - substitutive RTP sender MUST send the substitutive RTP packet ahead - of the Splicing In point, allowing the mixer to find out the - timestamp of this first RTP packet in the substitutive RTP stream, - e.g., using a prior RTCP SR message. + prepare the substitutive stream. The main and the substitutive + content providers MUST ensure that the RTP timestamp of the first + substitutive RTP packet that would be presented to the receivers + corresponds to the same time instant as the former NTP timestamp in + the Splicing Interval. To enable the splicer to know the first + substitutive RTP packet it needs to send, the substitutive RTP sender + MUST send the substitutive RTP packet ahead of the Splicing In point, + allowing the splicer to find out the timestamp of this first RTP + packet in the substitutive RTP stream, e.g., using a prior RTCP SR + message. - When the splicing will end, the mixer MUST ensure that the RTP - timestamp of the first main RTP packet that would be presented on the - receivers corresponds to the same time instant as the latter NTP - timestamp in the Splicing Interval. + When the splicing will end, the main content provider and the + substitutive content provider MUST ensure the RTP timestamp of the + first main RTP packet that would be presented on the receivers + corresponds to the same time instant as the latter NTP timestamp in + the Splicing Interval. 3 Conveying Splicing Interval in RTP/RTCP extensions This memo defines two backwards compatible RTP extensions to convey - the Splicing Interval to the mixer: an RTP header extension and an + the Splicing Interval to the splicer: an RTP header extension and an RTCP splicing notification message. 3.1 RTP Header Extension The RTP header extension mechanism defined in [RFC5285] can be adapted to carry the Splicing Interval consisting of a pair of NTP- format timestamps. One variant is defined for this header extension. It carries the 7 octets splicing-out NTP timestamp (lower 24-bit part of the Seconds of a NTP-format timestamp and the 32 bits of the Fraction of a NTP- format timestamp as defined in [RFC5905]), followed by the 8 octets splicing-in NTP timestamp (64-bit NTP-format timestamp as defined in [RFC5905]). The top 8 bits of the splicing-out NTP timestamp are - referred from the top 8 bits of the splicing-in NTP timestamp. This + inferred from the top 8 bits of the splicing-in NTP timestamp. This is unambiguous, under the assumption that the splicing-out time is after the splicing-in time, and the splicing interval is less than - 2^25 seconds. + 2^25 seconds. If the 7 octets splicing-out NTP timestamp is smaller + than the lower 7 octets splicing-in NTP timestamp, it implies a wrap + of the 64-bit splicing-out NTP timestamp which will then be + calculated by the 7 octets splicing-out NTP timestamp plus + 0x100000000. Otherwise, the top 8 octets of splicing-out NTP + timestamp is equal to the top 8 octets of splicing-in NTP timestamp. The format is shown in Figures 1. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0xBE | 0xDE | length=4 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+E | ID | L=15 | OUT NTP timestamp format - Seconds (bit 8-31) |x +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+t | OUT NTP timestamp format - Fraction (bit 0-31) |e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+n | IN NTP timestamp format - Seconds (bit 0-31) |s +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+i | IN NTP timestamp format - Fraction (bit 0-31) |o +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+n Figure 1: Sample hybrid NTP Encoding Using the One-Byte Header Format - Note that the inclusion of an RTP header extension will reduce the - efficiency of RTP header compression. It is RECOMMENDED that the main - sender begins to insert the RTP header extensions into a number of - RTP packets prior to the splicing in, while leaving the remaining RTP - packets unmarked. + Since the inclusion of an RTP header extension will reduce the + efficiency of RTP header compression, it is RECOMMENDED that the main + sender inserts the RTP header extensions into only a number of RTP + packets, instead of all the RTP packets, prior to the splicing in. - After the mixer intercepts the RTP header extension and derives the + After the splicer intercepts the RTP header extension and derives the Splicing Interval, it will generate its own stream and SHOULD NOT include the RTP header extension in outgoing packets to reduce header overhead. - Furthermore, whether the in-band NTP-format timestamps are included - or not, RTCP splicing notification message, specified in the next - section, MUST be sent to provide robustness in case of any splicing- - unaware middlebox that might strip RTP header extensions. - 3.2 RTCP Splicing Notification Message In addition to the RTP header extension, the main RTP sender includes the Splicing Interval in an RTCP splicing notification message. + Whether the in-band NTP-format timestamps are included or not, the + main RTP sender MUST send the RTCP splicing notification message to + provide robustness in case of any splicing-unaware middlebox that + might strip RTP header extensions. The RTCP splicing notification message is a new RTCP packet type. It - has a fix header followed by a pair of NTP-format timestamps: + has a fixed header followed by a pair of NTP-format timestamps: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |V=2|P|reserved | PT=TBA | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | SSRC | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IN NTP Timestamp (most significant word) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ @@ -275,106 +284,120 @@ SSRC: 32 bits The SSRC of the Main RTP Sender. Timestamp: 64 bits Indicates the wallclock time when this splicing starts and ends. The full-resolution NTP timestamp is used, which is a 64-bit, unsigned, fixed-point number with the integer part in the first 32 bits and the fractional part in the last 32 bits. This format is - similar to RTCP Sender Report (Section 6.4.1 of [RFC3550]). + similar to the RTCP Sender Report (Section 6.4.1 of [RFC3550]). The RTCP splicing notification message can be appended to RTCP SR the main RTP sender generates in compound RTCP packets, and hence follows the compound RTCP rules defined in Section 6.1 in [RFC3550]. If the use of non-compound RTCP [RFC5506] was previously negotiated - between the sender and the mixer, the RTCP splicing notification + between the sender and the splicer, the RTCP splicing notification message may be sent as non-compound RTCP packets. - When the mixer intercepts the RTCP splicing notification message, it - SHOULD NOT forward the message to the receivers in order to reduce - RTCP bandwidth consumption. And it MUST NOT forward the message to - the downstream receivers to avoid them from detecting splicing - defined in Section 4.5 in [RFC6828]. + When the splicer intercepts the RTCP splicing notification message, + it SHOULD NOT forward the message to the down-stream receivers in + order to reduce RTCP bandwidth consumption. And if the splicer wishes + to prevent the downstream receivers from detecting splicing, it MUST + NOT forward the message. 4 Reducing Splicing Latency - When splicing starts or ends, the mixer outputs the multimedia + When splicing starts or ends, the splicer outputs the multimedia content from another sender to the receivers. Given that the receivers must first acquire certain information ([RFC6285] refers to this information as Reference Information) to start processing the multimedia data, either the main RTP sender or the substitutive - sender SHOULD provide the Reference Information align with its + sender SHOULD provide the Reference Information together with its multimedia content to reduce the delay caused by acquiring the Reference Information. The methods by which the Reference Information is distributed to the receivers is out of scope of this memo. Another latency element is synchronization caused delay. The receivers must receive enough synchronization metadata prior to synchronizing the separate components of the multimedia streams when splicing starts or ends. Either the main RTP sender or the substitutive sender SHOULD send the synchronization metadata early enough so that the receivers can play out the multimedia in a - synchronized fashion. The mechanisms defined in [RFC6051] are - RECOMMENDED to be adopted to reduce the possible synchronization - delay. + synchronized fashion. The main RTP sender and the substitutive sender + can be coordinated by some proprietary out-of-band mechanisms to + decide when and whom to send the metadata. If both send the + information, the splicer SHOULD pick one based on the current + situation, e.g., choosing media sender when synchronizing the main + media content while choosing the information from the substitutive + sender when synchronizing the spliced content. The mechanisms defined + in [RFC6051] are RECOMMENDED to be adopted to reduce the possible + synchronization delay. 5 Failure Cases This section examines the implications of losing RTCP splicing - notification message and other failure case, e.g., the RTP header + notification message and the other failure case, e.g., the RTP header extension is stripped on the path. - Given that there may be splicing un-aware middlebox on the path - between the main RTP sender and the mixer, one heuristics will be - used to verify whether or not the Splicing Interval reaches the - mixers. + Given that there may be a splicing un-aware middlebox on the path + between the main RTP sender and the splicer, the main and the + substitutive RTP senders can use one heuristic to verify whether or + not the Splicing Interval reaches the splicer. - If the mixer does not get the Splicing Interval when the splicing - starts, it will still output the main content to the downstream - receivers and forward the RTCP RR packets sent from downstream - receivers to the main RTP sender (see section 4.2 of [RFC6828]). In - such case, the main RTP sender can learn that splicing failed. + If a mixer works as the splicer [RFC6828] and it follows [RFC3550], + the RTP sender whose content is being passed to a downstream receiver + will see the reception quality of its stream as received by the mixer + and the reception quality of the processed stream as received by the + receiver; The RTP sender whose content is not being passed to a + downstream receiver will only see the reception quality of its stream + as received by the mixer. In such a case, the main RTP sender can + learn that splicing failed if it still sees the RTCP RR packets sent + from downstream receivers when the splicing starts; In a similar + manner, the substitutive sender can also learn that splicing failed + if it does not receive any RTCP RR packets from downstream receivers + when the splicing starts. - In a similar manner, the substitutive sender can learn that splicing - failed if it does not receive any RTCP RR packets from downstream - receivers when the splicing starts. + Other cases where senders and receivers are in different RTCP domains + may require translation of RTCP reports, or additional reporting, if + the senders want to detect splicing problems. - Upon the detection of a failure, the main RTP sender or the - substitutive sender SHOULD check the path to the failed mixer, or - fallback to the payload specific mechanisms, e.g., MPEG-TS splicing - solution defined in [SCTE35]. + Upon the detection of a failure, the splicer can communicate with the + main sender and the substitutive sender in some out of band signaling + to fall back to the payload specific mechanisms it supports, e.g., + MPEG-TS splicing solution defined in [SCTE35], or just abandon the + splicing. -6 SDP Signaling +6 Session Description Protocol (SDP) Signaling This document defines the URI for declaring this header extension in an extmap attribute to be "urn:ietf:params:rtp-hdrext:splicing- interval". This document extends the standard semantics defined in SDP Grouping Framework [RFC5888] with a new semantic: SPLICE to represent the relationship between the main RTP stream and the substitutive RTP stream. Only 2 m-lines are allowed in the SPLICE group. The main RTP stream is the one with the extended extmap attribute, and the other one is substitutive stream. A single m-line MUST NOT be included in different SPLICE groups at the same time. The main RTP sender provides the information about both main and substitutive sources. The extended SDP attribute specified in this document is applicable for offer/answer content [RFC3264] and do not affect any rules when - negotiating offer and answer. When used with multiple media, + negotiating offer and answer. When used with multiple m-lines, substitutive RTP MUST be applied only to the RTP packets whose SDP m- line is in the same group with the substitutive stream using SPLICE - and has the extended splicing extmap attribute. This semantics is - also applicable for BUNDLE cases. + and has the extended splicing extmap attribute. This semantic is also + applicable for BUNDLE cases. The following examples show how SDP signaling could be used for splicing in different cases. 6.1 Declarative SDP v=0 o=xia 1122334455 1122334466 IN IP4 splicing.example.com s=RTP Splicing Example t=0 0 @@ -387,33 +410,32 @@ a=mid:1 m=video 30002 RTP/AVP 100 i=Substitutive RTP Stream c=IN IP4 233.252.0.2/127 a=sendonly a=rtpmap:100 MP2T/90000 a=mid:2 Figure 3: Example SDP for a single-channel splicing scenario - The mixer receiving the SDP message above receives one MPEG2-TS + The splicer receiving the SDP message above receives one MPEG2-TS stream (payload 100) from the main RTP sender (with multicast destination address of 233.252.0.1) on port 30000, and/or receives another MPEG2-TS stream from the substitutive RTP sender (with multicast destination address of 233.252.0.2) on port 30002. But at - a particular point in time, the mixer only selects one stream and + a particular point in time, the splicer only selects one stream and outputs the content from the chosen stream to the downstream receivers. 6.2 Offer/Answer without BUNDLE SDP Offer - from main RTP sender - v=0 o=xia 1122334455 1122334466 IN IP4 splicing.example.com s=RTP Splicing Example t=0 0 a=group:SPLICE 1 2 m=video 30000 RTP/AVP 31 100 i=Main RTP Stream c=IN IP4 splicing.example.com a=rtpmap:31 H261/90000 a=rtpmap:100 MP2T/90000 @@ -444,24 +466,23 @@ a=mid:1 m=video 40000 RTP/AVP 100 i=Substitutive RTP Stream c=IN IP4 splicer.example.com a=rtpmap:100 MP2T/90000 a=recvonly a=mid:2 Only codecs that are supported both by the main RTP stream and the substitutive RTP stream could be negotiated with SDP O/A. And the - mixer MUST choose the same codec for both of these two streams. + splicer MUST choose the same codec for both of these two streams. 6.3 Offer/Answer with BUNDLE: All Media are spliced - In this example, the bundled audio and video media have their own substitutive media for splicing: 1. An Offer, in which the offerer assigns a unique address and a substitutive media to each bundled "m="line for splicing within the BUNDLE group. 2. An answer, in which the answerer selects its own BUNDLE address, and leave the substitutive media untouched. @@ -606,41 +627,40 @@ a=recvonly m=video 30004 RTP/AVP 32 i=Substitutive video RTP Stream c=IN IP4 splicer.example.com a=rtpmap:32 MPV/90000 a=mid:2 a=recvonly 7 Security Considerations - The security considerations of the RTP specification [RFC3550], the - general mechanism for RTP header extensions [RFC5285] and the - security considerations of the RTP splicing specification [RFC6828] - apply. + The security considerations of the RTP specification [RFC3550] and + the general mechanism for RTP header extensions [RFC5285] apply. If + the RTP splicing mechanism described in [RFC6828] is in use, its + security considerations also apply. - The RTP header extension defined in Section 4.1 include two NTP- - format timestamps. In the Secure Real-time Transport Protocol - (SRTP)[RFC3711], RTP header extensions are authenticated but not - encrypted. For a malicious endpoint without the key, it can observe - the splicing time in the RTP header, and it can intercept the - substitutive content and even replace it with a different one if the - splicer does not use any security like SRTP and authenticate the main - and substitutive content sources. + In Secure Real-time Transport Protocol (SRTP)[RFC3711], RTP header + extensions are authenticated but not encrypted. For a malicious + endpoint without the key, it can observe the splicing time in the RTP + header, and it can intercept the substitutive content and even + replace it with a different one if the substitutive stream does not + use any security like SRTP and the splicer does not authenticate the + main and substitutive content sources. If there is a concern about the confidentiality of the splicing time information, header extension encryption [RFC6904] SHOULD be used. - However, the malicious endpoint can get the splicing time information - by other means, e.g., observing the RTP timestamp of the substitutive - stream. To protect from different substitutive contents are inserted, - the splicer MUST have some mechanisms to authenticate the - substitutive stream source. + However, the malicious endpoint may get the splicing time information + by other means, e.g., inferring from the communication between the + main and substitutive content sources. To avoid invalid substitutive + contents are inserted, the splicer MUST have some mechanisms to + authenticate the substitutive stream source. For cases that the splicing time information is changed by a malicious endpoint, the splicing may fail since it will not be available at the right time for the substitutive media to arrive, which may also break an undetectable splicing. To mitigate this effect, the splicer SHOULD NOT forward the splicing time information RTP header extension defined in Section 4.1 to the receivers. And it MUST NOT forward this header extension when considering an undetectable splicing. @@ -679,33 +698,34 @@ extension called "SPLICE". Semantics: Splice Token:SPLICE Reference: This document Contact: Jinwei Xia -9 Acknowledges +9 Acknowledgement - TBD + The authors would like to thank the following individuals who help to + review this document and provide very valuable comments: Colin + Perkins, Bo Burman, Stephen Botzko, Ben Campbell. 10 References 10.1 Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. - Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. [RFC3264] Rosenberg, J., and H. Schulzrinne, "An Offer/Answer Model with the Session Description Protocol (SDP)", RFC 3264, June 2002. [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP Header Extensions", RFC 5285, July 2008.