draft-ietf-rmcat-sbd-03.txt   draft-ietf-rmcat-sbd-04.txt 
RTP Media Congestion Avoidance Techniques D. Hayes, Ed. RTP Media Congestion Avoidance Techniques D. Hayes, Ed.
Internet-Draft University of Oslo Internet-Draft University of Oslo
Intended status: Experimental S. Ferlin Intended status: Experimental S. Ferlin
Expires: April 21, 2016 Simula Research Laboratory Expires: September 22, 2016 Simula Research Laboratory
M. Welzl M. Welzl
K. Hiorth K. Hiorth
University of Oslo University of Oslo
October 19, 2015 March 21, 2016
Shared Bottleneck Detection for Coupled Congestion Control for RTP Shared Bottleneck Detection for Coupled Congestion Control for RTP
Media. Media.
draft-ietf-rmcat-sbd-03 draft-ietf-rmcat-sbd-04
Abstract Abstract
This document describes a mechanism to detect whether end-to-end data This document describes a mechanism to detect whether end-to-end data
flows share a common bottleneck. It relies on summary statistics flows share a common bottleneck. It relies on summary statistics
that are calculated by a data receiver based on continuous that are calculated by a data receiver based on continuous
measurements and regularly fed to a grouping algorithm that runs measurements and regularly fed to a grouping algorithm that runs
wherever the knowledge is needed. This mechanism complements the wherever the knowledge is needed. This mechanism complements the
coupled congestion control mechanism in draft-welzl-rmcat-coupled-cc. coupled congestion control mechanism in draft-ietf-rmcat-coupled-cc.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 21, 2016. This Internet-Draft will expire on September 22, 2016.
Copyright Notice Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. The signals . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. The signals . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1. Packet Loss . . . . . . . . . . . . . . . . . . . . . 3 1.1.1. Packet Loss . . . . . . . . . . . . . . . . . . . . . 3
1.1.2. Packet Delay . . . . . . . . . . . . . . . . . . . . 3 1.1.2. Packet Delay . . . . . . . . . . . . . . . . . . . . 3
1.1.3. Path Lag . . . . . . . . . . . . . . . . . . . . . . 4 1.1.3. Path Lag . . . . . . . . . . . . . . . . . . . . . . 4
2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1. Parameters and their Effect . . . . . . . . . . . . . . . 6 2.1. Parameters and their Effect . . . . . . . . . . . . . . . 7
2.2. Recommended Parameter Values . . . . . . . . . . . . . . 7 2.2. Recommended Parameter Values . . . . . . . . . . . . . . 8
3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1. Key metrics and their calculation . . . . . . . . . . . . 9 3.1. SBD feedback requirements . . . . . . . . . . . . . . . . 9
3.1.1. Mean delay . . . . . . . . . . . . . . . . . . . . . 9 3.1.1. Feedback when all the logic is placed at
3.1.2. Skewness Estimate . . . . . . . . . . . . . . . . . . 9 the sender . . . . . . . . . . . . . . . . . . . . . 10
3.1.3. Variability Estimate . . . . . . . . . . . . . . . . 10 3.1.2. Feedback when the statistics are
3.1.4. Oscillation Estimate . . . . . . . . . . . . . . . . 11 calculated at the receiver and SBD at
3.1.5. Packet loss . . . . . . . . . . . . . . . . . . . . . 11 the sender . . . . . . . . . . . . . . . . . . . . . 10
3.2. Flow Grouping . . . . . . . . . . . . . . . . . . . . . . 12 3.1.3. Feedback when bottlenecks can be
3.2.1. Flow Grouping Algorithm . . . . . . . . . . . . . . . 12 determined at both senders and
3.2.2. Using the flow group signal . . . . . . . . . . . . . 13 receivers . . . . . . . . . . . . . . . . . . . . . . 11
3.3. Removing Noise from the Estimates . . . . . . . . . . . . 13 3.2. Key metrics and their calculation . . . . . . . . . . . . 11
3.3.1. Oscillation noise . . . . . . . . . . . . . . . . . . 14 3.2.1. Mean delay . . . . . . . . . . . . . . . . . . . . . 11
3.3.2. Clock skew . . . . . . . . . . . . . . . . . . . . . 14 3.2.2. Skewness Estimate . . . . . . . . . . . . . . . . . . 11
3.4. Reducing lag and Improving Responsiveness . . . . 14 3.2.3. Variability Estimate . . . . . . . . . . . . . . . . 12
3.4.1. Improving the response of the skewness estimate . 15 3.2.4. Oscillation Estimate . . . . . . . . . . . . . . . . 12
3.4.2. Improving the response of the variability estimate 17 3.2.5. Packet loss . . . . . . . . . . . . . . . . . . . . . 13
4. Measuring OWD . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3. Flow Grouping . . . . . . . . . . . . . . . . . . . . . . 13
4.1. Time stamp resolution . . . . . . . . . . . . . . . . . . 17 3.3.1. Flow Grouping Algorithm . . . . . . . . . . . . . . . 13
5. Implementation status . . . . . . . . . . . . . . . . . . . . 18 3.3.2. Using the flow group signal . . . . . . . . . . . . . 15
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 3.4. Removing Noise from the Estimates . . . . . . . . . . . . 15
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 3.4.1. Oscillation noise . . . . . . . . . . . . . . . . . . 15
8. Security Considerations . . . . . . . . . . . . . . . . . . . 18 3.4.2. Clock skew . . . . . . . . . . . . . . . . . . . . . 16
9. Change history . . . . . . . . . . . . . . . . . . . . . . . 18 3.5. Reducing lag and Improving Responsiveness . . . . 16
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.5.1. Improving the response of the skewness estimate . 17
10.1. Normative References . . . . . . . . . . . . . . . . . . 19 3.5.2. Improving the response of the variability estimate 19
10.2. Informative References . . . . . . . . . . . . . . . . . 19 4. Measuring OWD . . . . . . . . . . . . . . . . . . . . . . . . 19
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 4.1. Time stamp resolution . . . . . . . . . . . . . . . . . . 19
5. Implementation status . . . . . . . . . . . . . . . . . . . . 20
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20
8. Security Considerations . . . . . . . . . . . . . . . . . . . 20
9. Change history . . . . . . . . . . . . . . . . . . . . . . . 20
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 21
10.1. Normative References . . . . . . . . . . . . . . . . . . 21
10.2. Informative References . . . . . . . . . . . . . . . . . 21
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
In the Internet, it is not normally known if flows (e.g., TCP In the Internet, it is not normally known if flows (e.g., TCP
connections or UDP data streams) traverse the same bottlenecks. Even connections or UDP data streams) traverse the same bottlenecks. Even
flows that have the same sender and receiver may take different paths flows that have the same sender and receiver may take different paths
and share a bottleneck or not. Flows that share a bottleneck link and share a bottleneck or not. Flows that share a bottleneck link
usually compete with one another for their share of the capacity. usually compete with one another for their share of the capacity.
This competition has the potential to increase packet loss and This competition has the potential to increase packet loss and
delays. This is especially relevant for interactive applications delays. This is especially relevant for interactive applications
that communicate simultaneously with multiple peers (such as multi- that communicate simultaneously with multiple peers (such as multi-
party video). For RTP media applications such as RTCWEB, party video). For RTP media applications such as RTCWEB,
[I-D.welzl-rmcat-coupled-cc] describes a scheme that combines the [I-D.ietf-rmcat-coupled-cc] describes a scheme that combines the
congestion controllers of flows in order to honor their priorities congestion controllers of flows in order to honor their priorities
and avoid unnecessary packet loss as well as delay. This mechanism and avoid unnecessary packet loss as well as delay. This mechanism
relies on some form of Shared Bottleneck Detection (SBD); here, a relies on some form of Shared Bottleneck Detection (SBD); here, a
measurement-based SBD approach is described. measurement-based SBD approach is described.
1.1. The signals 1.1. The signals
The current Internet is unable to explicitly inform endpoints as to The current Internet is unable to explicitly inform endpoints as to
which flows share bottlenecks, so endpoints need to infer this from which flows share bottlenecks, so endpoints need to infer this from
whatever information is available to them. The mechanism described whatever information is available to them. The mechanism described
skipping to change at page 3, line 50 skipping to change at page 4, line 7
device. The noise is often significantly increased if the round-trip device. The noise is often significantly increased if the round-trip
time is used. The cleanest signal is obtained by using One-Way-Delay time is used. The cleanest signal is obtained by using One-Way-Delay
(OWD). (OWD).
Measuring absolute OWD is difficult since it requires both the sender Measuring absolute OWD is difficult since it requires both the sender
and receiver clocks to be synchronised. However, since the and receiver clocks to be synchronised. However, since the
statistics being collected are relative to the mean OWD, a relative statistics being collected are relative to the mean OWD, a relative
OWD measurement is sufficient. Clock skew is not usually significant OWD measurement is sufficient. Clock skew is not usually significant
over the time intervals used by this SBD mechanism (see [RFC6817] A.2 over the time intervals used by this SBD mechanism (see [RFC6817] A.2
for a discussion on clock skew and OWD measurements). However, in for a discussion on clock skew and OWD measurements). However, in
circumstances where it is significant, Section 3.3.2 outlines a way circumstances where it is significant, Section 3.4.2 outlines a way
of adjusting the calculations to cater for it. of adjusting the calculations to cater for it.
Each packet arriving at the bottleneck buffer may experience very Each packet arriving at the bottleneck buffer may experience very
different queue lengths, and therefore different waiting times. A different queue lengths, and therefore different waiting times. A
single OWD sample does not, therefore, characterize the path well. single OWD sample does not, therefore, characterize the path well.
However, multiple OWD measurements do reflect the distribution of However, multiple OWD measurements do reflect the distribution of
delays experienced at the bottleneck. delays experienced at the bottleneck.
1.1.3. Path Lag 1.1.3. Path Lag
skipping to change at page 4, line 43 skipping to change at page 4, line 48
SBD -- Shared Bottleneck Detection SBD -- Shared Bottleneck Detection
Conventions used in this document: Conventions used in this document:
T -- the base time interval over which measurements are T -- the base time interval over which measurements are
made. made.
N -- the number of base time, T, intervals used in some N -- the number of base time, T, intervals used in some
calculations. calculations.
M -- the number of base time, T, intervals used in some
calculations.
sum_T(...) -- summation of all the measurements of the variable sum_T(...) -- summation of all the measurements of the variable
in parentheses taken over the interval T in parentheses taken over the interval T
sum(...) -- summation of terms of the variable in parentheses sum(...) -- summation of terms of the variable in parentheses
sum_N(...) -- summation of N terms of the variable in parentheses sum_N(...) -- summation of N terms of the variable in parentheses
sum_NT(...) -- summation of all measurements taken over the sum_NT(...) -- summation of all measurements taken over the
interval N*T interval N*T
skipping to change at page 7, line 9 skipping to change at page 8, line 9
is a compromise between false grouping of flows that do not is a compromise between false grouping of flows that do not
share a bottleneck and false splitting of flows that do. share a bottleneck and false splitting of flows that do.
Making them larger can help if the measures are very noisy, Making them larger can help if the measures are very noisy,
but reducing the noise in the statistical measures by but reducing the noise in the statistical measures by
adjusting T and N|M may be a better solution. adjusting T and N|M may be a better solution.
2.2. Recommended Parameter Values 2.2. Recommended Parameter Values
Reference [Hayes-LCN14] uses T=350ms, N=50, p_l=0.1. The other Reference [Hayes-LCN14] uses T=350ms, N=50, p_l=0.1. The other
parameters have been tightened to reflect minor enhancements to the parameters have been tightened to reflect minor enhancements to the
algorithm outlined in Section 3.3: c_s=-0.01, p_f=p_d=0.1, p_s=0.15, algorithm outlined in Section 3.4: c_s=-0.01, p_f=p_d=0.1, p_s=0.15,
p_mad=0.1, p_v=0.7. M=30, F=20, and c_h = 0.3 are additional p_mad=0.1, p_v=0.7. M=30, F=20, and c_h = 0.3 are additional
parameters defined in the document. These are values that seem to parameters defined in the document. These are values that seem to
work well over a wide range of practical Internet conditions. work well over a wide range of practical Internet conditions.
3. Mechanism 3. Mechanism
The mechanism described in this document is based on the observation The mechanism described in this document is based on the observation
that the distribution of delay measurements of packets that traverse that the distribution of delay measurements of packets that traverse
a common bottleneck have similar shape characteristics. These shape a common bottleneck have similar shape characteristics. These shape
characteristics are described using 3 key summary statistics: characteristics are described using 3 key summary statistics:
variability (estimate var_est, see Section 3.1.3) variability (estimate var_est, see Section 3.2.3)
skewness (estimate skew_est, see Section 3.1.2) skewness (estimate skew_est, see Section 3.2.2)
oscillation (estimate freq_est, see Section 3.1.4) oscillation (estimate freq_est, see Section 3.2.4)
with packet loss (estimate pkt_loss, see Section 3.1.5) used as a with packet loss (estimate pkt_loss, see Section 3.2.5) used as a
supplementary statistic. supplementary statistic.
Summary statistics help to address both the noise and the path lag Summary statistics help to address both the noise and the path lag
problems by describing the general shape over a relatively long problems by describing the general shape over a relatively long
period of time. Each summary statistic portrays a "view" of the period of time. Each summary statistic portrays a "view" of the
bottleneck link characteristics, and when used together, they provide bottleneck link characteristics, and when used together, they provide
a robust discrimination for grouping flows. They can be signalled a robust discrimination for grouping flows. They can be signalled
from a receiver, which measures the OWD and calculates the summary from a receiver, which measures the OWD and calculates the summary
statistics, to a sender, which is the entity that is transmitting the statistics, to a sender, which is the entity that is transmitting the
media stream. An RTP Media device may be both a sender and a media stream. An RTP Media device may be both a sender and a
skipping to change at page 8, line 19 skipping to change at page 9, line 19
| L2 | L2
| |
+----+ L1 | L3 +----+ +----+ L1 | L3 +----+
| H1 |------|------| H3 | | H1 |------|------| H3 |
+----+ +----+ +----+ +----+
A network with 3 hosts (H1, H2, H3) and 3 links (L1, L2, L3). A network with 3 hosts (H1, H2, H3) and 3 links (L1, L2, L3).
Figure 1 Figure 1
In Figure 1, there are two possible cases for shared bottleneck In Figure 1, there are two possible locations for shared bottleneck
detection: a sender-based and a receiver-based case. detection: sender-side and receiver-side.
1. Sender-based: consider a situation where host H1 sends media 1. Sender-side: consider a situation where host H1 sends media
streams to hosts H2 and H3, and L1 is a shared bottleneck. H2 streams to hosts H2 and H3, and L1 is a shared bottleneck. H2
and H3 measure the OWD and calculate summary statistics, which and H3 measure the OWD and packet loss and either send back this
they send to H1 every T. H1, having this knowledge, can raw data, or the calculated summary statistics, periodically to
determine the shared bottleneck and accordingly control the send H1 every T. H1, having this knowledge, can determine the shared
rates. bottleneck and accordingly control the send rates.
2. Receiver-based: consider that H2 is also sending media to H3, and 2. Receiver-side: consider that H2 is also sending media to H3, and
L3 is a shared bottleneck. If H3 sends summary statistics to H1 L3 is a shared bottleneck. If H3 sends summary statistics to H1
and H2, neither H1 nor H2 alone obtain enough knowledge to detect and H2, neither H1 nor H2 alone obtain enough knowledge to detect
this shared bottleneck; H3 can however determine it by combining this shared bottleneck; H3 can however determine it by combining
the summary statistics related to H1 and H2, respectively. This the summary statistics related to H1 and H2, respectively.
case is applicable when send rates are controlled by the
receiver; then, the signal from H3 to the senders contains the
sending rate.
A discussion of the required signalling for the receiver-based case 3.1. SBD feedback requirements
is beyond the scope of this document. For the sender-based case, the
messages and their data format will be defined here in future
versions of this document.
We envisige the following exchange during initialisation: There are three possible scenarios each with different feedback
requirements:
1. Both summary statistic calculations and SBD are performed at
senders only.
2. Summary statistics calculated on the receivers and SBD at the
senders.
3. Summary statistic calculations on receivers, and SBD performed at
both senders and receivers (beyond the current scope, but allows
cooperative detection of bottlenecks).
3.1.1. Feedback when all the logic is placed at the sender
Having the sender calculate the summary statistics and determine the
shared bottlenecks based on them has the advantage of placing most of
the functionality in one place -- the sender.
The sender requires precise accurate OWD measurements for every
packet, along with the proportion of packets lost over the interval
T, to be sent from the receivers to the senders every T.
An initialisation message may be required to agree on the feedback
interval.
3.1.2. Feedback when the statistics are calculated at the receiver and
SBD at the sender
This scenario minimises feedback, but requires receivers to send
selected summary statistics at an agreed regular interval. We
envisage the following exchange of information to initialise the
system:
o An initialization message from the sender to the receiver will o An initialization message from the sender to the receiver will
contain the following information: contain the following information:
* A protocol identifier (SBD=01). This is to future proof the * A protocol identifier (SBD=01). This is to future proof the
message exchange so that potential advances in SBD technology message exchange so that potential advances in SBD technology
can be easily deployed. All following initialisation elements can be easily deployed. All following initialisation elements
relate to the mechanism outlined in this document which will relate to the mechanism outlined in this document which will
have the identifier SBD=01. have the identifier SBD=01.
skipping to change at page 9, line 20 skipping to change at page 10, line 50
may be able to exploit other metrics (e.g. metrics based on may be able to exploit other metrics (e.g. metrics based on
explicit network signals). explicit network signals).
* The values of T, N, M, and the necessary resolution and * The values of T, N, M, and the necessary resolution and
precision of the relayed statistics. precision of the relayed statistics.
o A response message from the receiver acknowledges this message o A response message from the receiver acknowledges this message
with a list of key metrics it supports (subset of the senders with a list of key metrics it supports (subset of the senders
list) and is able to relay back to the sender. list) and is able to relay back to the sender.
o This initialisation exchange may be repeated to finalize the This initialisation exchange may be repeated to finalize the agreed
agreed metrics should not all be supported by all receivers. metrics should not all be supported by all receivers.
3.1. Key metrics and their calculation After initialisation the agreed summary statistics will be fed back
to the sender every T.
3.1.3. Feedback when bottlenecks can be determined at both senders and
receivers
This type of mechanism is currently beyond the scope of SBD in RMCAT.
It is mentioned here to ensure more advanced sender/receiver
cooperative shared bottleneck determination mechanisms remain
possible in the future.
It is envisaged that such a mechanism would be initialised in a
similar manner to that described in Section 3.1.2.
After initialisation both summary statistics and shared bottleneck
determinations will need to be exchanged every T.
3.2. Key metrics and their calculation
Measurements are calculated over a base interval, T and summarized Measurements are calculated over a base interval, T and summarized
over N or M such intervals. All summary statistics can be calculated over N or M such intervals. All summary statistics can be calculated
incrementally. incrementally.
3.1.1. Mean delay 3.2.1. Mean delay
The mean delay is not a useful signal for comparisons between flows The mean delay is not a useful signal for comparisons between flows
since flows may traverse quite different paths and clocks will not since flows may traverse quite different paths and clocks will not
necessarily be synchronized. However, it is a base measure for the 3 necessarily be synchronized. However, it is a base measure for the 3
summary statistics. The mean delay, E_T(OWD), is the average one way summary statistics. The mean delay, E_T(OWD), is the average one way
delay measured over T. delay measured over T.
To facilitate the other calculations, the last N E_T(OWD) values will To facilitate the other calculations, the last N E_T(OWD) values will
need to be stored in a cyclic buffer along with the moving average of need to be stored in a cyclic buffer along with the moving average of
E_T(OWD): E_T(OWD):
mean_delay = E_M(E_T(OWD)) = sum_M(E_T(OWD)) / M mean_delay = E_M(E_T(OWD)) = sum_M(E_T(OWD)) / M
where M <= N. Setting M to be less than N allows the mechanism to be where M <= N. Setting M to be less than N allows the mechanism to be
more responsive to changes, but potentially at the expense of a more responsive to changes, but potentially at the expense of a
higher error rate (see Section 3.4 for a discussion on improving the higher error rate (see Section 3.5 for a discussion on improving the
responsiveness of the mechanism.) responsiveness of the mechanism.)
3.1.2. Skewness Estimate 3.2.2. Skewness Estimate
Skewness is difficult to calculate efficiently and accurately. Skewness is difficult to calculate efficiently and accurately.
Ideally it should be calculated over the entire period (M * T) from Ideally it should be calculated over the entire period (M * T) from
the mean OWD over that period. However this would require storing the mean OWD over that period. However this would require storing
every delay measurement over the period. Instead, an estimate is every delay measurement over the period. Instead, an estimate is
made over M * T based on a calculation every T using the previous T's made over M * T based on a calculation every T using the previous T's
calculation of mean_delay. calculation of mean_delay.
The base for the skewness calculation is estimated using a counter The base for the skewness calculation is estimated using a counter
initialised every T. It increments for one way delay samples (OWD) initialised every T. It increments for one way delay samples (OWD)
skipping to change at page 10, line 28 skipping to change at page 12, line 27
enable it to be calculated iteratively. enable it to be calculated iteratively.
skew_est = sum_MT(skew_base_T)/num_MT(OWD) skew_est = sum_MT(skew_base_T)/num_MT(OWD)
where skew_est is a number between -1 and 1 where skew_est is a number between -1 and 1
Note: Care must be taken when implementing the comparisons to ensure Note: Care must be taken when implementing the comparisons to ensure
that rounding does not bias skew_est. It is important that the mean that rounding does not bias skew_est. It is important that the mean
is calculated with a higher precision than the samples. is calculated with a higher precision than the samples.
3.1.3. Variability Estimate 3.2.3. Variability Estimate
Mean Absolute Deviation (MAD) delay is a robust variability measure Mean Absolute Deviation (MAD) delay is a robust variability measure
that copes well with different send rates. It can be implemented in that copes well with different send rates. It can be implemented in
an online manner as follows: an online manner as follows:
var_base_T = sum_T(|OWD - E_T(OWD)|) var_base_T = sum_T(|OWD - E_T(OWD)|)
where where
|x| is the absolute value of x |x| is the absolute value of x
E_T(OWD) is the mean OWD calculated in the previous T E_T(OWD) is the mean OWD calculated in the previous T
var_est = MAD_MT = sum_MT(var_base_T)/num_MT(OWD) var_est = MAD_MT = sum_MT(var_base_T)/num_MT(OWD)
For calculation of freq_est p_v=0.7 For calculation of freq_est p_v=0.7
For the grouping threshold p_mad=0.1 For the grouping threshold p_mad=0.1
3.1.4. Oscillation Estimate 3.2.4. Oscillation Estimate
An estimate of the low frequency oscillation of the delay signal is An estimate of the low frequency oscillation of the delay signal is
calculated by counting and normalising the significant mean, calculated by counting and normalising the significant mean,
E_T(OWD), crossings of mean_delay: E_T(OWD), crossings of mean_delay:
freq_est = number_of_crossings / N freq_est = number_of_crossings / N
where we define a significant mean crossing as a crossing that where we define a significant mean crossing as a crossing that
extends p_v * var_est from mean_delay. In our experiments we extends p_v * var_est from mean_delay. In our experiments we
have found that p_v = 0.7 is a good value. have found that p_v = 0.7 is a good value.
skipping to change at page 11, line 38 skipping to change at page 13, line 32
The counter, number_of_crossings, is incremented when there is a The counter, number_of_crossings, is incremented when there is a
significant mean crossing and decremented when a non-zero value is significant mean crossing and decremented when a non-zero value is
removed from the last_N_crossings. removed from the last_N_crossings.
This approximation of freq_est was not used in [Hayes-LCN14], which This approximation of freq_est was not used in [Hayes-LCN14], which
calculated freq_est every T using the current E_N(E_T(OWD)). Our calculated freq_est every T using the current E_N(E_T(OWD)). Our
tests show that this approximation of freq_est yields results that tests show that this approximation of freq_est yields results that
are almost identical to when the full calculation is performed every are almost identical to when the full calculation is performed every
T. T.
3.1.5. Packet loss 3.2.5. Packet loss
The proportion of packets lost over the period NT is used as a The proportion of packets lost over the period NT is used as a
supplementary measure: supplementary measure:
pkt_loss = sum_NT(lost packets) / sum_NT(total packets) pkt_loss = sum_NT(lost packets) / sum_NT(total packets)
Note: When pkt_loss is small it is very variable, however, when Note: When pkt_loss is small it is very variable, however, when
pkt_loss is high it becomes a stable measure for making grouping pkt_loss is high it becomes a stable measure for making grouping
decisions. decisions.
3.2. Flow Grouping 3.3. Flow Grouping
3.2.1. Flow Grouping Algorithm 3.3.1. Flow Grouping Algorithm
The following grouping algorithm is RECOMMENDED for SBD in the RMCAT The following grouping algorithm is RECOMMENDED for SBD in the RMCAT
context and is sufficient and efficient for small to moderate numbers context and is sufficient and efficient for small to moderate numbers
of flows. For very large numbers of flows (e.g. hundreds), a more of flows. For very large numbers of flows (e.g. hundreds), a more
complex clustering algorithm may be substituted. complex clustering algorithm may be substituted.
Since no single metric is precise enough to group flows (due to Since no single metric is precise enough to group flows (due to
noise), the algorithm uses multiple metrics. Each metric offers a noise), the algorithm uses multiple metrics. Each metric offers a
different "view" of the bottleneck link characteristics, and used different "view" of the bottleneck link characteristics, and used
together they enable a more precise grouping of flows than would together they enable a more precise grouping of flows than would
skipping to change at page 13, line 30 skipping to change at page 15, line 22
diff(pkt_loss) < (p_d * pkt_loss) diff(pkt_loss) < (p_d * pkt_loss)
The threshold, (p_d * pkt_loss), is with respect to the highest The threshold, (p_d * pkt_loss), is with respect to the highest
value in the difference. value in the difference.
This procedure involves sorting estimates from highest to lowest. It This procedure involves sorting estimates from highest to lowest. It
is simple to implement, and efficient for small numbers of flows (up is simple to implement, and efficient for small numbers of flows (up
to 10-20). to 10-20).
3.2.2. Using the flow group signal 3.3.2. Using the flow group signal
Grouping decisions can be made every T from the second T, however Grouping decisions can be made every T from the second T, however
they will not attain their full design accuracy until after the they will not attain their full design accuracy until after the
2*N'th T interval. We recommend that grouping decisions are not made 2*N'th T interval. We recommend that grouping decisions are not made
until 2*M T intervals. until 2*M T intervals.
Network conditions, and even the congestion controllers, can cause Network conditions, and even the congestion controllers, can cause
bottlenecks to fluctuate. A coupled congestion controller MAY decide bottlenecks to fluctuate. A coupled congestion controller MAY decide
only to couple groups that remain stable, say grouped together 90% of only to couple groups that remain stable, say grouped together 90% of
the time, depending on its objectives. Recommendations concerning the time, depending on its objectives. Recommendations concerning
this are beyond the scope of this draft and will be specific to the this are beyond the scope of this draft and will be specific to the
coupled congestion controllers objectives. coupled congestion controllers objectives.
3.3. Removing Noise from the Estimates 3.4. Removing Noise from the Estimates
The following describe small changes to the calculation of the key The following describe small changes to the calculation of the key
metrics that help remove noise from them. Currently these "tweaks" metrics that help remove noise from them. Currently these "tweaks"
are described separately to keep the main description succinct. In are described separately to keep the main description succinct. In
future revisions of the draft these enhancements may replace the future revisions of the draft these enhancements may replace the
original key metric calculations. original key metric calculations.
3.3.1. Oscillation noise 3.4.1. Oscillation noise
When a path has no bottleneck, var_est will be very small and the When a path has no bottleneck, var_est will be very small and the
recorded significant mean crossings will be the result of path noise. recorded significant mean crossings will be the result of path noise.
Thus up to N-1 meaningless mean crossings can be a source of error at Thus up to N-1 meaningless mean crossings can be a source of error at
the point a link becomes a bottleneck and flows traversing it begin the point a link becomes a bottleneck and flows traversing it begin
to be grouped. to be grouped.
To remove this source of noise from freq_est: To remove this source of noise from freq_est:
1. Set the current var_base_T = NaN (a value representing an invalid 1. Set the current var_base_T = NaN (a value representing an invalid
record, i.e. Not a Number) for flows that are deemed to not be record, i.e. Not a Number) for flows that are deemed to not be
transiting a bottleneck by the first skew_est based grouping test transiting a bottleneck by the first skew_est based grouping test
(see Section 3.2.1). (see Section 3.3.1).
2. Then var_est = sum_MT(var_base_T != NaN) / num_MT(OWD) 2. Then var_est = sum_MT(var_base_T != NaN) / num_MT(OWD)
3. For freq_est, only record a significant mean crossing if flow 3. For freq_est, only record a significant mean crossing if flow
deemed to be transiting a bottleneck. deemed to be transiting a bottleneck.
These three changes can help to remove the non-bottleneck noise from These three changes can help to remove the non-bottleneck noise from
freq_est. freq_est.
3.3.2. Clock skew 3.4.2. Clock skew
Generally sender and receiver clock skew will be too small to cause Generally sender and receiver clock skew will be too small to cause
significant errors in the estimators. Skew_est is most sensitive to significant errors in the estimators. Skew_est and freq_est are the
this type of noise. In circumstances where clock skew is high, most sensitive to this type of noise due to their use of a mean OWD
basing skew_est only on the previous T's mean provides a noisier but calculated over a longer interval. In circumstances where clock skew
reliable signal. is high, basing skew_est only on the previous T's mean and ignoring
freq_est provides a noisier but reliable signal.
A better method is to estimate the effect the clock skew is having on A more sophisticated method is to estimate the effect the clock skew
the summary statistics, and then adjust statistics accordingly. A is having on the summary statistics, and then adjust statistics
simple online method of doing this based on min_T(OWD) will be accordingly. There are a number of techniques in the literature,
described here in a subsequent version of the draft. including [Zhang-Infocom02].
3.4. Reducing lag and Improving Responsiveness 3.5. Reducing lag and Improving Responsiveness
Measurement based shared bottleneck detection makes decisions in the Measurement based shared bottleneck detection makes decisions in the
present based on what has been measured in the past. This means that present based on what has been measured in the past. This means that
there is always a lag in responding to changing conditions. This there is always a lag in responding to changing conditions. This
mechanism is based on summary statistics taken over (N*T) seconds. mechanism is based on summary statistics taken over (N*T) seconds.
This mechanism can be made more responsive to changing conditions by: This mechanism can be made more responsive to changing conditions by:
1. Reducing N and/or M -- but at the expense of having less accurate 1. Reducing N and/or M -- but at the expense of having less accurate
metrics, and/or metrics, and/or
skipping to change at page 15, line 21 skipping to change at page 17, line 10
exponentially weighted moving average weights drop off too quickly exponentially weighted moving average weights drop off too quickly
for our requirements and have an infinite tail. A simple linearly for our requirements and have an infinite tail. A simple linearly
declining weighted moving average also does not provide enough weight declining weighted moving average also does not provide enough weight
to the most recent measurements. We propose a piecewise linear to the most recent measurements. We propose a piecewise linear
distribution of weights, such that the first section (samples 1:F) is distribution of weights, such that the first section (samples 1:F) is
flat as in a simple moving average, and the second section (samples flat as in a simple moving average, and the second section (samples
F+1:M) is linearly declining weights to the end of the averaging F+1:M) is linearly declining weights to the end of the averaging
window. We choose integer weights, which allows incremental window. We choose integer weights, which allows incremental
calculation without introducing rounding errors. calculation without introducing rounding errors.
3.4.1. Improving the response of the skewness estimate 3.5.1. Improving the response of the skewness estimate
The weighted moving average for skew_est, based on skew_est in The weighted moving average for skew_est, based on skew_est in
Section 3.1.2, can be calculated as follows: Section 3.2.2, can be calculated as follows:
skew_est = ((M-F+1)*sum(skew_base_T(1:F)) skew_est = ((M-F+1)*sum(skew_base_T(1:F))
+ sum([(M-F):1].*skew_base_T(F+1:M))) + sum([(M-F):1].*skew_base_T(F+1:M)))
/ ((M-F+1)*sum(numsampT(1:F)) / ((M-F+1)*sum(numsampT(1:F))
+ sum([(M-F):1].*numsampT(F+1:M))) + sum([(M-F):1].*numsampT(F+1:M)))
where numsampT is an array of the number of OWD samples in each T where numsampT is an array of the number of OWD samples in each T
skipping to change at page 17, line 5 skipping to change at page 19, line 5
11. sum_skewbase = sum_skewbase + skewbase_hist(F+1) - old_skewbase 11. sum_skewbase = sum_skewbase + skewbase_hist(F+1) - old_skewbase
12. sum_numsamp = sum_numsamp + numsampT(1) - old_numsampT 12. sum_numsamp = sum_numsamp + numsampT(1) - old_numsampT
13. skew_est = ((M-F+1)*F_skewbase + W_D_skewbase) / 13. skew_est = ((M-F+1)*F_skewbase + W_D_skewbase) /
((M-F+1)*F_numsamp+W_D_numsamp) ((M-F+1)*F_numsamp+W_D_numsamp)
Where cycle(....) refers to the operation on a cyclic buffer where Where cycle(....) refers to the operation on a cyclic buffer where
the start of the buffer is now the next element in the buffer. the start of the buffer is now the next element in the buffer.
3.4.2. Improving the response of the variability estimate 3.5.2. Improving the response of the variability estimate
Similarly the weighted moving average for var_est can be calculated Similarly the weighted moving average for var_est can be calculated
as follows: as follows:
var_est = ((M-F+1)*sum(var_base_T(1:F)) var_est = ((M-F+1)*sum(var_base_T(1:F))
+ sum([(M-F):1].*var_base_T(F+1:M))) + sum([(M-F):1].*var_base_T(F+1:M)))
/ ((M-F+1)*sum(numsampT(1:F)) / ((M-F+1)*sum(numsampT(1:F))
+ sum([(M-F):1].*numsampT(F+1:M))) + sum([(M-F):1].*numsampT(F+1:M)))
where numsampT is an array of the number of OWD samples in each T where numsampT is an array of the number of OWD samples in each T
(i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1) (i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1)
is the most recent calculation of skew_base_T; 1:F refers to the is the most recent calculation of skew_base_T; 1:F refers to the
integer values 1 through to F, and [(M-F):1] refers to an array of integer values 1 through to F, and [(M-F):1] refers to an array of
the integer values (M-F) declining through to 1; and ".*" is the the integer values (M-F) declining through to 1; and ".*" is the
array scalar dot product operator. When removing oscillation noise array scalar dot product operator. When removing oscillation noise
(see Section 3.3.1) this calculation must be adjusted to allow for (see Section 3.4.1) this calculation must be adjusted to allow for
invalid var_base_T records. invalid var_base_T records.
Var_est can be calculated incrementally in the same way as skew_est Var_est can be calculated incrementally in the same way as skew_est
in Section 3.4.1. However, note that the buffer numsampT is used for in Section 3.5.1. However, note that the buffer numsampT is used for
both calculations so the operations on it should not be repeated. both calculations so the operations on it should not be repeated.
4. Measuring OWD 4. Measuring OWD
This section discusses the OWD measurements required for this This section discusses the OWD measurements required for this
algorithm to detect shared bottlenecks. algorithm to detect shared bottlenecks.
The SBD mechanism described in this draft relies on differences The SBD mechanism described in this draft relies on differences
between OWD measurements to avoid the practical problems with between OWD measurements to avoid the practical problems with
measuring absolute OWD (see [Hayes-LCN14] section IIIC). Since all measuring absolute OWD (see [Hayes-LCN14] section IIIC). Since all
skipping to change at page 18, line 38 skipping to change at page 20, line 38
Non-authenticated RTCP packets carrying shared bottleneck indications Non-authenticated RTCP packets carrying shared bottleneck indications
and summary statistics could allow attackers to alter the bottleneck and summary statistics could allow attackers to alter the bottleneck
sharing characteristics for private gain or disruption of other sharing characteristics for private gain or disruption of other
parties communication. parties communication.
9. Change history 9. Change history
Changes made to this document: Changes made to this document:
WG-03->WG-04 : Add M to terminology table, suggest skew_est based
on previous T and no freq_est in clock skew
section, feedback requirements as a separate sub
section.
WG-02->WG-03 : Correct misspelled author WG-02->WG-03 : Correct misspelled author
WG-01->WG-02 : Removed ambiguity associated with the term WG-01->WG-02 : Removed ambiguity associated with the term
"congestion". Expanded the description of "congestion". Expanded the description of
initialisation messages. Removed PDV metric. initialisation messages. Removed PDV metric.
Added description of incremental weighted metric Added description of incremental weighted metric
calculations for skew_est. Various clarifications calculations for skew_est. Various clarifications
based on implementation work. Fixed typos and based on implementation work. Fixed typos and
tuned parameters. tuned parameters.
skipping to change at page 19, line 23 skipping to change at page 21, line 28
notation to make it clearer. Some tightening of notation to make it clearer. Some tightening of
the thresholds. the thresholds.
00->01 : Revisions to terminology for clarity 00->01 : Revisions to terminology for clarity
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ Requirement Levels", BCP 14, RFC 2119,
RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<http://www.rfc-editor.org/info/rfc2119>. <http://www.rfc-editor.org/info/rfc2119>.
10.2. Informative References 10.2. Informative References
[Hayes-LCN14] [Hayes-LCN14]
Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive
Shared Bottleneck Detection using Shape Summary Shared Bottleneck Detection using Shape Summary
Statistics", Proc. the IEEE Local Computer Networks (LCN) Statistics", Proc. the IEEE Local Computer Networks
p150-158, September 2014, <http://heim.ifi.uio.no/davihay/ (LCN) pp150-158, September 2014,
<http://heim.ifi.uio.no/davihay/
hayes14__pract_passiv_shared_bottl_detec-abstract.html>. hayes14__pract_passiv_shared_bottl_detec-abstract.html>.
[I-D.welzl-rmcat-coupled-cc] [I-D.ietf-rmcat-coupled-cc]
Welzl, M., Islam, S., and S. Gjessing, "Coupled congestion Islam, S., Welzl, M., and S. Gjessing, "Coupled congestion
control for RTP media", draft-welzl-rmcat-coupled-cc-04 control for RTP media", draft-ietf-rmcat-coupled-cc-00
(work in progress), October 2014. (work in progress), September 2015.
[ITU-Y1540]
ITU-T, "Internet Protocol Data Communication Service - IP
Packet Transfer and Availability Performance Parameters",
Series Y: Global Information Infrastructure, Internet
Protocol Aspects and Next-Generation Networks , March
2011, <http://www.itu.int/rec/T-REC-Y.1540-201103-I/en>.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
July 2003, <http://www.rfc-editor.org/info/rfc3550>. July 2003, <http://www.rfc-editor.org/info/rfc3550>.
[RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
"Extended RTP Profile for Real-time Transport Control "Extended RTP Profile for Real-time Transport Control
Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, DOI Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
10.17487/RFC4585, July 2006, DOI 10.17487/RFC4585, July 2006,
<http://www.rfc-editor.org/info/rfc4585>. <http://www.rfc-editor.org/info/rfc4585>.
[RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
Real-time Transport Control Protocol (RTCP)-Based Feedback Real-time Transport Control Protocol (RTCP)-Based Feedback
(RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February
2008, <http://www.rfc-editor.org/info/rfc5124>. 2008, <http://www.rfc-editor.org/info/rfc5124>.
[RFC5481] Morton, A. and B. Claise, "Packet Delay Variation
Applicability Statement", RFC 5481, DOI 10.17487/RFC5481,
March 2009, <http://www.rfc-editor.org/info/rfc5481>.
[RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
"Low Extra Delay Background Transport (LEDBAT)", RFC 6817, "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
DOI 10.17487/RFC6817, December 2012, DOI 10.17487/RFC6817, December 2012,
<http://www.rfc-editor.org/info/rfc6817>. <http://www.rfc-editor.org/info/rfc6817>.
[Zhang-Infocom02]
Zhang, L., Liu, Z., and H. Xia, "Clock synchronization
algorithms for network measurements", Proc. the IEEE
International Conference on Computer Communications
(INFOCOM) pp160-169, September 2002,
<http://dx.doi.org/10.1109/INFCOM.2002.1019257>.
Authors' Addresses Authors' Addresses
David Hayes (editor) David Hayes (editor)
University of Oslo University of Oslo
PO Box 1080 Blindern PO Box 1080 Blindern
Oslo N-0316 Oslo N-0316
Norway Norway
Phone: +47 2284 5566 Phone: +47 2284 5566
Email: davihay@ifi.uio.no Email: davihay@ifi.uio.no
 End of changes. 52 change blocks. 
110 lines changed or deleted 168 lines changed or added

This html diff was produced by rfcdiff 1.44. The latest version is available from http://tools.ietf.org/tools/rfcdiff/