]>
Shared Bottleneck Detection for Coupled Congestion Control for
RTP Media.
University of OsloPO Box 1080 BlindernOsloN-0316Norway+47 2284 5566davihay@ifi.uio.noSimula Research LaboratoryP.O.Box 134Lysaker1325Norway+47 4072 0702ferlin@simula.noUniversity of OsloPO Box 1080 BlindernOsloN-0316Norway+47 2285 2420michawe@ifi.uio.no
General
RTP Media Congestion Avoidance TechniquesSBDThis document describes a mechanism to detect whether
end-to-end data flows
share a common bottleneck. It relies on summary statistics that are calculated by
a data receiver based on continuous measurements and regularly fed to a grouping algorithm that
runs wherever the knowledge is needed. This mechanism complements the coupled congestion
control mechanism in draft-welzl-rmcat-coupled-cc.In the Internet, it is not normally known if flows (e.g., TCP connections or UDP data streams)
traverse the same bottlenecks. Even flows that have the same sender and receiver may take
different paths and share a bottleneck or not. Flows that share a bottleneck link usually
compete with one another for their share of the capacity. This competition has the potential
to increase packet loss and delays. This is especially relevant for interactive applications
that communicate simultaneously with multiple peers (such as multi-party video). For RTP
media applications such as RTCWEB, describes
a scheme that combines
the congestion controllers of flows in order to honor their priorities and avoid unnecessary
packet loss as well as delay.
This mechanism relies on some form of Shared Bottleneck Detection (SBD); here, a
measurement-based SBD approach is described.The current Internet is unable to explicitly inform
endpoints as to which flows share bottlenecks, so endpoints
need to infer this from packet loss and packet delay.Packet loss is often a relatively rare
signal. Therefore, on its own it is of limited use for
SBD, however, it is a valuable supplementary measure when
it is more prevalent.End-to-end delay measurements include noise from every
device along the path in addition to the delay
perturbation at the bottleneck device. The noise is
often significantly increased if the round-trip time is used. The
cleanest signal is obtained by using One-Way-Delay
(OWD).Measuring absolute OWD is difficult since it
requires both the sender and receiver clocks to be
synchronised. However, since the statistics being
collected are relative to the mean OWD, a relative OWD
measurement is sufficient. Clock drift is not usually
significant over the time intervals used by this SBD
mechanism (see A.2 for a
discussion on clock drift and OWD measurements).Each packet arriving at the bottleneck buffer may
experience very different queue lengths, and therefore different
waiting times. A single OWD sample does therefore not
characterize the actual OWD of a path well. However,
multiple OWD measurements do reflect the distribution of
delays experienced at the bottleneck.Flows that share a common bottleneck may traverse
different paths, and these paths will often have different
base delays. This makes it difficult to correlate changes
in delay or loss. This technique uses the long term shape
of the delay distribution as a base for comparison to
counter this.The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119.Acronyms used in this document:
One Way Delay Round Trip Time Shared Bottleneck DetectionConventions used in this document:
the base time interval over which measurements
are made. the number of base time, T, intervals
used in some calculations. summation of all the
measurements of the variable in parentheses taken over the
interval T summation of N terms of the variable in parentheses summation of all
measurements taken over the interval N*T the expectation or mean of the
measurements of the variable in parentheses over T The expectation or mean of the last N values of
the variable in parentheses the maximum recorded measurement
of the variable in parentheses taken over the interval T the minimum recorded measurement
of the variable in parentheses taken over the interval T the count of measurements of the
variable in parenthesis taken in the interval T various
thresholds used in the mechanism.Reference uses T=350ms,
N=50, p_l = 0.1,
p_f = 0.2, p_pdf = 0.3, c_s = 0.0, p_s = p_d = p_v = 0.2. These are
values that seem to work well over a wide range of practical
Internet conditions.The mechanism described in this document is based on the
observation that the distribution of delay measurements of
packets from flows that share a
common bottleneck have similar shape characteristics. These
shape characteristics are described using 3 key summary
statistics:
variance (estimate PDV, see )skewness (estimate skew_est, see )oscillation (estimate freq_est, see )Summary statistics help to address both the noise and the
path lag problems by describing the general shape over a
relatively long period of time. This is sufficient for their
application in coupled congestion control for RTP Media. They
can be signalled from a receiver, which measures the OWD and calculates
the summary statistics, to a sender, which is the entity that is transmitting
the media stream. An RTP Media device may
be both a sender and a receiver. SBD can be performed at either
Sender or receiver or both.In , there are two possible cases
for shared bottleneck detection: a sender-based and a
receiver-based case.
Sender-based: consider a situation where host H1 sends media
streams to hosts H2 and H3, and L1 is a shared bottleneck.
H2 and H3 measure the OWD and calculate summary statistics,
which they send to H1 every T. H1, having this knowledge,
can determine the shared bottleneck and accordingly control
the send rates.Receiver-based: consider that H2 is also sending media to
H3, and L3 is a shared bottleneck. If H3 sends summary
statistics to H1 and H2, neither H1 nor H2 alone obtain
enough knowledge to detect this shared bottleneck; H3 can
however determine it by combining the summary statistics
related to H1 and H2, respectively. This case is applicable
when send rates are controlled by the receiver; then, the
signal from H3 to the senders contains the sending rate.A discussion of the required signaling for the receiver-based
case is beyond the scope of this document. For the sender-based
case, the messages and their data format will be defined here in
future versions of this document. We envision that an
initialization message from the sender to the receiver could
specify which key metrics are requested out of a possibly
extensible set (PL_NT, PDV, skew_est, freq_est).
The grouping algorithm described in this
document requires all four of these metrics, and receivers MUST be
able to provide them,
but future algorithms may be able to exploit other metrics
(e.g. metrics based on explicit network signals).
Moreover, the initialization message could
specify T, N, and the necessary resolution and precision (number of bits
per field).
Measurements are calculated over a base interval,
T. T should be long enough to provide enough samples
for a good estimate of skewness, but short enough so that
a measure of the oscillation can be made from N of these
estimates. Reference
uses T = 350ms and N = 50,
which are values that seem to work well over a wide range
of practical Internet conditions.The mean delay is not a useful signal for comparisons
between flows since flows may traverse quite different paths
and clocks will not necessarily be synchronized. However, it
is a base measure for the 3 summary statistics. The mean
delay, E_T(OWD), is the average one way delay measured over
T.To facilitate the other calculations, the last N
E_T(OWD) values will need to be stored in a cyclic buffer
along with the moving
average of E_T(OWD):
E_N(E_T(OWD)) = sum_N(E_T(OWD)) / NSkewness is difficult to calculate efficiently and
accurately. Ideally it should be calculated over the entire
period (N * T) from the mean OWD over that period. However this
would require storing every delay measurement over the
period. Instead, an estimate is made over T using the
previous calculation of E_T(OWD). Comparisons are made
using the mean of N skew estimates.The skewness is estimated using two counters, counting
the number of one way delay samples (OWD) above and below the
mean:
skew_est = (sum_T(OWD < E_NT(OWD)) - sum_T(OWD >
E_NT(OWD))) / num_T(OWD)
where
if (OWD < E_NT(OWD)) 1 else 0if (OWD > E_NT(OWD)) 1 else 0skew_est is a number between -1 and 1E_N(skew_est) = sum_N(skew_est) / N
For implementation ease, E_NT(OWD) does not include the mean
of the current T interval. Care must be taken when implementing the
comparisons to ensure that rounding does not bias
skew_est.Packet Delay Variation (PDV) (
and )
is used as an estimator of
the variance of the delay signal. We define PDV
as follows:
PDV = (max_T(OWD) - E_T(OWD))E_N(PDV) = sum_N(PDV) / N
This modifies PDV as outlined in
to provide a summary statistic version that best
aids the grouping decisions of the algorithm (see section IVB).The use of PDV = (min_T(OWD) - E_T(OWD)) is currently
being investigated as an alternative that is less sensitive
to noise.An estimate of the low frequency oscillation of the delay
signal is calculated by counting and normalising the significant mean,
E_T(OWD), crossings of E_N(E_T(OWD)):
freq_est = number_of_crossings / N Where
we define a significant mean crossing as a crossing
that extends p_v * E_N(PDV) from E_N(E_T(OWD)). In our
experiments we have found that p_v = 0.2 is a good
value.
Freq_est is a number between 0 and 1. Freq_est
can be approximated incrementally as follows:
With each new calculation of E_T(OWD) a decision is
made as to whether this value of E_T(OWD) significantly
crosses the current long term mean, E_N(E_T(OWD), with respect to
the previous significant mean crossing.A cyclic buffer, last_N_crossings, records a 1 if there is a significant
mean crossing, otherwise a 0.The counter, number_of_crossings, is incremented when there
is a significant mean crossing and subtracted from when a
non zero value is removed from the last_N_crossings.
This approximation of freq_est was not used in , which calculated freq_est every T
using the current E_N(E_T(OWD)). Our tests show that
this approximation of freq_est yields results that are almost
identical to when the full calculation is performed every T.The proportion of packets lost is used as a supplementary
measure:
PL_NT = sum_NT(lost packets) / sum_NT(total
packets)The following grouping algorithm is RECOMMENDED for SBD
in this context and is sufficient and efficient for small to
moderate numbers of flows. For very large numbers of flows
(e.g. hundreds), a more complex clustering algorithm may be
substituted.Since no single metric is precise enough to group flows
(due to noise), the algorithm uses multiple metrics. Each
metric offers a different "view" of the bottleneck link
characteristics, and used together enable a more precise
grouping of flows than would otherwise be possible.Flows determined to be experiencing congestion are
successively divided into groups based on freq_est, PDV, and
skew_est.The first step is to determine which flows are
experiencing congestion. This is important, since if a flow
is not experiencing congestion its delay based metrics will
not describe the bottleneck, but the "noise" from the rest
of the path. Skewness, with proportion of packets loss as a
supplementary measure, is used to do this:
Grouping will be performed on flows where:
E_N(skew_est) < c_s || PL_NT > p_l.The parameter c_s controls how sensitive the mechanism is
in detecting congestion. C_s = 0.0 was used in . A value of c_s = 0.05 is a little
more sensitive, and c_s = -0.05 is a little less sensitive.These flows, flows experiencing congestion, are then
progressively divided into groups based on the freq_est, PDV,
and skew_est summary statistics. The process proceeds
according to the following steps:
Group flows whose difference in sorted freq_est is less than a
threshold:
diff(freq_est) < p_fGroup flows whose difference in sorted E_N(PDV) is less than a
threshold:
diff(E_N(PDV)) < (p_pdv * E_N(PDV)) Group flows whose difference in sorted E_N(skew_est) or
PL_NT is less than a threshold:
if PL_NT < p_l
diff(E_N(skewness)) < p_s otherwise
diff(PL_NT) < p_d This procedure involves sorting the groups, according to
the measure being used to divide them. It is simple to
implement, and efficient for small numbers of flows, such as
are expected in RTCWEB.A grouping decisions is made every T from the second T,
though they will not attain their full design accuracy until
after the N'th T interval.Network
conditions can cause bottlenecks to fluctuate. A coupled
congestion controller MAY decide only to couple groups that
remain stable, say grouped together 90% of the time,
depending on its objectives. Recommendations concerning this are
beyond the scope of this draft and will be specific to the
coupled congestion controllers objectives.This section discusses the OWD measurements required for this
algorithm to detect shared bottlenecks.
The SBD mechanism described in
this draft relies on differences between OWD measurements to avoid the
practical problems with measuring absolute OWD (see section IIIC). Since all summary statistics are
relative to the mean OWD and sender/receiver clock offsets
are approximately constant over the measurement periods, the
offset is subtracted out in the calculation.The SBD mechanism requires timing information precise enough
to be able to make comparisons. As a rule of thumb, the time
resolution should be less than one hundredth of a typical path's range
of delays. In general, the lower the time resolution, the more
care that needs to be taken to ensure rounding errors don't bias the
skewness calculation.Typical RTP media flows use sub-millisecond timers,
which should be adequate in most situations.This work was part-funded by the European Community under its
Seventh Framework Programme through the Reducing Internet
Transport Latency (RITE) project (ICT-317700). The views
expressed are solely those of the authors. This memo includes no request to IANA.The security considerations of RFC
3550, RFC 4585, and RFC 5124 are
expected to apply.Non-authenticated RTCP packets carrying shared bottleneck indications and summary
statistics could allow attackers to alter the bottleneck sharing
characteristics for private gain or disruption of other parties
communication.
&RFC2119;
&RFC3550;
&RFC4585;
&RFC5124;
&RFC5481;
&RFC6817;
&I-D.welzl-rmcat-coupled-cc;
Practical Passive Shared Bottleneck Detection using Shape
Summary StatisticsUniversity of OsloSimula Research LaboratoryUniversity of OsloInternet protocol data communication service - IP
packet transfer and availability performance
parametersITU-T