[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 04 05 06 07 08 09 10 11 RFC 4342

Internet Engineering Task Force                              Sally Floyd
INTERNET-DRAFT                                                      ICIR
draft-ietf-dccp-ccid3-05.txt                                Eddie Kohler
Expires: August 2004                                                UCLA
                                                         Jitendra Padhye
                                                      Microsoft Research
                                                        16 February 2004


               Profile for DCCP Congestion Control ID 3:
                        TFRC Congestion Control


Status of this Memo

    This document is an Internet-Draft and is in full conformance with
    all provisions of Section 10 of RFC2026.

    Internet-Drafts are working documents of the Internet Engineering
    Task Force (IETF), its areas, and its working groups.  Note that
    other groups may also distribute working documents as Internet-
    Drafts.

    Internet-Drafts are draft documents valid for a maximum of six
    months and may be updated, replaced, or obsoleted by other documents
    at any time. It is inappropriate to use Internet-Drafts as reference
    material or to cite them other than as "work in progress."

    The list of current Internet-Drafts can be accessed at
    http://www.ietf.org/ietf/1id-abstracts.txt

    The list of Internet-Draft Shadow Directories can be accessed at
    http://www.ietf.org/shadow.html

Copyright Notice

    Copyright (C) The Internet Society (2004). All Rights Reserved.

Abstract

    This document contains the profile for Congestion Control Identifier
    3, TCP-Friendly Rate Control (TFRC), in the Datagram Congestion
    Control Protocol (DCCP).  CCID 3 should be used by senders that want
    a TCP-friendly send rate, possibly with Explicit Congestion
    Notification (ECN), while minimizing abrupt rate changes.



Floyd/Kohler/Padhye                                             [Page 1]

INTERNET-DRAFT            Expires: August 2004             February 2004


    TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION:

    Changes from draft-ietf-dccp-ccid3-03.txt:

    * Added more text to the section on Congestion Control on Data
    Packets to make it more readable, and to summarize the key
    mechanisms specified in the TFRC spec.

    * Said that it is OK to use an initial sending rate of 2-4 pkts/RTT,
    based on RFC 3390.  And that in the future an initial sending rate
    of up to 8 pkts/RTT might be specified, for very small packets.

    * Receive Rate is measured in bytes per second, as RFC 3448
    specifies.

    * New definition of Loss Intervals option, because old definition
    was 24-bit-sequence-number specific; and add an example.

    Changes from draft-ietf-dccp-ccid3-02.txt:

    * Added to the section on Application Requirements.

    * Added a section on Packet Sizes.

    Changes from draft-ietf-dccp-ccid3-01.txt:

    * Added "Security Considerations" and "IANA Considerations"
    sections.

    * Store Window Counter in the DCCP header's CCVal field, not a
    separate option.

    * Add to the description of a loss interval in the Loss Intervals
    option: a loss interval includes at most one round-trip time's worth
    of possibly-marked packets, and at least one round-trip time's worth
    of packets in all.

    * Added a description of when the loss event rate calculated by the
    sender could differ from that calculated by the receiver.

    * Window counter fixups.

    * Add Use Loss Intervals and Use Loss Event Rate features, and
    explain their interaction.

    * Move Elapsed Time option to DCCP's main specification (and
    simultaneously change its units to tenths of milliseconds). Allow
    the use of either Elapsed Time or Timestamp Echo.



Floyd/Kohler/Padhye                                             [Page 2]

INTERNET-DRAFT            Expires: August 2004             February 2004


    * Clarify the definition of quiescence.

    * Change calculations for determining loss events to take window
    counter wrapping into account.

    Changes from draft-ietf-dccp-ccid3-00.txt:

    * Changed the guidelines to say that required acknowledgement
    packets should include one or more of the following:  The Loss Event
    Rate, Loss Intervals, or the Ack Vector.

    * Added a separate section on "The Use of Ack Vectors".  This
    section says that Ack-of-acks must be used when the Ack Vector is
    used.

    * Renamed the "ECN Nonce Option" to the "Loss Intervals" option, and
    extended this option to include up to eight loss intervals.  This is
    to enable more precise verification by the sender of the receiver's
    feedback.

    * Added a section about "When should Ack Vector or Loss Intervals be
    used?"  In progress.

    * Added a section about using the ECN Nonce to verify the receiver's
    feedback.

    * Said that the ECN-Nonce feedback must be returned in every
    required acknowledgement.

    * Added a sentence saying that the TFRC spec "separately specifies
    the minimum sending rate from rate reductions during an idle
    period."



















Floyd/Kohler/Padhye                                             [Page 3]

INTERNET-DRAFT            Expires: August 2004             February 2004


                             Table of Contents

    1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . .   5
    2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . .   5
    3. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . .   6
       3.1. Example Half-Connection. . . . . . . . . . . . . . . . .   6
       3.2. Updates. . . . . . . . . . . . . . . . . . . . . . . . .   7
    4. Connection Establishment. . . . . . . . . . . . . . . . . . .   7
    5. Congestion Control on Data Packets. . . . . . . . . . . . . .   7
       5.1. Response to Data Dropped . . . . . . . . . . . . . . . .   9
       5.2. Packet Sizes . . . . . . . . . . . . . . . . . . . . . .   9
    6. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . .   9
       6.1. Congestion Control on Acknowledgements . . . . . . . . .  10
       6.2. Quiescence . . . . . . . . . . . . . . . . . . . . . . .  10
       6.3. Acknowledgements of Acknowledgements . . . . . . . . . .  11
    7. Explicit Congestion Notification. . . . . . . . . . . . . . .  11
    8. Options and Features. . . . . . . . . . . . . . . . . . . . .  12
       8.1. Window Counter Value . . . . . . . . . . . . . . . . . .  13
       8.2. Elapsed Time Options . . . . . . . . . . . . . . . . . .  14
       8.3. Receive Rate Option. . . . . . . . . . . . . . . . . . .  14
       8.4. Send Loss Event Rate Feature . . . . . . . . . . . . . .  15
       8.5. Loss Event Rate Option . . . . . . . . . . . . . . . . .  15
       8.6. Send Loss Intervals Feature. . . . . . . . . . . . . . .  15
       8.7. Loss Intervals Option. . . . . . . . . . . . . . . . . .  16
          8.7.1. Loss Interval Definition. . . . . . . . . . . . . .  16
          8.7.2. Option Details. . . . . . . . . . . . . . . . . . .  17
          8.7.3. Example . . . . . . . . . . . . . . . . . . . . . .  18
    9. Verifying Congestion Control Compliance With
    ECN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  19
       9.1. Verifying the ECN Nonce Echo . . . . . . . . . . . . . .  20
       9.2. Verifying the Reported Loss Event Rate . . . . . . . . .  20
    10. Design Considerations. . . . . . . . . . . . . . . . . . . .  21
       10.1. Possible Changes to the Initial Window. . . . . . . . .  21
       10.2. Determining Loss Events at the Receiver . . . . . . . .  22
       10.3. Sending Feedback Packets. . . . . . . . . . . . . . . .  23
       10.4. When Should Ack Vector And Loss Intervals
       Be Used?. . . . . . . . . . . . . . . . . . . . . . . . . . .  24
    11. Security Considerations. . . . . . . . . . . . . . . . . . .  25
    12. IANA Considerations. . . . . . . . . . . . . . . . . . . . .  25
    13. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . .  25
    Normative References . . . . . . . . . . . . . . . . . . . . . .  25
    Informative References . . . . . . . . . . . . . . . . . . . . .  26
    Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . .  26
    Intellectual Property Notice . . . . . . . . . . . . . . . . . .  27







Floyd/Kohler/Padhye                                             [Page 4]

INTERNET-DRAFT            Expires: August 2004             February 2004


1.  Introduction

    This document contains the profile for Congestion Control Identifier
    3, TCP-friendly rate control (TFRC), in the Datagram Congestion
    Control Protocol (DCCP) [DCCP]. DCCP uses Congestion Control
    Identifiers, or CCIDs, to specify the congestion control mechanism
    in use on a half-connection.  (A half-connection might consist of
    data packets sent from DCCP A to DCCP B, plus acknowledgements sent
    from DCCP B to DCCP A. DCCP A is the HC-Sender, and DCCP B the HC-
    Receiver, for this half-connection.  In this document, we abbreviate
    HC-Sender and HC-Receiver as "sender" and "receiver", respectively.
    These terms are defined more fully in [DCCP].)

    TFRC is a receiver-based congestion control mechanism that provides
    a TCP-friendly send rate, while minimizing abrupt rate changes [RFC
    3448].

    The basic TFRC protocol is as follows.  The sender sends a stream of
    data packets to the receiver at some rate.  The receiver sends a
    feedback packet to the sender roughly once every round-trip time.
    Based on the information contained in the feedback packets, the
    sender adjusts its sending rate in accordance with the TCP
    throughput equation [PFTK98], to maintain TCP-friendliness.  If no
    feedback is received from the receiver in several round-trip times
    (four, in the current TFRC specification), the sender halves its
    sending rate.

    The values of the round-trip time "RTT", the loss event rate "p" and
    the base timeout value "TO" are needed by the sender to calculate
    the send rate using the TCP throughput equation.  The sender
    calculates the values of RTT and TO, and the receiver calculates the
    value of p.  (If it prefers, the sender can also calculate p based
    on loss intervals provided by the receiver.)

2.  Conventions

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
    "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in
    this document are to be interpreted as described in [RFC 2119].

    All multi-byte numerical quantities in CCID 3, such as arguments to
    options, are transmitted in network byte order (most significant
    byte first).

    For simplicity, we occasionally refer to DCCP-Data packets sent by
    the sender and DCCP-Ack packets sent by the receiver.  Both of these
    categories are meant to include DCCP-DataAck packets.




Floyd/Kohler/Padhye                                 Section 2.  [Page 5]

INTERNET-DRAFT            Expires: August 2004             February 2004


3.  Usage

    DCCP with TFRC congestion control is intended to provide congestion
    control for applications that do not require fully reliable data
    transmission, or that desire to implement reliability on top of
    DCCP.  CCID 3's TFRC congestion control is appropriate for flows
    that would prefer to minimize abrupt changes in the sending rate.
    Applications that prefer a relatively smooth sending rate include
    some streaming media applications with small or moderate buffering
    at the receive application before the playback time.  TCP-like
    congestion control, which halves the sending rate in response to a
    congestion event, cannot satisfy this preference for a relatively
    smooth sending rate.

    As explained in [RFC 3448], the penalty of having smoother
    throughput than TCP while competing fairly for bandwidth is that the
    TFRC mechanism in CCID 3 responds slower than TCP or TCP-like
    mechanisms to changes in available bandwidth.  Thus, CCID 3 should
    only be used when the application has a requirement for smooth
    throughput, in particular avoiding TCP's halving of the sending rate
    in response to a single packet drop.  For applications that simply
    need to transfer as much data as possible in as short a time as
    possible, we recommend using TCP-like congestion control.

    As described in the TFRC specification [RFC 3448], this CCID should
    also not be used by applications that change their sending rate by
    varying the packet size, rather than varying the rate at which
    packets are sent.  A new CCID will be required for these
    applications.

3.1.  Example Half-Connection

    This example shows the typical progress of a half-connection using
    TFRC Congestion Control specified by CCID 3, not including
    connection initiation and termination.  Again, the "sender" is the
    HC-Sender, and the "receiver" is the HC-Receiver.  The example is
    informative, not normative.

    (1) The sender sends DCCP-Data packets, where the number of packets
        sent is governed by an allowed transmit rate, as specified in
        [RFC 3448]. Each DCCP-Data packet has a sequence number, and the
        DCCP header's CCVal field contains the window counter value.

        If the use of Explicit Congestion Notification (ECN) has been
        negotiated, each DCCP-Data and DCCP-DataAck packet is sent as
        ECN-Capable, with either the ECT(0) or the ECT(1) codepoint set.
        The use of the ECN Nonce with TFRC is described below.




Floyd/Kohler/Padhye                               Section 3.1.  [Page 6]

INTERNET-DRAFT            Expires: August 2004             February 2004


    (2) The receiver sends DCCP-Ack packets at least once per round-trip
        time acknowledging the data packets, unless the sender is
        sending at a rate of less than one packet per RTT, as indicated
        by the TFRC specification [RFC 3448]. Each DCCP-Ack packet uses
        a sequence number and identifies the most recent packet received
        from the sender.  Each DCCP-Ack packet includes feedback about
        the loss event rate calculated by the receiver, as specified
        below.

    (3) The sender continues sending DCCP-Data packets as controlled by
        the allowed transmit rate.  Upon receiving DCCP-Ack packets, the
        sender updates its allowed transmit rate as specified in [RFC
        3448].

    (4) The sender estimates round-trip times and calculates a TimeOut
        value TO as specified in [RFC 3448].

3.2.  Updates

    The congestion control mechanisms described here follow the TFRC
    mechanism standardized by the IETF.  Conformant CCID 3
    implementations MAY track updates to the TCP throughput equation
    directly, as updates are standardized in the IETF, rather than
    waiting for revisions of this document.  However, conformant
    implementations SHOULD wait for explicit updates to CCID 3 before
    implementing other changes to TFRC congestion control.

4.  Connection Establishment

    The connection is initiated by the client using mechanisms described
    in the DCCP specification [DCCP]. During or after CCID 3
    negotiation, the client and/or server may want to negotiate the
    values of the Send Ack Vector, Send Loss Intervals, and Send Loss
    Event Rate features.

    CCID 3 requires CCID-specific feedback from the receiver, and thus
    MUST NOT masquerade as CCID 1.

5.  Congestion Control on Data Packets

    CCID 3 uses the congestion control mechanisms of TFRC, from RFC
    3448.

    As specified in RFC 3448, the sender starts in a slow-start phase,
    roughly doubling its allowed sending rate each round-trip time.  The
    feedback packets from the receiver contain a Receive Rate option
    specifying the rate at which data packets were received by the
    receiver since the last feedback packet.  The allowed sending rate



Floyd/Kohler/Padhye                                 Section 5.  [Page 7]

INTERNET-DRAFT            Expires: August 2004             February 2004


    is never more than twice the rate that the receiver received in the
    last round-trip time, as specified in detail in RFC 3448.

    RFC 3448 specifies an initial sending rate of one packet per RTT, as
    follows: The sender initializes the allowed sending rate to one
    packet per second.  However, as soon as a feedback packet is
    received from the receiver, the sender has a measurement of the
    round-trip time, and sets the allowed sending rate to one packet per
    RTT.

    The sender's measurement of the round-trip time uses the Elapsed
    Time or Timestamp Echo option contained in feedback packets.  The
    sender maintains an average round-trip time heavily weighted on the
    most recent measurements.

    We note that [RFC 2581] has allowed an initial TCP window of 2
    segments since April 1999, and [RFC 3390] has allowed an initial TCP
    window of three or four segments (up to 4380 bytes) since October
    2002.  Therefore, this document allows initial sending rate X of up
    to four packets per RTT, as follows:

           X = min (4*s, max (2*s, 4380 bytes))/RTT,

    for s the segment size in bytes.

    As specified in RFC 3448, after the slow-start phase is ended by the
    receiver's report of a packet drop or mark, the sender calculates an
    allowed sending rate based on the round-trip time and on the loss
    event rate or equivalent information reported by the receiver.  Each
    DCCP-Data packet contains a sequence number.  Each DCCP-Data packet
    also contains a Window Counter Value, as described in Section 6.1
    below.  The Window Counter Value is incremented by one every quarter
    round-trip time, and is used by the receiver in the calculation of
    the loss event rate.  In particular, the Window Counter Value is
    used as a coarse-grained timestamp to determine when a packet loss
    should be counted as part of an existing loss event.

    Because TFRC is a rate-based instead of a window-based congestion
    control mechanism, and because feedback packets can be dropped in
    the network, the sender needs some mechanism to reduce its sending
    rate in the absence of positive feedback from the receiver.  As
    described in the section below, the receiver sends feedback packets
    roughly once per round-trip time.  As specified in RFC 3448, the
    sender sets a nofeedback timer to at least four round-trip times, or
    to twice the interval between data packets, whichever is larger.
    RFC 3448 specifies that if the sender hasn't received a feedback
    packet from the receiver when the nofeedback timer expires, then the
    sender halves its allowed sending rate.  The allowed sending rate is



Floyd/Kohler/Padhye                                 Section 5.  [Page 8]

INTERNET-DRAFT            Expires: August 2004             February 2004


    never reduced below one packet per 64 seconds.

    As mentioned in RFC 3448, one consequence of the nofeedback timer is
    that the sender reduces the allowed sending rate when the sender has
    been idle for a significant period of time.  As specified in RFC
    3448, the allowed sending rate is never reduced to less than two
    packets per round-trip time as the result of an idle period.

5.1.  Response to Data Dropped

    CCID 3 senders respond to packets acknowledged as Data Dropped as
    described in [DCCP], with the following further clarifications.

    o Drop Code 2 ("receive buffer drop"). The sending rate is reduced
      by one for each packet newly acknowledged as Drop Code 2, except
      that it is never reduced below one packet per round-trip time.
      This can be achieved by manipulating the loss event rate, or by
      maintaining a separate parameter that determines how much the
      sending rate should be reduced.

5.2.  Packet Sizes

    CCID 3 is intended for applications that use a fixed packet size,
    and that vary their sending rate in packets per second in response
    to congestion.   CCID 3 is not appropriate for applications that
    require a fixed interval of time between packets, and vary their
    packet size instead of their packet rate in response to congestion.
    However, some attention might be required for applications using
    CCID 3 that vary their packet size not in response to congestion,
    but in response to other application-level requirements.

    Since the average packet size "s" is used in the TCP throughput
    equation, a CCID 3 implementation SHOULD keep a running average of
    recent packet sizes.  This MAY be augmented by an expected packet
    size provided by the application.

    CCID 3 implementations MAY check for applications that appear to be
    manipulating the packet size inappropriately.  For example, an
    application might send small packets for a while, building up a fast
    rate, then switch to large packets to take advantage of the fast
    rate.  However, preliminary simulations indicate that applications
    may not be able to increase their overall transfer rates this way,
    so it is not clear this manipulation will occur in practice.

6.  Acknowledgements

    The receiver sends an acknowledgement packet to the sender roughly
    once per round-trip time, if the sender is sending packets that



Floyd/Kohler/Padhye                                 Section 6.  [Page 9]

INTERNET-DRAFT            Expires: August 2004             February 2004


    frequently.  This rate is determined by details of the TFRC
    protocol, as specified in [RFC 3448].

    As specified in [DCCP], the acknowledgement number acknowledges the
    greatest valid sequence number received so far on this connection.
    ("Greatest" is, of course, measured in circular sequence space.)
    Each acknowledgement required by TFRC also includes at least the
    following options:

    (1) An Elapsed Time and/or Timestamp Echo option specifying the
        amount of time elapsed since the receiver received the packet
        whose sequence number appears in the Acknowledgement Number
        field.  These options are described in Sections 6.8 and 6.7 of
        [DCCP].

    (2) A Receive Rate option (Section 8.3) specifying the rate at which
        the receiver received data since the last DCCP-Ack was sent.

    (3) One or more options concerning the loss event rate p experienced
        by the receiver, as described in [RFC 3448]. Relevant options
        include Loss Event Rate, which gives the loss event rate
        calculated by the receiver (Section 8.5); Loss Intervals, which
        specifies the beginning and end of each loss interval, from
        which the sender can easily calculate and/or verify the loss
        event rate (Section 8.7); and Ack Vector, which says exactly
        which packets were lost or marked, again allowing the sender to
        calculate and/or verify the loss event rate (see Section 8.5 of
        [DCCP]).

    If the HC-Receiver is also sending data packets to the HC-Sender,
    then it MAY piggyback acknowledgement information on those data
    packets more frequently than TFRC's specified acknowledgement rate
    allows.

6.1.  Congestion Control on Acknowledgements

    The rate and timing for generating acknowledgements is determined by
    the TFRC algorithm [RFC 3448]. The sending rate for acknowledgements
    is relatively low, and there is no explicit congestion control on
    the acknowledgements.

6.2.  Quiescence

    This section refers to quiescence in the DCCP sense (see section 8.1
    of [DCCP]): How does a CCID 3 receiver determine that the
    corresponding sender is not sending any data?





Floyd/Kohler/Padhye                              Section 6.2.  [Page 10]

INTERNET-DRAFT            Expires: August 2004             February 2004


    Let T equal the greater of 0.2 seconds and two round-trip times.  (A
    CCID 3 receiver measures the round-trip time, so that it can
    appopriately pace its acknowledgements.)  The receiver detects that
    the sender has gone quiescent after T seconds have passed without
    receiving any additional data from the sender.

6.3.  Acknowledgements of Acknowledgements

    TFRC acknowledgements don't generally need to be reliable, so the
    sender generally need not acknowledge the receiver's
    acknowledgements.  When Ack Vector is used, however, the sender,
    DCCP A, MUST occasionally acknowledge the receiver's
    acknowledgements so that the receiver can free up Ack Vector state.
    When both half-connections are active, the necessary
    acknowledgements will be contained in A's acknowledgements to B's
    data.  If the B-to-A half-connection goes quiescent, however, DCCP A
    must do it proactively.

    When Ack Vector is used, therefore, an active sender MUST
    acknowledge the receiver's acknowledgements approximately once per
    round-trip time, within a factor of two or three, probably by
    sending a DCCP-DataAck packet.  No acknowledgement options are
    necessary, just the relevant Acknowledgement Number in the DCCP-
    DataAck header.

    The sender MAY choose to acknowledge the receiver's acknowledgements
    even if they do not contain Ack Vectors.  For instance, regular
    acknowledgements can shrink the size of the Loss Intervals option.
    Unlike the Ack Vector, however, the Loss Intervals option is bounded
    in size (and receiver state), so acks-of-acks are not required.

7.  Explicit Congestion Notification

    Explicit Congestion Notification (ECN) [RFC 3168] MAY be used with
    CCID 3.  If ECN is enabled, then the ECN Nonce will automatically be
    used following the specification for the ECN Nonce for TCP [RFC
    3540]. For the data sub-flow, the sender sets either the ECT[0] or
    ECT[1] codepoint on DCCP-Data packets.

    If ECN is used, then the receiver MUST use at least one of Ack
    Vector and Loss Intervals to return ECN Nonce information to the
    sender.

    If the Ack Vector option is being used, then it will include the ECN
    Nonce Sum.  The sender can maintain a table with the ECN nonce sum
    for each packet, and use this information to probabilistically
    verify the ECN nonce sum returned in each DCCP-Ack packet, as
    described in Appendix A of [DCCP].



Floyd/Kohler/Padhye                                Section 7.  [Page 11]

INTERNET-DRAFT            Expires: August 2004             February 2004


    If the Ack Vector option is not being used, the information about
    the ECN Nonce is returned by the receiver using the Loss Intervals
    option described below.  An ECN-capable receiver MUST include this
    option on every required acknowledgement.

8.  Options and Features

    CCID 3 can make use of DCCP's Ack Vector, Timestamp, Timestamp Echo,
    and Elapsed Time options, and its Send Ack Vector and ECN Capable
    features.  In addition, the following CCID-specific options are
    defined for use with CCID 3:

              Option                           Section
     Type     Length     Meaning               Reference
     -----    ------     -------               ---------
    128-191              Reserved
      192        6       Loss Event Rate         8.5
      193        6       Loss Intervals          8.7
      194        6       Receive Rate            8.3
    195-255              Reserved

    The following CCID-specific features are also defined.  The Rec'n
    Rule column defines each feature's reconciliation rule; both are
    server-priority.

                                      Rec'n Initial  Section
    Number   Meaning                  Rule   Value  Reference
    ------   -------                  -----  -----  ---------
    128-191  Reserved
      192    Send Loss Event Rate      SP      1      8.4
      193    Send Loss Intervals       SP      0      8.6
    194-255  Reserved

    The Reserved CCID-specific option types and feature numbers should
    be allocated by IANA.

    Although the use of Ack Vector, Loss Intervals, and Loss Event Rate
    are controlled by separate features, only some combinations of these
    features make sense.  In particular, if ECN Capable is true, then
    every required acknowledgement MUST include at least one of Ack
    Vector and Loss Intervals; otherwise, every required acknowledgement
    MUST include at least one of Ack Vector, Loss Intervals, and Loss
    Event Rate.  This may impel the receiver to send certain options
    even when their corresponding Send features are false.  A sender
    that receives several invalid acknowledgements---that include only
    Loss Event Rate on an ECN-capable connection, for example---MAY
    respond by resetting the connection with Reason set to "Option
    Error".



Floyd/Kohler/Padhye                                Section 8.  [Page 12]

INTERNET-DRAFT            Expires: August 2004             February 2004


8.1.  Window Counter Value

    The data sender stores a 4-bit window counter value in the DCCP
    generic header's CCVal field on every data packet it sends.  This
    value is set to 0 at the beginning of the transmission, and
    generally increased by 1 every quarter of a round-trip time, as
    described in [RFC 3448]. For reference, the DCCP generic header is
    as follows (diagram repeated from [DCCP]):

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |          Source Port          |           Dest Port           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |  Data Offset  | CCVal | CsCov |           Checksum            |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    | Type  |X| Res |              Sequence Number                  |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    The CCVal field has enough space to express 4 round-trip times at
    quarter-RTT granularity.  The sender MUST avoid wrapping CCVal on
    adjacent packets, as might happen, for example, if two data-carrying
    packets were sent 4 round-trip times apart with no packets
    intervening.  Therefore, the sender SHOULD use the following
    algorithm for setting CCVal.  The algorithm uses three variables:
    "last_WC" holds the last window counter value sent, "last_WC_time"
    is the time at which the first packet with window counter value
    "last_WC" was sent, and "RTT" is the current round-trip time
    estimate.  last_WC is initialized to zero, and last_WC_time to the
    time of the first packet sent.  Then, before sending a new packet,
    proceed like this:

       Let quarter_RTTs = floor( (current_time - last_WC_time) / (RTT/4) ).
       If quarter_RTTs > 0, then:
           Set last_WC := (last_WC + min(quarter_RTTs, 5)) mod 16, and
           Set last_WC_time := current_time.
       Set the packet header's CCVal field to last_WC.

    When this algorithm is used, adjacent data-carrying packets' CCVal
    counters never differ by more than five, modulo 16.

    The window counter value may also change as feedback packets arrive.
    In particular, after receiving an acknowledgement for a packet sent
    with window counter WC, the sender SHOULD increase its window
    counter, if necessary, so that subsequent packets have window
    counter value at least (WC + 4) mod 16.





Floyd/Kohler/Padhye                              Section 8.1.  [Page 13]

INTERNET-DRAFT            Expires: August 2004             February 2004


    The receiver can use the CCVal counters to estimate the round-trip
    time if there is no better information available.  For example, say
    that packets arrived as follows:

    Time:       T1  T2  T3 T4  T5           T6  T7   T8  T9
           ------*---*---*-*----*------------*---*----*--*---->
    CCVal:      K-1 K-1  K K   K+1          K+3 K+4  K+3 K+4

    Then T7 - T3, the difference between the receive times of the first
    packet received with window counter K+4 and the first packet
    received with window counter K, is a reasonable round-trip time
    estimate.  When estimating the round-trip time in this way, the
    receiver MUST limit itself to packet pairs whose CCVals differ by 1,
    2, 3, or 4 (representing intervals of 1/4, 1/2, 3/4, and 1 RTT,
    respectively); differences of 4 SHOULD be preferred.

8.2.  Elapsed Time Options

    The data receiver MUST include an elapsed time value on every
    required acknowledgement.  This helps the sender distinguish between
    network round-trip time, which it must include in its rate
    equations, and delay at the receiver due to TFRC's infrequent
    acknowledgement rate.  The elapsed time value is included in one of
    two ways:

    (1) If at least one recent data packet (i.e., a packet received
        after the previous DCCP-Ack was sent) included a Timestamp
        option, then the receiver SHOULD include the corresponding
        Timestamp Echo option, with Elapsed Time value.

    (2) Otherwise, the receiver MUST include an Elapsed Time option.

    All these option types are defined in the main DCCP specification
    [DCCP].

8.3.  Receive Rate Option

    +--------+--------+--------+--------+--------+--------+
    |11000010|00000110|            Receive Rate           |
    +--------+--------+--------+--------+--------+--------+
     Type=194   Len=6

    This option MUST be sent by the data receiver on all required
    acknowledgements.  Its four data bytes indicate the rate at which
    the receiver has received data since it last sent an
    acknowledgement, in bytes per second.  The Receive Rate is
    calculated as the number of bytes received in the most recent t
    seconds, divided by t, where t is the larger of the following:  the



Floyd/Kohler/Padhye                              Section 8.3.  [Page 14]

INTERNET-DRAFT            Expires: August 2004             February 2004


    time since the last Receive Rate Option was sent, and the estimated
    round-trip time.  The receiver has an estimate of the round-trip
    time from the Window Counter Value in received data packets.

8.4.  Send Loss Event Rate Feature

    The Send Loss Event Rate feature lets CCID 3 endpoints negotiate
    whether the receiver MUST provide Loss Event Rate options on its
    acknowledgements.  DCCP A sends a "Change R(Send Loss Event Rate,
    1)" option to ask DCCP B to send Loss Event Rate options as part of
    its acknowledgement traffic.

    Send Loss Event Rate has feature number 192, and is server-priority.
    It takes one-byte Boolean values.  DCCP B MUST send Loss Event Rate
    options on its acknowledgements when Set Loss Event Rate/B is one,
    although it MAY send Loss Event Rate options even when Send Loss
    Event Rate/B is zero.  Values of two or more are reserved.  A CCID 3
    half-connection starts with Send Loss Event Rate equal to one.

8.5.  Loss Event Rate Option

    +--------+--------+--------+--------+--------+--------+
    |11000000|00000110|          Loss Event Rate          |
    +--------+--------+--------+--------+--------+--------+
     Type=192   Len=6

    The option value indicates the inverse of the loss event rate,
    rounded UP, as calculated by the receiver.  Its units are packets
    per loss interval.  See [RFC 3448] for a normative calculation of
    loss event rate.

8.6.  Send Loss Intervals Feature

    The Send Loss Intervals feature lets CCID 3 endpoints negotiate
    whether the receiver MUST provide Loss Intervals options on its
    acknowledgements.  DCCP A sends a "Change R(Send Loss Intervals, 1)"
    option to ask DCCP B to send Loss Intervals options as part of its
    acknowledgement traffic.

    Send Loss Intervals has feature number 193, and is server-priority.
    It takes one-byte Boolean values.  DCCP B MUST send Loss Intervals
    options on its acknowledgements when Send Loss Intervals/B is one,
    although it MAY send Loss Intervals options even when Send Loss
    Intervals/B is zero.  Values of two or more are reserved.  A CCID 3
    half-connection starts with Send Loss Intervals equal to zero.






Floyd/Kohler/Padhye                              Section 8.6.  [Page 15]

INTERNET-DRAFT            Expires: August 2004             February 2004


8.7.  Loss Intervals Option


                                 ___ Loss Interval ___
                                /                     \
    +--------+--------+--------+----...----+----...----+--------+---
    |11000001| Length |  Skip  | Lossless  |E|  Loss   | Up to 7 Loss
    |        |        | Length |  Length   | | Length  | Intervals...
    +--------+--------+--------+----...----+----...----+--------+---
     Type=193                     3 bytes     3 bytes

    This option MAY be set by the data receiver on acknowledgements.
    (If ECN is enabled and Ack Vector is off, or if the Send Loss
    Intervals feature is true, it MUST be sent with every required
    acknowledgement.)  The option reports up to 8 loss intervals seen by
    the receiver, allowing the sender to calculate a loss event rate and
    to probabilistically verify the receiver's ECN Nonce Echo.

8.7.1.  Loss Interval Definition

    As described in [RFC 3448] (Section 5.2), a loss interval begins
    with a lost or ECN-marked packet; continues with at most one round
    trip time's worth of packets that may or may not be lost or marked;
    and completes with an arbitrarily-long series of non-dropped, non-
    marked packets.  Call these the lossy part and the lossless part of
    the loss interval.  For example, here is a single loss interval,
    assuming that sequence numbers increase as you move right:

               Lossy Part
                <= 1 RTT   __________ Lossless Part __________
              /          \/                                   \
              *----*--*--*-------------------------------------
              ^    ^  ^  ^
             losses or marks


    The Loss Event Rate, reported by option 192, is the weighted average
    of the last 8 loss interval lengths, inverted.  Note that a loss
    interval's lossless part might be empty.

    The length of the lossy part must be <= 1 RTT; however, if the
    packet that starts a loss interval was actually lost, the receiver
    cannot know its receive time.  The TFRC specification gives in
    Section 5.2 a calculation whereby the receiver interpolates a likely
    receive time for each lost packet.  CCID 3 implementations SHOULD
    use this calculation.  As a slightly simpler alternative, they MAY
    instead calculate loss intervals to satisfy the following invariant.
    Take any two distinct loss intervals L[i] and L[j] with nonempty



Floyd/Kohler/Padhye                            Section 8.7.1.  [Page 16]

INTERNET-DRAFT            Expires: August 2004             February 2004


    lossless parts.  Assume i < j.  Let Ti be the time when the last
    packet in L[i]'s lossless part was received, and let Tj be the time
    when the first packet in L[j]'s lossless part was received.  Then we
    must have Tj - Ti <= (j - i)*RTT.

    Note that a missing packet doesn't begin a new loss interval until 3
    packets have been seen after the "hole" (see Section 5.1 of [RFC
    3448]). Thus, up to three of the most recent sequence numbers
    (including the sequence numbers of any "holes") might temporarily
    not be part of any loss interval, while the implementation waits to
    see whether a "hole" will be filled.

8.7.2.  Option Details

    The Loss Intervals option contains information about between one and
    eight consecutive loss intervals, always including the most recent
    loss interval.  Intervals are listed in reverse chronological order.
    The option MUST contain information about the most recent 8 loss
    intervals unless (1) there have not yet been 8 loss intervals, in
    which case the receiver SHOULD send information about all the loss
    intervals it has experienced; or (2) the receiver knows, because of
    acknowledgements from the sender, that information about older loss
    intervals has been received by the sender, in which case the
    receiver MUST send at least information about the loss intervals the
    sender has not acknowledged.  In any case, the Loss Intervals option
    MUST contain the most recent loss interval.

    Loss interval sequence numbers are delta-encoded starting from the
    Acknowledgement Number.  Therefore, Loss Intervals options MUST NOT
    be sent on packets without an Acknowledgement Number.

    The first byte of option data is Skip Length, which indicates the
    number of packets up to and including the Acknowledgement Number
    that are not part of any Loss Interval.  As discussed above, Skip
    Length must be less than or equal to three.

    Up to eight Loss Interval structures follow Skip Length.  Each Loss
    Interval consists of a Lossless Length, a Loss Length, and an ECN
    Nonce Echo (E).

    Lossless Length, a 24-bit number, specifies the number of packets in
    the loss interval's lossless part.

    Loss Length, a 23-bit number, specifies the number of packets in the
    loss interval's lossy part.

    The ECN Nonce Echo, stored in the high-order bit of the 3-byte field
    containing Loss Length, equals the one-bit sum (exclusive-or, or



Floyd/Kohler/Padhye                            Section 8.7.2.  [Page 17]

INTERNET-DRAFT            Expires: August 2004             February 2004


    parity) of nonces received over the loss interval's lossless part
    (which is Lossless Length packets long).  If Lossless Length is 0,
    or if the receiver is ECN-incapable, the ECN Nonce Echo MUST be
    reported as 0.

    The Loss Intervals option serves several purposes.

    o The sender can use the Loss Intervals to easily calculate the Loss
      Event Rate, perhaps using a later version of the TFRC algorithm
      than that deployed at the receiver.

    o Loss Intervals information is easily checked for consistency
      against previous Loss Intervals options, and against any Loss
      Event Rate calculated by the receiver.

    o The sender can probabilistically verify the ECN Nonce Echo for
      each Loss Interval, reducing the likelihood of misbehavior.

8.7.3.  Example

    Consider the following sequence of packets, where "-" represents a
    safely delivered packet and "*" represents a lost or marked packet.

    Sequence
     Numbers: 0         10        20        30        40  44
              |         |         |         |         |   |
              --*-*-----*--------***-*--------*----------*-

    Assuming that packet 43 was lost, not marked, this sequence might be
    divided into loss intervals as follows:

              0         10        20        30        40  44
              |         |         |         |         |   |
              --*-*-----*--------***-*--------*----------*-
              \/\______/\_______/\___________/\_________/
              L0   L1      L2         L3           L4

    A Loss Intervals option sent to acknowledge this set of loss
    intervals, on a packet with Acknowledgement Number 44, might contain
    the bytes 193,33,2, 0,0,10, 128,0,1, 0,0,8, 0,0,5, 0,0,8, 0,0,1,
    0,0,5, 128,0,3, 0,0,2, 128,0,0.  This option is interpreted as
    follows.

    193 The Loss Intervals option number.

    33  The length of the option, including option type and length
        bytes.  This option contains information about (33 - 3)/6 = 5
        loss intervals.



Floyd/Kohler/Padhye                            Section 8.7.3.  [Page 18]

INTERNET-DRAFT            Expires: August 2004             February 2004


    2   The Skip Length is 2 packets.  Thus, the most recent loss
        interval, L4, ends immediately before sequence number 44 - 2 + 1
        = 43.

    0,0,10, 128,0,1
        These bytes define L4.  L4 consists of a 10-packet lossless part
        (0,0,10), preceded by a 1-packet lossy part.  Continuing to
        subtract, the lossless part begins with sequence number 43 - 10
        = 33, and the lossy part begins with sequence number 33 - 1 =
        32.  The ECN Nonce Echo for the lossless part, namely packets 33
        through 42, inclusive, equals 1.

    0,0,8, 0,0,5
        This defines L3, whose lossless part begins with sequence number
        32 - 8 = 24; whose lossy part begins with sequence number 24 - 5
        = 19; and whose ECN Nonce Echo (for packets [24,31]) equals 0.

    0,0,8, 0,0,1
        L2's lossless part begins with sequence number 11, its lossy
        part begins with sequence number 10, and its ECN Nonce Echo (for
        packets [11,18]) equals 0.

    0,0,5, 128,0,3
        L1's lossless part begins with sequence number 5, its lossy part
        begins with sequence number 2, and its ECN Nonce Echo (for
        packets [5,9]) equals 1.

    0,0,2, 128,0,0
        L1's lossless part begins with sequence number 0, it has no
        lossy part, and its ECN Nonce Echo (for packets [0,1]) equals 1.

9.  Verifying Congestion Control Compliance With ECN

    If ECN is used, the sender can use Ack Vector or the Loss Intervals
    option to probabilistically verify that the receiver is not lying in
    reporting packets received undropped and unmarked.  The sender could
    then use the information in acknowledgement packets to roughly
    verify the Loss Event Rate reported by the receiver, if it so
    desired.

    We note that if ECN is not used, the sender could still check on the
    receiver by occasionally not sending a packet, or sending a packet
    out-of-order, to catch the receiver in an error in Ack Vector or
    Loss Intervals information.  Similarly, the sender would still use
    the Ack Vector or Loss Intervals information to verify the loss
    event rate reported by the receiver.  However, this is not as robust
    or as non-intrusive as the verification provided by the ECN Nonce.




Floyd/Kohler/Padhye                                Section 9.  [Page 19]

INTERNET-DRAFT            Expires: August 2004             February 2004


9.1.  Verifying the ECN Nonce Echo

    To verify the ECN Nonce Echo included with an Ack Vector option, the
    sender maintains a table with the ECN nonce value sent for each
    packet.  The Ack Vector option explicitly says which packets were
    received non-marked; the sender just adds up the nonces for those
    packets using a one-bit sum (exclusive-or, or parity), and compares
    the result to the Nonce Echo encoded in the Ack Vector's option
    type.

    To verify the ECN Nonce Echo included with a Loss Intervals option,
    the sender maintains a table with the ECN nonce *sum* for each
    packet.  As defined in [RFC 3540], the nonce sum for sequence number
    S is the one-bit sum of nonces over the sequence number range [I,S]
    (where I is the initial sequence number).  Let NonceSum(S) represent
    this nonce sum for sequence number S, and let NonceSum(I - 1) equal
    0.  Then the Nonce Echo for a loss interval [Left Edge, Left Edge +
    Offset) should equal the following one-bit sum:

       NonceSum(Left Edge - 1) + NonceSum(Left Edge + Offset - 1).

    An Ack Vector's ECN Nonce Echo may also be calculated from a table
    of ECN nonce sums, rather than ECN nonces.  If the Ack Vector
    contains many long runs of non-marked, non-dropped packets, the
    nonce sum-based calculation will probably be faster than a
    straightforward nonce-based calculation.

    In either of these cases, a misbehaving receiver---meaning a
    receiver that reports a lost or marked packet as "received non-
    marked", to avoid rate reductions---has only a 50% chance of
    guessing the correct Nonce Echo.

9.2.  Verifying the Reported Loss Event Rate

    Once the sender has probabilistically verified the ECN Nonce Echoes
    reported by the receiver, the sender can calculate for itself the
    number of packets in each loss interval, to roughly verify the loss
    event rate reported by the receiver, if it so desires.  We note that
    DCCP's Loss Event Rate Option reports the average loss interval
    size, which is the inverse of the loss event rate.

    If the Ack Vector is used, the sender can identify the packet that
    begins each new loss interval from the Ack Vector in each DCCP-Ack
    packet.  If the sender saves information about the window counter
    for each data packet, then the sender also can tell when two lost or
    marked packets would have been interpreted by the receiver as
    separate loss events.




Floyd/Kohler/Padhye                              Section 9.2.  [Page 20]

INTERNET-DRAFT            Expires: August 2004             February 2004


    The Loss Intervals option explicitly reports the size of each loss
    interval, as seen by the receiver.  The sender can, using saved
    information about window counters, verify that the receiver is not
    falsely combining two loss events into one reported loss interval.

    Once the sender has reconstructed or verified Loss Intervals, it can
    easily calculate the expected loss event rate, and compare against
    the receiver's reported loss event rate.

    We note that in some cases the loss event rate calculated by the
    sender could differ from that calculated by the receiver.  In
    particular, when a number of successive packets are dropped, the
    receiver does not know the sending times for these packets, and
    interprets these losses as a single loss event.  In contrast, if the
    sender has saved the sending times or the window counter information
    for these packets, then the sender can determine if these losses
    constitute a single loss event, or several successive loss events.
    Thus, with its knowledge of the sending times of dropped packets,
    the sender is able to make a more accurate calculation of the loss
    event rate.

10.  Design Considerations

    CCID 3 data packets need not carry Timestamp options.  The sender
    can store the times at which recent packets were sent.  Then the
    Acknowledgement Number and Elapsed Time option contained on each
    required acknowledgement provide sufficient information to compute
    the round trip time.  Alternatively, the sender MAY include
    Timestamp options on a limited subset of its data packets; the
    receiver will respond with Timestamp Echo options including Elapsed
    Times, allowing the sender to calculate round-trip times without
    storing timestamps at all.

10.1.  Possible Changes to the Initial Window

    In the future, it is possible that an initial sending rate of up to
    eight small packets per RTT would be allowed, for connections with
    sufficiently-small packets.  That is, we are evaluating the
    possibility of an initial sending rate X as follows:

           X = min (8*s, max (2*s, 4380 bytes)) / RTT.

    Because the packets would be rate-paced out over a round-trip time,
    instead of sent back-to-back as they would be in TCP, an initial
    sending rate of eight small packets per RTT with TFRC-based
    congestion control would be considerably milder than the impact of
    an initial window of eight small packets in TCP.  We note that with
    CCID 3, the sender is in slow-start in the beginning, and responds



Floyd/Kohler/Padhye                             Section 10.1.  [Page 21]

INTERNET-DRAFT            Expires: August 2004             February 2004


    promptly to the report of a packet loss or mark.  However, in the
    absence of feedback from the receiver, the sender can maintain its
    old sending rate for up to four round-trip times.

10.2.  Determining Loss Events at the Receiver

    The window counter is used by the receiver to determine if multiple
    lost packets belong to the same loss event.  The sender increases
    the window counter by 1 every quarter round trip time.  To determine
    whether two lost packets, with sequence numbers X and Y (Y > X in
    circular sequence space), belong to different loss events, the
    receiver proceeds as follows:

    o Let X_prev be the greatest sequence number which was received with
      X_prev < X.

    o Let Y_prev be the greatest sequence number which was received with
      Y_prev < Y.

    o Given a sequence number N, let C(N) be the window counter value
      associated with that packet.

    o Packets X and Y belong to different loss events if there exists a
      packet with sequence number S so that X_prev < S <= Y_prev, and
      the distance from C(X_prev) to C(S) is greater than 4.  (The
      distance is the number D so that C(X_prev) + D = C(S) (mod
      WCTRMAX), where WCTRMAX is the maximum value for the window
      counter---in our case, 16.)

      This complex calculation is necessary to handle the case where
      window counter space wrapped completely between X and Y.
      Generally, the receiver can simply check whether the distance from
      C(X_prev) to C(Y_prev) is greater than 4.

    Window counters can help the receiver to disambiguate multiple
    losses after a sudden decrease in the actual round-trip time.  When
    the sender receives an acknowledgement acknowledging a data packet
    with window counter i, the sender increases its window counter, if
    necessary, so that subsequent data packets are sent with window
    counter values of at least i+4.  This can help minimize errors on
    the part of the receiver of incorrectly interpreting multiple loss
    events as a single loss event.

    We note that if all of the packets between X and Y are lost in the
    network, then X_prev and Y_prev are both set to X-1, and the series
    of consecutive losses is treated by the receiver as a single loss
    event.  However, the sender will receive no DCCP-Ack packets during
    a period of consecutive losses, and the sender will reduce its



Floyd/Kohler/Padhye                             Section 10.2.  [Page 22]

INTERNET-DRAFT            Expires: August 2004             February 2004


    sending rate accordingly.

    As an alternative to the window counter, the sender could have sent
    its estimate of the round-trip time to the receiver directly in a
    round-trip time option, and the receiver should use the sender's
    round-trip time estimate to infer when multiple lost or marked
    packets belong in the same loss event.  In some respects, a round-
    trip time option gives a more precise encoding of the sender's
    round-trip time estimate than does the window counter.  However, the
    window counter conveys information about the relative *sending*
    times for packets, while the receiver could only use the round-trip
    time option to distinguish between the relative *receive* times (in
    the absence of timestamps).  That is, the window counter will give
    more robust performance in some cases when there is a large
    variation in delay for packets sent within a window of data.  As a
    slightly more speculative consideration, the round-trip time option
    could possibly be used more easily by middleboxes attempting to
    verify that a flow was using conformant end-to-end congestion
    control.

10.3.  Sending Feedback Packets

    The window counter is also used by the receiver to decide when to
    send feedback packets.  Feedback packets should normally be sent at
    least once per round-trip time, if the sender is sending at least
    one data packet per round-trip time.  Whenever the receiver sends a
    feedback message, the receiver sets a local variable last_counter to
    the greatest received value of the window counter since the last
    feedback message was sent, if any data packets have been received
    since the last feedback message was sent.  If the receiver receives
    a data packet with a window counter value greater than or equal to
    last_counter + 4, then the receiver sends a new feedback packet.
    ("Greater" and "greatest" are measured in circular window counter
    space.)

    The TFRC protocol [RFC 3448] specifies that the receiver uses a
    feedback timer to decide when to send feedback packets.  In the TFRC
    protocol, when the feedback timer expires, the receiver resets the
    timer to expire after R_m seconds, where R_m is the most recent
    estimate of the round-trip time received by the receiver from the
    sender.  However, when the window counter is used, the receiver can
    use its information in deciding when to send feedback packets.

    When the sender is sending less than one packet per round-trip time,
    then the receiver sends a feedback packet after each data packet,
    and the feedback timer is not required.  Similarly, when the sender
    is sending several packets per round-trip time, then the receiver
    will send a feedback packet each time that a data packet arrives



Floyd/Kohler/Padhye                             Section 10.3.  [Page 23]

INTERNET-DRAFT            Expires: August 2004             February 2004


    with a window counter more than four greater than the window counter
    when the last feedback packet was sent, and again the feedback
    counter is not required.  Similarly, the receiver always sends a
    feedback packet after the detection of a loss event.  Thus, the
    feedback timer is not absolutely necessary when the window counter
    is used.

    However, the feedback timer still could be useful in some rare cases
    to prevent the sender from unnecessarily halving its sending rate.
    Consider the case when the receiver receives data soon after the
    most recent feedback packet has been sent, but has received no data
    packets with a window counter sufficiently large to trigger sending
    a new feedback packet.  The TFRC protocol specifies that after a
    feedback packet is received, the sender sets a nofeedback timer to
    at least four times the round-trip time estimate.  If the sender
    doesn't receive any feedback packets before the nofeedback timer
    expires, then the sender halves its sending rate.  One could
    construct scenarios where the use of a feedback timer at the
    receiver would prevent the unnecessary expiration of the nofeedback
    timer at the sender.

    For implementors who wish to implement a feedback timer for the data
    receiver, we suggest estimating the round-trip time from the most
    recent data packet as follows: Let K be the window counter from the
    most recent data packet, and let T_k be the time that that packet
    was received, as in the table below.  Let J be the highest window
    counter received that was less than K-4, and let T_j be the most
    recent time that such a packet was received.  Then the round-trip
    time can be very roughly estimated as 4*(T_k-T_j)/(K-J).

      Time  |           Event                 |   Window Counter
     -----------------------------------------------------------
       T_j  |  packet received with WC < K-4  |   J   (J<K-4)
       T_k  |  most recent packet received    |   K


10.4.  When Should Ack Vector And Loss Intervals Be Used?

    If the use of ECN has not been negotiated, then the receiver is not
    required to use either Ack Vector or Loss Intervals.  Essentially,
    in this case the sender is completely relying on the Loss Event Rate
    reported by the receiver.  If the Ack Vector or Loss Intervals is
    used, however, then the sender could test that the receiver is
    correctly reporting dropped and marked packets by conducting a test
    and skipping a packet in its transmissions.

    In the common case, it is assumed that the use of ECN will be
    negotiated with CCID 3.  However, it is possible that either the



Floyd/Kohler/Padhye                             Section 10.4.  [Page 24]

INTERNET-DRAFT            Expires: August 2004             February 2004


    sender or the receiver will want to negotiate the use of CCID 3
    without ECN, e.g., if there happens to be a known broken middlebox
    along the path that blocks the use of ECN in the IP packet header.

    If ECN is used, then the receiver is required to use at least one of
    Ack Vector and Loss Intervals to return ECN Nonce information to the
    sender.  The Ack Vector returns more information about which packets
    were lost or marked during a loss event.  The sender uses more
    computation and state for verifying receiver feedback with the Ack
    Vector than with Loss Intervals, because then it must reconstruct
    loss intervals from the Ack Vector.  The Ack Vector also requires
    that the sender occasionally acknowledge the receiver's
    acknowledgements; this is optional with Loss Intervals.

11.  Security Considerations

    Security considerations for DCCP have been discussed in [DCCP], and
    security considerations for TFRC have been discussed in [RFC 3448].
    The security considerations for TFRC include the need to protect
    against spoofed feedback, and the need for protection mechanisms to
    protect the congestion control mechanisms against incorrect
    information from the receiver.

    In this document we have extensively discussed the mechanisms the
    sender can use to verify the information sent by the receiver.

12.  IANA Considerations

    This specification assigns the following value in a namespace
    managed by IANA:

    o DCCP Congestion Control Identifier, value 3.

    It also creates two namespaces which should be managed by IANA,
    namely the namespaces for CCID 3-specific options and features.  Of
    these, several values have already been assigned.

13.  Thanks

    We thank Mark Handley for his help in defining CCID 3.  We also
    thank Sara Karlberg, Greg Minshall, Arun Venkataramani, Yufei Wang,
    and Magnus Westerlund for feedback on earlier versions of this
    document.

Normative References

    [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP Congestion
        Control ID 2: TCP-like Congestion Control, draft-ietf-dccp-



Floyd/Kohler/Padhye                                            [Page 25]

INTERNET-DRAFT            Expires: August 2004             February 2004


        ccid2-05.txt, work in progress, February 2004.

    [DCCP] E. Kohler, M. Handley, and S. Floyd.  Datagram Congestion
        Control Protocol, draft-ietf-dccp-spec-06.txt, work in progress,
        February 2004.

    [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate
        Requirement Levels. RFC 2119.

    [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition
        of Explicit Congestion Notification (ECN) to IP. RFC 3168.
        September 2001.

    [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer, TCP
        Friendly Rate Control (TFRC): Protocol Specification, RFC 3448,
        Proposed Standard, January 2003.

    [RFC 3540] N. Spring, D. Wetherall, and D. Ely.  Robust Explicit
        Congestion Notification (ECN) Signaling with Nonces.  RFC 3540.

Informative References

    [PFTK98] J. Padhye, V. Firoiu, D. Towsley, and J. Kurose.  Modeling
        TCP Throughput: A Simple Model and its Empirical Validation.
        Proc ACM SIGCOMM 1998.

    [RFC 2581] M. Allman, V. Paxson, and W. Stevens.  TCP Congestion
        Control.  RFC 2581.

    [RFC 3390] M. Allman, S. Floyd, and C. Partridge.  Increasing TCP's
        Initial Window.  RFC 3390.

Authors' Addresses


















Floyd/Kohler/Padhye                                            [Page 26]

INTERNET-DRAFT            Expires: August 2004             February 2004


    Sally Floyd <floyd@icir.org>
    ICSI Center for Internet Research
    1947 Center Street, Suite 600
    Berkeley, CA 94704 USA

    Eddie Kohler <kohler@cs.ucla.edu>
    4531C Boelter Hall
    UCLA Computer Science Department
    Los Angeles, CA 90095 USA

    Jitendra Padhye <padhye@microsoft.com>
    Microsoft Research
    One Microsoft Way
    Redmond, WA 98052 USA


Intellectual Property Notice

    The IETF takes no position regarding the validity or scope of any
    intellectual property or other rights that might be claimed to
    pertain to the implementation or use of the technology described in
    this document or the extent to which any license under such rights
    might or might not be available; neither does it represent that it
    has made any effort to identify any such rights.  Information on the
    IETF's procedures with respect to rights in standards-track and
    standards-related documentation can be found in BCP-11.  Copies of
    claims of rights made available for publication and any assurances
    of licenses to be made available, or the result of an attempt made
    to obtain a general license or permission for the use of such
    proprietary rights by implementors or users of this specification
    can be obtained from the IETF Secretariat.

Full Copyright Statement

    Copyright (C) The Internet Society (2004).  All Rights Reserved.

    This document and translations of it may be copied and furnished to
    others, and derivative works that comment on or otherwise explain it
    or assist in its implementation may be prepared, copied, published
    and distributed, in whole or in part, without restriction of any
    kind, provided that the above copyright notice and this paragraph
    are included on all such copies and derivative works.  However, this
    document itself may not be modified in any way, such as by removing
    the copyright notice or references to the Internet Society or other
    Internet organizations, except as needed for the purpose of
    developing Internet standards in which case the procedures for
    copyrights defined in the Internet Standards process must be
    followed, or as required to translate it into languages other than



Floyd/Kohler/Padhye                                            [Page 27]

INTERNET-DRAFT            Expires: August 2004             February 2004


    English.

    The limited permissions granted above are perpetual and will not be
    revoked by the Internet Society or its successors or assigns.

    This document and the information contained herein is provided on an
    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.








































Floyd/Kohler/Padhye                                            [Page 28]


Html markup produced by rfcmarkup 1.109, available from https://tools.ietf.org/tools/rfcmarkup/