[Docs] [txt|pdf|xml] [Tracker] [Email] [Nits]

Versions: 00

Network Working Group                                         Y. Nishida
Internet-Draft                                              WIDE Project
Intended status: Standards Track                          April 15, 2011
Expires: October 17, 2011


      Rescue Retransmission for SACK-based Loss Recovery Algorithm
              draft-nishida-tcpm-rescue-retransmission-00

Abstract

   This memo describes an issue in the recovery algorithm in RFC3517 and
   proposes a simple modification to avoid unnecessary timeouts for
   performance improvement.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on October 17, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.





Nishida                 Expires October 17, 2011                [Page 1]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Conventions and Terminology  . . . . . . . . . . . . . . . . .  4
   3.  Problem Description  . . . . . . . . . . . . . . . . . . . . .  5
   4.  Possible Scenario  . . . . . . . . . . . . . . . . . . . . . .  6
   5.  Proposed Fix . . . . . . . . . . . . . . . . . . . . . . . . .  8
   6.  Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 12
   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
     10.1.  Normative References  . . . . . . . . . . . . . . . . . . 13
     10.2.  Informative References  . . . . . . . . . . . . . . . . . 13
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14




































Nishida                 Expires October 17, 2011                [Page 2]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


1.  Introduction

   RFC3517 [RFC3517] defines conservative loss recovery algorithm based
   on the use of the selective acknowledgment (SACK) TCP option
   [RFC2018].  It is designed to follows the guidelines set in RFC2581
   [RFC2581] in order to be used safely in TCP implementations.
   However, in some situations, the loss recovery algorithm in RFC3517
   fails to retransmit segments even though there are available pipe
   size for the connection.  This failure of the retransmission can
   causes unnecessary timeouts which can lead performance degradation.
   This document describes the issue and propose a simple modification
   to solve this problem.  The proposed solution allows SACK-based TCP
   to attain the same performance as NewReno [RFC3782].






































Nishida                 Expires October 17, 2011                [Page 3]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


2.  Conventions and Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].














































Nishida                 Expires October 17, 2011                [Page 4]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


3.  Problem Description

   In RFC3517, when a sender receives the duplicate ACK corresponding to
   DupThresh ACKs, it enters loss recovery phase.  In the loss recovery
   phase, whenever sender receives ACK segments, it re-calculate the
   size of pipes by calling Update() and SetPipe(). and determines which
   segments should be sent by calling NextSeg().  However, there are
   some situations where NextSeg() returns no segment although the size
   of pipes is not zero.  This behavior results from the following logic
   in the NextSeg().  When NextSeg() tries to find segments to be
   retransmitted, it uses the IsLost() that returns segments which are
   most likely lost.  In order to increase the accuracy, IsLost()
   determines that the packet with 'SeqNum' is lost when DupThresh
   discontiguous SACKed sequences have arrived above 'SeqNum' or
   (DupThresh * SMSS) bytes with sequence numbers greater than 'SeqNum'
   have been SACKed.  If IsLost() returns no packet, NextSeg() uses new
   segments for the next transmission.

   In this logic, a problem can arise when a sender does not have new
   segments to be sent.  In this case, if IsLost() returns no packet,
   NextSeg() cannot find a packet for the next transmission and packet
   transmissions will be delayed until one of the following events
   happens.

   o  ACKs have arrived and IsLost() finds new lost segments

   o  Application feeds data to TCP

   o  Retransmission timer expires

   However, in some situations, such as where window size is small, the
   number of arrived ACKs might not be enough to identify lost segments.
   In addition, applications might feed data intermittently or might not
   have no more data to feed.  In this case, TCP will need timer
   expiration to retransmit segments even though there are enough pipe
   size to send a packet.















Nishida                 Expires October 17, 2011                [Page 5]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


4.  Possible Scenario

   This section describe a possible scenario where the issue described
   in the document happens.

   The following is a virtual tcpdump log.


     1  10:41:00.000001 A > B: . 1000:2000(1000) ack 1 win 32768
     2  10:41:00.001001 A > B: . 2000:3000(1000) ack 1 win 32768
     3  10:41:00.002001 A > B: . 3000:4000(1000) ack 1 win 32768
     4  10:41:00.003001 A > B: . 4000:5000(1000) ack 1 win 32768
     5  10:41:00.004001 A > B: . 5000:6000(1000) ack 1 win 32768
     6  10:41:00.010001 B > A: . ack 1000 win 16384 < sack {2000:3000} >
     7  10:41:00.011001 B > A: . ack 1000 win 16384 < sack {2000:4000} >
     8  10:41:00.012001 B > A: . ack 1000 win 16384 < sack {2000:5000} >
     9  10:41:00.015001 A > B: . 1000:2000(1000) ack 1 win 32768
    10  10:41:00.018001 B > A: . ack 5000 win 16384


   In this example, A sends data segments to B. At the beginning of the
   log, the cwnd of A is 5 SMSS (SMSS=1000 octets), hence A sends 5
   segments to B (line 1-5).  Here, if the segment sent in line 1
   (segment 1000:2000) and line 5 (segment 5000:6000) are lost, B sends
   3 duplicated ACKs for the lost segment (line 6-8) to ask
   retransmission for the segment 1000:2000.  At line 8, A receives
   DupThresh ACKs and retransmits the lost segment (at line 9).  At this
   time, A enters loss recovery phase and set pipe size to 2.5 SMSS.  At
   line 10, A receives the ACK triggered by the arrival of the segment
   1000:2000.  Upon the reception of the ACK at line 10, A performs the
   following steps to determine if there are segments can be sent.

   1.  Update the pipe size by calling update() and SetPipe().  Since
       HighACK = 5000, HighData is 6000 and IsLost(5000) returns false,
       the value of pipe is set to 1000.

   2.  Because cwnd - pipe >= 1 SMSS, it decides to send one or more
       segments.

   3.  Call NextSeg() to determine what segments to be sent.

   Now, if A has no unsent data, only available packet can be sent is
   segment 5000:6000.  NextSeg() checks if this segment can be sent by
   applying the following logics, however none of them can be applied.

   1.  rule (1) cannot be applied to this segment.  Because (1.b) and
       (1.c) return false,




Nishida                 Expires October 17, 2011                [Page 6]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


   2.  rule (2) cannot be applied since there is no available unsent
       data.

   3.  rule (3) cannot be applied to this segment.  Because (1.b)
       returns false.

   Hence NextSeg() returns no segment in this case, which means TCP has
   no segment to be sent until timeout happens.  In case where there are
   multiple packet loss in a window and TCP has no data to send at the
   moment, it will be possible that TCP falls into this situation.









































Nishida                 Expires October 17, 2011                [Page 7]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


5.  Proposed Fix

   To solve the problem mentioned above, we propose to introduce one
   variable: RescueRxt for TCP sender and add the following logic as the
   fourth rule.

   (4) If the conditions for rules (1), (2) and (3) fail, but there
       exists unSACKed data, one segment of up to SMSS octets MAY be
       returned if RescueRxt is not set.  The returned segment MUST
       include the highest unSACKed sequence number.

       When a segment is returned by this rule, RescueRxt MUST be set to
       the highest octets of the segment.  Also, HighRxt MUST NOT be
       updated.

   In addition to this rule, TCP sender MUST reset RescueRxt when it
   receives cumulative ACK for a sequence number greater than RescueRxt.


































Nishida                 Expires October 17, 2011                [Page 8]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


6.  Discussion

   The simple approach to address this issue is to send unSACKed data
   when the conditions for rules (1), (2) and (3) failed as long as
   there is available pipe size.  A similar approach is also proposed in
   [I-D.scheffenegger-tcpm-sack-loss-recovery].  However, this approach
   can cause lots of unnecessary retransmissions where segments are
   reordered but not lost.

   The proposed fix in the document allows TCP to retransmit one segment
   per RTT where all available data TCP has is unSACKed and not sure if
   it is lost.  Since the objective of this algorithm is to avoid
   retransmission timeout and maintain ack clocking, but not to utilize
   unused pipe, sending one segment per RTT is enough for this purpose.
   By sending this one packet, the sender TCP will have a good chance to
   receive additional ACKs from the receiver, which can trigger another
   retransmissions in the next RTT.  The variable RescueRxt ensures that
   the retransmission by this algorithm happens only once in a RTT.
   This logic can drastically suppress amount of unnecessary
   retransmissions in case of reordering.































Nishida                 Expires October 17, 2011                [Page 9]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


7.  Acknowledgements

   The authors gratefully acknowledge Richard Scheffenegger who
   originally identified the issue described in the document and gave
   insightful comments.  The authors also would like to appreciate Mark
   Allman and Ethan Blanton for their careful reviewing on the initial
   idea of the logic and their valuable feedbacks.












































Nishida                 Expires October 17, 2011               [Page 10]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


8.  Security Considerations

   This document only propose simple modification in RFC3782.  There are
   no known additional security concerns for this algorithm.















































Nishida                 Expires October 17, 2011               [Page 11]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


9.  IANA Considerations

   This document does not create any new registries or modify the rules
   for any existing registries managed by IANA.















































Nishida                 Expires October 17, 2011               [Page 12]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


10.  References

10.1.  Normative References

   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
              Selective Acknowledgment Options", RFC 2018, October 1996.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2581]  Allman, M., Paxson, V., and W. Stevens, "TCP Congestion
              Control", RFC 2581, April 1999.

   [RFC3517]  Blanton, E., Allman, M., Fall, K., and L. Wang, "A
              Conservative Selective Acknowledgment (SACK)-based Loss
              Recovery Algorithm for TCP", RFC 3517, April 2003.

   [RFC3782]  Floyd, S., Henderson, T., and A. Gurtov, "The NewReno
              Modification to TCP's Fast Recovery Algorithm", RFC 3782,
              April 2004.

10.2.  Informative References

   [I-D.scheffenegger-tcpm-sack-loss-recovery]
              Scheffenegger, R., "Improving SACK-based loss recovery for
              TCP", draft-scheffenegger-tcpm-sack-loss-recovery-00 (work
              in progress), November 2010.
























Nishida                 Expires October 17, 2011               [Page 13]

Internet-Draft   Rescue Retransmission for SACK Recovery      April 2011


Author's Address

   Yoshifumi Nishida
   WIDE Project
   Endo 5322
   Fujisawa, Kanagawa  252-8520
   Japan

   Email: nishida@wide.ad.jp










































Nishida                 Expires October 17, 2011               [Page 14]


Html markup produced by rfcmarkup 1.108, available from http://tools.ietf.org/tools/rfcmarkup/