draft-ietf-sigtran-mdtp-03.txt   draft-ietf-sigtran-mdtp-04.txt 
Network Working Group R. R. Stewart Network Working Group R. R. Stewart
INTERNET-DRAFT Motorola INTERNET-DRAFT Q. Xie
Q. Xie
Motorola Motorola
Expires in six months 1 April 1999 T. Bova
S Hussain
T Krivoruchka
R. Revis
Cisco
expires in six months April 19 1999
MULTI_NETWORK DATAGRAM TRANSMISSION PROTOCOL MULTI_NETWORK DATAGRAM TRANSMISSION PROTOCOL
<draft-ietf-sigtran-mdtp-03.txt> <draft-ietf-sigtran-mdtp-04.txt>
Status of This Memo Status of This Memo
This document is an Internet-Draft and is in full conformance This document is an Internet-Draft and is in full conformance with
with all provisions of Section 10 of RFC2026. Internet-Drafts are working all provisions of Section 10 of RFC2026. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
Abstract Abstract
This Internet Draft discusses an experimental call control protocol, This Internet Draft discusses an experimental call control signaling
namely the Multi-network Datagram Transmission Protocol (MDTP), that is transport protocol, namely the Multi-network Datagram Transmission
intended to provide fault-tolerant reliable/unreliable data transfer Protocol (MDTP), that is intended to provide fault-tolerant reliable
between communicating processes over IP networks [1]. MDTP is proposed data transfer between communicating entities over IP networks [1].
as an application-level protocol which is designed with a high emphasis
on supporting redundant networks and transparent fault management. MDTP
also gives the application a great degree of timing control and
configuration flexibilities. The motivation of developing MDTP is to
establish a framework for supporting Internet-based high reliability
real-time commercial applications such as signaling and call control
for Internet telephony.
Stewart & Xie [Page 1]
TABLE OF CONTENTS
1. Introduction..............................................3 MDTP is proposed as an application-level protocol which is designed
1.1 Multi-network Datagram Transmission Protocol.........3 with a high emphasis on supporting redundant networks and transparent
1.2 Interfaces to MDTP...................................4 fault management. MDTP also gives the user a great degree of timing
1.3 Operation of MDTP....................................5 control and configuration flexibilities in order to meet the stringent
2. Design Principles.........................................5 time constraints often found in telephony signaling protocols. The
3. Header Format.............................................6 motivation of developing MDTP is to establish a framework for
3.1 MDTP Header Format Description.......................9 supporting Internet-based high reliability real-time commercial
3.2 Notes on Multicast Header format....................12 applications such as signaling and call control for Internet
4. Transmission Initialization..............................12 telephony.
4.1 Normal Initialization...............................12
4.2 Multiple Network Addresses..........................14
4.3 Initialization Collision............................15
4.4 Re-initialization...................................16
4.5 Link rotation.......................................16
5. Reliable Transfer Mode...................................17
5.1 Timer Control.......................................19
5.2 Gap Acknowledgments.................................21
5.3 Congestion Control..................................23
5.4 Sequence Number Reset...............................26
5.5 Retransmission on Multiple Networks.................27
5.5.1 Randomization of the T3-Send timer at resend ...28
5.6 Termination of an Endpoint..........................28
5.7 Endpoint Drain......................................29
5.8 Advisory Acknowledgments...........................29
5.9 RTT Measurement.....................................30
5.10 Heart Beat Ack.....................................32
6. Unreliable Transfer Mode.................................33
6.1 Ordered reception..................................34
7. Reliable flows...........................................35
7.1 Initiating a flow...................................36
7.2 Flow acknowledgments................................37
7.3 Flow session closing................................41
8. Mixed Mode Data Transmission.............................42
9. Bundled Messages.........................................43
9.1 Format of Bundled Datagram..........................44
9.2 Bundled Transfer....................................45
10. Fragmented Messages......................................46
11. Non-protocol Datagrams...................................47
12. Broadcast and Multicast..................................48
12.1 Multicast/Broadcast Initialization.................48
12.2 Transmission of Broadcast Datagrams................48
12.3 Transmission of Multicast Datagrams................49
12.4 Reset of the Multicast Datagram Sequence Number....50
13. Interface with upper level protocols.....................51
13.1 Init.MDTP primitive.....................................52
13.2 Send.Data primitive.....................................52
13.3 Receive.Data primitive..................................52
13.4 Data.Arrive notification................................53
13.5 Send.Failure notification...............................53
13.5 Link.Status.Change notification.........................53
Stewart & Xie [Page 2] TABLE OF CONTENTS
13.6 Communication.Lost notification.........................53 1. Introduction
14. Suggested Timer and Protocol Parameter Values............54 1.1 Design Requirements of MDTP
15. Acknowledgments.........................................54 1.2 Interfaces to MDTP
16. Author's Addresses.......................................54 2. MDTP Datagram Format
17. References...............................................55 2.1 Header Field Descriptions
2.2 Data Field
3. Transmission Initialization
3.1 Endpoint Association Initialization
3.1.1 Choice of Tag Value
3.2 Data Field Format of Initiation Datagrams
3.3 Initialization Collision
3.4 Association Re-initialization
4. Reliable Transfer of Datagrams
4.1 Timer Management Rules
4.1.1 Link Rotation
4.2 Gap Acknowledgment for Missing Datagrams
4.3 Congestion Control
4.3.1 Sending with Window Control
4.3.2 Window Length Adjustment
4.3.3 Flow Control using In-Queue Information
4.3.4 T3-send Timer Adjustment with RTT
4.4 Sequence Number Reset
4.5 Datagram Re-transmission
4.5.1 Re-transmission on Redundant networks
4.6 RTT Measurement
4.6.1 RTT Datagram Header Format
4.6.2 Measure RTT
4.7 Link Heart Beat
4.8 Advisory Acknowledgment
4.9 Termination of an Association
4.10 Draining of an Association
5. Interface with upper level protocols
6. Suggested MDTP Protocol Parameter Values
7. Acknowledgments
8. Author's Addresses
9. References
Appendix A: Stream-based Reliable and Ordered Delivery
A.1 Stream Initiation
A.2 Stream Termination
A.3 Stream Datagram Transfer
A.3.1 Header Format in Stream Datagrams with User Data
A.3.2 Transmission of Stream Datagrams
A.3.3 Extended Stream Ack
A.4 Other Issues with Stream Transfer
Appendix B: Bundled Message Transfer
B.1 Format of Bundled Datagram
B.2 Bundled Datagram Transfer
Appendix C: Fragmented Message Transfer
Appendix D: Multicast Datagram Transfer
D.1 Multicast Datagram Header Format
D.2 Transmission of Multicast Datagrams
Appendix E: Unreliable Delivery
E.1 Ordered Unreliable Delivery
1. Introduction 1. Introduction
This Internet Draft discusses an experimental protocol, namely the This Internet Draft discusses an experimental protocol, namely the
Multi-network Datagram Transmission Protocol (MDTP), that is intended Multi-network Datagram Transmission Protocol (MDTP). The intention of
to provide fault-tolerant reliable/unreliable data transfer between developing MDTP is to provide a fault-tolerant, real-time reliable
communicating processes over IP networks [1]. data transfer mechanism between communicating endpoints over IP
networks [1].
MDTP is proposed as an application-level protocol which is designed MDTP is proposed as an application-level protocol which is designed
with a high emphasis on supporting redundant networks and transparent with a high emphasis on supporting redundant networks and transparent
fault management. MDTP also gives the application a great degree of fault management. MDTP also gives the user a great degree of timing
timing control and configuration flexibilities. The motivation of control and configuration flexibilities in order to meet the stringent
developing MDTP is to establish a framework for supporting time constraints often found in telephony signaling protocols. The
Internet-based high reliability real-time commercial applications motivation of developing MDTP is to establish a framework for
such as signaling and call control for Internet telephony. supporting Internet-based high reliability real-time commercial
applications such as signaling and call control for Internet
This document describes the functional interface and the details telephony.
necessary to implement MDTP.
1.1 Multi-network Datagram Transmission Protocol (MDTP)
The Multi-network Datagram Transmission Protocol (MDTP) presented in MDTP is also designed to be scalable in order to support different
this Internet Draft is designed to meet the following critical signaling transport requirements for different interfaces in a
requirements common to real-time call control environments employing telephony network.
redundant networks:
A) A process may need to be in simultaneous communication with For example, the transportation of signaling protocols such as PRI
thousands of endpoints performing various call processing ISDN may not require redundant links, and hence only a subset of MDTP
functions. These endpoints may be codec converters, SS7 to IP will need to be implemented. On the other hand, redundant networks
translation applications, or, in the case of mobile networks, data may be mandated when transporting SS7 signaling messages amongst
selector and combiner applications. different components in a carrier-grade telephony core network. In
such cases, the transparent support for redundant networks, load
sharing, and fault management defined in MDTP become essential and
likely need to be fully supported in an implementation.
B) A process needs to have a very fine control over the timing for Many of the fundamental concepts that have made TCP such a useful
delivering a datagram. The timing should be easily adjusted protocol are reused in MDTP, and some of the advantages of UDP are
depending on the message type and the destination. For example, also merged into the design. This has lead to a highly effective,
after a few seconds of non-delivery the call which the message robust protocol for fault tolerant data communications.
is about may not exist anymore.
C) A process communicating with a peer should be able to take This document describes the functional interface and the details
advantage of the redundant networks in a transparent way. This necessary for implementing MDTP. The main body of this document
means that the application or upper level protocols need not to be contains the minimal set of functionalities of MDTP that must be
involved in the network fault management. Instead, when network implemented. In the Appendices, a set of additional MDTP functions,
failure occurs the transmission protocol should be able to such as reliable stream, multicast, message bundling, message
automatically re-route the out-bound datagram to the alternate fragmentation, are defined. Those additional functionalities are
optional to implementation.
Stewart & Xie [Page 3] 1.1 Design Requirements of MDTP
network without intervention from the application.
D) Datagrams may arrive out of order, or may arrive in duplicate The following are some of the design requirements of MDTP, in order to
copies. This is especially true in a redundant network make MDTP capable of supporting real-time call control environments
environment. The transmission protocol should be strong enough to which potentially may employ redundant networks:
properly handle both situations with little intervention from the
upper level protocol or application.
To accomplish the above objectives we have defined MDTP to reside in A) High communication fan-out: an endpoint may need to be in
user-space, i.e., it is not intended to be implemented as a module in simultaneous communication with hundreds or thousands of endpoints
an operating system. This gives the application or upper level performing various call processing functions. These endpoints may
protocols that use MDTP outstanding flexibility in controlling the be codec converters, SS7 to IP translation applications, or, in the
timing and other operational characteristics for the data case of mobile networks, data selector and combiner applications.
transmissions.
MDTP is also made multi-network aware. This means that if more than B) Stringent timer control: an endpoint needs to have a very fine
one path exists between two endpoints (such as redundant LANs), MDTP control over the timing for delivering a datagram. The timing
will take advantage of the multiple networks by automatically should be easily adjusted depending on the message type and the
switching to the alternate LAN if the datagram delivery becomes destination. For example, after a few seconds of non-delivery the
unavailable or inefficient (e.g., too many re-transmissions) on the call which the message is about may not exist anymore.
current LAN. The ability to handle multiple networks by MDTP can also
greatly facilitate the implementation of various traffic balancing
schemes in the application or upper level protocols.
In the redundant network setting, out-of-order or duplicate datagrams C) Support redundant links: an endpoint communicating with a peer
are proven to be most harmful during MDTP transmission initiations and should be able to take advantage of the redundant networks in a
re-initiations. To cope with the problem, MDTP utilizes a very transparent way. This means that the application or upper layer
efficient tag mechanism to guard against out-of-order or duplicate protocols need not to be involved in the network fault
datagrams. management. Instead, when network failure occurs MDTP should be
able to automatically re-route the out-bound datagram to the
alternate network (if one exists) without intervention from the
application.
MDTP assumes that a UDP-like [2] transport protocol is available at the D) Orderly delivery: datagrams may arrive out of order, or may arrive
operating system level for data transport. We have successfully in duplicate copies. This is especially true if redundant networks
implemented and tested MDTP over UDP and Sun Microsystem's CLTS are used. MDTP should be strong enough to properly handle both
transport layers. situations with little intervention from the upper layer protocols
or applications.
Comparing to traditional TCP [3], MDTP design is more tuned towards a F) Support stream sequencing: on the demand of the upper layer
special set of applications, that is the time critical fault tolerant protocols or applications, MDTP should be able to support sequenced
applications using redundant LANs. It is not designed to replace TCP delivery with regard to each individual stream, i.e., the delay caused
as a general purpose transmission protocol. by the loss and retransmission of a datagram should be isolated to
only the stream to which the datagram belongs. This is particularly
important in some call control applications, where a loss of a
message should only affect the call whom the message belongs to.
1.2 Interfaces to MDTP 1.2 Interfaces to MDTP
MDTP interfaces with the application programs or higher level The application programs or upper layer protocols interface with MDTP
protocols through a set of function calls. Due to the fact that MDTP through a set of primitives (see section 5. for details).
is an application level protocol, these calls are not executed within
the operating system, but within the user process (i.e., in the user
space). The application or higher level protocols pass data to MDTP by
making calls to MDTP, which then enqueues the data for transmission.
When data arrives, MDTP will distribute the data to the application or
higher level protocols via mechanisms predefined by the application.
The application also has an interface to change the operational mode
of an MDTP endpoint and the default operational mode of the MDTP
endpoint. The default operational mode is used in the absence of any
Stewart & Xie [Page 4]
specific direction from the application. More details on the MDTP
interface to the upper level protocol/application can be found in
section 13.
As noted above, it is assumed that a UDP-like data transport protocol
will provide the interface between MDTP and the operating system. No
other special interfaces or changes are assumed within the operating
system, all queuing and internal pseudo-connection information is
maintained inside MDTP endpoint.
1.3 Operation of MDTP
MDTP operates in three different modes.
A) Reliable transfer mode
B) Unreliable transfer mode
C) Raw UDP transfer mode
The two ends in a communication connection can operate in different
modes with respect to each other, with the exception of the raw UDP
mode. For example, if two endpoints A and B are communicating with
each other. Endpoint A may be sending information to B in reliable
transfer mode, while B, on the other hand, may be sending information
to A in unreliable transfer mode. All communications from A to B will
be acknowledged by B, but A will not need to acknowledge data received
from B.
Raw UDP transfer is used when one of the endpoints in communication
does not support MDTP. This allows compatibility with non-MDTP
endpoints. Two MDTP capable endpoints are also allowed to engage in
communications in raw UDP transfer mode. However, both sides will have
to be in raw UDP mode once one of them indicates to use raw UDP
transfer mode.
MDTP also provides a bundling option for both the reliable and
unreliable transfer modes. This allows each side to hold the data
before transmission for some period of time, so that small datagrams
can be combined and sent in a single larger datagram to improve
network utilization efficiency.
2. Design Principles
One of the major objectives which dictates the design of MDTP is to
provide a data transmission protocol that transparently supports highly
fault tolerant implementations. To accomplish this, provisions for two
endpoints engaging in communication to use multiple networks is
essential. MDTP is therefore designed to yield the best fault
tolerance when the application shares the load over multiple network
connections.
In cases of failed original transmission, MDTP provides the ability of
attempting retransmissions using an alternate network connection even
Stewart & Xie [Page 5]
when the upper level protocol or the application is completely
ignorant of the existence of the alternate route.
Many of the fundamental concepts that have made TCP such a useful
protocol are reused, and some of the advantages of UDP are also merged
into the design of MDTP. This has lead to a highly effective, robust
protocol for fault tolerant data communications.
3. Header Format
MDTP inserts at the beginning of every datagram a header. This header Towards the networks, it is assumed that a UDP-like data transport
is composed of various flags and integers. The integers are always kept protocol will provide the interface between MDTP and the operating
in network byte order. The following table illustrates the common system. No special interfaces or changes are assumed within the
MDTP header overlay. Note that one tick mark represents one bit operating system, all queuing and endpoint association information are
position. maintained inside MDTP layer.
MDTP Header Format - Non Multicast 2. MDTP Datagram Format
0 1 2 3 MDTP inserts the following protocol header at the beginning of every
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 user datagram. The integer fields shall be transmitted in network byte
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ order.
| MDTP Protocol Identifier 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | In Queue |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ \
/ data /
\ \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Stewart & Xie [Page 6] MDTP Header Format
MDTP Header Format - Multicast Format
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 1 | | MDTP Protocol Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 | | Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen) | | Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) | | Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of | | Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | In Queue |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast To Transmit address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast From - senders base address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ \ \ \
/ data / / data /
\ \ \ \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MDTP Header Format - RTT Ack 2.1 Header Field Descriptions
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | In Queue |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transparent Time Int-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transparent Time Int-2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Stewart & Xie [Page 7]
Flow Initiate/Close Message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen/flow num) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | In Queue |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ack Flow (opening) | Ack datagram number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Flow Extended Acknowledgment
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ack Flow (Seen) | Ack datagram number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of flow Acks |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | In Queue |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ack Flow (Seen) | Ack datagram number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ /
/ one for each 'Number of flow Acks' \
\ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ack Flow (Seen) | Ack datagram number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Stewart & Xie [Page 8]
3.1 MDTP Header Format
MDTP Protocol Identifier 1: 32 bits
This is a fixed long value of 0xf7873072.
MDTP Protocol Identifier 2: 32 bits
This is a fixed long value of 0x17074012. MDTP Protocol
Identifier 1 and 2 are jointly examined to determine a received
datagram is an MDTP protocol datagram.
Acknowledgment Number (or Seen): 32 bits
If the flag ACK is set this value is the next sequence number
that the sender of this datagram expects to receive from the
receiver of this datagram.
However, during initialization negotiation, multicast and
broadcast transmissions, this field will have special meanings
(see 4 and 11).
Sequence Number (or Send): 32 bits
If DAT flag is set, this value represents the sequence number of
the first data octet that follows this header. Otherwise, this
value will be the sequence number of the first octet of the next
data unit that will be sent.
However, during initialization negotiation, multicast and
broadcast transmissions, this field will have special meanings
(see 4 and 11).
Part: 8 bits
This value represents the Part number of a fragmented message. The
first fragment of a message is always part '0'.
Of: 8 bits
This value represents the total number of fragments in a MDTP Protocol Identifier: 32 bits
fragmented message. The valid range for this value is from '1'
to '255'. For broadcast and multicast datagrams this value is
set to '1' to indicate that no fragmentation should occur.
Data Size: 16 bits This shall be a fixed long value of 0xf7873072. The receiver
shall always verify this Protocol Identifier before it proceeds
any further in interpreting the header fields.
This value represents, in number of octets, the size of the data Version: 8 bits
field that follows this header in the current datagram.
Flags: 8 bits This field represents the version number of the MDTP protocol
(value TBD).
NOG - No Guaranteed delivery. This bit is used in negotiation Flags: 16 bits
Stewart & Xie [Page 9] NOM - shall be set to 1 (reserved for fragmentation, see
and is set to indicate that the sender does not wish to use Appendix C)
reliable delivery. When this bit has been set in negotiation,
the receiver should prevent its application from putting
communication with this endpoint in reliable mode.
In normal data transfer (after the initiate sequence) this
bit should be set to 0, except when responding to a RTT Ack
request.
NOB - No Bundling. This bit is used in negotiation and NOB - shall be set to 1 (reserved for bundling, see Appendix B)
is set to indicate that the sender does not wish to perform of
bundling or un-bundling of datagrams. When this bit has been set
in negotiation, the receiver should prevent its application from
putting communication with this endpoint in bundled mode.
In normal data transfer this bit should be set to 0, if this
bit is set to 1 then this message is part of a flow.
WIN - Window Up. This bit is set by the sender of this datagram WIN - Window Up. This bit is set by the sender of this datagram
to indicate that the sender needs the receiver to acknowledge on to indicate that the sender needs the receiver to acknowledge on
previously received datagrams before it can send more datagrams. previously received datagrams before it can send more datagrams.
ISB - Is Bundled. This bit is set by the sender to indicate that ISB - shall set to 0 (reserved for bundling, see Appendix B)
this datagram is bundled. This bit should never be set if during
negotiation either end set the NOB bit.
FIR - First Datagram. This flag is set to indicate that this is a FIR - First Datagram. This flag is set to indicate that this is a
negotiation datagram. Initiation datagram.
RES - Reset Sequence Number. This bit is set to indicate that the RTM - normally set to 0 (used for Link Heart Beat and RTT
sequence number is being reset. The sequence number should be reset measurement, see sections 4.6 and 4.7)
whenever the sending count is greater than 0x7fffffff.
DAT - Data Present. This bit is set to indicate that, following DAT - Data Present. This bit is set to indicate that, following
this header, application data is present in this datagram. this header, application data is present in this datagram.
ACK - Acknowledge. This bit is set to indicate that the sender is ACK - Acknowledge. This bit is set to indicate that the sender is
acknowledging receipt of the specified Acknowledgment Number. acknowledging the reception of the specified Acknowledgment Number.
Mode: 8 bits
BRO - Broadcast. This bit is set to indicate a broadcast or MUL - shall be set to 0 (reserved for multicast, see Appendix D)
multicast datagram. When this bit is set, bit SHU, WNR, BUN, and
GAR are not used and should be set to '0'. This datagram is a
multicast datagram if the UNR bit is also set. Otherwise, this
datagram is a broadcast datagram.
SHU - Shutdown. This bit is set when the sender initiates its SHU - Shutdown. This bit is set when the sender initiates its
closing procedure and indicates to the receiver that the sender closing procedure and indicates to the receiver that the sender
is no longer a valid destination. If the UNR bit is set in is no longer a valid destination. If the UNR bit is set in
conjunction with the SHU bit, an incomplete shutdown is conjunction with the SHU bit, an incomplete shutdown is
specified. After an incomplete shutdown, the receiver can still specified. After an incomplete shutdown, the receiver can still
re-establish the communication with the sender by re-initiating re-establish the communication with the sender by re-initiating
with the sender (see 5.7). with the sender (see 4.7).
WNR - Window Up Response. This bit is set in the acknowledgment WNR - Window Up Response. This bit is set in the acknowledgment
reply to a Window Up flag. reply to a Window Up flag.
RE1 - This bit will represent one of two things. If the GAR RE1 - normally set to 0 (used for advisory ACK, see section 4.8)
bit is set to one, then setting the RE1 bit indicates to the
receiver that the sender is requesting a advisory ACK. This
is normally sent in a datagram when 1/2 of the current window
has been sent. If this bit is set to 0 (when the GAR bit is
set) then the sender is NOT requesting a advisory ACK.
If the UNR bit is set then the RE1 bit is set than the receiver
is requested to order the datagrams (if more than one have
not been read). If the receiver has already delivered a datagram
of higher sequence, then the receiver should discard lower number
sequence datagrams that arrive late.
RE2 - This bit will represent one of two things. If the GAR
bit is set to one, the DAT bit is set to 0 and the ACK bit is
set to 1 then this is a ACK with a Round Trip Time Request
format. This also identifies the RTT Ack header format it
in place. If the UNR bit is set to 1 and DAT bit is set to 0,
then this datagram is used in a implementation specific way but
carries no data. The datagram can be safely ignored and discarded.
BUN - Bundled Mode. This bit is set to indicate that bundled
mode is in effect for the sender. This bit should never be set
if during negotiation either endpoint set the NOB flag.
GAR - Guaranteed Mode. This bit is set to indicate that the RTC - normally set to 0, (used for RTT, see section 4.6)
reliable mode is in effect for the sender, i.e., the sender
expects an acknowledgment. This bit should never be set if
either endpoint set the NOG flag during negotiation.
UNR - Unreliable Mode. This bit is set to indicate that FLO - shall be set to 0 (reserved for reliable stream, see
unreliable mode is in effect for the sender and the sender does Appendix A)
not expect an acknowledgment. This bit has special meanings if
BRO or SHU bit is set (see above).
Version: 8 bits GAR - shall be set to 1 (reserved for unreliable mode, see
Appendix E)
This field represents the version number of the MDTP UNR - shall be set to 0 (reserved for unreliable mode see
protocol. If these bits are set to 1, then the sender does Appendix E)
not support Round Trip Time (RTT) calculation or Heart
Beat of reliable protocol. If these bits are set to 2 then
this version does support RTT and Heartbeat. If the Version
is set to 3 then the sender/receiver supports reliable flows.
In Queue: 8 bits In Queue: 8 bits
This field contains the number of messages the sender has on its This field contains the number of messages the sender has on its
incoming queue, waiting to be read by the application. This gives incoming queue, waiting to be read by the application. This
the receiver an indication of the flow control conditions within gives the receiver an indication of the flow control conditions
the sender. within the sender.
The message header is always followed by the data field. If there is Acknowledgment Number (or Seen): 32 bits
less than 4 octets of application data to send with the datagram, the
data field of the datagram should be padded with all '0' to make it
four (4) octets. The padded all '0' octets, if there is any, are not
counted in the Data Size.
The maximal Data Size for a single MDTP datagram is the MTU size of If the flag ACK is set this value is the last sequence number
the underlying transport protocol (e.g., UDP) minus the MDTP header that the sender of this datagram received from the
size that is twenty four (24) octets. The combination of the maximal receiver of this datagram.
'Of' value, which is 255, and the maximal Data Size will determined
the maximal size of a single message that the MDTP can send or
receive.
3.2 MDTP Multicast Header Format Sequence Number (or Send): 32 bits
The multicast header format is identical to the standard MDTP header If DAT flag is set, this value represents the sequence number of
format, as discussed above, except for the following extensions. the current data unit following this header. Otherwise, this
value will be the sequence number of the next data unit that
will be sent.
Multicast To Transmit address - This is the multicast address, in Data Size: 16 bits
network byte order, that the sender transmitted the data to. The
receiver can use this information for internal tracking purposes.
Multicast From - This is the base address (address 0 in the initiate This value represents, in number of octets, the size of the data
message, see below) of the sender. Since a multicast sender may not field that follows this header in the current datagram.
have gone through the initiate procedures this address is the base
reference that the receiver is to use to lookup the sender. This
network byte order address should be used to reference any internal
cache rather than the arriving network from address.
4. Transmission Initialization Part: 8 bits
4.1 Normal Initialization shall have value '0' (reserved for fragmentation, see Appendix C)
Before the first data transmission can take place from one endpoint Of: 8 bits
(A) to another endpoint (Z), the two endpoints will need to complete
an initialization process.
The initialization process consists of the following steps. shall have value '1' (reserved for fragmentation, see Appendix C)
A) Endpoint A should first send an initiation datagram, while 2.2 Data Field
withholding the application data from transmission.
Endpoint A Endpoint Z When the DAT flag is set to 1, the MDTP datagram header will be
[Header Flags=FIR|RES followed by a data field. An implementation may choose to pad some
Mode=options '0's at the end of the data field so as to align with certain memory
Seen=0,Send=Tag_A] -----------------------> boundaries. However, the padded '0' octets, if there are any, shall
(Start T1-init timer) not be counted in the Data Size.
(Enter Tag_A-lock mode)
The initiation datagram is identified by setting FIR and RES bits in The maximal Data Size for a single MDTP datagram is the MTU size of
the Flags field. No user data should be carried in the initiation the underlying transport protocol (e.g., UDP) minus the MDTP header
datagram. size.
The Endpoint A should fill in the appropriate options, e.g., BUN, 3. Transmission Initialization
GAR, or UNR, in the Mode field to indicate the transmission type it
has chosen. It may also use NOB and NOG bits in the Flags field to
specify to whether or not its peer is allowed for bundling or
reliable transfer mode.
The Seen field will be set to '0', but an initiation tag, Tag_A, 3.1 Endpoint Association Initialization
generated by Endpoint A, will be carried in the Send field, as
shown in the above diagram. If re-initializations are needed
between two endpoints subsequently (see 4.3), a different tag with
a unique value should be used for each re-initialization.
After sending the initiation datagram, Endpoint A shall start T1-init Before the first data transmission can take place from one endpoint
timer and enter a Tag_A-lock mode. ("A") to another endpoint ("Z"), the two endpoints will need to
complete an initialization process in order to set up an association
between them.
During the Tag_A-lock mode, Endpoint A will wait for the initiation The initialization procedure should be made transparent to the upper
Ack datagram with the Seen value set to Tag_A. Any other incoming layer protocol, i.e., it should take place automatically whenever the
datagrams from Endpoint Z, except for new initiation datagrams, upper layer tries to send a datagram to an endpoint which has never
will be discarded. The arrival of new initiation datagrams during the been sent to before. The user datagram shall be withheld by MDTP from
Tag_A-lock mode indicates an initialization collision that will be transmission till the completion of the initialization.
discussed in 4.3.
If T1-init timer expires, the same initiation datagram will be A tag-and-lock mechanism is employed during the initialization in
retransmitted and the timer restarted. This will be repeated order to guard against erroneous or stale datagrams (this is
Max.Init.Retransmit times before Endpoint A considers Endpoint Z especially true if redundant networks are deployed).
unreachable and optionally reports the failure.
B) Upon the receipt of the above initiation datagram from Endpoint A, The initialization process consists of the following steps (assuming
Endpoint Z should respond immediately with an initiation Ack as shown the upper layer at "A" tries to send data to "Z" for the first time):
below:
Endpoint A Endpoint Z A) "A" first sends an Initiation (FIR) to "Z", with Seen field set
[Header Flags=FIR|RES|ACK to 0 and Send field set to Tag_A, and then enters the Tag-lock mode
Mode=Options (see below).
/---------- Seen=Tag_A,Send=Tag_Z]
/ (Enter Tag_Z-lock mode)
(Cancel T1-init timer)<-------/
The initiation Ack datagram is specified with FIR, RES, ACK bits set B) "Z" responds immediately with an Initiation Ack (FIR|ACK), with
to '1' in the Mode field. Similarly, Endpoint Z will specify its Seen set to Tag_A and Send set to Tag_Z, and then enters the
preferred transmission mode and type by setting proper bits in the Tag-lock mode, too (see below).
Mode and Flags fields.
In addition, in the out-bound initiation Ack datagram, Endpoint Z Note that no user data should be carried in the Initiation or
should set the Seen field to Tag_A and supply its own initiation Initiation Ack datagram.
tag, Tag_Z, in the Send field.
Once the initiation Ack is transmitted, Endpoint Z should enter the At this point "Z" is ready to send user data to "A". And upon the
Tag_Z-lock mode. In the Tag_Z-lock mode Endpoint Z will ignore any receipt of the above Initiation Ack from "Z", "A" can also start
incoming initiation Ack datagrams and also discard any other incoming sending user data to "Z".
datagram whose Seen field is not equal to Tag_Z, except for new
initiation datagrams.
If a new initiation datagram is received when Endpoint Z is in However, the first datagram with user data transmitted by "A" to "Z"
Tag_Z-lock mode, Endpoint Z will acknowledged the initiation datagram shall have the Seen value set to Tag_Z, which is obtained from the
only when the tag carried in the Send field matches Tag_A previously Initiation Ack. And similarly, the first datagram with user data
recorded by Endpoint Z. Otherwise, Endpoint Z will send an initiation transmitted by "Z" to "A" shall have the Seen value set to Tag_A,
datagram with Send field set to Tag_Z back to Endpoint A to elicit an which comes from the Initiation datagram.
initiation Ack.
C) After transmitted the initiation Ack, Endpoint Z can start In the Tag-lock mode, each side will silently discard any datagrams
transmitting datagrams with user data. However, the Seen field in the with user data from the other side until it receives the first
datagram with user data and with a Seen value that matches its own
Tag. Once that datagram is received, that endpoint will leave the
Tag-lock mode and immediately send back a data acknowledgment, and
start using the sequence numbers to filter out missing and duplicate
datagrams.
first out-bound datagram with user data must be set to Tag_A. If another Initiation from "A" is received by "Z" after it sent out
the Initiation Ack, "Z" will acknowledge this Initiation by re-sending
the Initiation Ack only when the Send field of this new Initiation has
the same tag as that of the original Initiation. Otherwise, "Z" will
send an Initiation of its own with Send field set to Tag_Z back to "A"
to elicit an Initiation Ack from "A".
D) Upon the receipt of the initiation Ack with Seen equal to Tag_A, In the following example, "A" initiates the association first and then
Endpoint A can start transmitting datagrams with user data. However, sends a datagram with user data to "Z":
the first datagram with application data transmitted by Endpoint A
should have the Seen value set to Tag_Z, which is obtained from the
initiation Ack.
Endpoint A Endpoint Z Endpoint A Endpoint Z
{first app message}
{first app message to Z}
[Header Flags=FIR
& other options
Seen=0,Send=Tag_A] ----------------------->
(Start T1-init timer)
(Enter Tag_A-lock mode)
[Header Flags=FIR|ACK
& other options
/---------- Seen=Tag_A,Send=Tag_Z]
/ (Enter Tag_Z-lock mode)
(Cancel T1-init timer)<-------/
[Header Flags=ACK|DAT [Header Flags=ACK|DAT
Mode=options & other options
Seen=Tag_Z,Send=1] Seen=Tag_Z,Send=1]
[data field] -----------\ [data field] -----------\
\ (Start T3-send timer) \
\-------> (Leave Tag_Z-lock mode) \----> (Leave Tag_Z-lock mode)
E) Upon the receipt of the first datagram with user data from Endpoint
A and with the Seen value set to Tag_Z, Endpoint Z should leave the
Tag_Z-lock mode.
F) Similarly, upon the receipt of the first datagram with user data
and the Seen value set to Tag_A from Endpoint Z, Endpoint A
should leave the Tag_A-lock mode.
The upper level protocol or application can predefine a set of default
transmission modes, which will be used by the endpoint for
initialization. However, it should be pointed out that the
transmission modes between two endpoints are allowed to change on a
datagram by datagram basis, as been illustrated in later chapters.
4.2 Multiple Network Addresses
In order to support multiple networks, both endpoints need to have If T1-init timer expires at "A" after the Initiation sent, the same
knowledge of all network addresses available to each other. This Initiation datagram with the same Tag_A value will be retransmitted
information needs to be passed to the other end during the and the timer restarted. This will be repeated Max.Init.Retransmit
initialization. The data field of the initiation and initiation Ack times before "A" considers "Z" unreachable and optionally reports the
datagrams is used for this purpose. failure.
Depending on the underlying network configuration, the data field will 3.1.1 Choice of Tag Value
be filled in one of the two following ways:
A) If the sending endpoint of the initiation or initiation Ack Tag values should be selected from the range of 0x80000000 to
datagram does not have access to multiple networks, the data field 0xffffffff.
will be set to the pad value of 4 octets of '0's.
B) If the sending endpoint has access to multiple networks (for 3.2 Data Field Format of Initiation Datagrams
example two redundant LANs), the first 4 octets of the data field will
be an unsigned long integer (in network order) specifying how many
networks the endpoint has access to. Following these 4 octets will be
a list of network addresses. Each address begins with a header of 4
octets followed by the actual address. The first 2 octets of the
header is an unsigned integer indicating the size of the actual
address. The next 2 octets of the header is the type of the address.
For an IPv4 address, the address header will have the size set to 8 If redundant networks exist between two endpoints, the data field of
and the type set to AF_INET (2). Of the 8 octets used by the actual the Initiation and Initiation Ack datagrams will carry the redundant
IPv4 address, the first 4 octets will contain the IP address (in network information.
network order) of the path. The next two octets will contain the UDP
port number (in network byte order). The last two octets will be
padded with 0's.
The data field of the initiation or initiation Ack datagram from an The following shows the data field format carrying N IPv4 redundant
endpoint with access to two IPv4 networks would look the following: network information:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Networks = 2 | | Number of Networks = N |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of address=8 | Type of Address=AF_INET (2)| | Size of address=8 | Type of Address=AF_INET (2)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IP Address of Network 1 = 0x88b68108 | | IP Address of Network 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Port = 52212 | Padding = 0 | | Port # 1 | Padding = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
/ /
\ ... \
/ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of address=8 | Type of Address=AF_INET (2)| | Size of address=8 | Type of Address=AF_INET (2)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IP Address of Network 2 = 0x0a100001 | | IP Address of Network N |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Port = 52212 | Padding = 0 | | Port # N | Padding = 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Any data following the initiate network list can be ignored. Implementations Additional implementation-specific data is allowed after the redundant
are at option to use additional data sent in subsequent locations for network information. No user data, however, is allowed to be
implementation specific data exchanges. No user data, however, is allowed transported in Initiation or Initiation Ack datagrams.
to be transported in this datagram.
4.3 Initialization Collision
If both endpoints attempt to initialize the communication at about the
same instance, a collision will occur. In a collision each endpoint
will receive an initiation datagram from the other side after it
transmitted its own. Both sides must acknowledge the initiation
datagram in the normal procedure as described in 4.1
The following is an example of initialization collision:
Endpoint A Endpoint Z
[Header Flags=FIR|RES [Header Flags=FIR|RES
Mode=options Mode=options
Seen=0,Send=Tag_A] --------\ /----- Seen=0, Send=Tag_Z]
(Start T1-init timer) \ / (Start T1-init timer)
/
/ \
/ \
[Header Flags=FIR|RES|ACK <------/ \
Mode=options \---> [Header Flags=FIR|RES|ACK
Seen=Tag_Z,Send=Tag_A]----\ Mode=options
\ /------- Seen=Tag_A,Send=Tag_Z]
\
/ \-------> (Cancel T1-init timer)
(Cancel T1-init timer) <------/
..
[Header Flags=ACK|DAT
Mode=options
Seen=Tag_Z,Send=1] ------------------>
..
[Header Flags=ACK|DAT
Mode=options
<----------------- Seen=Tag_A,Send=1]
4.4 Re-initialization
An endpoint is allowed to re-initialize an established communication.
In the case of re-initialization, the endpoint which initiates the 3.3 Initialization Collision
re-initialization (i.e, the initiator) should use a tag different
from the one used in the previous initialization. The initiator should
follow the standard initialization procedure as stated in 4.1.
Upon the arrival of the initiation datagram, the peer of the initiator If two endpoints attempt to initialize an association with each other
should also follow the procedure stated in 4.1 to respond. Note that at about the same instance, a collision will occur, i.e., each side
any outstanding flows that were open are considered closed once will receive an Initiation datagram from the other side after it
re-initialized. transmitted its own. In such a case, both sides shall acknowledge the
Initiation datagram of the other side in the normal procedure as
described above.
4.5 Link Rotation 3.4 Association Re-initialization
When multiple networks exist between two communicating endpoints, An endpoint shall be allowed to re-initialize an established
every time the application transmits a datagram, the MDTP association with another endpoint.
implementation MUST keep track of which network the transmission was
sent on (if more than one network exists) in the MDTP protocol variable
'last.sent.intf'. If the user does not specifically override rotation,
each send should be rotated in a round robin fashion amongst all In such a case, the endpoint that initiates the re-initialization
available networks and the protocol variable 'last.sent.intf' should (i.e, the initiator) shall use a tag different from the one used in
be updated to indicate which interface was used last. The MDTP the previous initialization. And the initiator shall follow the normal
implementation should consider the rules defined in "5.5 initialization procedure as stated in section 3.1.
Retransmission on Multiple Networks" to consider if a network is
"available"
The MDTP implementation MUST allow a user to override this rotation Once left the Tag-lock mode of the current association initialization,
defeating MDTP's rotation upon each send. an endpoint shall treat any new incoming Initiation from its peer as a
re-initialization event. Upon the arrival of the new Initiation
datagram from the peer, the receiving endpoint shall also follow the
procedure stated in section 3.1 to respond.
5. Reliable Transfer Mode 4. Reliable Transfer of Datagrams
Reliable transfer mode is indicated if the sending endpoint sets the Reliable transfer is indicated if the datagram being transferred has
GAR option on the current datagram. GAR bit set to 1 and the UNR bit set to 0. The receiver of a
reliable datagram shall always acknowledgment the sender.
If the sending endpoint was previously transmitting in unreliable mode Normally, delayed acknowledgment is used, and the acknowledgment can
(by setting UNR bit in each previous datagram), the receiver must either be sent separately or piggy-backed on a datagram traveling
reset its Seen counter to the Send value of this current datagram in the opposite direction.
upon receiving it.
The following example illustrates both piggy-backed and non-piggy-backed The following example illustrates both separate and piggy-backed
acknowledgments with both ends transmitting in reliable mode: acknowledgments with both ends transmitting in reliable mode:
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends 3 messages} {App sends 3 messages}
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=1,Size=100]-------------> (Start T2-receive timer) Seen=0,Send=1,Size=100]-------------> (Start T2-receive timer)
(Start T3-send timer) (Start T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=101,Size=100]-----------> Seen=0,Send=2,Size=100]----------->
(Restart T3-send timer) (Restart T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=201,Size=100]-----------> Seen=0,Send=3,Size=100]----------->
(Stop and restart T3-send timer) (Restart T3-send timer)
...
{Timer T2 expires} {Timer T2 expires}
<---------------------------- [Header Flags=ACK /----------- [Header Flags=ACK
Mode=0 / Part=0,Of=0
Part=0,Of=0 / Seen=3,Send=1]
Seen=301,Send=1] /
(cancel T3-send timer) <------
(cancel T3-send timer) ...
.. ...
{App sends 1 message} {App sends 1 message}
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=301,Size=100]-----------> (Start T2-receive timer) Seen=1,Send=4,Size=100]-----------> (Start T2-receive timer)
(Start T3-send timer) (Start T3-send timer)
...
{App sends 1 message} {App sends 1 message}
(cancel T2-receive timer) (cancel T2-receive timer)
<---------------------------- [Header Flags=DAT|ACK /----------- [Header Flags=DAT|ACK|GAR
Mode=GAR / Part=0,Of=1
Part=0,Of=1 / Seen=4,Send=1,Size=45]
Seen=401,Send=1,Size=45] / (Start T3-send timer)
(Start T3-send timer) (cancel T3-send timer) <------
(cancel T3-send timer)
(Start T2-receive timer) (Start T2-receive timer)
.. ..
{Timer T2 Expires} {Timer T2 Expires}
[Header Flags=ACK [Header Flags=ACK
Part=0,Of=0 Part=0,Of=0
Seen=46,Send=401]------------------> (cancel T3-send timer) Seen=1,Send=5]------------------> (cancel T3-send timer)
In the above example, the first series of 3 messages of 100 octets each
are sent by Endpoint A. The messages are unbundled in this example,
i.e., each message will be transmitted in a single datagram. Endpoint
A starts its send timer T3 after sending the first datagram, and each
subsequent send will stop and restart the send timer T3, extending the
life of the send timer. Endpoint Z upon receiving the first datagram
starts the receive timer T2. When timer T2 in Endpoint Z expires,
Endpoint Z transmits an Ack. Upon receipt of this Ack by Endpoint A,
it stops timer T3 and discards the first 3 datagrams (held for
possible retransmissions).
After the first three messages were transmitted successfully, the
application at Endpoint A sends another message of 100 octets. After
sending this datagram, Endpoint A starts timer T3 again. Upon
receipt of the datagram, Endpoint Z starts Timer T2. Before
Endpoint Z's T2 timer expires, the application at Endpoint Z sends a
message of 45 octets to Endpoint A. This causes Endpoint Z to cancel
the T2 timer and to piggyback an Ack on the out-bound datagram being
transmitted to Endpoint A. After the transmission, Endpoint Z then
starts its T3 timer. Upon receipt of this datagram Endpoint A
cancels its T3 timer (since all data it has sent is acknowledged), and
starts a receive timer T2. At the expiration of the T2 timer Endpoint
A acks the receipt of the last datagram from Endpoint Z. This Ack
causes Endpoint Z to cancel its T3-send timer.
It is very important to notice in the above example that the
acknowledgments to the received datagrams are always delayed by timer
T2. This delay gives the receiving endpoint a window to piggyback the
Acks onto subsequent datagrams traveling in the opposite direction,
thus to avoid sending the Acks in separate datagrams.
5.1 Timer Control
The basic rules for timer control are as follows:
A) When all outstanding datagrams are acknowledged, the T3-send timer
shall be stopped, if one is running.
B) When a datagram with application data (i.e., with DAT flag set) is
received, the endpoint shall start a T2-receive timer if no timer is
running.
C) Upon the expiration of the T2-receive timer, the endpoint shall
ack to the sender all the un-acked data it has received.
D) When a datagram with application data is sent out, the sending
endpoint shall start a T3-send timer. If the T3-send timer is already
running, the endpoint shall first stop the old T3 timer and then
start a new one. If the T2-receive timer is running, the endpoint
shall first stop the T2 timer, piggyback an Ack unto the out-bound
datagram, and then start a T3-send timer.
E) If the T3-send timer expires, the endpoint shall attempt
re-transmission according to the rules described in 5.5.
F) No more than one timer of any type should be running on an
endpoint at any given moment.
G) When a T2-receive timer expires, any bundled data waiting to be
transmitted should be sent immediately with a piggy-backed Ack to
acknowledge all un-acked data previously received.
H) Whenever a T3-send timer is to be started, any running timer should Note that if the datagrams previously received from the same sending
be stopped and supplanted by the T3-send timer. endpoint was transmitted in Unreliable transfer mode (see Appendix E
for details on Unreliable transfer), the receiving endpoint must
reset its Seen counter to the value of the Send field in the current
reliable datagram.
I) In bundling mode, if the total size of all application messages 4.1 Timer Management Rules
pending to be sent is less than the bundle size, the messages should
be withheld and the T4-bundle timer should be started.
J) If the total size of all application messages pending to be sent The the following rules shall be used to manage the timers during
exceeds the bundle size, the T4-bundle timer should be stopped and normal Reliable transfer, unless otherwise stated for some special
the message(s) should be immediately sent. cases:
K) If a T4-bundle timer is running and data arrives, the T2-receive A) When a reliable datagram with user data (i.e., with DAT flag set) is
timer should not be started. received, the endpoint shall start a T2-receive timer if no other
timer is running, and upon the expiration of the T2-receive timer,
the endpoint shall ack to the sender all the un-acked datagrams
it has received.
L) A T4-bundle timer should never be canceled unless it is being B) When a reliable datagram with user data is sent out, the sending
supplanted by a T3-send timer. endpoint shall start a T3-send timer. If the T3-send timer is
already running, the endpoint shall first stop the old T3 timer
and then start a new one. If the T2-receive timer is running, the
endpoint shall first stop the T2 timer, piggyback an Ack unto the
out-bound datagram, and then start a T3-send timer. Upon the
expiration of the T3-send timer, the endpoint shall follow the rules
described in 4.5 for possible re-transmission of the un-acked
datagrams. Whenever the T3-send timer is started the RTT estimate
last calculated for that network should be added to the base
T3-send timer value (if a RTT value is measured, see section 4.6).
M) When the first datagram with the Tag which unlocks the initiation C) When all outstanding datagrams are acknowledged, the T3-send timer
is received, no T2-receive timer should be started, instead an shall be stopped if one is still running.
acknowledgment must be sent without delay.
The following example shows the use of various timers. The following example shows the use of various timers.
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends 2 messages} {App sends 2 messages}
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=501,Size=100]-----------> (Start T2-receive timer) Seen=1,Send=6,Size=100]-----------> (Start T2-receive timer)
(Start T3-send timer) (Start T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 {App sends 1 message} Part=0,Of=1 {App sends 1 message}
Seen=1,Send=601,Size=100]-\ /-- (cancel T2-receive timer) Seen=1,Send=7,Size=100]---\ /--- (cancel T2-receive timer)
(stop and restart T3-send timer) \ / [Header Flags=DAT|ACK (Restart T3-send timer) \ / [Header Flags=DAT|ACK|GAR
\ / Mode=GAR
\ / Part=0,Of=1 \ / Part=0,Of=1
\ Seen=601,Send=1,Size=100] \/ Seen=6,Send=2,Size=100]
/ \ (Start T3-send timer) / \ (Start T3-send timer)
/ \ / \
<----/ \--> <----/ ---->
.. ...
...
{T3-send timer expires} {T3-send timer expires}
[Header Flags=DAT|ACK (re-transmit 2nd datagram)
Mode=GAR [Header Flags=DAT|ACK|GAR
Part=0,Of=1 Part=0,Of=1
Seen=101,Send=601,Size=100]---------> (Cancel T3-send timer) Seen=2,Send=7,Size=100]---------> (Cancel T3-send timer)
(Restart T3-send timer) (Start T2-receive timer) (Restart T3-send timer) (Start T2-receive timer)
.. ..
{Timer T2 expires} {Timer T2 expires}
(Cancel T3-send timer) <-------------- [Header Flags=ACK (Cancel T3-send timer) <-------------- [Header Flags=ACK
Mode=0
Part=0,Of=0 Part=0,Of=0
Seen=701,Send=101] Seen=7,Send=3]
In this example, the application at Endpoint A sends 2 messages to 4.1.1 Link Rotation
Endpoint Z. Both messages are 100 octets in length. Before the second
datagram arrives at Endpoint Z, Endpoint Z's application sends a
message to Endpoint A. This causes Endpoint Z to cancel its T2-receive
timer and piggyback the Ack to the first received datagram on the
out-bound datagram destined to Endpoint A. After transmitting the
datagram Endpoint Z starts its T3-send timer. When the T3-send timer
at Endpoint A expires, it will re-send its earlier datagram. The
retransmitted datagram is the same except for now it acknowledges all
outstanding packets that Endpoint Z has sent. After retransmitting the
datagram Endpoint A restarts its T3-send timer.
The arrival of the retransmitted datagram causes Endpoint Z to cancel When multiple networks exist between two communicating endpoints,
its T3-send timer and discard the duplicate datagram, and it now every time the application transmits a datagram, the MDTP
implementation MUST keep track of which network the transmission was
sent on (if more than one network exists) in the MDTP protocol variable
'last.sent.intf'. If the user does not specifically override rotation,
each send should be rotated in a round robin fashion amongst all
available networks and the protocol variable 'last.sent.intf' should
be updated to indicate which interface was used last.
starts its T2-receive timer. At the expiration of the T2-receive timer The MDTP implementation MUST allow a user to override this rotation
Endpoint Z sends the Ack to Endpoint A. Endpoint A upon receipt of the defeating MDTP's rotation upon each send. The implementation must also
Ack Cancels its T3 timer. provide a interface to add and remove a link from rotation eligibility.
5.2 Gap Acknowledgments 4.2 Gap Acknowledgment for Missing Datagrams
If a datagram becomes missing during a series of transmissions, a If reliable datagrams become missing during a series of transmissions,
special type of acknowledgment known as the gap Ack will be sent. The a special type of acknowledgment known as the Gap Ack will be sent
gap Ack tells the sender of the missing datagram that retransmission back to inform the sender to re-transmit the missing datagrams.
is needed.
The following example shows the use of gap Ack. The following example shows the use of Gap Ack.
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends 3 messages} {App sends 3 messages}
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=701,Size=100]--------> (Start T2-receive timer) Seen=3,Send=8,Size=100]-----------> (Start T2-receive timer)
(Start T3-send timer) (Start T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=801,Size=100]-----X (lost) Seen=3,Send=9,Size=100]-----X (lost)
(Restart T3-send timer) (Restart T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=901,Size=100]--------> (A gap detected in data) Seen=3,Send=10,Size=100]-----------> (A gap detected in data)
(Restart T3-send timer) (Restart T3-send timer)
.. ..
{T2-receive timer expires} {T2-receive timer expires}
/------ [Header Flags=ACK /------- [Header Flags=ACK
/ Mode=0 / Seen=9,Send=3,
/ Seen=801,Send=146,
/ Part=1,Of=1 / Part=1,Of=1
/ data=(long integer)901] / data=(long integer)10]
(Prepare retransmit) <--------/ (Prepare retransmit) <--------/
In this example, when Endpoint Z received the third datagram from In this example, when "Z" receives the third datagram from "A" it
Endpoint A it realizes that a gap exists in the received data. At the realizes that a gap exists in the received data. At the expiration of
expiration of T2-receive timer, Endpoint Z sends a gap Ack, in place T2-receive timer, "Z" sends a Gap Ack, in place of a normal Ack, to
of a normal Ack, to Endpoint A to indicate the missing data. "A" to indicate the missing datagram.
In the gap Ack, the Part and Of fields are both set to '1', as opposed
to '0' as in a normal Ack. The data field of the gap Ack is a four (4)
octet long integer containing the sequence number of the last octet of
the gap (which is 901 in this example). The Seen field in the gap Ack
will contain the sequence number of the first octet of the gap.
Using these two values, Endpoint A should be able to calculate the In the Gap Ack, the Part and Of fields are both set to '1', as opposed
position and size of the missing data (which is 801-900 in this to '0' as in a normal Ack. The data field of the Gap Ack is a four (4)
octet long integer containing the sequence number of the next datagram
after the Gap (which is 10 in this example). The Seen field in
the Gap Ack will contain the sequence number of the datagram of the
gap. Using these two values, "A" should be able to calculate the
the missing datagram numbers (which is 9 in this
example) and thus determine which datagrams will need to be example) and thus determine which datagrams will need to be
retransmitted. retransmitted.
Gap Acks cannot be piggy-backed with application data. The following is Note that Gap Acks cannot be piggy-backed with user data; if there is
another example of using gap Ack: user data to be sent when a gap is detected, the Gap Ack must be sent
out first before the datagram carrying user data can be sent.
Endpoint A Endpoint Z
{App sends 3 messages}
[Header Flags=DAT|ACK
Mode=GAR
Part=0,Of=1
Seen=146,Send=701,Size=100]--------> (Start T2-receive timer)
(Start T3-send timer)
[Header Flags=DAT|ACK
Mode=GAR
Part=0,Of=1
Seen=146,Send=801,Size=100]-----X (lost)
(Restart T3-send timer)
[Header Flags=DAT|ACK
Mode=GAR
Part=0,Of=1
Seen=146,Send=901,Size=100]--------> (A gap is detected)
(Restart T3-send timer)
..
{App sends a message}
(Cancel T2-receive timer)
/------ [Header Flags=ACK
/ Mode=0
/ Seen=801,Send=146,
/ Part=1,Of=1
/ data=(network long)901]
(Retransmit missing data) <-----/
[Header Flags=DAT|ACK - [Header Flags=DAT|ACK
Mode=GAR / Mode=GAR
Part=0,Of=1 / Part=0,Of=1
Seen=146,Send=801,Size=100]- / Seen=801,Send=146,Size=100]
(Restart T3-send timer) \ / (Start T3-send timer)
\/
/\
<---------/ \
\
\-->
..
{T3-Send timer expires}
(Retransmit app data)
(Cancel T3-send timer) <--------------- [Header Flags=DAT|ACK
(Start T2-receive timer) Mode=GAR
Part=0,Of=1
Seen=1001,Send=146,Size=100]
(Restart T3-send timer)
..
{T2-receive timer expires}
[Header Flags=ACK
Part=0,Of=0
Seen=246,Send=1001]----------------> (Cancel T3-send timer)
In this example, Endpoint Z detected the missing data when it received
the second datagram. However, before the T2-receive timer expired, the
application at Endpoint Z requested to send a message (of 100 octets
in length). This caused Endpoint Z to cancel its T2-receive timer and
send the gap Ack before it sent out the datagram containing the
application message. After transmitting the application message
Endpoint Z started its T3-send timer. When Endpoint Z's T3-send timer
expired it retransmitted the previous datagram and at the same time
acked all of Endpoint A's outstanding datagrams. Upon the receipt of
the retransmission from Endpoint Z, Endpoint A started its own
T2-receive timer. At the expiration of its T2-receive timer Endpoint A
sent an Ack to Endpoint Z and resolved the outstanding datagram at
Endpoint Z.
5.3 Congestion Control 4.3 Flow and Congestion Controls
Three different mechanisms should be used jointly to achieve flow Several different mechanisms shall be used jointly to achieve
and congestion control in MDTP. flow and congestion controls in MDTP.
First, a limit should be set on the number of out-bound messages 4.3.1 Sending with Window Control
queued up at an endpoint. If the limit is reached, new send requests
from the application should be rejected until the number of messages
in the queue drops back.
Secondly, MDTP uses a transmission window to control the number of The sending endpoint shall use a transmission window to control the
outstanding datagrams, i.e., datagrams that have been sent, but yet to number of outstanding datagrams, i.e., datagrams that have been sent,
be acknowledged. The length of the window is defined as the maximal but yet to be acknowledged. The length of the window is defined as the
number of outstanding datagrams a sending endpoint can allow. This maximal number of outstanding datagrams a sending endpoint can
length is adjusted dynamically, depending on the current number of allow. This length is adjusted dynamically, depending on the current
successful transmissions as well as the number of lost datagrams. number of successful transmissions as well as the number of lost
datagrams.
When the number of outstanding datagrams reaches the current window When the number of outstanding datagrams reaches the current window
length, the endpoint may still accept send requests from the length, the endpoint shall still accept send requests from its upper
application, but will transmit no more datagram until an Ack is layer, but shall transmit no more datagrams until an Ack is received.
received.
Also, when the window length is reached, the next send request from the
application will trigger the sending endpoint to transmit a special Moreover, when the window length is reached, the next send request
Window Up message. Upon receiving this Window Up message the receiver from the upper layer will trigger the sending endpoint to transmit a
must respond with a Window Up Response message, as illustrated by the special Window Up message. Upon receiving this Window Up (WIN|ACK) the
following diagram (assume current window length is 3): receiver must respond with a Window Up Response (WNR|ACK), as
illustrated by the following example (assuming current window length
is 3):
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends 3 messages} {App sends 3 messages}
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=1001,Size=100]--------> (Start T2-receive timer) Seen=0,Send=11,Size=100]-----------> (Start T2-recv timer)
(Start T3-send timer) (Start T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=1101,Size=100]--------> Seen=0,Send=12,Size=100]----------->
(Restart T3-send timer) (Restart T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=1201,Size=100]--------> Seen=0,Send=13,Size=100]----------->
(Restart T3-send timer) (Restart T3-send timer)
{App sends 1 messages} {App sends a new message}
{ queue 100 byte message } (queue new message and send Win Up)
[Header Flags=WIN|ACK [Header Flags=WIN|ACK
Seen=146,Send=1301]-----------------> (cancel T2-receive timer) Seen=0,Send=14]--------------------> (cancel T2-recv timer)
/--- [Header Flags=ACK /----- [Header Flags=WNR|ACK
/ Mode=WNR
/ Part=0,Of=0 / Part=0,Of=0
/ Seen=1301,Send=146] / Seen=14,Send=0]
[Header Flags=DAT|ACK <---------/ [Header Flags=DAT|GAR|ACK <--------/
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=146,Send=1301,Size=100]--------> (Start T2-receive timer) Seen=0,Send=15,Size=100]-----------> (Start T2-recv timer)
(Restart T3-send timer)
In this example, after the transmission of the first three datagrams, In the above example, after the transmission of the first three
Endpoint A reached its window length. The next message from the datagrams, "A" reached its window length. The next message from the
application triggered a Window Up message that was sent to Endpoint user triggered a Window Up that was sent to "Z". The Window Up shall
Z. The Window Up message always contains no data and has its WIN flag contain no user data. In response, "Z" cancelled timer T2 and
set. In response, Endpoint Z cancelled timer T2 and immediately sent immediately sent a Window Up Response. The arrival of this Window Up
an Ack with the WNR set in the Mode field. The arrival of this Ack Response effectively resolved all the outstanding datagrams at "A",
from Endpoint Z effectively resolved all the outstanding datagrams at thus allowed "A" to send out the next datagram.
Endpoint A, thus allowed Endpoint A to send out the next datagram.
The window length is initially set to 2, and is then dynamically 4.3.2 Window Length Adjustment
adjusted based on the performance of the underlying networks.
If the current window length is equal to or greater than 4, every time The window length shall be initially set to 2, and shall then be
dynamically adjusted based on the datagram loss and acknowledgment
conditions of the underlying network.
when 4 consecutive outstanding datagrams are acknowledged at once by When 4 consecutive outstanding datagrams are acknowledged at once by
the receiver, the sender's window length will be raised by 1 until it the receiver, the sender's window length will be raised by 1 until it
reaches 20. reaches the protocol parameter 'Max.Outstanding.dg' (which should be a
user configurable parameter).
If the length is less than 4, every time when the number of If the current window length is less than 4, every time when the
consecutively acknowledged outstanding datagrams is equal to or number of consecutively outstanding datagrams acknowledged in a single
greater than the current window length, the sender's window will be Ack is equal to or greater than the current window length, the
raised by 1 until it reaches 20. sender's window length shall be raised by 1, until it reaches
'Max.Outstanding.dg'.
The sender's window length will be decreased if datagram loss In the following circumstances, the sender's window length shall be
occurs. If between 1 to 3 consecutive datagrams are lost, the window decreased. However, when the window length reaches 2 it shall not be
length will be decreased by 1. If between 4 to 7 datagrams are lost, decreased any further.
the window length will be decreased by 2. If 8 or more datagrams are
lost, the window length will be decreased by 4. When the window length If between 1 to 3 consecutive datagrams are lost, the window length
reaches 2 it will not be decreased any further. will be decreased by 1. If between 4 to 7 datagrams are lost, the
window length will be decreased by 2. If 8 or more datagrams are lost,
the window length will be decreased by 4.
Moreover, any time a Window Up is sent to the receiving endpoint the Moreover, any time a Window Up is sent to the receiving endpoint the
sender's window length will be decreased by 1. Also, if a timeout sender's window length will be decreased by 1. Also, if a timeout
forces a retransmission the sender's window length will be decreased forces a retransmission the sender's window length will be reduced
by 1. Moreover if a duplicate Ack is received by a sender, this should to half of its currently value.
indicate a network congestion situation and the number of outstanding
packets allowed should be decreased by 4.
The following table summarizes these rules: The following table summarizes these rules:
- -----------------------------------------------------------------------
Duplicate Ack received by sender | Adjust down by 4 Duplicate Ack received by sender | Adjust down by 4
- -----------------------------------------------------------------------
Greater than 8 datagrams lost | Adjust down by 4 Greater than 8 datagrams lost | Adjust down by 4
- -----------------------------------------------------------------------
Greater than 4 datagrams lost | Adjust down by 2 Greater than 4 datagrams lost | Adjust down by 2
- -----------------------------------------------------------------------
Greater than 0 datagrams lost | Adjust down by 1 Greater than 0 datagrams lost | Adjust down by 1
Timeout forces retransmission | Adjust down by 1 - -----------------------------------------------------------------------
Timeout forces retransmission | Adjust down by 1/2 of the current
| window.
- -----------------------------------------------------------------------
Window Up sent | Adjust down by 1 Window Up sent | Adjust down by 1
- -----------------------------------------------------------------------
4 or more consecutive datagrams | Adjust up by 1 4 or more consecutive datagrams | Adjust up by 1
acknowledged (window length > 4) | acknowledged (window length > 4) |
- -----------------------------------------------------------------------
1/2 Window length or more acked | Adjust up by 1 1/2 Window length or more acked | Adjust up by 1
(window length <=4) | (window length <=4) |
- -----------------------------------------------------------------------
Finally, the third flow control mechanism is to exchange incoming 4.3.3 Flow Control using In-Queue Information
queue information between the two communicating endpoints. By using the
In Queue field in the MDTP header, the sender can inform the receiver
the number of pending datagrams which the sender has received, but yet
to deliver to its application. The following example shows how the
endpoints use In Queue value to accomplish flow control.
Assume that Endpoint A sent Endpoint Z 20 datagrams, and when Endpoint By using the In Queue field in the MDTP header, the sender can inform
the receiver the number of pending datagrams which the sender has
received, but yet to deliver to its application. The following example
shows how the endpoints use In Queue value to accomplish Flow control.
Z acked the receipt of all the 20 datagrams, only the first one of Assume that Endpoint A has sent Endpoint Z 20 datagrams, and when
the 20 datagrams was delivered to the application at Endpoint Z. In Endpoint Z sends an Ack on the reception of these 20 datagrams, only
the last Ack sent by Endpoint Z, the In Queue field would then have a the first one of them has been delivered to the upper layer at
Endpoint Z.
In the Ack sent by Endpoint Z, the In Queue field would then have a
value of 19, indicating the number of datagrams pending for delivery value of 19, indicating the number of datagrams pending for delivery
to its application. This value would be checked by Endpoint A before to its upper layer. This value would be checked by Endpoint A before
it sent the next datagram to Endpoint Z. If this value was found to be it sent the next datagram to Endpoint Z. If this value was found to be
greater than its current window length, Endpoint A would not send the greater than its current window length, Endpoint A would not send the
next datagram. Instead, Endpoint A would start its T3-send timer and next datagram. Instead, Endpoint A would start its T3-send timer and
send a Window Up message to Endpoint Z at the expiration of the timer. send a Window Up message to Endpoint Z at the expiration of the timer.
This would force Endpoint Z to send an Ack with an updated In Queue This would force Endpoint Z to send another Ack with an updated In
value. If the new In Queue value was still greater than its window Queue value. If the new In Queue value was still greater than its
length, Endpoint A would restart its T3-send timer, repeating this window length, Endpoint A would re-start its T3-send timer, and repeat
procedure until the In Queue value of Endpoint Z dropped below the this procedure until the In Queue value of Endpoint Z dropped below
current window length of Endpoint A. Then, the transmission at the current window length of Endpoint A. Then, the transmission at
Endpoint A would resume. Endpoint A would resume.
5.4 Sequence Number Reset 4.3.4 T3-send Timer Adjustment with RTT
It may become necessary for an endpoint to reset the sequence number If the RTT measurement is available on a specific network, the sender
while it is sending data to a peer. However, the endpoint must inform shall adjust the T3-send timer each time when sending datagram using
the peer about this event by: this network. The calculation and adjustment of the timer should
follow the method described in [4]. RTT measurement shall be tracked
for each network if redundant networks are in use.
1) sending a Window Up message to force the peer to acknowledge all MDTP defines two optional methods to obtain RTT measurements, see
received datagrams which have not been acknowledged, and sections 4.6 and 4.7.
2) sending the next datagram with RES bit set in the Flags field. 4.4 Sequence Number Reset
3) A sending endpoint should always reset it sequence counter before When the datagram sequence number reaches the value 0x7fffffff the
the counter reaches 0x7fffffff. When the counter reaches this next sequence number shall be set to 1.
value the sending endpoint is required to reset its sequence
counter.
4) A sending endpoint should never reset its sequence counter until 4.5 Datagram Re-transmission
after reaching 0x7fff05ff.
Note: This section will be obsoleted in a future version of the Whenever a T3-send timer expires, the endpoint shall re-transmit the
draft and be replaced by a deterministic roll-over algorithm. un-acked datagram that has the lowest Send value, unless:
The following example illustrates the sequence number reset procedure A) If the current window length is reached, a Window Up message will
(assume that Endpoint A opts to do a reset when the data sequence be sent out (see 4.3 Congestion Control), or
number becomes greater than 0x7fffff000).
Endpoint A Endpoint Z B) If the current window length is not reached and there is still
user data pending for transmission, a new datagram with user data
shall be sent out and T3-send timer shall be restarted.
{App sends 2 messages} When a T3-send timer is started at a re-transmission, the length of
[Header Flags=DAT|ACK the next T3-send timer for this destination should be doubled and the
Mode=GAR last estimated RTT value for that network should be added to the timer.
Part=0,Of=1
Seen=46,Send=0x7ffff000,Size=100]----> (Start T2-receive timer)
(Start T3-send timer)
(Reset sequence number)
[Header Flags=WIN|ACK
Seen=146,Send=0x7ffff100]------------> (cancel T2-receive timer)
/------- [Header Flags=ACK
/ Mode=WNR
/ Part=0,Of=0
/ Seen=7fffff100,Send=46]
(Cancel T3-send timer) <------/
[Header Flags=DAT|ACK|RES
Mode=GAR
Part=0,Of=1
Seen=46,Send=2,Size=100]-------------> (Start T2-receive timer)
(Restart T3-send timer)
.. 4.5.1 Re-transmission on Redundant networks
{App sends 1 message}
(cancel T2-receive timer)
(Cancel T3-send timer) <---------------- [Header Flags=DAT|ACK
(Start T2-receive timer) Mode=GAR
Part=0,Of=1
Seen=102,Send=46,Size=100]
(Start T3-send timer)
In the above example, after transmitting the first datagram Endpoint A When redundant networks exist between two communicating endpoints, the
determines that its data sequence number needs to be reset before it re-transmission shall be attempted on the network specified in the
transmits the next datagram. It first sends out a Window Up message to MDTP protocol variable 'last.good.intf'. The value of 'last.good.intf'
force Endpoint Z to send back a Window Up Response to ack all the is always updated to refer to the network on which the last datagram
outstanding received data. Then, it transmits the datagram from the peer endpoint arrived.
it has been withholding, with the new sequence number and the RES flag
set. Upon detecting the RES flag in the header of the incoming datagram,
Endpoint Z resets its data sequence counter on Endpoint A.
5.5 Retransmission on Multiple Networks Moreover, the number of consecutive re-transmissions is also recorded
in a variable 'retran.count' for each network. Every time a datagram
is received on a network, the corresponding 'retran.count' shall be
reset to 0.
Whenever a T3-send timer expires, the endpoint will take one of the If the value in the 'retran.count' of the current network exceeds
following three actions: half of the value of the protocol parameter 'Max.Retransmit', the
'last.good.intf' will be changed, so as to force the next
re-transmission to be directed to an alternate network and
optionally report a failure condition.
A) If the current window length is not reached (see 5.3) and there is The total number of consecutive re-transmissions across all the
application data pending, a new datagram will be sent out. networks in an association is also recorded. If this value exceeds the
limit defined by 'Max.Retransmit', the sending endpoint shall consider
the peer endpoint unreachable and stop transmitting data to it, and
optionally report the failure.
B) If the current window length is reached, a Window Up message will 4.6 RTT Measurement
be sent out.
C) If the window length is not reached, but there is no pending This defines the mechanism for round-trip-time (RTT) measurement in
application data to send, The datagram with the lowest Send value MDTP.
that is still outstanding (i.e., not been acked) will be
retransmitted.
When multiple networks exist between two communicating endpoints, the On occasions either side of an association may need to perform an RTT
re-transmission should be attempted on the network specified measurement of the network (or one of the redundant networks) between
in the MDTP protocol variable 'last.good.intf'. The value of them.
'last.good.intf' is always updated to refer to the network on which
the last datagram from the peer endpoint arrived.
Moreover, the number of consecutive re-transmissions is also recorded 4.6.1 RTT Datagram Header Format
in a variable 'retran.count' for each network. Every time a datagram is
received from a network, the corresponding retran.count is reset to '0'.
If the value in the retran.count of the current network exceeds a half The following shows the header format an endpoint shall use for RTT
of the value of the protocol parameter 'Max.Retransmit', the measurement:
'last.good.intf' will be changed, so as to force the next
re-transmission to be directed to an alternate network.
The total number of consecutive re-transmissions across all the MDTP Header Format - RTT measurement
networks is also recorded. If this value exceeds the limit defined by
'Max.Retransmit', the sending endpoint should consider the peer
endpoint unreachable and stop transmitting data to it, and optionally
report the failure.
5.5.1 Randomization of the T3-send timer at retransmission 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transparent Time Int-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Transparent Time Int-2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
When a T3-send timer is started after retransmitting a packet, the Two long integers are used in the data field to carry the time value.
value of the next T3-send timer for this destination should be The RTT datagram is identified by setting the RTC or RTM bit to 1.
extended by a random amount. The amount must be bounded so that the
application can predict with some reasonable degree of precision when
the destination endpoint is declared unreachable.
For performance considerations, this can be implemented by 4.6.2 Measure RTT
pre-calculating a set of random values and then using a different
value to extend the T3-send timer for each re-transmission to the
same destination endpoint.
5.6 Termination of an Endpoint AT the request of its upper layer, an endpoint shall initiate an RTT
measurement by sending an RTT datagram with GAR, ACK, and RTC bits set
to 1 (to a specific network if redundant networks exist). No
user data shall be carried. The sender shall also place in Time Int-1
and Time Int-2 the value of the current time of day in seconds and
microseconds.
When an endpoint terminates, it should send a shutdown message Upon the reception of this RTT datagram, the recipient shall
to each of the peer endpoints it has ever initiated for a immediately return the datagram to the sender (over the same network
communication. The shutdown message is sent in unreliable transfer on which the datagram arrives if redundant networks exist), with the
mode and need not to be acknowledged. When an endpoint receives a RTM and ACK bits set to 1.
shutdown message from its peer, it will remove the sender from its
record, and optionally report the termination of that peer.
The following sequence shows an example of the termination of an Upon the reception of this reply, the sender shall use the Time Int-1
endpoint (Endpoint A). and Time Int-2 in the reply datagram to calculate the RTT (of the
specific network if redundant networks exist).
Endpoint A Endpoint A Endpoint Z
RTT - Request Now=x.y
[Header Flags=ACK|GAR|RTC
Part=0,Of=1
Seen=1,Send=31,Size=0
Time-Int1=x
Time-Int2=y] ----------------------->
------ [Header Flags=ACK|RTM
/ Part=0,Of=0
/ Seen=31,Send=1
/ Time-Int1=x
/ Time-Int2=y]
/
(Endpoint A uses <-----------
current time subtracted from
x.y to calculate RTT)
{App indicates termination} 4.7 Link Heart Beat
[Header Flags=FIR
Mode=SHU
Seen=146,Send=1301,------------------------> to Endpoint X
[Header Flags=FIR This defines the mechanism for activating and transmitting of link
Mode=SHU heart beats in MDTP.
Seen=1496,Send=101,------------------------> to Endpoint Y
[Header Flags=FIR At request by its upper layer, an endpoint shall enable heart beat on
Mode=SHU a specific peer with which it has an established association in the
Seen=1460,Send=201-------------------------> to Endpoint Z Reliable transfer mode.
As shown in this example, the shutdown message is indicated by having The RTT datagram defined in section 4.6.1 shall be used as the Heart
both FIR flag and SHU mode bit set. Also, notice that no Beat.
acknowledgment is sent back by Endpoint X, Y, or X.
5.7 Endpoint Drain After having heart beat enabled, the endpoint shall transmit a Heart
Beat to that specific peer and start a T5-heartBeat timer. The peer
shall immediately respond to the Heart Beat in the same manner as an
RTT as described in section 4.6. This response shall be stored by the
first endpoint (also can be used to update its RTT measurement).
An endpoint may decide to "drain" a connection without completely When the T5-heartBeat timer expires, the endpoint shall first check if
shutting it down. By draining a connection, both endpoints will remove the previous heart beat has been responded (on the same network it was
any record and pending datagrams associated with the connection. sent in the case of redundant network). If not, the network that the
Further communications between the two endpoints can be resumed by last Heart Beat was sent upon shall be counted as a transmission
going through a re-initialization procedure. failure, and be handled following the rules described in section 4.5.
Then, the endpoint shall send another Heart Beat and re-start the
T5-heartBeat timer.
A "drain" message is specified with the UNR bit set in a shutdown In the case where redundant networks exist, the sending of Heart beats
message. No Ack is required for a "drain" message. shall follow the link rotation rules outlined in section 4.1.1.
The following sequence shows an example. If, before the expiration of T5-heartBeat timer, a datagram is
transmitted or received by the endpoint, the T5-heartBeat timer shall
be stopped and the appropriate T2-T4 timer shall be started. In other
words, the T5-heartBeat timer has the lowest precedence.
Endpoint A When no datagram to send and no other timers are running, the
T5-heartBeat timer shall be start and the above procedure shall
continue.
{App indicates termination} The suggested interval for T5-heartBeat timer is 4000 ms.
[Header Flags=FIR|UNR
Mode=SHU
Seen=146,Send=1301]------------------------> to Endpoint X
5.8 Advisory Acknowledgments. 4.8 Advisory Acknowledgment
To increase bandwidth utilization a sending endpoint may (at its option) This defines the mechanism for sending and handling of the Advisory
request an advisory acknowledgment. A endpoint would typically do this Acknowledgments in MDTP.
when 1/2 of its window is unacknowledged and upon its last datagram
that will fill its window. Upon reception of a advisory Acknowledgment An endpoint may use Advisory Acks to increase bandwidth utilization
request the receiver shall with no delay transmit an acknowledgment of when transmitting over a reliable association.
all received packets canceling any T2-Receive timer that may be running.
The sequence would look as follows: An Advisory Ack shall be indicated by setting RE1 flag to 1 in the
datagram.
The endpoint shall send an Advisory Ack to its peer when it reaches
half of its current window length, and also when it detects that the
next send will reach the full window length.
Upon the reception of an Advisory Ack, the peer endpoint shall
immediately acknowledge all the datagrams it has received but yet
acked upon, and then cancel the T2-recv timer if one is still
running.
The following shows an example of using Advisory Ack:
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends 3 messages} {App sends 3 messages}
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=1,Size=100]-------------> (Start T2-receive timer) Seen=0,Send=1,Size=100]-------------> (Start T2-recv timer)
(Start T3-send timer) (Start T3-send timer)
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK
Mode=GAR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=101,Size=100]-----------> Seen=0,Send=2,Size=100]----------->
(Restart T3-send timer) (Restart T3-send timer)
{detects window half full, use Advisory Ack}
[Header Flags=DAT|ACK [Header Flags=DAT|GAR|ACK|RE1
Mode=GAR|RE1
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=201,Size=100]-----------> Seen=0,Send=3,Size=100]------\
(Stop and restart T3-send timer) (Stop and restart T3-send timer) \
\----> (cancel T2-receive timer)
(cancel T2-receive timer) <---------------------- [Header Flags=ACK
<---------------------------- [Header Flags=ACK
Mode=0
Part=0,Of=0 Part=0,Of=0
Seen=301,Send=1] Seen=3,Send=1]
5.9 RTT Measurement 4.9 Termination of an Association
On occasion either end may wish to do a Round Trip Time measurement of When an endpoint terminates, it shall send a Shutdown datagram
a network. There are two methods of measuring Round Trip Time. (FIR|SHU) to each of the peer endpoints in all its existing
Method 1 involves a ping-pong using a special ACK, Method 2 involves a associations. The Shutdown datagram itself is sent in unreliable
rider on top of a datagram. If Method 2 is invoked then the Round Trip transfer mode and thus needs not to be acknowledged.
Time includes the T2-Receive timer (this actually may be more useful
then pure RTT time since each endpoint may have a different T2-Receive
timer value).
Method 1: When a endpoint wishes a RTT measurement it shall send a ACK When a peer endpoint receives the Shutdown, it will remove the sender
datagram with RE2 set to 1, GAR set to 1 and DAT set to 0. The sender from its record, and optionally report the termination of the sender
should place in Time Int 1 and Time int 2 the value of the current to the upper layer.
time of day in seconds/microseconds.
Upon receipt of a datagram with RE2 set to 1, GAR set to 1 and DAT set The following shows an example of the termination of Endpoint A:
to 0, the recipient should return the datagram to the sender over the
arriving network with the NOG bit set. The sender can then use the
Time Int 1 and Time Int 2 to calculate the current RTT.
Endpoint A Endpoint Z Endpoint A
RTT - Request Now=x.y {App indicates termination}
[Header Flags=ACK [Header Flags=FIR|SHU
Mode=GAR|RE2 Seen=3,Send=14, ------------------------> to Endpoint X
Part=0,Of=1
Seen=1,Send=301,Size=0
Time-Int1=x
Time-Int2=y]------------->
<---------------------------- [Header Flags=ACK|NOG
Mode=0
Part=0,Of=0
Seen=301,Send=1
Time-Int1=x
Time-Int2=y]
Endpoint A uses [Header Flags=FIR|SHU
current time subtracted from Seen=1496,Send=101,------------------------> to Endpoint Y
X.y (in arriving Datagram) to
calculate the RTT.
Method 2: [Header Flags=FIR|SHU
Seen=14,Send=2 -------------------------> to Endpoint Z
If a endpoint wishes to piggyback a RTT test including the T2-Timer at 4.10 Draining of an Association
the remote endpoint the sending endpoint fills out the datagram in the
normal way for reliable communication but also sets the RE2 flag, and
places at the end of the datagram (outside the length of the data) two
long integers has a trailer.
When the receiving endpoint recognizes the RE2 flag, it should extract An endpoint in a association may decide to "drain" the association
the two integers and place them in internal storage until the next without completely shutting it down. By draining an association, both
datagram is scheduled to be returned (i.e. at the expiration of the endpoints will remove any record and pending datagrams associated with
T2-Recv timer). If the The T2-Recv timer expires the receiving the association. Further communications between the two endpoints can
endpoint should send the acknowledgment as above with the addition of be resumed by going through a re-initialization procedure (see
the NOB flag as well. If the receiving endpoints upper layer sends a section 3).
datagram causing the T2-Recv timer to be canceled then the datagram
should include the Trailing integers and have the NOB flag set. In
cases where a intervening Window UP is received the receiving endpoint
should respond with a window Up Response (per the window up procedure)
but NOT cancel its T2-Recv timer.
Example 1 - T2-Recv timer expires In such a case, a Drain datagram (FIR|SHU|UNR) is sent to the peer
endpoint of the association, and no Ack is required.
Endpoint A Endpoint Z The following sequence shows an example of Draining:
RTT - Request Now=x.y
[Header Flags=ACK|DAT
Mode=GAR|RE2
Part=0,Of=1
Seen=1,Send=301,Size=100
{data of 100 octets}
Time-Int1=x
Time-Int2=y]-------------> (started T2-Recv)
{T2-Recv Expires }
<---------------------------- [Header Flags=ACK|NOG|NOB
Mode=0
Part=0,Of=0
Seen=301,Send=1
Time-Int1=x
Time-Int2=y]
Example 2 - Datagram causes T2-Recv timer cancel
Endpoint A Endpoint Z Endpoint A
RTT - Request Now=x.y {App indicates draining}
[Header Flags=ACK|DAT [Header Flags=FIR|SHU|UNR
Mode=GAR|RE2 Seen=146,Send=1301]------------------------> to Endpoint X
Part=0,Of=1
Seen=1,Send=301,Size=100
{data of 100 octets}
Time-Int1=x
Time-Int2=y]-------------> (started T2-Recv)
{datagram sent by application}
(cancel T2-Recv)
<---------------------------- [Header Flags=DAT|ACK|NOG
Mode=GAR
Part=0,Of=1,Size=100
Seen=401,Send=1
{data of 100 octets}
Time-Int1=x
Time-Int2=y]
5.10 Heart Beat Ack 5. Interface with upper level protocols
At request by the application, the user may wish a Heart Beat The upper layer protocols (ULP) shall request for services by passing
acknowledgment sent. The Heart Beat should only be allowed to be primitives to MDTP and shall receive notifications from MDTP for
enabled if the senders Mod is Gar (reliable delivery) and version is various events.
2. Once enabled when no datagrams are being transmitted, a T5-Heart
Beat timer should be started. When the T5 timer expires a ACK should
be sent using the next available link, following the link rotation
procedure outlined in "4.5 Link Rotation". After sending the Ack
another T5-Heart Beat timer should be started. If, before the
expiration of T5-Heart Beat, a datagram is transmitted or received,
the T5 timer should be stopped and the appropriate T2-T4 timer should The primitives and notifications described in this section should be
be started. The T5 timer has the lowest precedence of all timers. used as a guideline for implementing MDTP.
When sending a Heart Beat Ack, the format should be that of a RTT time A) Init.MDTP primitive
test. This will require the receiver to respond on the network. If
the sender does not get a response on the network the heartbeat
arrived on by the time a next heartbeat is to be sent, then the
network that the last heartbeat was sent upon should be counted as a
transmission failure has described in section "5.5 Retransmission on
Multiple Networks", and should counted against the 'retran.count' and
protocol parameter 'Max.Retransmit'.
6. Unreliable Transfer Mode This primitive allows MDTP to initialize its internal data structures
and allocate necessary resources for setting up its operation
environment. Note that once MDTP is initialized, ULP can communicate
directly with any other endpoints without re-invoking this primitive.
The unreliable transfer mode allows two endpoints to send to each Mandatory attributes:
other without acknowledging the receiving. This can usually achieve
higher data throughput than the reliable transfer mode. To indicate the
unreliable transfer mode the sender of a datagram simply sets the UNR
in the mode field. The following sequence illustrates unreliable data
transfer.
Endpoint A Endpoint Z None.
{App sends 2 messages}
[Header Flags=DAT|ACK
Mode=UNR
Part=0,Of=1
Seen=1,Send=11001,Size=100]-------->
[Header Flags=DAT|ACK Optional attributes:
Mode=UNR
Part=0,Of=1
Seen=1,Send=11101,Size=100]-------->
{App sends 1 message} The following types of attributes may be passed along with
<------- [Header Flags=DAT|ACK the primitive:
Mode=UNR
Part=0,Of=1
Seen=11201,Send=1,Size=450]
{App sends 2 more messages} o Timer selection and its operation syntax -- to indicate to MDTP
[Header Flags=DAT|ACK an alternative timer the MDTP should use for its operation.
Mode=UNR o Initial MDTP operation mode;
Part=0,Of=1 o IP port number, if ULP wants it to be specified;
Seen=451,Send=11201,Size=100]------>
[Header Flags=DAT|ACK B) Send.Data primitive
Mode=UNR
Part=0,Of=1
Seen=451,Send=11301,Size=100]------>
Note that no timers are started by either end. Also note that even This is the main method to send datagrams via MDTP.
though both ends are in UNR mode, the ACK flag is still set by the Mandatory attributes:
sender of the datagram. This means that the Seen field in the datagram
header is still valid to indicating the sequence number of the last
octet received by the sender. However, the sender makes no claim as to
whether pieces of data are missing. The upper application can use this
information to help detecting missing or duplicated pieces. In
unreliable mode, MDTP makes no effort to re-transmit missing data or
to screen out duplicated datagrams.
6.1 Ordered reception o data - This is the payload ULP wants to transmit;
o size - The size of the payload in number of octets;
o to-address - The IP address and port number of the intended
receiver. In case of redundant networks, to-address can be any one
of the multiple IP addresses of the receiver. The network which the
datagram will actually be sent through will be determined by MDTP due
to the link rotation, unless the current mode prohibits MDTP link
rotation; in such case the datagram will be sent through the network
specified by to-address (see section 4.5).
In unreliable transfer if the sender sets the RE1 bit the receiver Optional attributes:
should order the datagrams upon arrival. Any datagrams that have not
been read by the receivers application should be ordered so that the
datagrams will be received in order the datagrams were transmitted
(using the sendStartsAt field). If a datagram arrives after a
new datagram then the datagram should be discarded. The sequence would
look as follows:
Endpoint A Endpoint Z o mode-flags - This indicates a new MDTP operation mode, taking effect
{App sends 4 messages} immediately including the current datagram send;
[Header Flags=DAT|ACK
Mode=UNR|RE1
Part=0,Of=1
Seen=1,Send=11001,Size=100]-------->
[Header Flags=DAT|ACK o context - optional information that will be carried in the
Mode=UNR|RE1 Send.Failure notification to the ULP if the transportation of
Part=0,Of=1 this datagram fails.
Seen=1,Send=11101,Size=100]\ /-->
\ /
\ / (User reads/Receives all
[Header Flags=DAT|ACK \ / datagrams 11001 & 11201)
Mode=UNR|RE1 \
Part=0,Of=1 / \
Seen=451,Send=11201,Size=100]/ \---> { Datagram is discarded }
[Header Flags=DAT|ACK C) Receive.Data primitive
Mode=UNR|RE1
Part=0,Of=1
Seen=1,Send=11301,Size=100]\ /-->
\ /
\ /
[Header Flags=DAT|ACK \ /
Mode=UNR|RE1 \
Part=0,Of=1 / \
Seen=451,Send=11401,Size=100]/ \--->(User reads/Receives all
datagrams in order
11301 & 11401)
7. Reliable flows This primitive shall return the first datagram in the MDTP in-queue to
ULP, if there is one available. It may, depending on the specific
implementation, also return other informations such as the sender's
address, whether there are more datagrams available for retrieval,
etc. The behavior is undefined if no datagram is available when this
primitive is invoked.
A flow is a ordered reliable sequence of datagrams that is delivered Mandatory attributes:
to the receiver in order without constraint to other flows. There is a
set way to initiate (open) a flow and close a flow. Each flow is o buffer - the memory location indicated by the ULP to store the
initiated by the sender. Multiple flows may be initiated between two received datagram and other information.
endpoints at the same time. Once initiated a flow will follow the same
retransmission and link rotation schema's has the rest of MDTP. However Optional attributes:
each flow is independent of any other flow, so if datagram 1 and 2 of
flow 5 arrives, but datagram 1 of flow 4 is lost (having been sent None.
ahead of flow 5's datagrams), flow 5's datagrams are delivered to the
application without blocking for retransmission of the lost datagram D) Data.Arrive notification
from flow 4 (datagram 1 of flow 4). All flow related datagrams will
have the NOB bit set. Each flow will also have a separate timer MDTP shall invoke this notification on the ULP when a datagram is
associated with it that is unique and different from any non-flow successfully received and ready for retrieval.
related timers that are running. The Seen and Send fields will be
broken down and interpreted in the following manner. E) Send.Failure notification
If a datagram can not be delivered MDTP shall invoke this notification
on the ULP.
The following may be optionally passed with the notification:
o data - the location ULP can find the un-delivered datagram.
o context - optional information associated with this datagram (see
13.2).
F) Link.Status.Change notification
When a link is marked down (e.g., when MDTP detects a link failure),
or marked up (e.g., when MDTP detects a link recovery), MDTP shall
invoke this notification on the ULP.
The following shall be passed with the notification:
o link-address - This indicates the IP address of the affected link;
o new-status - This indicates the new status of the link;
G) Communication.Up notification
This notification is used when MDTP becomes ready to send or receive
datagrams, or when a lost communication to an endpoint is restored.
The following shall be passed with the notification:
o status - This indicates what type of event that has occurred;
o endpoint-id - The IP address and port number to identify the
endpoint;
H) Communication.Lost notification
When MDTP loses communication to an endpoint completely or detects
that the endpoint has performed a shut-down operation, it shall invoke
this notification on the ULP.
The following shall be passed with the notification:
o status - This indicates what type of event that has occurred;
o endpoint-id - The IP address and port number to identify the
endpoint;
o packets-enqueue - The number and location of un-sent datagrams
still holding by MDTP;
o last-acked - the sequence number last acked by that peer endpoint;
o last-sent - the sequence number last sent to that peer endpoint;
I) Change.Link.Rotation primitive
When the upper layer wants to inform MDTP to make a specific network
eligible or ineligible for in link rotation, the upper layer will send
this primitive to MDTP.
Mandatory attributes:
o action - This indicates if the network is to be made eligible or
ineligible for link rotation.
o network-id - This is the IP address and port of the network to be
added or removed from link rotation consideration.
J) Open.Stream primitive
This shall be used by the upper layer to open a new stream.
Mandatory attributes:
o endpoint-id - The IP address and port number to identify the
peer endpoint to which the stream is to be opened. An association
must have existed at the time of stream open.
Returned attributes:
o The stream number that is opened.
K) Close.Stream primitive
This shall be used by the upper layer to request to close a stream.
Mandatory attributes:
o endpoint-id - The IP address and port number to identify the
peer endpoint to which the stream is to be closed.
o stream number - The stream number to identify the stream to be
closed (this should be the number returned by the Stream.Open
primitive on this stream).
6. Suggested MDTP Protocol Parameter Values
The following are suggested timer values for MDTP:
T1-init Timer - 160 ms
T2-receive Timer - 20 ms
T3-send Timer - 160 ms + Last calculated RTT for that network.
The following protocol parameters are recommended:
Max.Outstanding.dg - 20 messages
Max.Retransmit - 10 attempts
Max.Init.Retransmit - 8 attempts
Min.Mcast.Time.To.Reset - 5 seconds
Num.Of.Mcast.Reset.Msg - 5 messages
7. Acknowledgments
The authors wish to thank Brian Wyld, A. Sankar, Henry Houh, Gary
Lehecka, Ken Morneault, Lyndon Ong, and others for their very valuable
comments.
8. Author's Addresses
Randall R. Stewart Tel: +1-847-632-7438
Cellular Infrastructure Group EMail: stewrtrs@cig.mot.com
Motorola, Inc.
1475 W. Shure Drive, #2C-6
Arlington Heights, IL 60004
USA
Qiaobing Xie Tel: +1-847-632-3028
Cellular Infrastructure Group EMail: xieqb@cig.mot.com
Motorola, Inc.
1501 W. Shure Drive, #2309
Arlington Heights, IL 60004
USA
Tom Bova Tel: +1-703-484-3331
Cisco Systems Inc. EMail: tbova@cisco.com
13615 Dulles Technology Drive
Herndon, VA 20171
Suheel Hussain Tel: +1-919-472-2312
Cisco Systems Inc. EMail:ssh@cisco.com
7025 Kit Creek Road
Research Triangle Park, NC 27709
Ted Krivoruchka Tel: +1-703-484-3331
Cisco Systems Inc. EMail: tedk@cisco.com
13615 Dulles Technology Drive
Herndon, VA 20171
Renee Revis Tel: +1-703-472-5681
Cisco Systems Inc. EMail: drrevis@cisco.com
7025 Kit Creek Road
Research Triangle Park, NC 27709
9. References
[1] Postel, J. (ed.), "Internet Protocol - DARPA Internet Program
Protocol Specification", RFC 791, USC/Information Sciences Institute,
September 1981.
[2] Postel, J., "User Datagram Protocol", RFC 768, USC/Information Sciences
Institute, August 1980.
[3] Postel, J. (ed.), "Transmission Control Protocol", RFC 793, USC/
Information Sciences Institute, September 1981.
[4] Jacobson V., "Congestion Avoidance and Control", Proceedings of
SIGCOMM '88, pp 314-329, August, 1988.
[5] Seth, T., etc. "Performance Requirements for Signaling in Internet
Telephony", Internet-Draft <draft-seth-sigtran-req-00.txt>, May, 1999.
Appendix A: Stream-based Reliable and Ordered Delivery
This defines a reliable and ordered stream mechanism for MDTP. It is
optional for implementation.
A stream in MDTP is defined as a sequence of user datagrams that needs
to be reliably delivered with sequence preservation of its own. In
other words, the delivery of a stream shall not be delayed because of
the losses or re-transmissions occurred in other streams within the
same MDTP association. This capability is a critical requirement of
some telephony call signaling protocols [5].
Stream datagrams are identified by setting FLO bit to 1.
A.1 Stream Initiation
First, an MDTP association between the two endpoints must be initiated
before any stream operation.
A stream shall be initiated (opened) by the sender before datagrams
can be sent in the stream, and after the stream is complete it shall
be terminated (closed) by the user. Also, both sides of the
association shall be able to initiate or terminate streams
independently.
The sender initiates a stream by sending a Stream Initiation
(NOB|UNR), using the following header format:
Stream Initiation
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flow Number | Datagram number in flow | (Seen) | MDTP Protocol Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flow Number | Datagram number in flow | (Send) | Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seen = 0x0 (or Tag) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send = 0x0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| New Stream Number | 0x0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Send field will contain the flow number of this datagram, flow 0 Note that in the Stream Initiation, the Seen and Send shall be set to 0,
is always reserved and is NOT used. The datagram number is the and the number of the new stream being initiated shall be indicated
sequential number of the datagram. The Seen field is used to in the first two octets of the data field.
acknowledge receipt of the indicated datagram for the specified
flow. The flow number in the acknowledgment does NOT need to be the
same as the flow number in the Send field. This format is only used
for flow datagrams.
A flow can have bundled data (see section 9) but cannot have
fragmented messages. The reason fragmented messages are not supported
is two fold, to attempt to simplify the flows a little bit. And flows
are thought of has call control related limiting there size to be no
larger than one datagram per message.
If a flow packet number reaches 0xffff, then the next packet number
should wrap to 1.
Before a flow can be used it must be initiated, after the flow is However, if this is the first datagram sent out after receiving the
complete it should be closed. Note it is assumed that before any flows Initiation Ack from the peer (see section 3.1), the Seen field of
can be opened the MDTP initiate sequence has taken place (see section above Stream Initiation shall be set to the Tag value carried in the
4). When a MDTP initiate sequence occurs, any endpoint being Initiation Ack.
re-initialized will cause a closing of all outstanding flows during
that re-initialization. Before opening a flow the opening end should
verify that the version number of the receiving MDTP endpoint is at
least 3. If the version number is less than 3 then the MDTP endpoint
must NOT attempt to open a flow.
7.1 Initiating a flow. Upon the reception of the Stream Initiation, the peer shall respond
immediately with a Stream Initiation Ack (NOB|UNR|ACK), using the
following header format:
A flow is initiated by sending a Flow Initiate/Close Message. In all Stream Initiation Ack
flow datagram the NOB bit is set. For the Flow Initiate Message the
UNR mode bit set as well. The Acknowledgment number (Seen) and the
Sequence Number (Send) is set to 0 unless this is the first message in
which case the TAG unlock value is set in the Send (see section 4.1).
Until a flow is open successfully a receiver of a non-opened flow 0 1 2 3
datagram will silently discard the datagram. Upon sending a flow 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
initiation a T3-Send timer will be started on flow 0. The timer will +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
follow the same rules for retransmission and timing as outlined in | MDTP Protocol Identifier |
section 5. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seen = Stream Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send = 0x0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The following illustration demonstrates the opening of flow 5: The following example shows the opening of stream 5 by "A":
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App Initiates flow 5} {App Initiates stream 5}
[Header Flags=NOB [Header Flags=FLO|UNR
Mode=UNR
Part=0,Of=1 Part=0,Of=1
Seen=00000000,Send=0x0000 0000,Size=0, Seen=0,Send=0,Size=0,
flow=0x0005 dg=0000 ]------> Stream=5 ]--------------------------->
(Start T3-send timer f=5) (Start T3-send timer)
(Cancel T3-send timer f=5) <----------------- [Header Flags=NOB|ACK (Cancel T3-send timer) <--------------------- [Header Flags=FLO|UNR|ACK
Mode=UNR Mode=UNR
Part=0,Of=1 Part=0,Of=1
Seen=0x00000005,Send=0x00000000, Seen=5,Send=0]
Size=0, flow=0000 dg=0000]
In the above example note that for flow 0, unlike all others, no T2-Recv A.2 Stream Termination
timer is ever started. Each flow open/close must be independently
acknowledged. Note also that in the reply acknowledgment the ACK bit is For an existing stream, either side shall be allowed to terminate the
set. If unlikely event that Endpoint-Z wished to piggy back the open of stream by sending a Stream Termination (FLO|UNR|SHU) to the other side.
flow 5 with a flow open of its own the sequence would look as follows:
Besides flag RES, The Stream Termination shall use the same header
format as that used in Stream Initiation datagram (see A.2)
A Stream Termination Ack (FLO|UNR|SHU|ACK) shall be sent by the peer
endpoint in response.
The following example shows the termination of stream 5 by "A":
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App Initiates flow 5} {App terminates stream 5}
[Header Flags=NOB [Header Flags=FLO|UNR|SHU
Mode=UNR
Part=0,Of=1
Seen=0,Send=0,
Size=0,
flow=5, dg=0 ]------>
(Start T3-send timer f-5) {App Initiates flow 8}
(Cancel T3-send timer f-5) <----------------- [Header Flags=NOB|ACK
Mode=UNR
Part=0,Of=1 Part=0,Of=1
Seen=5, Seen=0,Send=0,Size=0,
Send=0, Stream=5 ]--------------------->
Size=0,flow=0008 dg=0000] (Start T3-send timer s-5)
(Start T3-send timer - f8) (Cancel T3-send timer s-5) <------------ [Header Flags=FLO|UNR|SHU|ACK
[Header Flags=NOB|ACK
Mode=0
Part=0,Of=1 Part=0,Of=1
Seen=8,Send=0,Size=0, Seen=5,Send=0]
flow=0, dg=0]-------------------------------->(Cancel T3-send timer - f8)
Note that at the initiate of a flow, the timer started is considered Datagrams associated to a terminated stream received by either side
the first timer for the flow, but it is sent over flow 0. Note also should be silently discarded. It is up to the side which terminates
that a piggyback open is not allowed if the TAG sequences have not the stream to assure that all outstanding user datagrams in the stream
been exchanged. are acknowledged before the termination.
7.2 Flow acknowledgments A.3 Stream Datagram Transfer
Normal dataflow's follow the normal MDTP transmission formats (see A.3.1 Header Format in Stream Datagrams with User Data
section 5) Acknowledgments when possible are piggy-backed on
datagrams. Each flow maintains its own send timer. When no piggyback
of data and acknowledgments is possible, more than one flow can be be
acknowledged at the same time by using the Flow Extend Acknowledgment
format. The Send field (now considered the number of extended
acknowledgments) will contain the number of acknowledgments in the
array.
During data transfer if the when the datagram number reaches 0xffff The MDTP header in a stream datagram with user data shall have the
the next packet should be labeled 1. Pkt 0 is never used for datagram following format:
transfer.
One T2-Recv timer is maintained for all flows. If more than one flow 0 1 2 3
is being timed and a datagram is to be transmitted then one of the 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
flows will be acknowledged and the T2-Recv timer will be left running +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
until expiration, which will then cause the Flow Extended | MDTP Protocol Identifier |
Acknowledgment to be sent, acknowledging all remaining flows. The +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
following examples illustrate examples of flow acknowledgments. For | Version | Flags | In Queue |
this example we assume that Endpoint A has 3 flows open 5,7 and | |N N W I F R D A M S W R R F G U| |
9. Endpoint Z has 4 flows open 0x11, 8 4 and 1. | |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seen |
| Stream Number | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Send |
| Stream Number | Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ \
/ data /
\ \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Example 1: Endpoint A sends to Endpoint Z T2-Recv timer expires The stream number and sequence number in the Send field shall be used
by the sender to identify the current stream datagram. And, the
stream number and sequence number in the Seen field shall be used
by the sender to acknowledgment of stream datagrams it has received.
Endpoint A Endpoint Z Stream number 0 and sequence number 0 are reserved for special
{ App sends first datagram on flow 5} purposes and are not valid stream number or sequence number.
[Header Flags=NOB|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0001,Size=20]------>(Start T2-Recv)
(Start T3-send timer-f5)
{ T2-Recv Timer Expires }
(Cancel T3-send timer) <--------------- [Header Flags=NOB|ACK
Mode=REL
Part=0,Of=1
Seen=0x00050001,Send=0x00000000,
Size=0]
(Start T3-send timer)
Example 1: Endpoint A sends to Endpoint Z T2-Recv timer expires A.3.2 Transmission of Stream Datagrams
Endpoint A Endpoint Z The rules of using the Seen Sequence Number and Send Sequence Number
{ App sends first datagram on flow 5} are similar to those defined for normal MDTP non-stream datagram
[Header Flags=NOB|DAT transmissions (see section 4), except that for stream transfer the
Mode=REL sequence numbers shall roll-over to 1 after 0xFFFF.
Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0001,Size=20]------>(Start T2-Recv)
(Start T3-send timer-f5)
{ T2-Recv Timer Expires }
(Cancel T3-send timer) <--------------- [Header Flags=NOB|ACK
Mode=REL
Part=0,Of=1
Seen=0x00050001,Send=0x00000000,
Size=0]
(Start T3-send timer)
Example 2: Endpoint A sends multiple messages to Endpoint Z and Moreover, each stream maintains its individual T3-send timer, but only
T2-Recv timer expires one global T2-receive timer is maintained for all existing streams.
Acknowledgment to a stream datagram shall either be sent separately
or be piggy-backed with a stream datagram (not necessarily belonging
to the same stream) traveling in the opposite direction. For a
separate Stream Ack, the Send field will be set to 0000:0000.
The following shows an example of transmitting a stream datagram
(FLO|REL|DAT) and a separate Stream Ack (FLO|REL|ACK):
Endpoint A Endpoint Z Endpoint A Endpoint Z
{ App sends 1 datagram on flow 5} {App sends first data on stream 5}
[Header Flags=NOB|DAT [Header Flags=FLO|REL|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0002,Size=20]------>(Start T2-Recv)
(Start T3-send timer-f5)
{ App sends 1 datagram on flow 9}
[Header Flags=NOB|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0009 0004,Size=20]------>
(Start T3-send timer-f9)
{ App sends 1 datagram on flow 5}
[Header Flags=NOB|DAT
Mode=REL
Part=0,Of=1 Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0003,Size=20]------> Seen=0-0,Send=5-1,Size=20]----\
{ App sends 1 datagram on flow 7} (Start T3-send timer-s5) \--->(Start T2-recv)
[Header Flags=NOB|DAT ...
Mode=REL {T2-recv Timer Expires}
(Cancel T3-send timer-s5) <--------------- [Header Flags=FLO|REL|ACK
Part=0,Of=1 Part=0,Of=1
Seen=0x0000 0000,Send=0x0007 0011,Size=20]------> Seen=5-1,Send=0-0,Size=0]
{ T2-Recv Timer Expires }
(Cancel T3-send timer-f5) <-------------- [Header Flags=NOB|ACK
(Cancel T3-send timer-f9) Mode=REL
(Cancel T3-send timer-f7) Part=0,Of=1
Seen=0x00050003,
Send=0x00000002,
Size=0,
ex[0]=0x00090004,
ex[1]=0x00070011
]
Example 3: Endpoint A sends a message to Endpoint Z, Endpoint Z The following example shows the use of a piggy-backed Stream Ack.
piggy-backs a ack.
{ App sends 1 datagram on flow 5} {App sends new data on stream 5}
[Header Flags=NOB|DAT [Header Flags=FLO|REL|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0004,Size=20]------>(Start T2-Recv)
(Start T3-send timer-f5) { App sends 1 message flow 0x11}
( cancel T2-Recv Timer )
(Cancel T3-send timer-f5) <----------------- [Header Flags=NOB|DAT|ACK
(Start T2-Recv timer) Mode=REL
Part=0,Of=1 Part=0,Of=1
Seen=0x0005 0004, Seen=0-0,Send=5-4,Size=20]--------->(Start T2-recv)
Send=0x0011 0008, (Start T3-send timer-s5) ...
Size=10] {App sends data on stream 11}
(Start T3-send timer-f0x11) (cancel T2-recv Timer)
{ T2-Recv Timer Expires } /----- [Header Flags=FLO|REL|DAT|ACK
[Header Flags=NOB|ACK / Part=0,Of=1
Mode=REL / Seen=5-4,Send=11-8,Size=10]
/ (Start T3-send timer-s11)
(Cancel T3-send timer-s5) <-----/
(Start T2-recv timer)
...
{T2-recv Timer Expires}
[Header Flags=FLO|REL|ACK
Part=0,Of=1 Part=0,Of=1
Seen=0x0000 0000,Send=0x0011 0008,Size=0]------>(Cancel T3-send-f0x11) Seen=11-8,Send=0-0,Size=0]--------->(Cancel T3-send-s11)
Example 4: Endpoint A sends a multiple message to Endpoint Z, Endpoint Z Note that when piggy-back a Stream Ack with an out-bound stream
piggy-backs a ack and sends a Extended flow Ack. datagram when more than one streams have un-acked datagrams, the
endpoint shall choose one stream and piggy-back a Stream Ack on one of
the datagrams, and shall leave the T2-recv timer running.
{ App sends 1 datagram on flow 5} A.3.3 Extended Stream Ack
[Header Flags=NOB|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0005 0005,Size=20]------>(Start T2-Recv)
(Start T3-send timer-f5)
{ App sends 1 datagram on flow 9}
[Header Flags=NOB|DAT
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0009 0004,Size=20]------>
(Start T3-send timer-f9)
{ App sends 1 message flow 0x4}
(Cancel T3-send timer-f5) <-------------- [Header Flags=NOB|DAT|ACK
(Start T2-Recv timer) Mode=REL
Part=0,Of=1
Seen=0x00050005,Send=0x00040004,
Size=10]
(Start T3-send timer-f0x4)
{ T2-Recv Timer Expires }
(Start T3-send timer)
(Cancel T3-send timer) <-------------- [Header Flags=NOB|ACK
Mode=REL
Part=0,Of=1
Seen=0x00090004,Send=0x00000000,
Size=0]
{ T2-Recv Timer Expires }
[Header Flags=NOB|ACK
Mode=REL
Part=0,Of=1
Seen=0x0000 0000,Send=0x0004 0004,Size=0]------>(Cancel T3-send-f0x4)
Retransmissions and resends are handled per section 5 but using the Upon the expiration of T2-recv timer, if there are more than one
flow formats (i.e. the NOB bit set) as described above. The rules for stream datagrams received but yet acked upon by the endpoint, an
retransmission, windowing, flow control and declaration of endpoint Extended Stream Ack shall be used.
death are applied has defined in section 5.
Note that messages to the different flows are handed up ordered The following defines the header format of the Extended Stream Ack
correctly within the flow but not delayed with respect to any other that acknowledges N stream datagrams received:
flows transmission or retransmission.
7.3 Flow session closing 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Seen |
| Stream Number #0 | Sequence Number #0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number of Extra Acks = N-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stream Number #1 | Sequence Number #1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ /
/ \
\ /
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Stream Number #N-1 | Sequence Number #N-1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The application may signal a closing of a flow. If this occurs the Note that an Extended Stream Ack is identified by setting the Seen
implementation will inform its peer of the closing so that resources field to the number of extra acks carried in its data field, as shown
used to track and maintain the flow can be reused/freed. The following above. Also, Extended Stream Acks shall not be piggy-backed.
sequence is used to release a flow in this example we see the closing
of flow 5. Note it is up to the sender to assure that all outstanding The following example shows the using of an Extended Stream Ack
datagrams are acknowledged before closing a flow: (NOB|REL|ACK) by "Z":
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App Initiates flow 5} {App sends data on stream 5}
[Header Flags=NOB|RES [Header Flags=FLO|REL|DAT
Mode=UNR
Part=0,Of=1 Part=0,Of=1
Seen=0,Send=0,Size=0, Seen=0-0,Send=5-2,Size=20]----------> (Start T2-recv)
flow=5, dg=0 ]------> (Start T3-send timer-s5)
(Start T3-send timer f-5) {App sends data on stream 9}
(Cancel T3-send timer f-5) <----------------- [Header Flags=NOB|ACK|RES [Header Flags=FLO|REL|DAT
Mode=UNR
Part=0,Of=1 Part=0,Of=1
Seen=5,Send=0, Seen=0-0,Send=9-4,Size=20]---------->
(Start T3-send timer-s9)
{App sends more data on stream 5}
[Header Flags=FLO|REL|DAT
Part=0,Of=1
Seen=0-0,Send=5-3,Size=20]---------->
(Restart T3-send timer-s5)
{App sends data on stream 7}
[Header Flags=FLO|REL|DAT
Part=0,Of=1
Seen=0-0,Send=7-11,Size=20]--------->
(Start T3-send timer-s7)
...
{T2-recv Timer Expires}
(Cancel T3-send timer-s5) <-------------- [Header Flags=FLO|REL|ACK
(Cancel T3-send timer-s7) Part=0,Of=1
(Cancel T3-send timer-s9) Seen=5-3,NumExtAck=2,
Size=0, Size=0,
flow=0, dg=0] ext[0]=9-4,
ext[1]=7-11]
Datagrams received by a endpoint directed to a closed flow should be
silently discarded.
8. Mixed Mode Data Transmission
An endpoint can switch between reliable and unreliable transfer modes A.4 Other Issues with Stream Transfer
at any time during the data transfer.
The following sequence illustrates such a transfer mode change, in - -- Congestion control, including the rules for timer management and window
which both endpoints starts with the unreliable transfer mode, and management, shall apply to Stream Transfer the same way as it does to
then Endpoint A switches to reliable transfer mode. non-Stream based transfer, as defined in section 4.3.
Endpoint A Endpoint Z - -- When an association is re-initialized (see section 3.4), all existing
{App send 1 message} stream within that association will be automatically terminated.
<------------------ [Header Flags=DAT|ACK
Mode=UNR
Part=0,Of=1
Seen=11201,Send=1,Size=450]
..
{App send 1 message}
[Header Flags=DAT|ACK
Mode=UNR
Part=0,Of=1
Seen=451,Send=11201,Size=100]------>
.. - -- The receiver shall silently discard any datagrams associated
{App send 1 message} with a stream which has not been initiated or has already been
[Header Flags=DAT|ACK terminated.
Mode=GAR
Part=0,Of=1
Seen=451,Send=11301,Size=100]------> (Start T2-receive timer)
(Start T3-send timer)
{App sends 1 message}
(Cancel T2-receive timer)
/------- [Header Flags=DAT|ACK
/ Mode=UNR
/ Part=0,Of=1
/ Seen=11401,Send=1,Size=450]
(Cancel T3-send timer) <-------/
.. - -- The same re-transmission and link rotation rules as defined in
{App sends 1 message} section 4 shall apply to Stream Transfer.
[Header Flags=DAT|ACK
Mode=GAR
Part=0,Of=1
Seen=451,Send=11401,Size=100]------> (Start T2-receive timer)
(Start T3-send timer)
.. - -- Bundled Message (see Appendix B) may be allowed in Stream Transfer,
{Timer T2 Expires} but fragmentation (see Appendix C) shall not be allowed.
(Cancel T3-send timer) <------------------- [Header Flags=ACK
Mode=0
Part=0,Of=0
Seen=11501,Send=146]
Note that in the second datagram sent by Endpoint A the mode is Appendix B: Bundled Message Transfer
switched to reliable transfer mode (with GAR bit set). This causes
Endpoint A to start its T3-send timer. When Endpoint Z receives the
datagram and realizes the mode change, it starts its T2-receive timer.
At this point, Endpoint Z also must update its Seen value to 11301.
This will allow Endpoint Z to align its Seen counter
to the Seen value of this first reliable datagram from Endpoint
A. This prevents Endpoint Z from requesting retransmission of data
that Endpoint A may not have.
9. Bundled Messages This defines the mechanism for bundled datagram transport in MDTP. It
is optional for implementation.
In order to increase network utilization, MDTP allows an endpoint to Bundling is sometimes desired by the user when transferring small
bundle small application messages into one single datagram for datagrams, as a way of improving network utilization.
transmission. This bundled mode can be applied to both reliable and
unreliable datagrams.
An endpoint indicates to its peer that it is currently in bundled In bundled transfer, MDTP allows an endpoint to bundle small
application messages into one single datagram for transmission. This
bundled mode can be applied to both reliable and unreliable datagrams
(see Appendix E for Unreliable Delivery).
mode by setting the BUN bit in the mode field. Note that an endpoint shall never send bundled messages to a peer if
that peer endpoint set NOB bit to 1 during their association
initialization (see section 3).
9.1 Format of Bundled Datagram B.1 Format of Bundled Datagram
The ISB bit in the flag field is set to indicate the current datagram is The ISB bit in the flag field is set to indicate the current datagram
bundled, i.e., it contains multiple messages. The format of a bundled is bundled, i.e., it contains multiple messages. The format of a
datagram is defined as follow: bundled datagram is defined as follows:
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 1 | | MDTP Protocol Identifier |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MDTP Protocol Identifier 2 | | Version | Flags | In Queue |
| |N N W I F R D A M S W R R F G U| |
| |O O I S I T A C U H N E T L A N| |
| |M B N B R M T K L U R 1 C O R R| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number (Seen) | | Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) | | Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of | | Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Flags | Mode | Version | Num On Queue | | Total Number Of Messages=N | Message #1 Size = B1 |
|N N W I F R D A|B S W R R B G U| | |
|O O I S I E A C|R H N E E U A N| | |
|G B N B R S T K|O U R 1 2 N R R| | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Number Of Messages | Size of first message B1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | |
| B1 octets of data | | B1 octets of data |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of second message B2 | | | Message #2 Size = B2 | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| B2 octets of data | | B2 octets of data |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ \ \ \
/ / / /
\ \ \ \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of last message BL | | | Message #N Size = BN | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| BL octets of data | | BN octets of data |
| | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Data Size in a bundled datagram indicates the actually size of the Data_Size in a bundled datagram indicates the actually size of the
data field of the datagram, including both the bundling overhead and data field of the datagram, including both the bundling overhead and
the actually application data. Since no fragmentation is allowed in a the actually user data. Since no fragmentation is allowed in a bundled
bundled datagram, the Part field will always be '0' and the Of field datagram, the Part field will always be '0' and the Of field always be
always be '1'. '1'.
The first two octets of the data field is a 16 bit integer indicating The first two octets of the data field is a 16 bit integer indicating
the number of messages bundled in the datagram. This is followed the number of messages bundled in the current datagram. This is
immediately by a list of bundled messages. Each bundled message starts followed immediately by a list of bundled messages. Each bundled
with an integer of two octets indicating the size of the data in the message starts with an integer of two octets indicating the size of
message, followed by the data itself. the data in the message, followed by the data itself.
All integers in the datagram should be transmitted in the network byte All integers in the datagram should be transmitted in the network byte
order. order.
9.2 Bundled Transfer B.2 Bundled Datagram Transfer
Two protocol parameters, namely the Min.Bundle and Max.Bundle, are The T4-bundling timer and two protocol parameters, namely the
used to control the assembly of bundled datagrams. If the current size Min.Bundle and Max.Bundle, are used to control the bundling of user
of a bundled datagram is smaller than Min.Bundle, the endpoint will datagrams.
withhold the datagram from transmission and start T4-bundle timer. If
new out-bound data becomes available for transmission, the endpoint The endpoint will withhold the datagram from transmission and start
will attempt to bundle the new data with the current withheld datagram T4-bundle timer, if the combined size of all user datagrams currently
by using the following rules: pending for transmission in the out-bound buffer is smaller than
'Min.Bundle'.
Each time a new out-bound user data becomes available for
transmission, the endpoint will attempt to bundle the new data with
the current withheld datagram by using the following rules:
A) If the size of the new data is greater than or equal to A) If the size of the new data is greater than or equal to
Min.Bundle, the current withheld datagram will be transmitted and 'Min.Bundle', the current withheld datagram will be transmitted and
T4-bundle timer will be canceled. Then, the new data will be T4-bundle timer will be canceled. Then, the new data will be
transmitted in a separate datagram. transmitted in a separate datagram.
B) If the size of the new data is less than Min.Bundle, but the B) If the size of the new data is less than 'Min.Bundle', but the
combined size of the current datagram and the new data is greater combined size of the current datagram and the new data is greater
than or equal to Max.Bundle, the current datagram will be sent and than or equal to 'Max.Bundle', the current datagram will be sent and
the new data will be withheld as the new current datagram. the new data will be withheld as the new current datagram.
C) If the size of the new data is less than Min.Bundle, and the C) If the size of the new data is less than 'Min.Bundle', and the
combined size of the current datagram and the new data is less than combined size of the current datagram and the new data is greater
Max.Bundle, the new data will be bundled into the current than 'Min.Bundle', but less than 'Max.Bundle', the new data will be
datagram and the bundled datagram will be immediately transmitted. bundled into the current datagram and the bundled datagram will be
immediately transmitted. and T4-bundle timer will be canceled.
D) If the size of the new data is less than Min.Bundle, and the D) If the size of the new data is less than 'Min.Bundle', and the
combined size of the current datagram and the new data is less than combined size of the current datagram and the new data is less than
Min.Bundle, the new data will be bundled into the current Min.Bundle, the new data will be bundled into the current
datagram. And the T4-bundle timer will be restarted. datagram. And the T4-bundle timer will be restarted.
E) If T4-bundle timer expires, the current datagram will be sent E) If T4-bundle timer expires, the current datagram will be sent
immediately. immediately.
F) If the size of the new data is greater than the Max.Bundle, the F) When a T2-receive timer expires, any bundled data waiting to be
current datagram will be sent. Then, the new data will be fragmented transmitted should be sent immediately with a piggy-backed Ack to
for transmission (see 9). acknowledge all un-acked data previously received.
The following is an example of bundled data transfer, assuming
Max.Bundle=4096 and Min.Bundle=1700:
Endpoint A Endpoint Z G) If a T4-bundle timer is running and data arrives, the T2-receive
timer should not be started.
{App sends 1 messages of 100 octets} H) A T4-bundle timer should never be canceled unless it is being
(withhold and Start T4-Bundle timer) supplanted by a T3-send timer.
.. When a bundled datagram arrives at the receiving endpoint, each
{App sends 1 messages of 100 octets} message is unbundled and delivered separately to the upper layer.
(bundling into current datagram)
.. The following are the suggested protocol parameter values for bundled
{App sends 1 messages of 100 octets} datagram transfer:
(bundling into current datagram)
.. T4-bundle Timer - 40 ms
{T4-bundle timer expires} Min.Bundle - 1000 octets
[Header Flags=DAT|ACK Max.Bundle - 1432 octets
Mode=GAR|BUN
Part=0,Of=1
Seen=146,Send=1001,Size=308]--------> (Start T2-receive timer)
(T3-send timer starts)
..
{Timer T2 Expires}
(cancel T3-send) <---------------- [Header Flags=ACK
Mode=0
Part=0,Of=0
Seen=1309,Send=146]
Notice that the Data Size in the datagram sent by Endpoint A is not Appendix C: Fragmented Message Transfer
300 but 308. This is due to the fact that this size reflects the
size of the data field of the datagram including the bundling overhead.
When the bundled datagram arrives at the receiving endpoint, each This defines the mechanism for fragmented datagram transport in
message is unbundled and delivered separately to the upper level MDTP. It is optional for implementation.
application.
10. Fragmented Messages When the size of an out-bound user message exceeds the value defined
in the protocol parameter Max.Bundle, the endpoint shall fragment the
message into smaller pieces of size equal to or smaller than
'Max.Bundle' and send each piece out in a separate datagram.
When the size of an out-bound message exceeds the value defined in the The "Part" and "Of" fields are used to disassemble and reassemble the
protocol parameter Max.Bundle, the endpoint will fragment the message fragmented message. The combination of the maximal 'Of' value, which
into smaller pieces of sizes equal to or smaller than Max.Bundle and is 255, and the maximal Data Size (see section 2.2) will determined
send each piece out in a separate datagram. the maximal size of a single user message that the MDTP can send or
receive in fragmented message transfer mode.
The Part and Of fields are used to disassemble and reassemble the However, an endoint shall never send fragmented datagrams to a peer if
fragmented message. that peer set the NOM bit to 1 during their association
initialization.
The following example shows the transmission of a fragmented message The following example shows the transmission of a fragmented message
(assuming Max.Bundle=4096, Min.Bundle=1700): (assuming Max.Bundle=1432, Min.Bundle=1000):
Endpoint A Endpoint Z Endpoint A Endpoint Z
{App sends message size=3300 octets}
{App sends 1 messages 8544 octets long} [Header Flags=DAT|ACK|GAR
[Header Flags=DAT|ACK
Mode=GAR|BUN
Part=0,Of=3 Part=0,Of=3
Seen=146,Send=1001,Size=4072]-------> (Start T2-receive timer) Seen=3,Send=16,Size=1432]-------> (Start T2-receive timer)
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR|BUN
Part=1,Of=3 Part=1,Of=3
Seen=146,Send=5073,Size=4072]-------> Seen=3,Send=17,Size=1432]------->
[Header Flags=DAT|ACK [Header Flags=DAT|ACK|GAR
Mode=GAR|BUN
Part=2,Of=3 Part=2,Of=3
Seen=146,Send=9145,Size=400]--------> Seen=3,Send=18,Size=436]-------->
(Start T3-send timer) (Start T3-send timer)
.. ..
{Timer T2 Expires} {Timer T2 Expires}
/----------- [Header Flags=ACK /----------- [Header Flags=ACK
/ Mode=0 / Mode=0
/ Part=0,Of=0 / Part=0,Of=0
(cancel timer T3) <-----------/ Seen=9545,Send=146] (cancel timer T3) <-----------/ Seen=18,Send=4]
Notice that Endpoint A is using the reliable transfer mode to send the Notice that "A" is using the reliable transfer mode to send the
fragmented message. In this mode, Endpoint Z will hold the fragments fragmented message, therefore "Z" will hold the fragments and request
and request retransmission if a fragment is found missing, i.e., a gap retransmission if a fragment is found missing, i.e., if a gap is found
is found in the received data (see 5). When all the parts of the in the received data (see ). When all the parts of the fragmented
fragmented message are received, the endpoint will re-assemble the message are received, the receiving endpoint will re-assemble the
message and dispatch it to the upper level application. message and dispatch it to the upper layer.
It is also allowed in MDTP to send fragmented message using unreliable It is also allowed in MDTP to send fragmented message using Unreliable
transfer mode. However, in unreliable mode, each fragment datagram Transfer mode (see section 4.5). However, in unreliable mode, each
will be dispatch to the application upon its arrival, and no fragment will be dispatch to the application upon its arrival, and no
retransmission will be requested even if a fragment is found missing. retransmission will be requested even if a fragment is found missing.
Bundling is prohibited if the current datagram contains a fragment of Bundling is prohibited if the current datagram contains a fragment of
a fragmented message. a fragmented message.
11. Non-protocol Datagrams Appendix D: Multicast Datagram Transfer
The MDTP protocol allows an endpoint to send and receive non-protocol
datagrams such as the traditional UDP datagrams. Non-protocol
datagrams are detected by the absence of the MDTP protocol
identifiers at the beginning of the datagram. A non-protocol
transmission received by an MDTP endpoint is termed as a "raw"
datagram. When a raw datagram arrives, the receiving endpoint will set
itself into raw mode and start sending back to its peer in raw mode
as well.
Once an endpoint is in raw mode with a peer, only a change of
operational mode by the application or a reception of a MDTP datagram
will bring the endpoint out of raw mode. In the latter case, the
endpoint will use the default MDTP operational mode predefined by the
application for MDTP transmissions. When an endpoint changes from raw
mode into MDTP mode, the normal MDTP initiation messages must be
exchanged between the two endpoints, as described in 4.
12. Broadcast and Multicast
Broadcast and multicast are supported by MDTP when the underlying
transport layer supports them. Both types of transmissions are carried
out in unreliable transfer mode.
For broadcast datagrams, the BRO bit will be set to '1' and the UNR
bit will be set to '0' in the mode field. For multicast datagrams,
both the BRO bit and the UNR bit will be set to '1'.
For multicast datagrams, the value in the Send field will indicate This defines the mechanism for unreliable transportation of multicast
the number of multicast datagrams transmitted by the sender. This datagrams in MDTP. It is optional for implementation.
information makes it possible for the receiver of the multicast to
detect duplicated multicast datagrams and also to detect lost
multicast datagrams. A multicast datagram transmission MUST use
the alternate multicast header filling in both the multicast transmit
to address as well as its lowest network address in the multicast
from address.
Bundling and fragmentation are not allowed in either multicast or D.1 Multicast Datagram Header Format
broadcast datagrams.
12.1 Multicast/Broadcast initialization. Multicast datagrams are identified by setting MUL, UNR, and DAT bits
to 1.
No initiation is needed for an endpoint to transmit multicast or Two new fields are added to the standard MDTP datagram header to
broadcast datagrams. However, caution should be taken when support multicast:
transmitting non-protocol datagrams (i.e., datagrams with no MDTP
protocol header) in multicast or broadcast transmission. This is
because the non-protocol datagrams may inadvertently force all the
receiving endpoints of the multicast or broadcast transmission into
raw mode (see 10).
12.2 Transmission of Broadcast Datagrams. Multicast To Transmit address - This is the multicast address, in
network byte order, that the sender transmitted the data to. The
receiver can use this information for internal tracking purposes.
When sending a broadcast datagram, the endpoint will not take effort Multicast From - This is the network address (or the IP Address of
to prevent duplicate transmissions (this is likely to occur Network 1 as described in 3.2, if redundant networks exist) of the
especially when multiple networks exist). The application at the sender, in network byte order.
receiving end must be prepared to handle duplicate
broadcast messages.
The following is an example of broadcast datagram transmission: MDTP Header Format - Multicast Format
Endpoint A Endpoint Z 0 1 2 3
{application sends 2 messages } 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
[Header Flags=DAT +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Mode=BRO | MDTP Protocol Identifier |
Part=0,Of=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Seen=0,Send=0,Size=200]--------------> (Datagram may appear | Version | Flags | In Queue |
more than once.) | |N N W I F R D A M S W R R F G U| |
[Header Flags=DAT | |O O I S I T A C U H N E T L A N| |
Mode=BRO | |M B N B R M T K L U R 1 C O R R| |
Part=0,Of=1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Seen=0,Send=0,Size=100]--------------> | Acknowledgment Number (Seen) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number (Send) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Size | Part | Of |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast To Transmit address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Multicast From - senders base address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
\ \
/ data /
\ \
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Notice that no timers are used on either end, and Seen and Send values For multicast datagrams, the value in the Send field shall indicate
in the datagrams are always '0'. the sequence number of multicast datagrams transmitted by the
sender. This information helps the receiver of the multicast to detect
duplicated multicast datagrams and also to detect lost multicast
datagrams from the same sender. The Seen field shall normally be
set to 0, unless in some special cases stated below.
12.3 Transmission of Multicast Datagrams. Bundling and fragmentation are not allowed in either multicast or
broadcast datagrams.
Unlike the broadcast transmission, when multicast datagrams are No initiation shall be needed for an endpoint to transmit to a
transmitted the receiving endpoints should take effort to prevent multicast address.
duplicate copies of datagrams from being distributed to their
applications.
This is possible because the transmission of multicast datagrams is D.2 Transmission of Multicast Datagrams
usually addressed to a special multicast network address. The receiving
endpoints can thus use this multicast address in combination with the
sender's address to detect duplicate transmissions of a multicast
datagram.
The following example illustrates multicast transmissions between two The following example illustrates multicast transmissions between two
endpoints. endpoints.
Endpoint A Endpoint Z Endpoint A Endpoint Z
{app multicasts a message} {App multicasts a message}
[Header Flags=DAT [Header Flags=MUL|UNR|DAT
Mode=BRO|UNR
Part=0,Of=1 Part=0,Of=1
Seen=0,Send=5,Size=250]--------------> (may receive more Seen=0,Send=5,Size=250]--------------> (no Ack necessary)
than one copy)
..
{app multicasts a message} ...
[Header Flags=DAT {App multicasts a message}
Mode=BRO|UNR [Header Flags=MUL|UNR|DAT
Part=0,Of=1 Part=0,Of=1
Seen=0,Send=6,Size=500]--------------> (may receive more Seen=0,Send=6,Size=500]--------------> (no Ack necessary)
than one copy)
Notice the values of the Send field in the multicast datagrams (which
are 5 and 6, respectively). They represent the sequence numbers of the
multicast datagrams Endpoint A has sent out. Endpoint Z should use the
Send value found in the incoming multicast datagrams to detect any Notice that the values of the Send field in the multicast datagrams
missing or duplicate datagrams. (which are 5 and 6, respectively). They represent the sequence numbers
of the multicast datagrams "A" has sent out. Endpoint Z should use
this value to detect missing or duplicate datagrams.
Duplicate datagrams will be discarded and no effort will be made to Duplicate datagrams will be discarded and no effort will be made to
retransmit lost multicast datagrams. retransmit lost multicast datagrams.
For example, each endpoint can track the last 32 datagrams received by D.3 Reset of the Multicast Datagram Sequence Number
using a sliding window of 32 bits. Each time a new datagram with a
sequence number higher than the current window head is received, the
window can be moved up. If a datagram received has a sequence number
below the current window head, then a check of the last 32 received
datagrams' sequence numbers can determine whether the new datagram is a
duplicate. If the sequence number of the new datagram is below the
current window tail then the datagram should be considered a duplicate
and discarded.
12.4 Reset of the Multicast Datagram Sequence Number
If the Seen field in a multicast datagram is set to '1', it is an If the Seen field of a received multicast datagram equals to '1', this
indication that the sender has reset its multicast datagram sequence indicates that the sender has reset its multicast datagram sequence
number. The receiving endpoint, upon detecting this reset indicator in number. The receiving endpoint, upon detecting this reset indicator in
the incoming multicast datagram, should start a procedure to adopt the the incoming multicast datagram, should start a procedure to adopt the
new sequence number for error detection. However, caution new sequence number for error detection. However, caution
should be taken to prevent false resets due to duplicated datagrams should be taken to prevent false resets due to duplicated datagrams
with reset indicator propagating through multiple networks. with reset indicator propagating through multiple networks.
To guarantee that all receivers of the multicast group adopt the new To guarantee that all receivers of the multicast group adopt the new
sequence number, the reset indicator should be repeated within the sequence number, the reset indicator should be repeated within the
first N multicast datagrams sent out after the reset. N is predefined first N multicast datagrams sent out after the reset. N is predefined
by the protocol parameter Num.Of.Mcast.Reset.Msg. by the protocol parameter 'Num.Of.Mcast.Reset.Msg'.
At the receiving endpoint, when the reset indicator is detected the At the receiving endpoint, when the reset indicator is detected the
new sequence number will be adopted. However, if two reset events are new sequence number will be adopted. However, if two reset events are
detected within a predefined time interval (Min.Mcast.Time.To.Reset), detected within a predefined time interval (Min.Mcast.Time.To.Reset),
the second reset indicator will be ignored. the second reset indicator will be ignored.
The following is an example (assuming Num.Of.Mcast.Reset.Msg = 4): The suggested values for these two protocol parameters are:
Min.Mcast.Time.To.Reset - 5 seconds
Num.Of.Mcast.Reset.Msg - 5 messages
Endpoint A Endpoint Z Appendix E: Unreliable Delivery
[Header Flags=DAT This defines the support for sending Unreliable datagrams in MDTP. It
Mode=BRO|UNR is optional for implementation.
Part=0,Of=1
Seen=0,Send=17859,Size=300]---------->
`<
{reset message sequence number indicated}
[Header Flags=DAT The unreliable transfer mode allows two endpoints to send to each
Mode=BRO|UNR other without acknowledging the receiving. This can usually achieve
Part=0,Of=1 higher data throughput than the reliable transfer mode. To indicate
Seen=1,Send=1,Size=250]--------------> (record new sequence the unreliable transfer mode the sender of a datagram with user data
number, datagram may simply sets the UNR flag to 1. The following sequence illustrates
appear more than once) unreliable data transfer.
[Header Flags=DAT
Mode=BRO|UNR Endpoint A Endpoint Z
{App sends 2 messages}
[Header Flags=UNR|DAT|ACK
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=2,Size=250]--------------> (may appear more than Seen=0,Send=4,Size=100]-------->
once) [Header Flags=UNR|DAT|ACK
[Header Flags=DAT
Mode=BRO|UNR
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=3,Size=500]--------------> (may appear more than Seen=0,Send=5,Size=100]-------->
once)
[Header Flags=DAT {App sends 1 message}
Mode=BRO|UNR <------- [Header Flags=UNR|DAT|ACK
Part=0,Of=1 Part=0,Of=1
Seen=1,Send=4,Size=500]--------------> (may appear more than Seen=5,Send=1,Size=450]
once) ...
[Header Flags=DAT {App sends 2 more messages}
Mode=BRO|UNR [Header Flags=UNR|DAT|ACK
Part=0,Of=1 Part=0,Of=1
Seen=0,Send=5,Size=100]--------------> (may appear more than Seen=1,Send=6,Size=100]------>
once)
In the above example Endpoint Z would detect the reset indicator in
the second multicast datagram and adopt the new sequence number which
is 1. Then, it would ignore the reset indicator in the subsequent three
(3) datagrams since they arrived within a very short time interval.
13. Interface with upper level protocols
The upper level protocols (ULP) shall request for services by passing
primitives to MDTP and shall receive notifications from MDTP for
various events.
The primitives and notifications described in this section should be
used as a guideline for implementing MDTP.
13.1 Init.MDTP primitive
This primitive allows MDTP to initialize its internal data structures
and allocate necessary resources for setting up its operation
environment. Note that once MDTP is initialized, ULP can communicate
directly with any other endpoints without re-invoking this primitive.
Mandatory attributes:
None.
Optional attributes:
The following types of attributes may be passed along with
the primitive:
o Timer selection and its operation syntax -- to indicate to MDTP
an alternative timer the MDTP should use for its operation.
o Initial MDTP operation mode;
o IP port number, if ULP wants it to be specified;
13.2 Send.Data primitive
This is the main method to send datagrams via MDTP.
Mandatory attributes:
o data - This is the payload ULP wants to transmit;
o size - The size of the payload in number of octets;
o to-address - The IP address and port number of the intended
receiver. In case of redundant networks, to-address can be any one
of the multiple IP addresses of the receiver. The network which the
datagram will actually be sent through will be determined by MDTP due
to the link rotation, unless the current mode prohibits MDTP link
rotation; in such case the datagram will be sent through the network
specified by to-address (see section 4.5).
Optional attributes:
o mode-flags - This indicates a new MDTP operation mode, taking effect
immediately including the current datagram send;
o context - optional information that will be carried in the
Send.Failure notification to the ULP if the transportation of
this datagram fails.
13.3 Receive.Data primitive
This primitive shall return the first datagram in the MDTP in-queue to
ULP, if there is one available. It may, depending on the specific
implementation, also return other informations such as the sender's
address, whether there are more datagrams available for retrieval,
etc. The behavior is undefined if no datagram is available when this
primitive is invoked.
Mandatory attributes:
o buffer - the memory location indicated by the ULP to store the
received datagram and other information.
Optional attributes:
None.
13.4 Data.Arrive notification
MDTP shall invoke this notification on the ULP when a datagram is
successfully received and ready for retrieval.
13.5 Send.Failure notification
If a datagram can not be delivered MDTP shall invoke this notification
on the ULP.
The following may be optionally passed with the notification:
o data - the location ULP can find the un-delivered datagram.
o context - optional information associated with this datagram (see
13.2).
13.5 Link.Status.Change notification
When a link is marked down (e.g., when MDTP detects a link failure),
or marked up (r.g., when MDTP detects a link recovery), MDTP shall
invoke this notification on the ULP.
The following shall be passed with the notification:
o link-address - This indicates the IP address of the affected link;
o new-status - This indicates the new status of the link;
13.6 Communication.Lost notification
When MDTP loses communication to an endpoint completely or detects
that the endpoint has performed a shut-down operation, it shall invoke
this notification on the ULP.
The following shall be passed with the notification:
o status - This indicates what type of event that has occurred;
o endpoint-id - The IP address and port number to identify the
endpoint;
o packets-enqueue - The number and location of un-sent datagrams
still holding by MDTP;
o last-acked - the sequence number last acked by that peer endpoint;
o last-sent - the sequence number last sent to that peer endpoint;
14. Suggested timer and MTU values.
The following are suggested timer values for MDTP:
T1-init Timer - 160 ms
T2-receive Timer - 20 ms
T3-send Timer - 160 ms
T4-bundle Timer - 40 ms
T5-Heart Beat - 4000 ms
The following protocol parameters are recommended:
Min.Bundle - 1000 octets
Max.Bundle - 1432 octets
Max.Retransmit - 10 attempts
Max.Init.Retransmit - 8 attempts
Min.Mcast.Time.To.Reset - 5 seconds
Num.Of.Mcast.Reset.Msg - 5 messages
15. Acknowledgments
The authors wish to thank Brian Wyld, Sankar A, Henry Houh, Gary
Lehecka, Ken Morneault, Lyndon Ong, and others for their very valuable
comments.
16. Author's Addresses
Randall R. Stewart Tel: +1-847-632-7438 [Header Flags=UNR|DAT|ACK
Cellular Infrastructure Group EMail: stewrtrs@cig.mot.com Part=0,Of=1
Motorola, Inc. Seen=451,Send=7,Size=100]------>
1475 W. Shure Drive, #2C-6
Arlington Heights, IL 60004
USA
Qiaobing Xie Tel: +1-847-632-3028 Note that no timers shall be started by either end, and that even
Cellular Infrastructure Group EMail: xieqb@cig.mot.com though both ends are in Unreliable transfer mode, the ACK flag is
Motorola, Inc. still set by the sender of the datagram. This means that the Seen
1501 W. Shure Drive, #2309 field in the datagram header is still valid to indicating the sequence
Arlington Heights, IL 60004 number of the last datagram received by the sender. The upper layer
USA can use this information to help detecting missing or duplicated
datagrams. However, MDTP shall make no effort to detect or retransmit
missing data or to screen out duplicated datagrams.
17. References E.1 Ordered Unreliable Delivery
[1] Postel, J. (ed.), "Internet Protocol - DARPA Internet Program In unreliable transfer, the sender should be allowed to request
Protocol Specification", RFC 791, USC/Information Sciences Institute, ordered delivery by setting the RE1 flag to 1.
September 1981.
[2] Postel, J., "User Datagram Protocol", RFC 768, USC/Information Sciences When Ordered Unreliable Delivery is indicated, the receiver shall
Institute, August 1980. order the newly arrived datagram with any datagrams it has received
but yet passed to its upper layer.
[3] Postel, J. (ed.), "Transmission Control Protocol", RFC 793, USC/ If it receives a datagram which is older than the last datagram it has
Information Sciences Institute, September 1981. passed to the upper layer, that datagram shall be silently discarded.
This Internet Draft expires in 6 months from April 1999. This Internet Draft expires in 6 months from April 1999.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/