Internet Engineering Task Force          Audio-Video Transport Working Group
INTERNET-DRAFT                                                H. Schulzrinne
                                                      AT&T Bell Laboratories
                                                           December 15, 1992
                                                            Expires:  5/1/93

   Sample Profile for the Use of RTP for Audio and Video Conferences with
                              Minimal Control

     This  note  describes  a  profile for  the  use  of the  real-time
    transport protocol  (RTP) within  audio and  video multiparticipant
    conferences with minimal  control.  It  provides interpretations of
    generic fields within the RTP  specification suitable for audio and
    video conferences.   In particular, this document  defines a set of
    default mappings from content index to encodings.

1 Introduction

This profile defines  aspects of RTP  left unspecified in  the RTP  protocol
definition.  This  profile is intended  for the use  within audio and  video
conferences with minimal  session control.   In particular,  no support  for

the negotiation  of parameters  or admission  control is  provided.    Other
profiles may  make different  choices for  the items  specified here.    The
profile specifies the use of RTP over  unicast and multicast UDP as well  as
ST-II. For unicast UDP and ST-II,  references to multicast addresses are  to
be ignored.  The use of this profile is indicated by the use of a well-known
port number.

2 Multiplexing and Demultiplexing

Packets sharing the same multicast group address, the same destination  port
number and the  same flow value  belong to the  same conference.   Within  a
conference, a packet is mapped to a site (state) through its synchronization
address and network source port.


The content  field within  the  CDESC option  describes the  media  encoding
used.  The four octets contain one of the encodings defined by the  Internet
Assigned Numbers  Authority (IANA)  or  an encoding  agreed upon  by  mutual
consent of all conference participants.  The names are defined in Figures  1
and 2 and encoded in US-ASCII. Case is significant.  If the name  is shorter
than four characters, it is padded with one or more space characters  (ASCII
32 decimal).

The encodings are identified as follows:

Bolt: refers to the proprietary Bolter video coding algorithm.

dvc: the BBN video coding algorithm.

DVI: refers  to  the  Intel  DVI/ADPCM  audio  encoding,  specified  in  the
    `Recommended  Practices  for   Enhancing  Digital  Audio   Compatibility
    in  Multimedia  Systems'',  published  by  the   Interactive  Multimedia
    Association (IMA), Annapolis, MD.

1016: refers to  the Federal Standard 1016,  which uses code-excited  linear

G721: refers to the ADPCM encoding defined by CCITT Recommendation  G.721 at
    a rate of 32 kb/s.

G723: refers to the ADPCM encoding defined by CCITT Recommendation  G.723 at
    a rate of 24 kb/s.

G722: is defined in  CCITT Recommendation G.722 and denotes a subband  coded

    ADPCM algorithm with an audio bandwidth of 7 kHz.

GSM: denotes  the European  GSM  06.10  provisional standard  for  full-rate
    speech transcoding, prI-ETS 300 036, based on residual  pulse excitation
    with long term prediction (RPE/LTP).

H261: refers to CCITT  Recommendation H.261 and defines a video codec  based
    on discrete-cosine transforms.

nv: Xerox Parc video coding algorithm.

PCMU: is  a subset  of CCITT  Recommendation G.711,  referring  to a  mu-law
    companded PCM encoding.

PCMA: is  a subset  of CCITT  Recommendation G.711,  referring  to an  A-law
    companded PCM encoding.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 |F|   CFDESC    |    length     |0|0|  content  | MBZ           |
 | return port number            | clock quality | MBZ           |
 | name of encoding                                              |
 |   channels    | sampling rate (Hz)                            |
...  encoding specific parameters                               ...

                         Figure 1:  CDESC for Audio

For audio encodings, the  index into the table  of encodings is followed  by
a field containing  a channel  count and a  sample rate  field, measured  in
samples per second.(1)  A channel count of zero is considered invalid.

For  video  encodings,  a  one-octet  numeric  version  identifier   further
describes the encoding.
 1. Fractional samples per  second was considered  excessive as the  typical
crystal accuraccy  of  100 ppm  translates  into about  one  Hz or  more  of
sampling rate inaccuracy.

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 |F|   CFDESC    |    length     |0 0   content  | MBZ           |
 | return port number            | clock quality | MBZ           |
 | name of encoding                                              |
 |   version     | encoding-specific parameters                  |
... encoding-specific parameters                                ...

                         Figure 2:  CDESC for Video

4 Standard Encodings

Unless specified with  the CDESC  option,  the mapping  between the  content
field in an RTP packet and  encodings, sampling rates and channel counts  is
specified by Tables 1 and 2.  Values of 31 and below cannot be  redefined by
CDESC options.  In other words, only values of 32 and above are valid in the
content field within an CDESC option.   The receiver is expected to  discard
RTP packets containing media data with unknown content field values.   Sites
are expected to keep the mapping  between content and encoding constant,  so
that lost  packets containing  CDESC options  do not  lead the  receiver  to
misinterpret media data.

                  index  encoding  sampling rate channels
                      0  PCMU      8                    1
                      1  1016      8                    1
                      2  G721      8                    1
                      3  GSM       8                    1
                      4  G723      8                    1
                      5  DVI       8                    1
                      6  L16       16                   1

                     Table 1:  Default Audio Encodings

                                31      H261
                                30      Bolt
                                29      dvc
                                28      nv

                     Table 2:  Default Video Encodings

5 Port Assignments and Miscellaneous

UDP port [TBD] is to be used as the destination for multicast real-time data
carried by RTP. Unicast connections  may use the this  or a set of  mutually
agreed-upon port numbers.  ST-II connections use port 3456.

The framing  field is  to be  used only  when RTP  protocol data  units  are
carried over a network or transport  protocol that does not provide  framing
(e.g., TCP).

6 Address of Author

Henning Schulzrinne
AT&T Bell Laboratories
MH 2A244
600 Mountain Avenue
Murray Hill, NJ 07974
telephone:  908 582-2262
electronic mail:  hgs@research.att.com

