Network Working Group                                           Kutscher
Internet-Draft                                                       Ott
Expires: October January 18, 2001 2002                                        Bormann
                                                TZI, Universitaet Bremen
                                                          April 19,
                                                           July 20, 2001

             Session Description and Capability Negotiation
                     draft-ietf-mmusic-sdpng-00.txt
                     draft-ietf-mmusic-sdpng-01.txt

Status of this Memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six
   months and may be updated, replaced, or obsoleted by other documents
   at any time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/1id-abstracts.html

     The

   To view the entire list of Internet-Draft Shadow Directories can be accessed at Directories, see
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on October January 18, 2001. 2002.

Copyright Notice

   Copyright (C) The Internet Society (2001). All Rights Reserved.

Abstract

   This document defines a language for describing multimedia sessions
   with respect to configuration parameters and capabilities of end
   systems.

   This document is a product of the Multiparty Multimedia Session
   Control (MMUSIC) working group of the Internet Engineering Task
   Force. Comments are solicited and should be addressed to the working
   group's mailing list at confctrl@isi.edu and/or the authors.

Document Revision

   $Revision: 1.8 2.0 $

Table of Contents

   1.      Introduction . . . . . . . . . . . . . . . . . . . . . . . .  3  4
   2.      Terminology and System Model . . . . . . . . . . . . . . . .  5  6
   3.      SDPng  . . . . . . . . . . . . . . . . . . . . . . . . . . .  8  9
   3.1     Conceptual Outline . . . . . . . . . . . . . . . . . . . . .  8  9
   3.1.1   Definitions  . . . . . . . . . . . . . . . . . . . . . . . .  8  9
   3.1.2   Components & Configurations  . . . . . . . . . . . . . . . . 10 11
   3.1.3   Constraints  . . . . . . . . . . . . . . . . . . . . . . . . 11 13
   3.1.4   Session Attributes . . . . . . . . . . . . . . . . . . . . 14
   3.1.4.1 Owner  . . . . . . 12
   3.2   Syntax Proposal . . . . . . . . . . . . . . . . . . . . 15
   3.1.4.2 Session Identification . . . . . . . . . . . . . . . . . . 15
   3.1.4.3 Time Specification (SDP 't=', 'r=', and 'z=' lines)  . . . 16
   3.1.4.4 Component Semantic Specification . . . . . . . . . . . . . 17
   3.2     Syntax Definition Mechanisms . . . . . . . . . . . . 12 . . . 18
   3.3     External Definition Packages . . . . . . . . . . . . . . . . 14 20
   3.3.1   Profile Definitions  . . . . . . . . . . . . . . . . . . . . 15 20
   3.3.2   Library Definitions  . . . . . . . . . . . . . . . . . . . . 15 21
   3.4     Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 17 22
   4.      Formal Specification . . . . . . . . . . . . . . . . . . . 24
   5.      Use of SDPng in conjunction with other IETF Signaling
           Protocols  . . . . . . . . . . . . . . . . . . . . . . . . 25
   5.1     The Session Announcement Protocol (SAP)  . . . . . . . . . 25
   5.2     Session Initiation Protocol (SIP)  . . . . . . . . . . . . 26
   5.3     Real-Time Streaming Protocol (RTSP)  . . . . . . . . . . . 26
   5.4     Media Gateway Control Protocol (MEGACOP) . . . . . . . . . 27
   6.      Open Issues  . . . . . . . . . . . . . . . . . . . . . . . . 18 28
           References . . . . . . . . . . . . . . . . . . . . . . . . . 19 29
           Authors' Addresses . . . . . . . . . . . . . . . . . . . . 30
   A.      Base SDPng Specifications for Audio Codec Descriptions . 19
         Full Copyright Statement . 31
   A.1     DVI4 . . . . . . . . . . . . . . . . . 21

1. Introduction

   Multiparty multimedia conferencing is one application that requires
   the dynamic interchange of end system capabilities and the
   negotiation of a parameter set that is appropriate for all sending
   and receiving end systems in a conference. For some applications,
   e.g. for loosely . . . . . . . . . . 32
   A.2     G.722  . . . . . . . . . . . . . . . . . . . . . . . . . . 32
   A.3     G.726  . . . . . . . . . . . . . . . . . . . . . . . . . . 32
   A.4     G.728  . . . . . . . . . . . . . . . . . . . . . . . . . . 32
   A.5     G.729  . . . . . . . . . . . . . . . . . . . . . . . . . . 32
   A.6     G.729 Annex D and E  . . . . . . . . . . . . . . . . . . . 33
   A.7     GSM  . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
   A.7.1   GSM Full Rate  . . . . . . . . . . . . . . . . . . . . . . 33
   A.7.2   GSM Half Rate  . . . . . . . . . . . . . . . . . . . . . . 33
   A.7.3   GSM Enhanced Full Rate . . . . . . . . . . . . . . . . . . 33
   A.8     L8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
   A.9     L16  . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
   A.10    LPC  . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
   A.11    MPA  . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
   A.12    PCMA and PCMU  . . . . . . . . . . . . . . . . . . . . . . 34
   A.13    QCELP  . . . . . . . . . . . . . . . . . . . . . . . . . . 34
   A.14    VDVI . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
           Full Copyright Statement . . . . . . . . . . . . . . . . . 35

1. Introduction

   Multiparty multimedia conferencing is one of the applications that
   require dynamic interchange of end system capabilities and the
   negotiation of a parameter set that is appropriate for all sending
   and receiving end systems in a conference. For some applications,
   e.g. for loosely coupled conferences or for broadcast scenarios, it
   may be sufficient to simply have session parameters be fixed by the
   initiator of a conference. In such a scenario no negotiation is
   required because only those participants with media tools that
   support the predefined settings can join a media session and/or a
   conference.

   This approach is applicable for conferences that are announced some
   time ahead of the actual start date of the conference. Potential
   participants can check the availability of media tools in advance
   and tools like session directories can configure media tools on
   startup. This procedure however fails to work for conferences
   initiated spontaneously like Internet phone calls or ad-hoc
   multiparty conferences. Fixed settings for parameters like media
   types, their encoding etc. can easily inhibit the initiation of
   conferences, for example in situations where a caller insists on a
   fixed audio encoding that is not available at the callee's end
   system.

   To allow for spontaneous conferences, the process of defining a
   conference's parameter set must therefore be performed either at
   conference start (for closed conferences) or maybe (potentially)
   even repeatedly every time a new participant joins an active
   conference. The latter approach may not be appropriate for every
   type of conference without applying certain policies: For
   conferences with TV-broadcast or lecture characteristics (one main
   active source) it is usually not desired to re-negotiate parameters
   every time a new participant with an exotic configuration joins
   because it may inconvenience existing participants or even exclude
   the main source from media sessions. But conferences with equal
   "rights" for participants that are open for new participants on the
   other hand would need a different model of dynamic capability
   negotiation, for example a telephone call that is extended to a
   3-parties conference at some time during the session.

   SDP [2] allows to specify multimedia sessions (i.e. conferences,
   "session" as used here is not to be confused with "RTP session"!)
   by providing general information about the session as a whole and
   specifications for all the media streams (RTP sessions and others)
   to be used to exchange information within the multimedia session.

   Currently, media descriptions in SDP are used for two purposes:

   o  to describe session parameters for announcements and invitations
      (the original purpose of SDP) and

   o  to describe the capabilities of a system and possibly provide a
      choice between a number of alternatives (which SDP was not
      designed for).

   A distinction between these two "sets of semantics" is only made
   implicitly.

   This document is based upon a set of requirements specified in a
   companion document [1] In the following we first introduce a model
   for session description and capability negotiation and introduce the
   basic terms used throughout this specification (section 2). Then we
   outline the concept for the concepts underlying SDPng and introduce
   the syntactical components step by step in section 3. In section 4,
   we provide a formal definition of the SDPng session description
   language. Finally, we overview aspects of using SDPng with various
   IETF signaling protocols in section 5. In Appendix A, we introduce
   basic audio codec and payload type definitions.

2. Terminology and System Model

   Any (computer) system has, at a time, a number of rather fixed
   hardware as well as software resources. These resources ultimately
   define the limitations on what can be captured, displayed, rendered,
   replayed, etc. with this particular device. We term features enabled
   and restricted by these resources "system capabilities".

      Example: System capabilities may include: a limitation of the
      screen resolution for true color by the graphics board; available
      audio hardware or software may offer only certain media encodings
      (e.g. G.711 and G.723.1 but not GSM); and CPU processing power
      and quality of implementation may constrain the possible video
      encoding algorithms.

   In multiparty multimedia conferences, participants employ different
   "components" in conducting the conference.

      Example: In lecture multicast conferences one component might be
      the voice transmission for the lecturer, another the transmission
      of video pictures showing the lecturer and the third the
      transmission of presentation material.

   Depending on system capabilities, user preferences and other
   technical and political constraints, different configurations can be
   chosen to accomplish the "deployment" of these components.

   Each component can be characterized at least by (a) its intended use
   (i.e. the function it shall provide) and (b) a one or more possible
   ways to realize this function. Each way of realizing a particular
   function is referred to as a "configuration".

      Example: A conference component's intended use may be to make
      transparencies of a presentation visible to the audience on the
      Mbone. This can be achieved either by a video camera capturing
      the image and transmitting a video stream via some video tool or
      by loading a copy of the slides into a distributed electronic
      whiteboard. For each of these cases, additional parameters may
      exist, variations of which lead to additional configurations (see
      below).

   Two configurations are considered different regardless of whether
   they employ entirely different mechanisms and protocols (as in the
   previous example) or they choose the same and differ only in a
   single parameter.

      Example: In case of video transmission, a JPEG-based still image
      protocol may be sufficient to simply
   have session parameters used, H.261 encoded CIF images could be fixed sent as
      could H.261 encoded QCIF images. All three cases constitute
      different configurations. Of course there are many more detailed
      protocol parameters.

   Each component's configurations are limited by the initiator participating
   system's capabilities. In addition, the intended use of a conference. component
   may constrain the possible configurations further to a subset
   suitable for the particular component's purpose.

      Example: In such a scenario no negotiation is required because system for highly interactive audio communication
      the component responsible for audio may decide not to use the
      available G.723.1 audio codec to avoid the additional latency but
      only those
   participants with media tools that support use G.711. This would be reflected in this component only
      showing configurations based upon G.711. Still, multiple
      configurations are possible, e.g. depending on the predefined settings
   can join use of A-law
      or u-Law, packetization and redundancy parameters, etc.

   In this system model, we distinguish two types of configurations:

   o  potential configurations
      (a set of any number of configurations per component) indicating
      a media session and/or system's functional capabilities as constrained by the intended
      use of the various components;

   o  actual configurations
      (exactly one per instance of a conference.

   This approach is applicable for conferences that are announced some
   time ahead component) reflecting the mode of
      operation of this component's particular instantiation.

      Example: The potential configuration of the aforementioned video
      component may indicate support for JPEG, H.261/CIF, and
      H.261/QCIF. A particular instantiation for a video conference may
      use the actual start date configuration of H.261/CIF for exchanging video
      streams.

   In summary, the conference. Potential
   participants can check the availability key terms of media tools in advance
   and tools like this model are:

   o  A multimedia session directories can configure media tools on
   startup. This procedure however fails to work for conferences
   initiated spontaneously like Internet phone calls (streaming or ad-hoc
   multiparty conferences. Fixed settings for parameters like media
   types, their encoding etc. can easily inhibit the initiation conference) consists of
   conferences, one or
      more conference components for example in situations where a caller insists on multimedia "interaction".

   o  A component describes a
   fixed particular type of interaction (e.g.
      audio encoding conversation, slide presentation) that is not available at the callee's end
   system.

   To allow for spontaneous conferences, the process can be realized by
      means of defining different applications (possibly using different
      protocols).

   o  A configuration is a
   conference's parameter set must therefore be performed either at
   conference start (for closed conferences) or maybe (potentially)
   even repeatedly every time a new participant joins an active
   conference. The latter approach may not be appropriate for every
   type of conference without applying certain policies: For
   conferences with TV-broadcast or lecture characteristics (one main
   active source) it is usually not desired to re-negotiate parameters
   every time that are required to
      implement a new participant with an exotic configuration joins
   because it may inconvenience existing participants or even exclude
   the main source from media sessions. But conferences with equal
   "rights" for participants certain variation (realization) of a certain
      component. There are actual and potential configurations.

      *  Potential configurations describe possible configurations that
         are open for new participants on supported by an end system.

      *  An actual configuration is an "instantiation" of one of the
   other hand would need
         potential configurations, i.e. a different model of dynamic capability
   negotiation, for example decision how to realize a telephone call that
         certain component.

      In less abstract words, potential configurations describe what a
      system can do ("capabilities") and actual configurations describe
      how a system is extended configured to a
   3-parties conference operate at some a certain point in time during
      (media stream spec).

   To decide on a certain actual configuration, a negotiation process
   needs to take place between the session.

   SDP [1] allows involved peers:

   1.  to specify multimedia sessions (i.e. conferences,
   "session" as used here is not determine which potential configuration(s) they have in
       common, and

   2.  to select one of this shared set of common potential
       configurations to be confused with "RTP session"!)
   by providing general used for information about the exchange (e.g. based
       upon preferences, external constraints, etc.).

   In SAP-based [11] session as a whole and
   specifications announcements on the Mbone, for all which SDP
   was originally developed, the negotiation procedure is non-existent.
   Instead, the announcement contains the media streams (RTP sessions and others)
   to be used stream description sent
   out (i.e. the actual configurations) which implicitly describe what
   a receiver must understand to exchange information within participate.

   In point-to-point scenarios, the multimedia session.

   Currently, media descriptions in SDP are used negotiation procedure is typically
   carried out implicitly: each party informs the other about what it
   can receive and the respective sender chooses from this set a
   configuration that it can transmit.

   Capability negotiation must not only work for 2-party conferences
   but is also required for two purposes:

   o  to describe session parameters multi-party conferences. Especially for announcements and invitations
      (the original purpose of SDP)

   o  to describe the capabilities of a system (and possibly provide a
      choice between a number of alternatives). Note
   latter case it is required that SDP was not
      designed to facilitate this.

   A distinction between these two "sets the process of semantics" determining the
   subset of allowable potential configurations is only made
   implicitly.

   In deterministic to
   reduce the following we first introduce number of required round trips before a model session can be
   established.

   The requirements for the SDPng specification, subdivided into
   general requirements and requirements for session description descriptions,
   potential and capability actual configurations as well as negotiation and define some terms that rules,
   are later used
   to express some requirements. Note that this list captured in a companion document [1].

3. SDPng

   This section introduces the underlying concepts of requirements the Session
   Description Protocol - next generation (SDPng) that is
   possibly incomplete. to meet most
   of the above requirements. The purpose focus of this document section is to initiate on the
   development
   concepts of such a session capability description and capability negotiation
   framework.

2. Terminology and System Model

   Any (computer) system has, at language
   with a time, stepwise introduction of the various syntactical elements; a number
   full formal specification is provided in section 4.

3.1 Conceptual Outline

   The description language follows the system model introduced in the
   beginning of this document. We use a rather fixed
   hardware abstract language to
   avoid misinterpretations due to different intuitive understanding of
   terms as well far as software resources. These resources ultimately
   define the limitations on what can be captured, displayed, rendered,
   replayed, etc. with this particular device. We term features enabled
   and restricted by these resources "system capabilities".

      Example: System possible.

   The concept of a capability description language addresses various
   pieces of a full description of system and application capabilities may include:
   in four separate "sections":

      Definitions (elementary and compound); see Section 3.1.1.

      Potential or Actual Configurations; see Section 3.1.2.

      Constraints; see Section 3.1.3.

      Session attributes; see Section 3.1.4.

3.1.1 Definitions

   The definition section specifies a limitation number of the
      screen resolution basic abstractions that
   are later referenced to avoid repetitions in more complex
   specifications and allow for true color a concise representation. Definition
   elements are labelled with an identifier by the graphics board; available
      audio hardware or software which they may offer only certain media encodings
      (e.g. G.711 and G.723.1 but not GSM); and CPU processing power
      and quality of implementation be
   referenced. They may constrain the possible video
      encoding algorithms.

   In multiparty multimedia conferences, participants employ different
   "components" in conducting the conference.

      Example: In lecture multicast conferences one component might be
      the voice transmission for the lecturer, another the transmission elementary or compound (i.e. combinations of video pictures showing the lecturer and the third the
      transmission
   elementary entities). Examples of presentation material.

   Depending on system capabilities, user preferences definitions of that sections
   include (but are not limited to) codec definitions, redundancy
   schemes, transport mechanisms and payload formats.

   Elementary definition elements do not reference other
   technical elements. Each
   elementary entity only consists of one of more attributes and political constraints, different configurations can be
   chosen to accomplish their
   values. Default values specified in the "deployment" of these components.

   Each component can definition section may be characterized at least by (a) its intended use
   overridden in descriptions for potential (and later actual)
   configurations. A mechanisms for overriding definitions is specified
   below.

   For the moment, elementary elements are defined for media types
   (i.e. the function it shall provide) codecs) and (b) a one or more possible
   ways to realize this function. Each way of realizing a particular
   function is referred for media transports. For each transport and for
   each codec to as a "configuration".

      Example: A conference component's intended use may be to make
      transparencies of a presentation visible to the audience on used, the
      Mbone. This can respective attributes need to be achieved defined.

   This definition may either by a video camera capturing be provided within the image and transmitting a video stream via some video tool "Definitions"
   section itself or
      by loading a copy of in an external document (similar to the slides into a distributed electronic
      whiteboard. For each of these cases, additional parameters may
      exist, variations of which lead
   audio-video profile or an IANA registry that define payload types
   and media stream identifiers.

   It is not required to additional configurations (see
      below).

   Two configurations are considered different regardless of whether
   they employ entirely different define all codec, transport mechanisms in a
   definitions sections and protocols (as reference them in the
   previous example) or they choose the same definition of
   potential and differ only in actual configurations. Instead, a
   single parameter.

      Example: In case of video transmission, syntactic mechanism
   is defined that allows to specify some definitions directly in a JPEG-based still image
      protocol
   configurations section.

   Examples for elementary definitions:

   <audio-codec name="audio-basic" encoding="PCMU"
                sampling="8000 channels="1"/>

   <audio-codec name="audio-L16-mono" encoding="L16"
                sampling="44100 channels="1"/>

   The element type "audio-codec" is used in these examples to define
   audio codec configurations. The configuration parameters are given
   as attribute values.

   Definitions may have default values specified along with them for
   each attribute (as well as for their contents). Some of these
   default values may be used, H.261 encoded CIF images could overridden so that a codec definition can
   easily be sent as
      could H.261 encoded QCIF images. All three cases constitute re-used in a different configurations. Of course there are many more detailed
      protocol parameters.

   Each component's configurations are limited context (e.g. by specifying a
   different sampling rate) without the participating
   system's capabilities. need for a large number of base
   specifications. In addition, the intended use of a component
   may constrain following example the possible configurations further to a subset
   suitable definition of
   audio-L16-mono is re-used for the particular component's purpose.

      Example: In a system for highly interactive audio communication defintion of the component responsible for audio may decide not to use corresponding
   stereo codec. Appendix A provides a complete set of corresponding
   audio-codec definitions of the
      available G.723.1 audio codec to avoid the additional latency but
      only use G.711. This would used in RFC 1890 [4].

   <audio-codec name="audio-L16-stereo" ref="audio-L16-mono"
                channels="2"/>

   The example shows how exisiting defintions can be reflected referenced in this component only
      showing configurations based upon G.711. Still, multiple
      configurations new
   definitiones. This approach allows to have simple as well as more
   complex definitions which are possible, e.g. depending on the use of A-law
      or u-Law, packetization and redundancy parameters, etc.

   In this system model, we distinguish two types of configurations:

   o  potential configurations
      (a commonly used be available in an
   extensible set of any number reference documents. Section 3.3 specifies the
   mechanisms for external references.

   Besides definitions of configurations per component) indicating audio codecs there will be other definitions
   like RTP payload format and specific transport mechanisms that are
   suitable to be defined in a defintion section for later referencing.
   The following example shows how RTP payload types are defined using
   a system's functional capabilities as constrained by pre-defined codec.

   <rtp-pt name="rtp-avp-0" pt="0" format="audio-basic"/>
   <rtp-pt name="rtp-avp-11" pt="11" format="audio-L16-mono"/>

   In this example, the intended
      use payload type "rtp-avp-11" is defined with
   payload type number 11, referencing the codec "audio-L16-mono".
   Instead of referencing an existing definition it is also possible to
   define the various components;

   o  actual configurations
      (exactly one per instance format "inline":

   <rtp-pt name="rtp-avp-10" pt="10">
    <audio-codec encoding="L16" sampling="44100 channels="2"/>
   </rtp-pt>

   Note: For negotiation between endpoints, it may be helpful to define
   two modes of operation: explicit and implicit. Implicit
   specifications may refer to externally defined entities to minimize
   traffic volume, explicit specifications would list all external
   definitions used in a component) reflecting description in the mode of
      operation "Definitions" section.
   Again, see Section 3.3 for complete discussion of this component's particular instantiation.

      Example: external
   definitions.

   The potential configuration of the aforementioned video
      component "Definitions" section may indicate support for JPEG, H.261/CIF, be empty if all transport, codecs, and
      H.261/QCIF. A particular instantiation for a video conference may
      use
   other pieces needed to the actual configuration of H.261/CIF for exchanging video
      streams.

   In summary, specify Potential and Actual
   Configurations (as detailed below) are either included by
   referencing external definitions or are explicitly described within
   the Configurations themselves.

3.1.2 Components & Configurations

   The "Configurations" section contains all the components that
   constitute the key terms of this model are:

   o  A multimedia conference (IP telephone call, multiplayer
   gaming session (streaming or conference) consists etc.). For each of one these components, the potential
   and, later, the actual configurations are given. Potential
   configurations are used during capability exchange and/or
   negotiation, actual configurations to configure media streams after
   negotiation (e.g. with RTSP) or
      more conference components for multimedia "interaction".

   o in session announcements (e.g. via
   SAP). A potential and the actual configuration of a component describes may be
   identical.

   Each component is labelled with an identifier so that it can be
   referenced, e.g. to associate semantics with a particular type media
   stream. For such a component, any number of interaction (e.g.
      audio conversation, slide presentation) that can configurations may be realized by
      means
   given with each configuration describing an alternate way to realize
   the functionality of different applications (possibly using different
      protocols).

   o  A the respective component.

   Each configuration (potential as well as actual) is labelled with an
   identifier. A configuration combines one or more (elementary and/or
   compound) entities from the "Definitions" section to describe a set
   potential or an actual configuration. Within the specification of parameters that are required
   the configuration, default values from the referenced entities may
   be overwritten.

   Note: Not all protocol environments and their respective operation
   allow to
      implement a certain variation (realization) explicitly distinguish between Potential and Actual
   Configurations. Therefore, SDPng so far does not provide for
   syntactical identification of a certain
      component. There are actual and potential configurations.

      * Configurations as being a Potential configurations describe
   or an Actual one.

   The following example shows how RTP sessions can be described by
   referencing payload definitions.

   <cfg>
     <component name="interactive-audio" media="audio">
       <alt name="AVP-audio-0">
         <rtp format="rtp-avp-0">
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801"/>
         </rtp>
       </alt>

       <alt name= AVP-audio-11">
         <rtp format="rtp-avp-11">
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801"/>
         </rtp>
       </alt>
      </component>
   </cfg>

   For example, an IP telephone call may require just a single
   component "name=interactive-audio" with two possible ways of
   implementing it. The two corresponding configurations are
   "AVP-audio-0" without modification, the other ("AVP-audio-11") uses
   linear 16-bit encoding. Typically, transport address parameters such
   as the port number would also be provided. In this example, this
   information is given by the "udp" element. Of course, it must be
   possible to specify other transport mechanisms as well. See Section
   3.2 for a discussion of extension mechanisms that
         are supported by allow applications
   to use non-standard transport (or other) specifications.

   During/after the negotiation phase, an end system.

      *  An actual configuration is an "instantiation"
   chosen out of one a number of the alternative potential configurations, i.e. a decision how to realize a
         certain component.

      In less abstract words, potential configurations describe what a
      system can do ("capabilities") and actual configurations describe
      how a system is configured to operate at a certain point in time
      (media stream spec).

   To decide on a certain the
   actual configuration, a negotiation process
   needs configuration may refer to take place between the involved peers:

   1.  to determine which potential configuration(s) they have in
       common, and

   2.  to select one of this shared set of common potential
       configurations to be used for information exchange (e.g. based
       upon preferences, external constraints, etc.).

   In SAP [9] -based session announcements on the Mbone, configuration just
   by its "id", possibly allowing for which SDP
   was originally developed, the negotiation procedure is non-existent.
   Instead, the announcement contains the media stream description sent
   out (i.e. some parameter modifications.
   Alternatively, the full actual configurations) which implicitly describe what
   a receiver must understand configuration may be given.

   Instead of referencing existing payload type definitions it is also
   possible to participate.

   In point-to-point scenarios, provide the negotiation procedure required information "inline". The following
   example illustrates this:

   <cfg>
     <component name="audio1" media="audio">
       <alt name= AVP-audio-0">
         <rtp>
          <rtp-pt pt="0">
           <audio-codec name="audio-basic" encoding="PCMU"
                        sampling="8000 channels="1"/>
          </rtp-pt>
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801"/>
         </rtp>
       </alt>
      </component>
   </cfg>

   The UDP/IPv4 multicast transport that is typically
   carried out implicitly: each party informs the other about what it
   can receive and used in the respective sender chooses from this set examples is a
   configuration that
   simple variant of a transport specification. More complex ones are
   conceivable. For example, it can transmit.

   Capability negotiation must not only work for 2-party conferences
   but is also required for multi-party conferences. Especially for be possible to specify the
   latter case
   usage of source filters (inclusion and exclusion), Source Specific
   Multicast, the usage of multi-unicast, or other parameters.
   Therefore it is required that possible to extend the process definition of determining transport
   mechanisms by providing the
   subset required information in the element
   content. An example:

   <cfg>
     <component name="audio1" media="audio">
       <alt name= AVP-audio-0">
         <rtp format="rtp-avp-0">
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801">
           <option name="ssm" sender="sender.example.com"/>
          </udp>
         </rtp>
       </alt>
      </component>
   </cfg>

   More transport mechanisms and options will be defined in future
   versions of allowable potential this document.

3.1.3 Constraints

   Definitions specify media, transport, and other capabilities,
   whereas configurations is deterministic indicate which combinations of these could be
   used to
   reduce provide the number of required round trips before desired functionality in a session certain setting.

   There may, however, be further constraints within a system (such as
   CPU cycles, DSP available, dedicated hardware, etc.) that limit
   which of these configurations can be
   established.

   In instantiated in parallel (and
   how many instances of these may exist). We deliberately do not
   couple this aspect of system resource limitations to the various
   application semantics as the constraints exist across application
   boundaries. Also, in many cases, expressing such constraints is
   simply not necessary (as many uses of the following, current SDP show), so
   additional overhead can be avoided where this is not needed.

   Therefore, we elaborate on requirements for an SDPng
   specification, subdivided into general requirements and requirements
   for session descriptions, introduce a "Constraints" section to contain these
   additional limitations. Constraints refer to potential and actual
   configurations as
   well as negotiation rules.

3. SDPng

   This section outlines a proposed solution for describing
   capabilities that meets most of and to entity definitions and express and use simple
   logic to express mutual exclusion, limit the above requirements. Note that at
   this early point in time not all number of
   instantiations, and allow only certain combinations. The following
   example shows the details are completely
   filled in; rather, the focus is on the concepts definition of such a capability
   description and negotiation language.

3.1 Conceptual Outline

   Our concept for constraints that restricts the description language follows
   maximum number of instantiation of two alternatives (that would have
   to be defined in the system model
   introduced configuration section before) when they are
   used in parallel:

   <constraints>
     <par>
       <use-alt ref="AVP-audio-11" max="5">
       <use-alt ref="AVP-video-32" max="1">
     </par>
   </constraints>

   As the beginning example shows, contraints are defined by defining limits on
   simultaneous instantiations of this document. We use a rather alternatives. They are not defined by
   expressing abstract language to avoid misinterpretations due to different
   intuitive understanding of terms as far endsystem resources, such as possible.

   Our concept of a capability description language addresses various
   pieces of a full description of system and application capabilities
   in four separate "sections":

      Definitions (elementary and compound)

      Potential CPU speed or Actual Configurations

      Constraints memory
   size.

   By default, the "Constraints" section is empty (or missing) which
   means that no further restrictions apply.

3.1.4 Session attributes

3.1.1 Definitions Attributes

   The definition fourth and final section specifies a number of basic abstractions that the SDPng syntax addresses session
   layer attributes. These attributes largely include those defined by
   SDP [RFC2327] (which are later referenced to avoid repetitions explicitly indicated in more complex
   specifications the following
   specification) to describe originator, purpose, and allow for timing of a concise representation. Definition
   elements are labelled
   multimedia session among other characteristics. Furthermore, SDPng
   includes attributes indicating the semantics of the various
   Components in a teleconference or other session. This part of the
   specification is open ended with an identifier by which they may be
   referenced. They may IANA registry to be elementary or compound (i.e. combinations of
   elementary entities). Examples set up to
   register further types of definitions components; only a few of that sections
   include (but the examples are
   listed here.

   A session-level specification for connection information (SDP "c="
   line), bandwidth information (SDP "b=" line), and encryption keys
   (SDP "k=" lines) is deliberately not limited to) codec definitions, redundancy
   schemes, transport mechanisms and payload formats.

   Elementary definition elements do not reference other elements. Each
   elementary entity only consists of one of more provided for in SDPng.

   Session level attributes as defined by SDP still have to be examined
   and their
   values. Default values specified adopted for SDPng in a future revision of this specification.

3.1.4.1 Owner

   The owner refers to the definition section may be
   overridden creator of a session as defined in descriptions for potential (and later actual)
   configurations. RFC2327
   ("o=" line). The concrete mechanisms for overriding definitions
   are still to syntax is as follows:

   <owner user="username" id="session-id" version="version" nettype="IN"
                        addrtype="IP4" addr="130.149.25.97"/>

   The owner field MUST be defined. present if SDPng is used with SAP. For all
   other protocols, the moment, elementary elements are defined for media types
   (i.e. codecs) and for media transports. For each transport and for
   each codec to owner field MAY be used, specified. The attributes
   listed above match those from the respective SDP specification; all attributes need to
   MUST be defined.
   This definition may either present and they MUST be provided within created following the "Definition"
   section itself or in an external document (similar rules of
   RFC2327.

   Note: There are several possible ways ahead on this part: "owner"
   could stand as it is right now, but the various values of the
   various attributes could be concatenated (separated by blanks) the
   result being identical to the
   audio-video profile contents of the SDP "o=" line -- which
   then could be represented as either a single attribute or an IANA registry that define payload types
   and media stream identifiers.

   Examples for elementary definitions:

   <audio-codec name="audio-basic" encoding="PCMU sampling_rate="8000 channels="1"/>

   <audio-codec name="audio-L16-mono" encoding="L16 sampling_rate="44100 channels="1"/> as
   contents of the "owner" element. Alternatively, the owner element
   could become part of the "session" element described below. Or the
   contents of the owner element could become an attribute of the
   "session" element below.

3.1.4.2 Session Identification

   The "session" element type "audio-codec" is used in these examples to define
   audio codec configurations. identify the session and to provide
   a description and possible further references. The configuration parameters following
   attributes are given defined:

   name: The session name as attribute values.

   Compound elements combine it is to appear e.g. in a number of elementary and/or other
   compound elements for more complex descriptions. session
      directory. This mechanism can
   be used for simple standard configurations such as G.711 over
   RTP/AVP as well as is equivalent to express more complex coding schemes including
   e.g. FEC schemes, redundancy coding, and layered coding. Again, such
   definitions may the SDP "s=" line. This
      attribute MUST be standardized and externalized so that there is no
   need present.

   info: A pointer to repeat them in every specification.

   An example for further information about the definition of a audio-redundancy format:

   <audio-red name="red-pcm-gsm-fec">
     <use ref="audio-basic"/> <use ref="audio-gsm"/> <use ref="parityfec"/>
   </audio-red>

   In session; this example, the
      attribute MUST contain a URI. The attribute itself is OPTIONAL.

   The session element type "audio-red" MAY contain arbitrary text of any length (but
   authors are encouraged to keep the inline description brief and
   provide additional information via URLs. This text is used to define
   provide a
   redundant audio configuration that description of the session; it is labelled "red-pcm-gsm-fec" for
   later referencing. In the definition itself, equivalent of the
   SDP "i=" lines.

   Furthermore, the session element MAY contain other elements of the
   following types to provide further information about the session and
   its creator:

   info: The info element type "use" is used intended to reference other definitions.

   Definitions may have default values specified along with them for
   each attribute. Some provide a pointer to further
      information on the session itself. Its contents MUST be exactly
      one URI. If both the info attribute and one or more info elements
      are present, the union of these default the respective values may is used. Info
      elements are OPTIONAL, they MAY be overridden so
   that a codec definition can easily repeated any number of times.

   contact: The contact element provides contact information on the
      creator of the session; its contents MUST be re-used in exactly one URI. Any
      URI scheme suitable to reach a different context person or a group of persons is
      acceptable (e.g. by specifying sip:, mailto:, tel:). Contact elements are
      OPTIONAL, they MAY be repeated any number of times.

   <session name="An SDPng seminar" info="http://www.dmn.tzi.org/ietf/mmusic/">
       And here comes a different sampling rate) without long description of the need seminar indicating what
       this might be about and so forth. But we also include further
       information -- as additional elements:
       <info>http://www.ietf.org/</info>
       <contact>mailto:joe@example.com</contact>
       <contact>mailto:bob@example.com</contact>
       <contact>tel:+49421281</contact>
       <contact>sip:joe@example.com</contact>
       <contact>sip:bob@example.com</contact>
   </session>

3.1.4.3 Time Specification (SDP 't=', 'r=', and 'z=' lines)

   The time specification for a large number of base specifications.

   This approach allows to have simple as well session follows the same rules as more complex
   definitions which in
   SDP. Time specifications are commonly usually only meaningful when used be available in an extensible
   set of reference documents. Section 3.3 specifies
   conjunction with SAP and hence are OPTIONAL. SDPng uses the mechanisms for
   external references.

   Note: For negotiation between endpoints, it may be helpful to define
   two modes of operation: explicit
   following elements and implicit. Implicit
   specifications may refer to externally defined entities attributes to minimize
   traffic volume, explicit specifications would list all external
   definitions specify timing:

   The element "time" is used in to indicate a description in the "Definitions" section.
   Again, please see Section 3.3 schedule for complete discussion of external
   definitions.

3.1.2 Components & Configurations the session;
   time has two optional attributes:

   start: The "Configurations" section contains all starting time of the components that
   constitute first occurrence of the multimedia conference (IP telephone call, multiplayer
   gaming session etc.). For each as
      defined in RFC2327.

   end: The ending time of these components, the potential
   and, later, last occurrence of the actual configurations are given. Potential
   configurations are used during capability exchange and/or
   negotiation, actual configurations to configure media streams after
   negotiation or in session announcements (e.g. via SAP). A potential as
      defined in RFC2327.

   The time element MAY contain the following elements but otherwise
   MUST be empty:

   repeat: This element specifies the repetition pattern for the
      schedule. There MAY be zero or more occurrences of this element
      within the time element.  "repeat" has two MANDATORY and one
      OPTIONAL attribute and no further contents; the actual configuration attributes are as
      defined in SDP:

      interval: The duration between two start times of a component may be identical.

   Each component is labelled with an identifier so that it can the session.

         This attribute MUST be
   referenced, e.g. to associate semantics with a particular media
   stream. For such a component, any number of configurations may present.

      duration: The duration for which the session will be
   given with active
         starting at each configuration describing an alternate way repetition interval. This attribute MUST be
         present.

      offset: The offset relative to realize
   the functionality "start" attribute at which this
         repetition of the respective component.

   Each configuration (potential as well as actual) session is labelled with an
   identifier. A configuration combines start. This attribute is
         OPTIONAL; if it is absent, a default value of "0" is assumed.

      Formatting of the attribute values MUST follow the rules defined
      in RFC2327.

   zone: The zone element specifies one or more (elementary and/or
   compound) entities from the "Definitions" section to describe a
   potential time zone adjustments
      as defined in RFC2327. This element MAY have zero or an actual configuration. Within more
      occurrences in the specification of time element. It has two attributes as defined
      in SDP:

      adjtime: The time at which the configuration, default values next adjustment will take place.

      delta: The adjustment offset (typically +/- 1 hours).

   The example from the referenced entities may
   be overwritten.

   <cfg>
     <component name="audio1" media="audio">
       <alt name= AVP-audio-0">
         <rtp transport="udp-ip" format="audio-basic">
           <addr type="mc">
   	  <ipv4>239.239.239.239</ipv4> <port>30000</port>
   	</addr>
         </rtp>
       </alt>

       <alt name="AVP-audio-11">
         <rtp transport="udp-ip" format="audio-L16-mono">
            <addr type="mc">
   	   <ipv4>239.239.239.239</ipv4> <port>30000</port>
   	 </addr>
          </rtp>
        </alt>
      </component>
   </cfg>

   For example, an IP telephone call may require just RFC2327, page 16, expressed in SDPng:

   <time start="3034423619" stop="3042462419">
     <repeat interval="7d" duration="1h"/>
     <repeat interval="7d" duration="1h" offset="25h"/>
   </time>

3.1.4.4 Component Semantic Specification

   Another important session parameter is to specify - ideally in a single
   component id=interactive-audio with two possible ways
   machine-readable way but at least understandable for humans - the
   function of
   implementing it. The two corresponding configurations are
   "AVP-audio-0" without modification, the other ("AVP-audio-11") uses
   linear 16-bit encoding. various components in a session. Typically, transport address parameters such
   as the port number would also be provided. In this example, this
   information
   semantics of the streams are implicitly assumed (e.g. a video stream
   goes together with the only audio stream in a session). There are,
   however, scenarios in which such intuitive understanding is given by not
   sufficient and the "addr" element.

   During/after semantics must be made explicit.

   <info name="audio-interactive" function="speaker">
       Audio stream for the negotiation phase, an actual configuration is
   chosen out of different speakers
   </info>

   The above example shows a number simple definition of alternative potential configurations, the
   actual configuration may refer to the potential configuration just
   by its "id", possibly allowing semantics for some parameter modifications.
   Alternatively, a
   the full actual configuration component "interactive-audio". Further options may be given.

3.1.3 Constraints

   Definitions specify media, transport, added to
   provide additional information, e.g. language, and other capabilities,
   whereas configurations indicate which combinations of these could functions
   may be
   used specified (e.g. "panel", "audience", "chair", etc.).

3.2 Syntax Definition Mechanisms

   In order to provide allow for the desired functionality possibility to validate session
   descriptions and in order to allow for structured extensibility it
   is proposed to rely on a certain setting.

   There may, however, be further constraints within a system (such syntax framework that provides concepts as
   CPU cycles, DSP available, dedicated hardware, etc.)
   well as concrete procedures for document validation and extending
   the set of allowed syntax elements.

   SGML/XML technologies allow for the preparation of Document Type
   Definitions (DTDs) that limit
   which can define the allowed content models for
   the elements of these configurations conforming documents. Documents can be instantiated in parallel (and
   how many instances of these may exist). We deliberately do formally
   validated against a given DTD to check their conformance and
   correctness. XML DTDs however, cannot easily be extended. It is not
   couple this aspect
   possible to alter to content models of system resource limitations element types or to add new
   element types after the various
   application semantics as the constraints exist across application
   boundaries. Also, in many cases, expressing such constraints DTD has been specified.

   For SDPng a mechanism is
   simply not necessary (as many uses needed that allows the specification of a
   base syntax -- for example basic elements for the current SDP show), so
   additional overhead can high level
   structure of description documents -- while allowing extensions, for
   example elements and attributes for new transport mechanisms, new
   media types etc. to added on demand. Still, it has to be avoided where this is ensured
   that extensions do not needed.

   Therefore, we introduce result in name collisions. Furthermore, it
   must be possible for applications that process descriptios documents
   to disinguish extensions from base definitions.

   For XML, mechanisms have been defined that allow for structured
   extensibility of a "Constraints" section model of allowed syntax: XML Namespace and XML
   Schema.

   XML Schema mechanisms allows to constrain the allowed document
   content, e.g. for documents that contain these
   additional limitations. Constraints refer to potential
   configurations structured data and also
   provide the possibility that document instances can conform to entity
   several XML Schema definitions and express and use simple
   logic at the same time, while allowing
   Schema validators to check the conformance of these documents.

   Extensions of the session description language, say for allowing to
   express mutual exclusion, limit the number parameters of
   instantiations, and allow only certain combinations. The following
   example shows a new media type, would require the definition
   creation of a constraints corresponding XML schema definition that restricts contains the
   maximum number
   specification of instantiation element types that can be used to describe
   configurations of two alternatives (that would components for the new media type. Session
   description documents have to be defined in reference the configuration section before) when they are
   used in parallel:

   <constraints>
     <par>
       <use ref="AVP-audio-11" max="5"> <use ref="AVP-video-32" max="1">
     </par>
   </constraints>

   As non-standard Schema
   module, thus enabling parsers and validators to identify the example shows, contraints are defined by defining limits on
   simultaneous instantiations
   elements of alternatives. They the new extension module and to either ignore them (if
   they are not defined by
   expressing abstract endsystem resources, such as CPU speed supported) or memory
   size.

   By default, to consider them for processing the "Constraints" section is empty (or missing) which
   means that no further restrictions apply.

3.1.4 Session

   The "Session" section
   session/capability description.

   It is used important to describe general meta-information
   parameters of note that the communication relationship functionality of validating
   capability and session description documents is not necessarily
   required to be invoked generate or
   modified. It contains most (if not all) process them. For example, endpoints would
   be configured to understand only those parts of description
   documents that are conforming to the general parameters baseline specification and
   simply ignore extensions they cannot support. The usage of
   SDP (and XML and
   XML Schema is thus will easily be usable with SAP rather motivated by the need to allow for session
   announcements).

   In addition
   extensions being defined and added to the session description parameters, language in a structured
   way that does not preclude the "Session"
   section also ties possibility to have applications to
   identify and process the various components extensions elements they might support. The
   baseline specification of XML Schema definitions and profiles must
   be well-defined and targeted to certain semantics. If,
   in current SDP, two audio streams were specified (possibly even
   using the same codecs), there was little way to differentiate
   between their uses (e.g. live set of parameters that are
   relevant for the protocols and algorithms of the Internet Multimedia
   Conferencing Architecture, i.e. transport over RTP/UDP/IP, the audio from an event broadcast vs.
   video profile of RFC1890 etc.

   Section 3.3 describes profile definitions and library definition. A
   detailed definition of how the
   commentary from formal SDPng syntax and the TV studio).

   This section also allows
   corresponding extension mechanisms is to tie together different media streams or
   provide a more elaborate description be provided in future
   versions of alternatives (e.g. subtitles
   or not, which language, etc.). this document.

   The example below shows how the definition of codecs,
   transport-variants and configuration of components could be
   realized. Please note that this is not a complete example and that
   identifiers have been chosen arbitrarily.

   <def>
    <audio-codec name="audio-basic" encoding="PCMU"
                 sampling="8000 channels="1"/>

    <audio-codec name="audio-L16-mono" encoding="L16"
                 sampling="44100 channels="1"/>

    <rtp-pt name="rtp-avp-0" pt="0" format="audio-basic"/>
    <rtp-pt name="rtp-avp-11" pt="11" format="audio-L16-mono"/>

   </def>

   <cfg>
     <component name="interactive-audio" media="audio">
       <alt name= AVP-audio-0">
         <rtp format="rtp-avp-0">
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801"/>
         </rtp>
       </alt>

       <alt name= AVP-audio-11">
         <rtp format="rtp-avp-11">
          <udp addr="224.2.0.53" rtp-port="7800" rtcp-port="7801"/>
         </rtp>
       </alt>
      </component>
   </cfg>

   <constraints>
     <par>
       <use-alt ref="AVP-audio-11" max="1">
     </par>
   </constraints>

   <conf>
     <subject>SDPng test</subject>
     <originator>joe@example.com</originator>
     <about>A test conference</about>
    <owner user="joe@example.com" id="foobar" version="1" nettype="IN"
                         addrtype="IP4" addr="130.149.25.97"/>
    <session name="An SDPng seminar" info="http://www.dmn.tzi.org/ietf/mmusic/">
     This seminar is about SDPng...
     <info>http://www.ietf.org/</info>
     <contact>mailto:joe@example.comg</contact>
     <contact>sip:joe@example.com</contact>
    </session>

    <time start="3034423619" stop="3042462419">
     <repeat interval="7d" duration="1h"/>
     <repeat interval="7d" duration="1h" offset="25h"/>
    </time>

    <info name="audio1" name="interactive-audio" function="speaker">
       Video
       Audio stream for the different speakers
    <info>

   </conf>

   Further uses are envisaged but need

   The example does also not include specifications of XML Schema
   definitions or references to such definitions. This will be defined provided
   in a future versions version of this document.

3.2 Syntax Proposal

   In order to allow for the possibility to validate session
   descriptions and in order to allow for structured extensibility it
   is proposed to rely on a syntax framework that provides concepts as
   well as concrete procedures for document validation and extending
   the set of allows syntax elements.

   SGML/XML technologies allow for the preparation of Document Type
   Definitions (DTDs) that can define the allowed content models for draft.

   A real-world capability description would likely be shorter than the elements of conforming documents. Documents
   presented example because the codec and transport definitions can be formally
   validated against a given DTD
   factored-out to check their conformance and
   correctness. For XML, mechanisms have been defined profile definition documents that would only be
   referenced in capability description documents.

3.3 External Definition Packages

3.3.1 Profile Definitions

   In order to allow for
   structured extensibility it must be possible to define
   extensions to the basic SDPng configuration options.

   For example if some application requires the use of a model of allowed syntax: XML Namespace
   and XML Schema.

   XML Schema mechanisms allows new esoteric
   transport protocol endpoints must be able describe their
   configuration with respect to constrain the allowed document
   content, e.g. for documents parameters of that contain structured data transport
   protocol. The mandatory and also
   provide the possibility optional parameters that document instances can conform to
   several XML Schema definitions at be
   configured and negotiated when using the same time, while allowing
   Schema validators transport protocol will be
   specified in a definition document. Such a definition document is
   called a "profile".

   A profile contains rules that specify how SDPng is used to describe
   conferences or endsystem capabilities with respect to check the conformance parameters
   of these documents.

   Extensions the profile. The concrete properties of the session description language, say for allowing profile definitions
   mechanism are still to
   express the parameters be defined.

   An example of such a new media type, profile would require be the
   creation RTP profile that defines
   how to specify RTP parameters. Another example would be the audio
   codec profiles that defines how specify audio codec parameters.

   SDPng documents can reference profiles and provide concrete
   definitions, for example the definition for the GSM audio codec.
   (This would be done in the "Definitions" section of a corresponding XML schema definition SDPng
   document.) A SDPng document that contains the
   specification references a profile and provides
   concrete defintions of element types that configurations can be used to describe
   configurations of components for validated against the new media type. Session
   description documents have to reference
   profile definition.

3.3.2 Library Definitions

   While profile definitions specify the non-standard Schema
   module, thus enabling parsers allowed parameters for a given
   profile SDPng definition sections refer to profile definitions and validators
   define concrete configurations based on a specific profile.

   In order for such definitions to identify be imported into SDPng documents,
   there will be the
   elements notion of the new extension module and "SDPng libraries". A library is a set of
   definitions that is conforming to either ignore them (if
   they are not supported) or a certain profile definition (or
   to consider them for processing the
   session/capability description.

   It is important more than one profile definition -- this needs to note that the functionality be defined).

   The purpose of validating
   capability and session description documents the library concept is not necessarily
   required to generate or process them. For example, endpoints would allow certain common
   definitions to be configured factored-out so that not every SDPng document has
   to understand only those parts of description
   documents include the basic definitions, for example the PCMU codec
   definition. SDP [2] uses a similar concept by relying on the well
   known static payload types (defined in RFC1890 [4]) that are conforming also
   just referenced but never defined in SDP documents.

   An SPDng document that references definitions from an external
   library has to declare the baseline specification and
   simply ignore extensions they cannot support. use of the external library. The usage external
   library, being a set of XML and
   XML Schema is thus rather motivated by the need to allow configuration definitions for
   extensions being defined and added to the language in a structured
   way that does not preclude the possibility to have applications given
   profile, again needs to
   identify and process declare the extensions elements they might support. The
   baseline specification use of XML Schema the profile that it is
   conformant to.

   There are different possibilities of how profiles definitions and profiles must
   libraries can be well-defined used in SDPng documents:

   o  In an SPDng document a profile definition can be referenced and targeted to
      all the set of parameters that configuration definitions are
   relevant for provided within the protocols and algorithms of
      document itself. The SDPng document is self-contained with
      respect to the Internet Multimedia
   Conferencing Architecture, i.e. transport over RTP/UDP/IP, definitions it uses.

   o  In an SPDng document the audio
   video profile use of RFC1890 etc. an external library can be
      declared. The example below shows library references a profile definition and the
      SDPng document references the library. There are two alternatives
      how external libraries can be referenced:

      by name: Referencing libraries by names implies the definition use of codecs,
   transport-variants a
         registration authority where definitions and configuration of components could reference names
         can be
   realized. Please note that this registered with. It is not a complete example and that
   identifiers have been chosen arbitrarily.

   <def>
     <audio-codec name="audio-basic" encoding="PCMU sampling_rate="8000 channels="1"/>

     <audio-codec name="audio-L16-mono" encoding="L16 sampling_rate="44100 channels="1"/>

     <fec name="parityfec"/>

     <audio-red name="red-pcm-gsm-fec">
       <use ref="audio-basic"/> <use ref="audio-gsm"/> <use ref="parityfec"/>
     </audio-red>
   </def>
   <cfg>
     <component name="audio1" media="audio">
       <alt name= AVP-audio-0">
         <rtp transport="udp-ip" format="audio-basic">
           <addr type="mc">
             <ipv4>239.239.239.239</ipv4> <port>30000</port>
           </addr>
         </rtp>
       </alt>

       <alt name="AVP-audio-11">
         <rtp transport="udp-ip" format="audio-L16-mono">
           <addr type="mc">
             <ipv4>239.239.239.239</ipv4> <port>30000</port>
           </addr>
         </rtp>
       </alt>
     </component>
   </cfg>

   <constraints>
     <par>
       <use ref="AVP-audio-11" max="5"> <use ref="AVP-video-32" max="1">
     </par>
   </constraints>

   <conf>
     <subject>SDPng test</subject>
     <originator>joe@example.com</originator>
     <about>A test conference</about>
     <info name="audio1" function="speaker">
       Video stream for conceivable that the different speakers
       <info>
   </conf>

   The example does also not include specifications of XML Schema most common
         SDPng definitions or references to such definitions. This be registered that way and that there will
         be provided
   in a future version baseline set of this draft.

   A real-world capability description would likely be shorter than the
   presented example because the codec and transport definitions can be
   factored-out to profile definition documents that would only be
   referenced in capability description documents.

3.3 External Definition Packages

3.3.1 Profile Definitions

   In order to allow for extensibility it minimal implementations
         must understand. Secondly, a registration procedure will be possible
         defined, that allows vendors to define
   extensions register frequently used
         definitions with a registration authority (e.g., IANA) and to the basic SDPng configuration options.

   For example if some application requires
         declare the use of a new esoteric
   transport protocol endpoints must registered definition packages in
         conforming SDPng documents. Of course, care should be able describe their
   configuration with respect taken
         not to make the parameters of that transport
   protocol. The mandatory and optional parameters that can be
   configured external references too complex and negotiated when using the transport protocol will be
   specified in thus
         require too much a definition document. Such priori knowledge in a definition document protocol engine
         implementing SDPng. Relying on this mechanism in general is
   called a "profile".

   A profile contains rules
         also problematic because it impedes the extensiblity, because
         it requires implementors to provide support for new extensions
         in their products before they can interoperate. Registration
         is not useful for spontaneous or experimental extensions that specify how
         are defined in an SDPng library.

      by address: An alternative to referencing libraries by name is used to describe
   conferences or endsystem capabilities with respect to
         declare the parameters use of an external library by providing an
         address, i.e., an URL, that specifies where the profile. The concrete properties library can be
         obtained. While is allows the use of arbitrary third-party
         libraries that can extend the profile definitions
   mechanism basic SDPng set of configuration
         options in many ways there are still to problems if the referenced
         libraries cannot be defined.

   An example accessed by all communication partners.

   o  Because of such these problematic properties of external libraries,
      the final SDPng specification will have to provide a profile would be set of
      recommendations under which circumstances the RTP profile that defines
   how different
      mechanisms of externalizing definitions should be used.

3.4 Mappings

   A mapping needs to specify RTP parameters. Another example would be the audio
   codec profiles defined in particular to SDP that defines how specify audio codec parameters.

   SDPng document can reference profiles and provide concrete
   definitions, for example the definition for allows to
   translate final session descriptions (i.e. the GSM audio codec.
   (This would result of capability
   negotiation processes) to SDP documents. In principle, this can be
   done in the "Definitions" section of a rather schematic fashion.

   Furthermore, to accommodate SIP-H.323 gateways, a mapping from SDPng
   document.) A
   to H.245 needs to be specified at some point.

4. Formal Specification

   To be provided.

5. Use of SDPng document that references a profile and provides
   concrete defintions in conjunction with other IETF Signaling Protocols

   SDPng defines the notion of configurations can be validated against Components to indicate the
   profile definition.

3.3.2 Library Definitions

   While profile definitions specify intended
   types of collaboration between the allowed parameters for users in e.g. a teleconferencing
   scenario.

   For the means conceivable to realize a given
   profile particular Component, SDPng definition sections refer to profile definitions and
   define concrete configurations based on
   conceptually distinguishes three levels of support:

      a specific profile.

   In order to such definitions Capapility refers to be imported into SDPng documents,
   there will be the notion fact that one of "SDPng libraries". A library is the involved parties
      supports one particular way of exchanging media -- defined in
      terms of transport, codec, and other parameters -- as part of the
      teleconference.

      a Potential Configuration denotes a set of
   definitions that is conforming matching Capabilities
      from all those involved parties required to successfully realize
      one particular Component.

      an Actual Configuration indicates the Potential Configuration
      which was chosen by the involved parties to realize a certain profile definition (or
   to more than
      Component at one profile definition -- particular point in time.

   As mentioned before, this abstract notion of the interactions
   between a number of communicating systems needs to be defined).

   The purpose mapped to the
   application scenarios of SDPng in conjunction with the library concept various IETF
   signaling protocols: SAP, SIP, RTSP, and MEGACO.

5.1 The Session Announcement Protocol (SAP)

   SAP is used to allow certain common
   definitions disseminate a previously created (and typically
   fixed) session description to be factored-out so that not every a potentially large audience. An
   interested member of the audience will use the SDPng document has description
   contained in SAP to include the basic definitions, for example join the PCMU codec
   definition. SDP [1] uses announced media sessions.

   This means that a similar concept by relying on SAP announcements contains the well
   known static payload types (defined in RFC1890 [3]) Actual
   Configurations of all Components that are also
   just referenced but never defined in SDP documents.

   An SPDng document that references definitions from an external
   library has to declare the use part of the external library. The external
   library, being a set overall
   teleconference or broadcast.

   A SAP announcement may contain multiple Actual Configurations for
   the same Component. In this case, the "same" (i.e. semantically
   equivalent) media data from one configuration must be available from
   each of configuration definitions for a given
   profile, again needs to declare the Actual Configurations. In practice, this limits the use
   of the profile that it is
   conformant to.

   There are different possibilities multiple Actual Configurations to single-source multicast or
   broadcast scenarios.

   Each receiver of how profiles definitions and
   libraries can be used in a SAP announcement with SDPng documents:

   o  In an SPDng document compares its locally
   stored Capabiities to realize a profile definition can be referenced and
      all certain Component against the configuration definitions are provided within Actual
   Configurations contained in the
      document itself. The SDPng document is self-contained with
      respect to announcement. If the definitions intersection
   yields one or more Potential Configurations for the receiver, it uses.

   o  In an SPDng document
   chooses the use of an external library can be
      declared. The library references a profile definition and one it sees fit best. If the
      SDPng document references intersection is empty, the library. There are two alternatives
      how external libraries can
   receiver cannot participate in the announced session.

   SAP may be referenced:

      by name: Referencing libraries substituted by names implies HTTP (in the use of general case, at least),
   SMTP, NNTP, or other IETF protocols suitable for conveying a
         registration authority where definitions and reference names
         can be registered with. It is conceivable that media
   description from one entity to one or more other without the most common
         SDPng definitions be registered that way and that there will
         be a baseline set intend
   for further negotiation of definitions that minimal implementations
         must understand. Secondly, a registration procedure will be
         defined, that allows vendors the session parameters.

   Example from the SAP spec. to register frequently be provided.

5.2 Session Initiation Protocol (SIP)

   SIP is used
         definitions with a registration authority (e.g., IANA) and to
         declare the use of registered definition packages in
         conforming establish and modify multimedia sessions, and SDPng documents. Of course, care should
   may be taken
         though not to make the external references too complex carried at least in SIP INVITE and
         thus require too much a priori knowledge ACK messages as well as in
   a protocol engine
         implementing SDPng. Relying on this mechanism in general is
         also problematic because it impedes the extensiblity, because
         it requires implementors to provide support number of responses. From dealing with legacy SDP (and its
   essential non-suitability for new extensions
         in their products before they can interoperate. Registration
         is not useful capability negotiation), a particular
   use and interpretation of SDP has been defined for spontaneous or experimental extensions SIP.

   One of the important flexibilities introduced by SIP's usage of SDP
   is that
         are defined in a sender can change dynamically between all codecs that a
   receiver has indicated support (and has provided an SDPng library. address) for.
   Codec changes are not signaled out-of-band but only indicated by address: An alternative the
   payload type within the media stream. From this arises one important
   consequence to referencing libraries by name the conceptual view of a Component within SDPng.

   There is to
         declare no clear distinction between Potential and Actual
   Configurations. There need not be a single Actual Configuration be
   chosen at setup time within the use SIP signaling. Instead, a number of an external library by providing an
         address, i.e., an URL, that specifies where
   Potential Configurations is signaled in SIP (with all transport
   parameters required for carrying media streams) and the library can be
         obtained. While Actual
   Configuration is allows only identified by the use of arbitrary third-party
         libraries paylaod type which is
   actually being transmitted at any point in time.

   Note that can extend the basic since SDPng set of configuration
         options in many ways there are problems if does not explicitly distinguish between
   Potential and Actual Configurations, this has no implications on the referenced
         libraries cannot
   SDPng signaling itself.

   SIP Examples to be accessed defined.

5.3 Real-Time Streaming Protocol (RTSP)

   In contrast to SIP, RTSP has, from its intended usage, a clear
   distinction between offering Potential Configurations (typically by all communication partners.

   o  Because
   the server) and choosing one out of these problematic properties of external libraries, (by the final SDPng specification will have to provide client), and, in
   some cases; some parameters (such as multicast addresses) may be
   dictated by the server. Hence with RTSP, there is a set of
      recommendations under clear
   distinguish between Potential Configurations during the negotiation
   phase and a finally chosen Actual Configuration according to which circumstances
   streaming will take place.

   Example from the different
      mechanisms of externalizing definitions should be used.

3.4 Mappings

   A mapping needs RTSP spec to be defined in particular to SDP that allows to
   translate final session descriptions (i.e. provided.

5.4 Media Gateway Control Protocol (MEGACOP)

   The MEGACO architecture also follows the result SDPng model of capability
   negotiation processes) to SDP documents. In principle, this can be
   done in a rather schematic fashion.

   Furthermore, to accommodate SIP-H.323 gateways, clear
   separation between Potential and Actual Configurations. Upon
   startup, a mapping from SDPng Media Gateway (MG) will "register" with its Media Gateway
   Controller (MGC) and the latter will audit the MG for its
   Capabilities. Those will be provided as Potential Configurations,
   possibly with extensive Constraints specifications. Whenever a media
   path needs to H.245 be set up by the MGC between two MGs or an MG needs to
   be specified at some point.

4. reconfigured internally, the MGC will use (updated) Actual
   Configurations.

   Details and examples to be defined.

6. Open Issues

      Overriding

      Sytnax

      The precise sytnax for referencing profiles and libraries

      Registry needs
      to be worked out.

      A registry (reuse of SDP mechanisms and names etc.) needs to be
      set up.

      Transport and Payload type specifications need to be defined as
      additional appendices.

      Negotiation mechanisms for multiparty conferencing need to be
      formalized.

      Further details on the signaling protocols need to be filled in.

      Mapping to other media description formats (SDP, H.245, ...)
      should be provided. For H.245, this is probably a different
      document (beloning to the SIP-H.323 interworking group).

References

   [1]  Kutscher, D., Ott, J., Bormann, C. and I. Curcio, "Requirements
        for Session Description and Capability Negotiation", Internet
        Draft draft-ietf-mmusic-sdpng-req-01.txt, April 2001.

   [2]  Handley, M. and V. Jacobsen, "SDP: Session Description
        Protocol", RFC 2327, April 1998.

   [2]

   [3]  Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobsen,
        "RTP: A Transport Protocol for Real-Time Applications", RFC
        1889, January 1996.

   [3]

   [4]  Schulzrinne, H., "RTP Profile for Audio and Video Conferences
        with Minimal Control", RFC 1890, January 1996.

   [4]

   [5]  Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video
        Conferences with Minimal Control", Internet-Draft
        draft-ietf-avt-profile-new-10.txt , March 2001.

   [6]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley,
        M., Bolot, J., Vega-Garcia, A. and S. Fosse-Parisis, "RTP
        Payload for Redundant Audio Data", RFC 2198, September 1997.

   [5]  Klyne, G., "A Syntax Redundant Audio Data", RFC 2198, September 1997.

   [7]  Klyne, G., "A Syntax for Describing Media Feature Sets", RFC
        2533, March 1999.

   [8]  Klyne, G., "Protocol-independent Content Negotiation
        Framework", RFC 2703, September 1999.

   [9]  Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for
        Generic Forward Error Correction", RFC 2733, December 1999.

   [10]  Perkins, C. and O. Hodson, "Options for Repair of Streaming
         Media", RFC 2354, June 1998.

   [11]  Handley, M., Perkins, C. and E. Whelan, "Session Announcement
         Protocol", RFC 2974, October 2000.

Authors' Addresses

   Dirk Kutscher
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.218-7595, sip:dku@tzi.org
   Fax:   +49.421.218-7000
   EMail: dku@tzi.uni-bremen.de

   Joerg Ott
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.201-7028, sip:jo@tzi.org
   Fax:   +49.421.218-7000
   EMail: jo@tzi.uni-bremen.de

   Carsten Bormann
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.218-7024, sip:cabo@tzi.org
   Fax:   +49.421.218-7000
   EMail: cabo@tzi.org

Appendix A. Base SDPng Specifications for Audio Codec Descriptions

   [5] specifies a number of audio codecs including short name to be
   used as reference by session description protocols such as SDP and
   SDPng. Those codec names, as listed in the first column of the above
   table, are used to identify codecs in SDPng.

   The following sections indicate the default values that are assumed
   if nothing else than the codec reference is specified.

   The following audio-codec attributes are defined for audio codecs:

   name: the identifier to be later used for referencing the codec spec

   encoding: the RTP/AVP profile identifier as registered with IANA

   mime: the MIME type; may alternatively be specified instead of
      "encoding"

   channels: the number of independent media channels

   pattern: the media channel pattern for mapping channels to payload

   sampling: the sample rate for the codec (which in most cases equals
      the RTP clock)

   Furthermode, options may be defined of the following format:

   <option id="name">value</option>

   if a value is associated with the option (note that arbitrary
   complex values are allowed), or alternatively:

   <option id="name"/>

   if the option is just a boolean indicator.

   Attributes for the "option" tag are the following:

   id: the identifier for Describing Media Feature Sets", RFC
        2533, March 1999.

   [6]  Klyne, G., "Protocol-independent Content Negotiation
        Framework", RFC 2703, September 1999.

   [7]  Rosenberg, J. the option (variable name)

   collaps: the collapsing rules for this optional element, defined as
      follows:

      min: for numeric values only

      max: for numeric values only

      x: intersection of enumerated values, value lists

A.1 DVI4

   <audio-codec name="dvi4" encoding="DVI4" channels="1" sampling="8000">

   <rtp-pt name="rtp-avp-5" pt="5" format="dvi4"/>
   <rtp-pt name="rtp-avp-6" pt="6">
       <audio-codec encoding="DVI4" channels="1" sampling="16000">
   </rtp-pt>

   Note that there is no default sampling rate specified for DVI4 and H. Schulzrinne, "An
   hence a sampling rate MUST be specified.

A.2 G.722

   <audio-codec name="g722" encoding="G722" channels="1" sampling="16000"/>
   <rtp-pt name="rtp-avp-9" pt="9" format="g722"/>

   Note as per [5] that the RTP Payload Format for
        Generic Forward Error Correction", RFC 2733, December 1999.

   [8]  Perkins, C. clock rate is 8000Hz rather than 16000
   Hz.

A.3 G.726

   <audio-codec name="g726-40" encoding="G726-40" channels="1" sampling="8000"/>
   <audio-codec name="g726-32" encoding="G726-32" channels="1" sampling="8000"/>
   <audio-codec name="g726-24" encoding="G726-24" channels="1" sampling="8000"/>
   <audio-codec name="g726-16" encoding="G726-16" channels="1" sampling="8000"/>

   <rtp-pt name="rtp-avp-5" pt="5" format="g726-32"/>

A.4 G.728

   <audio-codec name="g728" encoding="G728" channels="1" sampling="8000"/>
   <rtp-pt name="rtp-avp-15" pt="15" format="g728"/>

A.5 G.729

   G.729 Annex A: reduced complexity of G.729
   G.729 Annex B: comfort noise

   <audio-codec name="g729" encoding="G729" channels="1" sampling="8000"/>
   <rtp-pt name="rtp-avp-18" pt="18" format="g729"/>

   For further codec description, the following options (which carry no
   values associated with them) MAY be included:

   <option id="annexA"/>
   <!-- to indicate the use of Annex A reduced complexity -->

   <option id="annexB"/>
   <!-- to indicate the use of Annex B comfort noise -->

   As stated in [5], the use of these options can be detected within
   the media stream.

A.6 G.729 Annex D and O. Hodson, "Options for Repair E

   <audio-codec name="g729d" encoding="G729D" channels="1" sampling="8000"/>
   <audio-codec name="g729e" encoding="G729E" channels="1" sampling="8000"/>

   The following option MAY be used with both Annexes D and E:

   <option id="annexB"/>
   <!-- to indicate the use of Streaming
        Media", RFC 2354, June 1998.

   [9]  Handley, M., Perkins, C. Annex B comfort noise -->

A.7 GSM

A.7.1 GSM Full Rate

   The GSM Full Rate codec is indicated as follows:

   <audio-codec name="gsm" encoding="GSM" channels="1" sampling="8000"/>
   <rtp-pt name="rtp-avp-3" pt="3" format="gsm"/>

A.7.2 GSM Half Rate

   The GSM Half Rate codec is indicated as follows:

   <audio-codec name="gsm-hr" encoding="GSM-HR" channels="1" sampling="8000"/>

A.7.3 GSM Enhanced Full Rate

   The GSM Enhanced Full Rate codec is indicated as follows:

   <audio-codec name="gsm-efr" encoding="GSM-EFR" channels="1" sampling="8000"/>

A.8 L8

   <audio-codec name="l8" encoding="L8" channels="1" sampling="8000"/>

A.9 L16

   <audio-codec name="l16" encoding="L16" channels="1" sampling="8000"/>

   <rtp-pt name="rtp-avp-11" pt="11" format="gsm"/>
   <rtp-pt name="rtp-avp-10" pt="11" format="gsm">
     <audio-codec encoding="L16" channels="2" sampling="8000"/>
   </rtp-pt>

A.10 LPC

   <audio-codec name="lpc" encoding="LPC" channels="1" sampling="8000"/>

A.11 MPA

   <audio-codec name="mpa" encoding="MPA" channels="1" sampling="8000"/>
   <rtp-pt name="rtp-avp-14" pt="14" format="mpa"/>

A.12 PCMA and E. Whelan, "Session Announcement
        Protocol", RFC 2974, October 2000.

Authors' Addresses

   Dirk Kutscher
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.218-7595
   Fax:   +49.421.218-7000
   EMail: dku@tzi.uni-bremen.de
   Joerg Ott
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.201-7028
   Fax:   +49.421.218-7000
   EMail: jo@tzi.uni-bremen.de

   Carsten Bormann
   TZI, Universitaet Bremen
   Bibliothekstr. 1
   Bremen  28359
   Germany

   Phone: +49.421.218-7024
   Fax:   +49.421.218-7000
   EMail: cabo@tzi.org PCMU

   <audio-codec name="pcmu" encoding="PCMU" channels="1" sampling="8000"/>
   <audio-codec name="pcma" encoding="PCMA" channels="1" sampling="8000"/>

   <rtp-pt name="rtp-avp-0" pt="0" format="pcmu"/>
   <rtp-pt name="rtp-avp-8" pt="8" format="pcma"/>

A.13 QCELP

   <audio-codec name="qcelp" encoding="QCELP" channels="1" sampling="8000"/>
   <rtp-pt name="rtp-avp-12" pt="12" format="qcelp"/>

A.14 VDVI

   <audio-codec name="vdvi" encoding="VDVI" channels="1" sampling="8000"/>

Full Copyright Statement

   Copyright (C) The Internet Society (2001). All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implmentation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph
   are included on all such copies and derivative works. However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC editor function is currently provided by the
   Internet Society.