[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: (draft-ietf-rohc-epic-lite) 00 01 02 03 04 05 06 07 08 09 10 11 12 13 RFC 4997

Robust Header Compression                                     R. Finking
Internet-Draft                                        Siemens/Roke Manor
Expires: April 28, 2005                                     G. Pelletier
                                                             Ericsson AB
                                                                R. Price
                                             Cogent Defence and Security
                                                                Networks
                                                        October 28, 2004


        Formal Notation for Robust Header Compression (ROHC-FN)
                 draft-ietf-rohc-formal-notation-04.txt

Status of this Memo

   This document is an Internet-Draft and is subject to all provisions
   of section 3 of RFC 3667.  By submitting this Internet-Draft, each
   author represents that any applicable patent or other IPR claims of
   which he or she is aware have been or will be disclosed, and any of
   which he or she become aware will be disclosed, in accordance with
   RFC 3668.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as
   Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 28, 2005.

Copyright Notice

   Copyright (C) The Internet Society (2004).

Abstract

   This document defines ROHC-FN: a formal notation for specifying how
   to compress and decompress fields from an arbitrary protocol stack.
   ROHC-FN is intended to simplify the creation of new compression



Finking, et al.          Expires April 28, 2005                 [Page 1]

Internet-Draft                  ROHC-FN                     October 2004


   profiles to fit within the ROHC (RFC3095 [4]) framework.

Table of Contents

   1.   Introduction . . . . . . . . . . . . . . . . . . . . . . . .   4
   2.   Terminology  . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.   Overview of ROHC-FN  . . . . . . . . . . . . . . . . . . . .   5
     3.1  Scope of ROHC-FN . . . . . . . . . . . . . . . . . . . . .   5
     3.2  Fundamentals of ROHC-FN  . . . . . . . . . . . . . . . . .   6
       3.2.1  Fields and Encodings . . . . . . . . . . . . . . . . .   6
       3.2.2  Structures . . . . . . . . . . . . . . . . . . . . . .   7
     3.3  Example using IPv4 . . . . . . . . . . . . . . . . . . . .   9
   4.   Normative Definition of ROHC-FN  . . . . . . . . . . . . . .  11
     4.1  Overall Structure of a Specification . . . . . . . . . . .  11
     4.2  Constant Definitions . . . . . . . . . . . . . . . . . . .  12
     4.3  Field Attributes . . . . . . . . . . . . . . . . . . . . .  12
     4.4  Expressions  . . . . . . . . . . . . . . . . . . . . . . .  13
     4.5  Expressions: NOTE:Merge+Remove . . . . . . . . . . . . . .  15
     4.6  Field References . . . . . . . . . . . . . . . . . . . . .  16
     4.7  Reserved Keywords  . . . . . . . . . . . . . . . . . . . .  16
       4.7.1  "let"  . . . . . . . . . . . . . . . . . . . . . . . .  16
       4.7.2  "this" . . . . . . . . . . . . . . . . . . . . . . . .  17
     4.8  Comments . . . . . . . . . . . . . . . . . . . . . . . . .  17
       4.8.1  End of line comments . . . . . . . . . . . . . . . . .  17
       4.8.2  Block comments . . . . . . . . . . . . . . . . . . . .  18
     4.9  Encoding Methods . . . . . . . . . . . . . . . . . . . . .  18
       4.9.1  uncompressed_value . . . . . . . . . . . . . . . . . .  18
       4.9.2  compressed_value . . . . . . . . . . . . . . . . . . .  19
       4.9.3  irregular  . . . . . . . . . . . . . . . . . . . . . .  20
       4.9.4  static . . . . . . . . . . . . . . . . . . . . . . . .  20
       4.9.5  lsb  . . . . . . . . . . . . . . . . . . . . . . . . .  20
       4.9.6  crc  . . . . . . . . . . . . . . . . . . . . . . . . .  21
     4.10   Profile-specific Encoding Methods  . . . . . . . . . . .  22
     4.11   Structures . . . . . . . . . . . . . . . . . . . . . . .  23
       4.11.1   Simple Structures  . . . . . . . . . . . . . . . . .  23
       4.11.2   Arguments and Structures . . . . . . . . . . . . . .  25
       4.11.3   Multiple Formats . . . . . . . . . . . . . . . . . .  26
       4.11.4   Recursive Structures . . . . . . . . . . . . . . . .  29
     4.12   Lists  . . . . . . . . . . . . . . . . . . . . . . . . .  30
       4.12.1   Notation . . . . . . . . . . . . . . . . . . . . . .  31
       4.12.2   List Encoding  . . . . . . . . . . . . . . . . . . .  34
   5.   Security considerations  . . . . . . . . . . . . . . . . . .  38
   6.   Acknowledgements . . . . . . . . . . . . . . . . . . . . . .  39
   7.   References . . . . . . . . . . . . . . . . . . . . . . . . .  39
        Authors' Addresses . . . . . . . . . . . . . . . . . . . . .  39
   A.   Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . .  40
     A.1  Reserved Keywords  . . . . . . . . . . . . . . . . . . . .  40
     A.2  Characters . . . . . . . . . . . . . . . . . . . . . . . .  41



Finking, et al.          Expires April 28, 2005                 [Page 2]

Internet-Draft                  ROHC-FN                     October 2004


     A.3  Literals . . . . . . . . . . . . . . . . . . . . . . . . .  43
     A.4  Identifiers  . . . . . . . . . . . . . . . . . . . . . . .  43
     A.5  Opertators . . . . . . . . . . . . . . . . . . . . . . . .  43
     A.6  Expressions  . . . . . . . . . . . . . . . . . . . . . . .  43
     A.7  Constants  . . . . . . . . . . . . . . . . . . . . . . . .  44
     A.8  Field Names  . . . . . . . . . . . . . . . . . . . . . . .  44
     A.9  Attributes . . . . . . . . . . . . . . . . . . . . . . . .  44
     A.10   Encoding Methods . . . . . . . . . . . . . . . . . . . .  44
     A.11   Structures . . . . . . . . . . . . . . . . . . . . . . .  45
   B.   Bit-level Worked Example . . . . . . . . . . . . . . . . . .  46
     B.1  Example Packet Format  . . . . . . . . . . . . . . . . . .  46
     B.2  Initial Encoding . . . . . . . . . . . . . . . . . . . . .  46
     B.3  Basic Compression  . . . . . . . . . . . . . . . . . . . .  47
     B.4  Inter-packet compression . . . . . . . . . . . . . . . . .  49
     B.5  Variable Length Discriminators . . . . . . . . . . . . . .  52
     B.6  Default encoding . . . . . . . . . . . . . . . . . . . . .  55
        Intellectual Property and Copyright Statements . . . . . . .  57


































Finking, et al.          Expires April 28, 2005                 [Page 3]

Internet-Draft                  ROHC-FN                     October 2004


1.  Introduction

   ROHC-FN is a formal notation designed to help with the definition of
   ROHC (RFC3095 [4]) header compression profiles.  ROHC-FN offers a
   library of encoding methods that are often used in ROHC profiles, so
   new profiles can be specified without the need to redefine this
   library from scratch.

   Informally, an encoding method is a function that maps between
   uncompressed data and compressed data.  The simplest encoding methods
   only have one input and one output: the input is an uncompressed
   field and the output is the compressed version of the field.  More
   complex encoding methods can compress multiple fields at the same
   time, e.g.  "list" encoding from RFC3095 [4], which is designed to
   compress an ordered list of fields.

2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC2119 [2].

   o  Control field

      Control fields are transmitted from a ROHC compressor to a ROHC
      decompressor, but are not part of the uncompressed header itself.

   o  Encoding method

      Encoding methods are functions that can be applied to compress
      fields in a protocol header.

   o  Field

      ROHC-FN divides the protocol header to be compressed into a set of
      contiguous bit patterns known as fields.

   o  Library of encoding methods

      The library of encoding methods contains a number of commonly used
      encoding methods for compressing header fields.

   o  Profile

      A ROHC (RFC 3095 [4]) profile is a description of how to compress
      a certain protocol stack over a certain type of link.  Each
      profile includes packet formats to compress the headers and a
      state machine to control the actions of each endpoint.



Finking, et al.          Expires April 28, 2005                 [Page 4]

Internet-Draft                  ROHC-FN                     October 2004


3.  Overview of ROHC-FN

   This section gives an overview of ROHC-FN and explains how it can be
   used to specify how to compress header fields as part of a ROHC
   profile.

3.1  Scope of ROHC-FN

   This section describes the scope of the ROHC-FN.  It explains how the
   formal notation relates to the ROHC framework and to specific ROHC
   profiles.

   The ROHC framework is common to all profiles: it defines the general
   principles for performing ROHC compression.  It defines the concept
   of a profile, which makes ROHC a general platform for different
   compression schemes.  It sets link layer requirements, and in
   particular negotiation requirements for all ROHC profiles.  It
   defines a set of common functions such as Context Identifiers (CIDs),
   padding and segmentation.  It also defines common packet formats (IR,
   IR-DYN, Feedback, Short-CID expander, etc.), and finally it defines a
   generic, profile independent, feedback mechanism.

   A ROHC profile is a description of how to compress a certain protocol
   stack over a certain type of link.  For example, ROHC profiles are
   available for RTP/UDP/IP and many other protocol stacks.

   Each ROHC profile can be further subdivided into the following two
   components:

   1.  Packet formats, for compressing and decompressing headers; and
   2.  State machine, for maintaining synchronisation of the context

   The purpose of the packet formats is to define how to compress and
   decompress headers.  The packet formats must define one or more
   compressed versions of each uncompressed header; inversely, the
   packet formats define how to relate a compressed header back to the
   original uncompressed header.

   The packet formats will typically compress headers relative to a
   context of field values from previous headers in a flow.  This
   improves the overall compression ratio, because this takes into
   account redundancies between successive headers.

   The purpose of the state machine is to ensure that the profile is
   robust against bit errors and dropped packets.  The state machine
   manages the context, providing feedback and other mechanisms to
   ensure that the compressor and decompressor contexts are kept
   synchronised.



Finking, et al.          Expires April 28, 2005                 [Page 5]

Internet-Draft                  ROHC-FN                     October 2004


   The ROHC-FN is designed to help in the specification of the packet
   formats for use in ROHC profiles.  It offers a library of encoding
   methods for compressing fields, and a mechanism for combining these
   encoding methods to create packet formats tailored to a specific
   protocol stack.  The state machine for the profiles is beyond the
   scope of ROHC-FN, and it must be provided separately as part of a
   complete profile specification.

3.2  Fundamentals of ROHC-FN

   There are two fundamental elements to the formal notation:

   1.  Fields and their encodings, which define the mapping between a
       field's uncompressed and compressed values.
   2.  Structures, which define lists of uncompressed fields and the
       lists of compressed fields they map onto.

   These two fundamental elements are at the core of the notation and
   are outlined below.

3.2.1  Fields and Encodings

   The creation of bindings between fields and encoding methods is
   indicated as follows:

     field   ::=   encoding_method

   When writing the above statement, the symbol "::=" means "is encoded
   as".  It does not represent an assignment operation from the right
   hand side to the left side.  Instead, it is a two-way mapping in that
   it both represents the compression and the decompression operation in
   a single statement, where variables take on values through the
   process of two-way matching.  Two-way matching is a binary operation
   that attempts to make the operands the same (similar to the
   unification process in logic).  The operands represent one
   unspecified data object, and values can be matched from either
   operand.

   More specifically, this statement creates a reversible binding
   between the attributes of a field and the encoding method (including
   the parameters specified with the method).  At the compressor, a
   packet format can be used if a set of bindings that is successful for
   all fields can be found.  At the decompressor, the operation is
   reversed using the same bindings and the fields are filled according
   to the specified bindings.

   For example, the 'static' encoding method creates a binding between
   the attribute corresponding to the uncompressed value of the field



Finking, et al.          Expires April 28, 2005                 [Page 6]

Internet-Draft                  ROHC-FN                     October 2004


   and the attribute corresponding to the value of the field in the
   context.

   o  For the compressor, this binding is successful when both values
      are the same for a packet format that sends no bits for that
      field.  Otherwise, a packet format using another encoding method
      that is successful when the parameters are not equal is used (such
      as a method that would send the field uncompressed).
   o  For the decompressor, the same binding succeeds for a packet type
      that sends no bits for that field if a valid context entry
      containing the value of the uncompressed field exists.  Otherwise,
      the binding will fail decompression for that packet type.

   Fields have attributes.  Attributes describe various things about the
   field, including the length and whereabouts they appear in the
   header.  For example:

     field:has_context

   indicates whether or not a context entry exists for this field.

3.2.2  Structures

   Structures provide a mechanism for combining fields and their
   encoding methods into larger units.  Structures are defined using the
   "===" operator.  These can then be used as encoding methods in other
   structures.

     structure ===
     {
       uncompressed_format = field_1,
                             field_2,
                                  :
                                  :
                             field_n;

       compressed_format_0 = field_a,
                                  :
                                  :
                             field_b
       {
         field_a      ::= encoding_method_1;
            :                     :
            :                     :
         field_b      ::= encoding_method_2;
       };





Finking, et al.          Expires April 28, 2005                 [Page 7]

Internet-Draft                  ROHC-FN                     October 2004


       compressed_format_1 = field_c,
                                  :
                                  :
                             field_d
       {
         field_c      ::= encoding_method_3;
            :                     :
            :                     :
         field_d      ::= encoding_method_4;
       };

       :
       :

       compressed_format_n = field_y,
                                  :
                                  :
                             field_z
       {
         field_y      ::= encoding_method_foo;
            :                     :
            :                     :
         field_z      ::= encoding_method_bar;
       };
   };

   In the example above, the comma separated list "uncompressed_format"
   indicates the order of fields in the uncompressed header.  This list
   is followed by several packet formats for the compressed data, each
   beginning with the keyword "compressed_format".

   Packet formats defined by "compressed_format" also indicate an
   ordered list of fields.  Items in this list consist either of:

   o  a compressed representation of fields that occur in the
      uncompressed header; or
   o  "control fields", that are additional information added to the
      compressed packet during compression.

   Fields from the uncompressed header will have the same name as they
   do in the compressed header.  So in the example above, "field_a"
   would be a control field if it didn't appear in the uncompressed
   field order list .

   Following the compressed field order list, encoding methods are
   defined inside braces for all the compressed and uncompressed fields.
   Fields that have no encoding methods will be handled using
   "default_methods" (see TBAref below).



Finking, et al.          Expires April 28, 2005                 [Page 8]

Internet-Draft                  ROHC-FN                     October 2004


3.3  Example using IPv4

   This section gives an overview of how the notation is used by means
   of an example.  The example will develop the formal notation for an
   encoding method capable of compressing a single, well-known header:
   the IPv4 header.

   The first step is to specify the overall structure of the IPv4
   header.  To do this, we use a structure (defined in Section 4.11),
   which we will call "ipv4_header".  This is notated as follows:

     ipv4_header           ===
     {

   This defines the encoding method "ipv4_header" as a structure, the
   definition of which follows the opening brace.

   Definitions within the pair of braces are local to "ipv4_header".
   This scoping mechanism helps to clarify which fields belong to which
   headers: it is also useful when compressing complex protocol stacks
   with several headers and fields, often sharing the same names.

   The next step is to specify the fields contained in the uncompressed
   IPv4 header, which is accomplished using ROHC-FN as follows:

       uncompressed_format   =   version        , % [  4 ]
                                 header_length  , % [  4 ]
                                 tos            , % [  6 ]
                                 ecn            , % [  2 ]
                                 length         , % [ 16 ]
                                 id             , % [ 16 ]
                                 reserved       , % [  1 ]
                                 dont_frag      , % [  1 ]
                                 more_fragments , % [  1 ]
                                 offset         , % [ 13 ]
                                 ttl            , % [  8 ]
                                 protocol       , % [  8 ]
                                 checksum       , % [ 16 ]
                                 src_addr       , % [ 32 ]
                                 dest_addr      ; % [ 32 ]

   The numbers in square brackets give the field width in bits.  Note
   that these are mere comments that do not have any formal meaning.

   The fields contained in the compressed header can then be specified.
   Exactly what appears in this list of fields depends on the encoding
   methods used to encode the uncompressed fields -- it may be possible
   to compress certain fields down to 0 bits, in which case they do not



Finking, et al.          Expires April 28, 2005                 [Page 9]

Internet-Draft                  ROHC-FN                     October 2004


   need to be sent in the compressed header at all.

       compressed_format =  src_addr       , % [ 32 ]
                            dest_addr      , % [ 32 ]
                            length         , % [ 16 ]
                            id             , % [ 16 ]
                            ttl            , % [  8 ]
                            protocol       , % [  8 ]
                            tos            , % [  6 ]
                            ecn            , % [  2 ]
                            dont_frag        % [  1 ]
       {

   Note that the order of the fields in the compressed header is
   independent of the order of the fields in the uncompressed header.

   The next step is to specify the encoding methods for each field in
   the IPv4 header.  These are taken from encoding methods in the
   ROHC-FN library as well as additional encoding methods defined in the
   profile specification itself.  Since the intention here is to
   illustrate the use of the notation, rather than to describe the
   optimum method of compressing IPv4 headers, this example uses only
   three predefined encoding methods.

   The "uncompressed_value" encoding method (defined in Section 4.9.1)
   can compress any field whose uncompressed length and value are fixed.
   No compressed bits need to be sent because the uncompressed field can
   be reconstructed using its known size and value.  The
   "uncompressed_value" encoding method is used to compress five fields
   in the IPv4 header, as described below:

         version             ::=   uncompressed_value (4, 4);
         header_length       ::=   uncompressed_value (4, 5);
         reserved            ::=   uncompressed_value (1, 0);
         more_fragments      ::=   uncompressed_value (1, 0);
         offset              ::=   uncompressed_value (13, 0);

   The first parameter indicates the length of the uncompressed field in
   bits, and the second parameter gives its integer value.

   The "irregular" encoding method (defined in Section 4.9.3) can be
   used to encode any field whose length is fixed, or can be calculated
   using an expression.  It is a general encoding method that can be
   used for fields to which no other encoding method applies.  All of
   the bits in the uncompressed field are present in the compressed
   format as well; hence this encoding does not achieve any compression.





Finking, et al.          Expires April 28, 2005                [Page 10]

Internet-Draft                  ROHC-FN                     October 2004


         tos                 ::=   irregular (6);
         ecn                 ::=   irregular (2);
         length              ::=   irregular (16);
         id                  ::=   irregular (16);
         dont_frag           ::=   irregular (1);
         ttl                 ::=   irregular (8);
         protocol            ::=   irregular (8);
         src_addr            ::=   irregular (32);
         dest_addr           ::=   irregular (32);

   Finally, the third encoding method is specific only to IPv4 headers,
   "inferred_ip_v4_header_checksum":

         checksum            ::=   inferred_ip_v4_header_checksum;
     };

   This is a specific encoding method for calculating the IP checksum
   from the rest of the header values.  Like the "uncompressed_value"
   encoding method, no compressed bits need to be sent, since the field
   value can be reconstructed at the decompressor.  However, unlike
   "uncompressed_value", the meaning of "inferred_ip_v4_header_checksum"
   is not defined in the ROHC-FN library of encoding methods, nor is it
   defined by another structure elsewhere in the formal notation given
   as an example above.  Its definition can be given either in the
   English language or using the formal notation as part of the profile
   definition itself.

   Finally the definition of the structure is closed with a closing
   brace.  At this point, the above example has defined the format of
   the compressed IPv4 header, and provided enough information to allow
   an implementation to construct the compressed header from an
   uncompressed header and vice versa.

4.  Normative Definition of ROHC-FN

   This section gives the normative definition of ROHC-FN.

4.1  Overall Structure of a Specification

   A ROHC-FN specification consists of a sequence of zero or more
   constant definitions (Section 4.2) and one or more encoding method
   definitions, given in the form of structures (Section 4.11).

   Structures define an encoding method by giving one or more formats
   for uncompressed packets and one or more formats for compressed
   packets.  These formats are linked by so-called fields, each of which
   describes a certain part of an uncompressed and/or a compressed
   format.



Finking, et al.          Expires April 28, 2005                [Page 11]

Internet-Draft                  ROHC-FN                     October 2004


   The properties of a field are defined by defining an encoding method
   for it, typically in the compressed format.  This encoding method can
   be one defined in a structure or it can be a predefined encoding
   method.  Predefined encoding methods can be defined in the text
   accompanying a formal specification, or they can be defined in the
   present document.

4.2  Constant Definitions

   Constant values can be defined using the "=" operator.  Identifiers
   for constants must be all upper case.  For example:


      SOME_CONSTANT = 3;

   Constants can be defined by any expression on the right hand side of
   the "=" operator (see Section 4.4).

4.3  Field Attributes

   In ROHC-FN, the properties of a field are defined by an encoding
   method.  The encoding method‚ÇÖs formal semantics are specified using
   a set of attributes.  This set of attributes entirely characterises
   the relationship between the uncompressed and compressed
   representation of a field.  Both of these representations are bit
   strings.  The notation defines seven attributes, three for the
   uncompressed field, three for the compressed field and one to assert
   the existence of a context entry for the field.  The attributes
   available for each field are:

   o  "uncomp_value", "uncomp_length" and "uncomp_hdr_start" --
      uncompressed attributes of the field
   o  "comp_value", "comp_length" and "comp_hdr_start" -- compressed
      attributes of the field
   o  "has_context" -- context information

   Attributes of a particular field are referred formally by using the
   field's reference (see Section 4.6, followed by a ":" and the
   attribute's identifier.  For example:

     tcp_ip.options.list_length:uncomp_value

   gives the numerical uncompressed value of the field referenced.  The
   attributes are explained in more detail below.

   The two value attributes contain the respective numeric values of the
   field as a non-negative integer by encoding the bit string
   most-significant bit first, i.e.  "uncomp_value" gives the numerical



Finking, et al.          Expires April 28, 2005                [Page 12]

Internet-Draft                  ROHC-FN                     October 2004


   value of the uncompressed aspect of the field, and the attribute
   "comp_value" gives the numerical value of the compressed aspect of
   the field.

   The two length attributes indicate the length in bits of the
   associated bit string; "uncomp_length" for the uncompressed
   representation, and "comp_length" for the compressed representation.

   Finally, the "has_context" attribute indicates whether there is any
   "context" available for the field.  The context keep for a particular
   field contains information about previous value(s) of the field.
   This information is needed for encoding methods, such as "static" and
   "lsb" (see section Section 4.9).  These methods refer back to the
   previous value of the field.  This attribute is particularly useful
   for list encoding, as it can be necessary for the notator to find out
   if context information is available or not (see section Section
   4.12.2).

4.4  Expressions

   Expressions can be made up of any of the following components:

      Integers

         Integers can be expressed as decimal values, binary values
         (prefixed by 0b), or hex values (prefixed by 0x).  Negative
         integers are prefixed by a "-" sign.

      Integer operations

         The operators +, -, *, / and ^ are available, along with ( and
         ) for grouping.  Note that k / v is undefined if k is not an
         integer multiple of v (i.e.  if it does not evaluate to an
         integer).  However, k // v is always defined.  The precedence
         for each of the operators, along with parentheses is given
         below (higher precedence first):

            (, )
            ^
            *, /
            +, -

      x ^ y
         Evaluates to x raised to the power of y.

      x / y
         Evaluates to the integer division of x by y, i.e.  x divided by
         y, rounded down to the nearest integer.  It is undefined when y



Finking, et al.          Expires April 28, 2005                [Page 13]

Internet-Draft                  ROHC-FN                     October 2004


         is zero.

      mod (k, v)
         Evaluates to k - v * (k / v).

      log2 (w)
         Evaluates to the smallest integer k where v <= 2^k, i.e.  it
         returns the smallest number of bits in which value v can be
         stored.

      Boolean operations

         The following boolean operators are available:

            &&, for logical and
            ||, for logical or
            !, for logical not
         The boolean values are 0 (false) and 1 (true).

         boolean1 && boolean2
            Returns true if both boolean1 and boolean2 are true.
            Returns false otherwise.

         boolean1 || boolean2
            Returns true if at least one of boolean1 or boolean2 is
            true.  Returns false otherwise.

         !boolean
            Returns true if boolean is false.  Returns false otherwise.

      Comparison operations

         The following comparison operators are available:

            ==, != for equality ("is equal" and "is not equal",
            respectively)
            <, >, <=, >= for comparison ("is smaller than", "is larger
            than", "is smaller than or equal to" and "is larger than or
            equal to" respectively)

         x == y
            Returns true if x is equal to y.  Returns false otherwise.

         x != y
            Returns true if x is not equal to y.  Returns false
            otherwise.





Finking, et al.          Expires April 28, 2005                [Page 14]

Internet-Draft                  ROHC-FN                     October 2004


         x < y
            Returns true if x is less than y.  Returns false otherwise.

         x <= y
            Returns true if x is less than or equal to y.  Returns false
            otherwise.

         x > y
            Returns true if x is greater than y.  Returns false
            otherwise.

         x >= y
            Returns true if x is greater than or equal to y.  Returns
            false otherwise.

   Expressions may refer to any of the attributes of each field (as
   described in Section 4.3), and also to any defined constant (see
   Section 4.2).

   If any of the attributes or constants used in the expression are
   undefined, the value of the expression is undefined.  Undefined
   expressions are illegal.

   Expressions cannot be used as encoding methods.  This is because they
   cannot completely characterise an uncompressed field; in particular,
   the length of the uncompressed field would be undefined for the
   decompressor.

4.5  Expressions: NOTE:Merge+Remove

   ROHC-FN includes the usual infix style of expressions, with
   parentheses "(" and ")" used for grouping.  Expressions can be made
   up of any of the following components:

      Integers

         Integers can be expressed as decimal values, binary values
         (prefixed by 0b), or hexadecimal values (prefixed by 0x).
         Negative integers are prefixed by a "-" sign (note that there
         is no unary minus operator).

      Integer, comparison and boolean operations

         The following operators are defined on integers.  Their
         precedence and semantics generally is as in the C programming
         language, with the following exceptions:





Finking, et al.          Expires April 28, 2005                [Page 15]

Internet-Draft                  ROHC-FN                     October 2004


            There is no limit on the range of integers.
            The expression div(k,v) is only defined if k is an integer
            multiple of v (i.e.  it always evaluates to an integer, with
            no residue).
            The expression k/v is always defined (for v != 0) and is
            evaluated as in C.
            The expression mod(k,v) is used instead of C language k % v,
            as the "%" character is the comment character.
            x ^ y evaluates to x raised to the power of y.
            log2(w) Evaluates to the smallest integer k where v <= 2^k,
            i.e.  it returns the smallest number of bits in which value
            v can be stored.

         field/attribute reference syntax ("." and ":")
         ! (unary), function application f(x)
         ^
         * /
         + -
         < <= > >=
         == !=
         &
         |
         &&
         ||

   Expressions may refer to any of the attributes of each field (as
   described in Section 4.3), and also to any defined constant (see
   Section 4.2).

   If any of the attributes or constants used in the expression are
   undefined, the value of the expression is undefined.  Undefined
   expressions cause the environment (e.g.  the packet format) in which
   they are used to fail, i.e., not succeed.  It is possible to test if
   an expression has an undefined value by comparing it to the keyword
   "null".  For example:
      field == null

4.6  Field References

   A field reference followed by a dot and a field name refers to the
   named field that is an immediate child within the referenced field.
   [needs fixing]

4.7  Reserved Keywords

4.7.1  "let"

   The reserved keyword "let" takes a boolean expression as a parameter.



Finking, et al.          Expires April 28, 2005                [Page 16]

Internet-Draft                  ROHC-FN                     October 2004


   It can be used to assert that the expression has a specific value, in
   order to choose a particular packet format from a list of possible
   formats:

     let (<boolean expression>)

   When the boolean expression evaluates to false, the packet format
   containing the expression fails, i.e.  this packet format cannot be
   selected by the compressor.

   A "let" statement is always part of a field encoding list.

4.7.2  "this"

   Within a structure it is possible to refer to the field it is
   encoding, using the keyword "this".  This is useful for gaining
   access to the attributes of the field being encoded.  For example it
   is often useful to know the total length of the uncompressed header
   which is being encoded.

4.8  Comments

   Comments do not affect the formal meaning of what is notated, but can
   be used to improve readability.  Their use is optional.

   Free English text can be inserted into a profile definition to
   explain why something has been done a particular way, to clarify the
   intended meaning of the notation, or to elaborate on some point.  To
   this end, the two commenting styles described in the subsections
   below can be used.

   Comments may help provide clarifications to the reader, and serve
   different purposes to implementers.  Comments should thus not be
   considered of lesser importance when inserting then into the formal
   definition of a profile; these should be consistent with the
   normative part of the profile.

4.8.1  End of line comments

   The end of line comment style makes use of the "%" comment character.
   Any text between the "%" character and the end of the line has no
   formal meaning.  For example:









Finking, et al.          Expires April 28, 2005                [Page 17]

Internet-Draft                  ROHC-FN                     October 2004


     %-----------------------------------------------------------------
     %    IR-REPLICATE packet formats
     %-----------------------------------------------------------------
     % The following fields are included in all of the IR-REPLICATE
     % packet formats:
     %
     uncompressed_format   =   discriminator,    % [  8 ] bits
                               tcp.seq_number,   % [ 32 ] bits
                               tcp.flags.ecn,    % [  2 ] bits


4.8.2  Block comments

   The block comment style makes use of the "/*" and "*/" delimiters.
   Any text between the "/*" and the "*/" has no formal meaning.  For
   example:

     /******************************************************************
      *   IR-REPLICATE packet formats
      *****************************************************************/

     /* The following fields are included in all of the IR-REPLICATE
      * packet formats:
      */
     uncompressed_format   =   discriminator,    /*   8 bits */
                               tcp.seq_number,   /*  32 bits */
                               tcp.flags.ecn,    /*   2 bits */

   The block comment style allows comments to be nested, unlike some
   programming languages such as C, C++ or Java.

4.9  Encoding Methods

   ROHC (RFC 3095 [4]) contains a number of different techniques for
   compressing header fields (LSB encoding, value encoding, etc.).  Most
   of these techniques are part of the ROHC-FN library so that they can
   be reused when creating new ROHC profiles.  The notation for these is
   described below.  Encoding methods can be defined using structures
   (see section Section 4.11).  It is also possible for a profile to
   define its own set of encoding methods using the formal notation or
   using a textual definition.

4.9.1  uncompressed_value

   The "uncompressed_value" encoding method is used to encode header
   fields for which the uncompressed value can be defined using a
   mathematical expression (including constant values):




Finking, et al.          Expires April 28, 2005                [Page 18]

Internet-Draft                  ROHC-FN                     October 2004


     field     ::= uncompressed_value (uncomp_length_param, <expression>);

   where the "uncomp_length_param" binds with the field's
   "uncomp_length" attribute, and where <expression> is a mathematical
   expression.  The value of <expression> binds with the field's
   "uncomp_value" attribute.

   For example, the IPv6 header version number is a four bits field that
   always has the value 6:

     version             ::=   uncompressed_value (4, 6);

   Another example of value encoding, using an expression:

     data_offset     ::= expression(4, (uncomp_value(tcp_ip.options.list_length)
                                        + 160) / 32);

   In both examples above, since the value is either fixed or described
   entirely in terms of a known expression, it is omitted from the
   compressed header.

4.9.2  compressed_value

   The "compressed_value" encoding method is used to define fields in
   the compressed header for which there is no counter-part in the
   uncompressed header.  It can be used to set compressed fields whose
   value can be defined using a mathematical expression (including
   constant values):

     field     ::= compressed_value (comp_length_param, <expression>);

   where the "comp_length_param" binds with the field's "comp_length"
   attribute, and where <expression> is a mathematical expression.  The
   value of <expression> binds with the field's "comp_value" attribute.

   One possible use of this encoding method is to define padding in the
   compressed header:

     pad_to_octet_boundary      ::=   compressed_value (3, 0);

   Another is to define a discriminator field to make it possible to
   differentiate between different packet formats within a structure.
   For convenience, the notation provides syntax for specifying value
   encoding in the form of a binary string.  The binary string to be
   encoded is simply given in single quotes.  For example:

     discriminator     ::=   '01101';




Finking, et al.          Expires April 28, 2005                [Page 19]

Internet-Draft                  ROHC-FN                     October 2004


   This has exactly the same meaning as:

     discriminator     ::=   compressed_value(5, 13);


4.9.3  irregular

   The "irregular" encoding method is used to encode a field in the
   compressed packet with a bit pattern identical to the original field
   in the uncompressed packet.  e.g.

     field         ::=   irregular (<expression>);

   where "expression" binds with the "uncomp_length" attribute of the
   field.

   For example, the checksum field of the TCP header is a sixteen bits
   field that does not follow any pattern:

     tcp_checksum  ::=   irregular (16);

   The expression can be used to derive the length of the field from the
   value of another field, and the length does not have to be constant.

4.9.4  static

   The "static" encoding method compresses a field whose length and
   value are the same as for a previous header in the flow, i.e.  where
   the field completely matches an existing entry in the context:

     field            ::=   static;

   The field's "uncomp_value" and "uncomp_length" attributes bind with
   their respective values in the context.

   Since the field value is the same as a previous field value, the
   entire field can be reconstructed from the context, so it is
   compressed to zero bits and does not appear in the compressed header.

   For example, the source port of the TCP header is a field whose value
   does not change from one packet to the other for a given flow:

     src_port  ::=   static;


4.9.5  lsb

   The Least Significant Bit encoding method, "lsb", compresses a field



Finking, et al.          Expires April 28, 2005                [Page 20]

Internet-Draft                  ROHC-FN                     October 2004


   whose value differs by a small amount from the value stored in the
   context.

     field            ::=   lsb (num_lsbs_param, offset_param);

   Here, "num_lsbs_param" is the number of least significant bits to
   use, and "offset_param" is the interpretation interval offset.  The
   parameter "num_lsbs_param" binds with the "comp_length" attribute,
   and the "uncomp_value" attribute binds with (context_value -
   offset_param + comp_value).

   The "lsb" encoding method can compress a field whose value lies
   between (context_value - offset_param) and (context _value -
   offset_param + 2^num_lsbs_param - 1) inclusively.  In particular, if
   offset_param = 0 then the field value can only stay the same or
   increase relative to the previous header in the flow.  If
   offset_param = -1 then it can only increase, whereas if offset_param
   = 2^num_lsbs_param then it can only decrease.

   The compressed field takes up the specified number of bits in the
   compressed header (i.e.  num_lsbs_param).

   For example, a sequence number used as a control field that can only
   increase:

     msn               ::=   lsb (2, -1);

   See the ROHC specification (RFC 3095 [4]) for additional details on
   LSB encoding, where the parameter "k" corresponds to the parameter
   "num_lsbs_param" and where interpretation interval offset "p"
   corresponds to the parameter "offset_param".

4.9.6  crc

   The "crc" encoding method provides a CRC calculated over a block of
   data.  The block of data is represented using either the
   "uncomp_value" or "comp_value" attribute of a field.  The "crc"
   method takes a number of parameters:

   o  the number of bits for the CRC (crc_bits);

   o  the bit-pattern for the polynomial (bit_pattern);

   o  the initial value for the CRC register (initial_value);

   o  the value of the block of data (block_data_value); and





Finking, et al.          Expires April 28, 2005                [Page 21]

Internet-Draft                  ROHC-FN                     October 2004


   o  the size inoctets of the block of data (block_data_length).

   I.e.:

     field   ::=   crc (num_bits, bit_pattern, initial_value,
                        block_data_value, block_data_length)

   The CRC is calculated in least significant bit (LSB) order.

   The following CRC polynomials are defined in RFC 3095 [4], in
   Sections 5.9.1 and 5.9.2:

      8-bit
         C(x) = x^0 + x^1 + x^2 + x^8
         bit_pattern = 0xe0
      7-bit
         C(x) = x^0 + x^1 + x^2 + x^3 + x^6 + x^7
         bit_pattern = 0x79
      3-bit
         C(x) = x^0 + x^1 + x^3
         bit_pattern = 0x06

   For example:

        crc_field   ::=   crc (3, 0x6, 0xF, 0x3, 40)  % 3 bits
                                                      % C(x) = x^0 + x^1 + x^3


4.10  Profile-specific Encoding Methods

   The library of encoding methods provides a basic and a generic set of
   field encoding methods.  Some additional encodings specific to a
   particular protocol may however be needed, such as for methods that
   infer the value of a field from other values.  These methods are
   defined based on the properties of the protocol being compressed.

   Profiles may define additional encoding methods; the scope of these
   methods is then local to the profile definition itself, and they can
   be used as part of the formal definition of the profile as any other
   methods from the library (see section Section 4.9).

   Profile-specific encoding methods must be rigorously defined using
   either the ROHC-FN syntax or in plain text, as long as its definition
   provides enough information to unambiguously implement the encoding
   method in the compressor and the decompressor.  These methods should
   be no less complete than the methods provided herein.





Finking, et al.          Expires April 28, 2005                [Page 22]

Internet-Draft                  ROHC-FN                     October 2004


4.11  Structures

   Structures are used for defining new encoding methods in a formal
   specification.  They can compose groups of individual fields into
   contiguous blocks.  Structures can be thought of as compound encoding
   methods; they have names and may have parameters and can be used in
   the same way as any other encoding method.  Since structures can
   contain references to other structures, complicated headers can be
   broken down into manageable pieces.

   This section describes the various features of structures, starting
   out with the simplest.

4.11.1  Simple Structures

   A structure can be used to specify a single fixed encoding.  This is
   its simplest form.  For example:

     compound_encoding_method ===
     {
       uncompressed_format   =   field_1, %  [  4 ]
                                 field_2; %  [ 12 ]

       compressed_format  =   field_2,    %  [  0 ],
                              field_1     %  [  4 ]
       {
         field_1   ::=   irregular (4);
         field_2   ::=   uncompressed_value (12, 9);
       };
     };

   The above begins with the structure name, "compound_encoding_method".
   This name is followed by "===", which indicates that this is a
   structure definition.  The definition of the structure then follows
   inside curly braces, "{" and "}".  The first item in the definition
   is the "uncompressed_format" field order list, which gives the order
   of the fields in the uncompressed header.  This is followed by the
   compressed header field order list.  This list is in turn followed by
   the field encodings list for the compressed header, which gives the
   encoding method for each field.  The different components of this
   example are described in more detail below.

   The encoding methods defined for the fields must define the
   "uncomp_length" attribute so there is an unambiguous mapping from the
   bits in the uncompressed header to the fields listed in the field
   order list.





Finking, et al.          Expires April 28, 2005                [Page 23]

Internet-Draft                  ROHC-FN                     October 2004


4.11.1.1  control_fields

   Control fields are defined using the "control_fields" list, which
   specifies control fields that do not appear in the uncompressed
   header but that are used for compression.  [Editor - write more here
   + include in example]

4.11.1.2  uncompressed_format

   The uncompressed field order list is defined by
   "uncompressed_format", which specifies the fields of the uncompressed
   header in the order that appear in the uncompressed header.  In the
   example, this is "field_1" followed by "field_2".

   Note that the arrangement of fields specified in the uncompressed
   field order list is up to the notator.  Any arrangement of fields
   that correctly describes the content of the uncompressed header may
   be chosen -- this need not be the same as the one described in the
   specifications for the protocol header being compressed.  However,
   the bits of the uncompressed format must remain in the same order.

   For example, there may be a protocol whose header contains a 16 bits
   sequence number, but whose sessions tend to be short lived.  This
   would mean that the high bits of the sequence number are almost
   always constant.  The "uncompressed_format" could reflect this by
   splitting the original uncompressed field into two fields, one field
   to represent the almost-always-zero part of the sequence number, and
   a second field to represent the significant part.

   An uncompressed format may contain a field encodings list.  Encoding
   methods specified therein are used whenever a packet with this
   uncompressed format is being encoded, regardless of the selected
   compressed format.  If an uncompressed format contains
   let-statements, the encoding of a packet with this uncompressed
   format can only succeed if the specified expressions evaluate to true
   (see Section [TBA]).

4.11.1.3  compressed format

   Similar to the uncompressed field order list, the compressed data
   will appear in the order specified by the compressed field order list
   given for a compressed format.  Each individual field is encoded in
   the manner given for that field in the field encodings list, which is
   in braces and follows immediately after the compressed field order
   list.  The total length of the compressed data will be the total of
   the compressed lengths of all the individual fields.  The annotation
   for these fields indicates that they are zero and 4 bits long, making
   a total of 4 bits.



Finking, et al.          Expires April 28, 2005                [Page 24]

Internet-Draft                  ROHC-FN                     October 2004


   Note that the order of the fields specified in "compressed_format"
   field order list, does not have to match the order they appear in the
   "uncompressed_format" field order list.  It may be desirable to
   reorder the fields in the compressed header for alignment the
   compressed header to the octet boundary, or for other reasons.  In
   the above example, the order is in fact the opposite of that in the
   uncompressed header.

   The field encodings list specifies that the encoding for "field_1",
   is "irregular", which takes up four bits in both the compressed
   header and uncompressed header.  The encoding for "field_2" is
   "uncompressed_value", which means that the field has a fixed value,
   so it can be compressed to zero bits.  The value it takes is 9, and
   it is 12 bits wide in the uncompressed header.

   Fields like "field_2", which compress to zero bits in length, may be
   omitted from the compressed field order list.  This is because their
   position in the list is not significant.  So, without changing the
   meaning, the above example could be notated as follows:

     compound_encoding_method ===
     {
       uncompressed_format   =   field_1  % [  4 ],
                                 field_2  % [ 12 ];

       compressed_format  =   field_1     % [  4 ]
       {
         field_1   ::=   irregular (4);
         field_2   ::=   uncompressed_value (12, 9);
       };
     };


4.11.2  Arguments and Structures

   Structures may take arguments, which have some control over the
   mapping between compressed and uncompressed fields.  These are
   specified immediately after the structure name, in parentheses, as a
   comma separated list.  For example:












Finking, et al.          Expires April 28, 2005                [Page 25]

Internet-Draft                  ROHC-FN                     October 2004


     poor_mans_lsb(variable_length) ===
     {
       uncompressed_format   =   constant_bits,
                                 variable_bits;

       compressed_format  =   variable_bits
       {
         constant_bits  ::=   static;
         variable_bits  ::=   irregular(variable_length);
       };
     };

   As with any encoding method, all arguments are values, rather than
   fields.  Although entire fields cannot be passed as arguments, it is
   possible to pass their attributes instead.

4.11.3  Multiple Formats

   Structures can also define multiple formats for a given header.  This
   allows different compression methods to be used depending on what is
   the most efficient way of compressing a particular header.

   For example, a field may have a fixed value most of the time, but the
   fixed value may occasionally change.  Using a single format for the
   structure, this field would have to be encoded using "irregular" (see
   Section 4.9.3), even though the value only changes rarely.  However,
   by using the structure to define multiple formats, we can provide two
   alternative encodings; one for when the value remains fixed and
   another for when the value changes.

   This is the topic of the following sub-sections.

4.11.3.1  Naming Convention

   When multiple compressed formats are defined, they must be defined
   using names beginning with "compressed_format", and each name must be
   unique.  In fact this is true even if only one format is given, but
   in that case, simply "compressed_format" will do, since there are no
   alternatives to differentiate between.

   Similarly, if multiple uncompressed formats are defined, they must be
   defined using names beginning with "uncompressed_format".

4.11.3.2  Format Discrimination

   Each of the compressed formats has its own field order list and field
   encodings list.  A compressor may pick any of these alternative
   formats to compress a header, as long the field encodings it employs



Finking, et al.          Expires April 28, 2005                [Page 26]

Internet-Draft                  ROHC-FN                     October 2004


   can be used with the uncompressed header.  For example, the
   compressor could not choose to use a compressed format that had a
   "static" encoding for a field whose value had just changed.

   More formally, the compressor can choose any combination of an
   uncompressed format and a compressed format for which all fields
   "succeed", i.e.  the encoding methods succeed and there are solutions
   for all the let-statements (see Section 4.7.1).  If there is no such
   combination, the encoding method defined by the structure "fails".
   If there are multiple such combinations, the compressor can choose
   one.

   On the other hand, it must be possible for the decompressor to
   discriminate between the different packet formats that the compressor
   may choose from.  A simple approach to this problem is for each
   compressed format to include a "discriminator" that uniquely
   identifies that particular format.  A discriminator is a control
   field; it is not derived from any of the uncompressed field values
   (see Section 4.9.2).

4.11.3.3  Default Encoding Methods - default_methods

   When using multiple compressed packet formats, default encoding
   methods can be specified for each field.  The default encoding
   methods specify the encoding method to use for a field if a given
   compressed format does not specify the encoding method for that
   field.  This is helpful to keep the definition of the packet formats
   concise, as the same encoding method need not be repeated for every
   compressed format.

   The syntax for specifying default encoding methods is similar to that
   used to specify a compressed format, except that there is no need to
   specify a field order list for the default encoding methods, since
   the field order is specified individually for each format; only the
   field encodings list is given.  For example:

     default_methods =
     {
       field_1           ::=   uncompressed_value (4,1);
       field_2           ::=   uncompressed_value (4,2);
       field_3           ::=   lsb(3,-1);
     }

   Fields for which there is a default encoding method do not need to be
   specified in the field encodings list of any compressed format which
   wishes to use the default encoding method for that field.  The
   default encoding method may however be overridden by specifying an
   explicit encoding method for that field.  If a default encoding



Finking, et al.          Expires April 28, 2005                [Page 27]

Internet-Draft                  ROHC-FN                     October 2004


   method is not overridden, and that encoding method always compresses
   the field down to zero bits, then the field can also be omitted from
   the compressed format field order list, since, like any other zero
   bit field, its position in the field order list is not significant.

   The field encodings list of default_methods may also contain
   let-statements.  In this case every compressed format of the
   structure can only succeed if the specified expressions evaluate to
   true.  Note that let-statements can not be overridden in compressed
   formats.

4.11.3.4  Example of Multiple Formats

   Putting this altogether, here is a complete example of a structure
   with multiple compressed formats:

     test_multiple_formats  ===
     {
       uncompressed_format   =   field_1,    % [  4 ]
                                 field_2,    % [  4 ]
                                 field_3;    % [ 24 ]

       default_methods =
       {
         field_1           ::=   static;
         field_2           ::=   uncompressed_value(4, 2);
         field_3           ::=   lsb(4, 0);
       };


       compressed_format_0   =   discriminator,    % [ 1 ]
                                 field_3           % [ 4 ]
       {
         discriminator     ::=   '0';
       };


       compressed_format_1   =   discriminator,    % [  1 ]
                                 field_1,          % [  4 ]
                                 field_3           % [ 24 ]
       {
         discriminator     ::=   '1';
         field_1           ::=   irregular(4);
         field_3           ::=   irregular(24);
       };
     };

   Note the following:



Finking, et al.          Expires April 28, 2005                [Page 28]

Internet-Draft                  ROHC-FN                     October 2004


   o  "field_1" and "field_3" both have default encoding methods
      specified for them, which are used in "compressed_format_0", but
      is overridden in "compressed_format_1"; "field_2" however is not
      overridden.  Overriding one of the default encoding methods does
      not imply that all default encoding methods must be overridden.
   o  "field_1" and "field_2" have default encoding methods which
      compress to zero bits, when these are used in
      "compressed_format_0", the field names do not appear in either the
      field order list or in the field encodings list.
   o  "field_3" has an encoding method which does not compress to zero
      bits, so whilst "field_3" is absent from "compressed_format_0"'s
      field encodings list, it still needs to appear in the field order
      list to specify whereabouts it goes in the compressed packet.
   o  in the example all the uncompressed header fields have default
      encoding methods specified for them, but this is not a
      requirement.  It is perfectly allowable to only specify default
      encodings for some or even none of the uncompressed header fields.
   o  in the example all the default encoding methods are on fields from
      the uncompressed header, but this is not a requirement.  It is
      perfectly allowable to specify default encoding methods for
      control fields.

4.11.4  Recursive Structures

   It is possible to define structures recursively, by having one or
   more of the compressed formats of a structure encode a field using
   the structure itself.  For example:

     static_32_list(num_bytes) ===
     {
       uncompressed_format_end = field_1;       %   [ 32 ] bits

       uncompressed_format_mid = field_1,       %   [ 32 ] bits
                                 tail;          %   [ num_bytes - 32 ] bits


       compressed_format_end_of_list = field_1  %   [ 32 ] bits
       {
         field_1 ::= static_or_irregular(32);
       };

       compressed_format_mid_list = field_1,    %   [ 32 ] bits
                                    tail        %   [ num_bytes - 32 ] bits
       {
         field_1 ::= static_or_irregular(32);
         tail    ::= static_32_list(num_bytes - 32);
       };
     };



Finking, et al.          Expires April 28, 2005                [Page 29]

Internet-Draft                  ROHC-FN                     October 2004


     static_or_irregular(length) ===
     {
       uncompressed_format = field;

       compressed_format_irregular = discriminator,  %  [ 1 ] bits
                                     field           %  [ length ] bits
       {
         discriminator ::= '0';
         field         ::= irregular(length);
       };

       compressed_format_static = discriminator,     %  [ 1 ] bits
                                  field              %  [ 0 ] bits
       {
         discriminator ::= '1';
         field         ::= static;
       };
     };


   The "static_or_irregular" structure will encode a field as either
   irregular, or static if there is a context value to refer back too.
   The "static_32_list" uses the "static_or_irregular" structure and one
   other encoding method, itself.  It encodes an arbitrary length
   sequence of 32 bit fields as "irregular (32)", or static where
   possible.  We could use this for example to encode a CSRC list:

     csrc_list ::= static_32_list(96);   % 32 bits per item, 96 bits = 3 items

   This is exactly equivalent to notating the following:

     csrc_list_1 ::= static_or_irregular(32);
     csrc_list_2 ::= static_or_irregular(32);
     csrc_list_3 ::= static_or_irregular(32);

   In this case the recursive notation simply provides a mechanism to
   choose the number of list items at run time; the literal "96" in the
   above example could easily have been an expression.

   It is possible to notate extremely complicated list structures using
   the above technique.  However, special syntax is provided which
   simplifies the notating of lists considerably.  This is discussed in
   the next section.

4.12  Lists

   The above section has described how structures can be used to build
   individual fields into larger units, but largely for a fixed order of



Finking, et al.          Expires April 28, 2005                [Page 30]

Internet-Draft                  ROHC-FN                     October 2004


   uncompressed fields.  Through the use of multiple uncompressed
   formats it is possible to cater for variable field order/presence
   using the above notation, but it quickly becomes cumbersome.  This
   section presents notation aimed specifically at encoding lists of
   fields, where the number and even the type of fields may vary from
   header to header in the flow.

4.12.1  Notation

   List notation is similar to that for structures above, the sections
   below describe it.

4.12.1.1  List name

   The notation for naming a list structure is the same as for ordinary
   structures, except that the structure name must begin with "list",
   and the first argument of the structure is always
   "list_length_in_bytes" and is the list length in bytes.  This
   parameter must always be present, even if not used explicitly, since
   it is used implicitly when encoding the list.

4.12.1.2  List body

   The notation for the body of the list structure definition is
   different from that described previously for ordinary structures.
   The body only contains definitions for the formats of all the
   possible list entries, and so the whole structure has an appearance
   similar to that of the compressed format field encodings, in an
   ordinary structure.  There must be at least one such entry.  For
   example:

     list_csrc(list_length_in_bytes) ===
     {
       list_item ::= irregular(32);
     };

   This defines a list of 32 bit irregular values.  The number of items
   in the list is determined by the length of the list in bytes.  Since
   this list structure defines no padding mechanism, the list length
   must be a multiple of 32, otherwise list_csrc encoding will fail, for
   example:

     csrc_list ::= list_csrc(32);   % OK

     csrc_list ::= list_csrc(48);   % Not OK

   End of list padding is discussed in the next section.




Finking, et al.          Expires April 28, 2005                [Page 31]

Internet-Draft                  ROHC-FN                     October 2004


4.12.1.3  List Termination

   With the wide variety of protocols in use today, there are a number
   of different mechanisms used which indicate the end of a list.  Most
   commonly the list length is specified, and when that length is
   reached then the end of the list is reached.  However, other lists
   are terminated by an end-of-list "sentinel".

   In order to indicate the use of a sentinel in the uncompressed list,
   list structures have a reserved field, "end_of_list_sentinel", which
   defines the list item which defines the end-of-list marker.  In
   addition to this, the notator may define an "end_of_list_pad", which
   specifies how to encode the pad bytes which occur after the end of
   the list is reached, but before the total list length is reached.
   The pad is only needed if the end of the list may be reached with
   bytes to spare.  This is the case with the TCP options list for
   example:

     list_tcp_options(list_length_in_bytes) ===
     {
       end_of_list_sentinel ::= value(8, 0);
       end_of_list_pad      ::= value(8, 1);
         :
         :
       etc.
     };

   Note that if a list is always terminated by an end of list sentinel,
   with no padding afterwards, the "list_length_in_bytes" parameter can
   be passed to the list structure unbound, since it will be bound to
   the length of the list (inclusive of the end-of-list sentinel).  It
   is an error for the "list_length_in_bytes" to be passed unbound to a
   list structure which specifies an end_of_list_pad; the use of the pad
   requires "list_length_in_bytes" to have a bound value.  Conversely,
   if no "end_of_list_pad" is specified and "list_length_in_bytes" is
   bound, then it must match the list size (regardless of whether the
   list uses an end-of-list sentinel), or else the encoding will fail.
   Finally, if no "end_of_list_sentinel" is specified, the
   list_length_in_bytes must be passed to the list structure bound,
   otherwise there is no way to tell when the end of the list has been
   reached.

4.12.1.4  Use of the has_context attribute

   For each list item, a boolean flag is defined, called "has_context",
   which is available as an attribute of the field (see sectionSection
   4.3 for more information on attributes).  When the list item is
   encoded, this flag is set to true or false depending on whether the



Finking, et al.          Expires April 28, 2005                [Page 32]

Internet-Draft                  ROHC-FN                     October 2004


   item is new or not.  New list items have no context, and so can not
   use encoding methods such as "lsb" or "static", which rely on context
   being present.  The purpose of this flag is to enable a list item to
   be compressed in different ways depending on the availability of
   context information for that list item.

   Typical behaviour for a list item is to be encoded as "irregular"
   when there is no context available, and "static" once the context
   becomes available.  However this does not suit all list items.  For
   example in TCP options, the no operation (or NOP) item has a fixed
   value of 1, so there is no need to specify two alternative encodings
   for it.  The timestamp field on the other hand is constantly
   changing, so static encoding would always fail, meaning it would have
   to be resent in full every time - better instead to use "lsb"
   encoding, or even a struct with several alternative "lsb" encodings.
   For example:

     list_tcp_options(list_length_in_bytes) ===
     {
       end_of_list_sentinel ::= value(8, 0);
       end_of_list_pad      ::= value(8, 1);
          :                        :
       timestamp            ::= tcp_timestamp_list_item();
          :                        :
          :                        :
       etc.
     };

     tcp_timestamp_list_item() ===
     {
       uncompressed_format = type,
                             length,
                             timestamp_value,
                             timestamp_echo_reply;

       default_methods =
       {
         type        ::= value(8, 8);
         length      ::= value(8, 10);
       };

       compressed_format_first = timestamp_value,
                                 timestamp_echo_reply
       {
         let (this:has_context == false);
         timestamp_value      ::= irregular(32);
         timestamp_echo_reply ::= irregular(32);
       };



Finking, et al.          Expires April 28, 2005                [Page 33]

Internet-Draft                  ROHC-FN                     October 2004


       compressed_format_subsequent = timestamp_value,
                                      timestamp_echo_reply
       {
         let (this:has_context == true);
         timestamp_value      ::= lsb(16, 0);
         timestamp_echo_reply ::= lsb(16, 0);
       };
     };

   Note that no discriminator has been used to differentiate between the
   two compressed formats, since the "has_context" flag fulfils that
   role.  It would be redundant to add a discriminator here since the
   "has_context" flag is automatically included in the encoding of the
   list, see next section for details on exactly how lists are encoded.
   This is NOT the case for fields which are not list items; non-list
   fields must encode a discriminator explicitly.

4.12.2  List Encoding

   The way a list structure is encoded is referred to as "type 0 list
   encoding" and is defined in RFC-3095 [4], section 5.8.  "List
   Compression", as a generic list compression scheme.

   Type 0 list encoding includes features which make it highly efficient
   at encoding the sorts of behaviour that occur in real protocols, such
   as items disappearing from the middle of list and perhaps reappearing
   later in the flow.  Notating this sort of behaviour directly would
   require a large amount of notation and would be hard to test and
   therefore error prone.

   The strength of type 0 list encoding is that it separates out the
   items that occur in the list from the order in which they occur.  The
   list items are stored in a table at both the compressor and the
   decompressor.  Updates to the table are transmitted if new list items
   are seen, otherwise, only references into the table (known as index
   items) need to be transmitted, rather than the list items themselves.
   Moreover, even the index items only need to be transmitted when there
   is a change to the list.

4.12.2.1  Formal Notation For List Encoding

   This section specifies the encoding of lists, using the formal
   notation.  Note that the notation given here is given only for the
   purposes of defining how lists are encoded - it is not necessary for
   a notator to reproduce this notation every time he/she wishes to
   encode a list, it all happens automatically.

   The type 0 list encoding starts with a header, which specifies the



Finking, et al.          Expires April 28, 2005                [Page 34]

Internet-Draft                  ROHC-FN                     October 2004


   format of the index item list.  In particular it specifies whether
   short or long index items are to be used, and how many of them there
   are:

   list_header(num_list_items_param, highest_index_param) ===
   {
     uncompressed_format = ;

     compressed_format = encoding_type           [  2 ],   % ET
                         generation_id_present   [  1 ],   % GP
                         xi_field_size           [  1 ],   % PS
                         list_item_count         [  4 ]    % CC
     {
       encoding_type         ::= '00';
       generation_id_present ::= '0';
       xi_size               ::= xi_size_encoding(highest_index_param);
       list_item_count       ::= compressed_value(4, num_list_items_param);
     };
   };


   % calculates the index item (xi) size flag
   xi_size_encoding(highest_index_param) ===
   {
     uncompressed_format = ;

     compressed_format_4_bit_field = xi_field_size
     {
       let(highest_index_param < 2^3);
       xi_field_size = '0';
     };

     compressed_format_8_bit_field = xi_field_size
     {
       let(highest_index_param < 2^7);
       xi_field_size = '1';
     };
   };

   Note that "num_list_items_param" must be derived from the header
   being compressed, and that "highest_index_param" comes from the
   compressor's knowledge of the items in the "xi list" (see below).
   "highest_index_param" is set to the maximum table index in the list.
   This means that even if the table currently contains greater than 8
   items, the "xi_field_size" flag could still be zero, as long as the
   highest index referred to in the list is below 8 (note items are
   indexed from zero upwards).




Finking, et al.          Expires April 28, 2005                [Page 35]

Internet-Draft                  ROHC-FN                     October 2004


   Immediately following the header is the index item list (or "xi
   list").  This is a contiguous list of index items, which specify what
   table indices to look up to find out what is in the list.  Each index
   item (or "xi") starts with a single bit flag, which indicates whether
   context is available for this item or not, followed by either three
   of seven bits to indicate the index into the table where the item is
   stored, depending on the table size:
   xi_list(xi_count_param, xi_size_param) ===
   {
     uncompressed_format = ;

     default_methods =
     {
       xi_1 ::= xi(xi_size_param);
       xi_2 ::= xi(xi_size_param);
     };


     compressed_format_mid = xi_1 [ xi_size_param ],
                             xi_2 [ xi_size_param ],
                             tail [ (xi_count_param - 2) * xi_size_param ];
     {
       let(xi_count_param > 2);
       tail ::= xi_list(xi_count_param - 2, xi_size_param);
     };

     compressed_format_even_end = xi_1 [ xi_size_param ],
                                  xi_2 [ xi_size_param ];
     {
       let(xi_count_param == 2);
     };

     % need a four bit pad at the end of the xi list if there are an
     % odd number in the list, and xi size is four bits
     compressed_format_odd_end = xi_1 [ xi_size_param ],
                                 pad  [ 8 - xi_size_param ];
     {
       let(xi_count_param == 1);
       pad ::= compressed_value(8 - xi_size_param, 0);
     };
   };



   xi(xi_size_param)  ===
   {
     uncompressed_format = ;




Finking, et al.          Expires April 28, 2005                [Page 36]

Internet-Draft                  ROHC-FN                     October 2004


     compressed_format_new = new,
                             table_index
     {
       let(this:has_context == false);
       new         ::= '1';
       table_index ::= irregular(xi_size_param);
     };

     compressed_format_old = new,
                             table_index
     {
       let(this:has_context == true);
       new         ::= '0';
       table_index ::= irregular(xi_size_param);
     };
   };

   Note that "xi_size_param" must be calculated by the compressor to be
   either 4 or 8 (it can be derived by multiplying the "xi_field_size"
   flag by 4 and adding 4), and that the "has_context" attribute of each
   xi must be bound by the compressor to the value of the "has_context"
   attribute of the corresponding list item.

   Immediately following the xi list, is the encodings list.  This is
   the actual list of encodings for all the items referred to by the xi
   list.  The items occur in the same order as they do in the xi list,
   each encoded in the manner specified by the notator.  Assuming the
   following generic notation for lists:

     list (list_length_in_bytes) ===
     {
       end_of_list_sentinel ::= sentinel_encoding;
       end_of_list_pad      ::= pad_encoding;

       item_type_1          ::= item_type_1_encoding;
       item_type_2          ::= item_type_2_encoding;
          :                        :
          :                        :
       item_type_n          ::= item_type_n_encoding;
     };

   The item list is encoded as follows:

     item_list_encoding(list_length_in_bytes, list_end_reached) ===
     {
       uncompressed_format ::= item, tail;

       default_methods =



Finking, et al.          Expires April 28, 2005                [Page 37]

Internet-Draft                  ROHC-FN                     October 2004


       {
         let (bytes_left == list_length_in_bytes - item:uncomp_length);
         tail ::= item_list_encoding(bytes_left, bytes_left == 0);
       };

       compressed_format_list_end =
       {
         let(list_length_in_bytes == 0);
         let(list_end_reached     == true);
       };

       compressed_format_sentinel = item, tail
       {
         item ::= sentinel_encoding;
         tail ::= item_list_encoding(bytes_left, true);
       };

       compressed_format_pad = item, tail
       {
         let (list_end_reached == true);
         item ::= pad_encoding;
         tail ::= item_list_encoding(bytes_left, true);
       };

       compressed_fomat_type_1  = item, tail
       {
         let (list_end_reached == false);
         item ::= item_type_1_encoding;
       };

       compressed_fomat_type_2 = item, tail
       {
         let (list_end_reached == false);
         item ::= item_type_2_encoding;
       };
          :                        :
          :                        :
       compressed_fomat_type_n = item_, tail
       {
         let (list_end_reached == false);
         item ::= item_type_n_encoding;
       };
     };


5.  Security considerations

   This draft describes a formal notation similar to ABNF RFC 2234 [3],



Finking, et al.          Expires April 28, 2005                [Page 38]

Internet-Draft                  ROHC-FN                     October 2004


   and hence is not believed to raise any security issues.

6.  Acknowledgements

   A number of important concepts and ideas have been borrowed from ROHC
   RFC 3095 [4].

   Thanks to Mark West, Eilert Brinkmann and Kristofer Sandlund for
   their cooperation and feedback from notating the TCP profile.

   Thanks to Rob Hancock and Stephen McCann for putting up with the
   authors' arguments and making helpful suggestions, frequently against
   the tide!

   The authors would also like to thank Carsten Bormann, Christian
   Schmidt, Qian Zhang, Hongbin Liao, Max Riegel and Lars-Erik Jonsson
   for their comments and encouragement.  We haven't always agreed, but
   the arguments have been fun!

7  References

   [1]  Bradner, S., "The Internet Standards Process -- Revision 3", BCP
        9, RFC 2026, October 1996.

   [2]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
        Levels", BCP 14, RFC 2119, March 1997.

   [3]  Crocker, D. and P. Overall, "Augmented BNF for Syntax
        Specifications: ABNF", RFC 2234, November 1997.

   [4]  Bormann, C., Burmeister, C., Degermark, M., Fukushima, H.,
        Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu,
        Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T.,
        Yoshimura, T. and H. Zheng, "RObust Header Compression (ROHC):
        Framework and four profiles: RTP, UDP, ESP, and uncompressed",
        RFC 3095, July 2001.















Finking, et al.          Expires April 28, 2005                [Page 39]

Internet-Draft                  ROHC-FN                     October 2004


Authors' Addresses

   Robert Finking
   Siemens/Roke Manor
   Roke Manor Research Ltd.
   Romsey, Hampshire  SO51 0ZN
   UK

   Phone: +44 (0)1794 833189
   EMail: robert.finking@roke.co.uk
   URI:   http://www.roke.co.uk


   Ghyslain Pelletier
   Ericsson AB
   Box 920
   LuleÈÑ  SE-971 28
   Sweden

   Phone: +46 (0) 8 404 29 43
   EMail: ghyslain.pelletier@ericsson.com


   Richard Price
   Cogent Defence and Security Networks
   Queensway Meadows Industrial Estate
   Meadows Road
   Newport, Gwent  NP19 4SS

   Phone: +44 (0)1794 833681
   EMail: richard.price@cogent-dsn.com
   URI:   http://www.cogent-dsn.com

Appendix A.  Syntax

   This section gives a formal definition of the ROHC-FN syntax in ABNF
   (see RFC 2234 [3]).

A.1  Reserved Keywords

   Some keywords are defined and reserved in ROHC-FN.  These keywords
   cannot be reused as identifiers by the notator.

   o  comp_hdr_start - attribute
   o  comp_length - attribute
   o  comp_value - attribute
   o  compressed_format - struct syntax




Finking, et al.          Expires April 28, 2005                [Page 40]

Internet-Draft                  ROHC-FN                     October 2004


   o  compressed_value - primitive encoding method
   o  default_methods - struct syntax
   o  irregular - primitive encoding method
   o  let - primitive encoding method
   o  lsb - primitive encoding method
   o  static - primitive encoding method
   o  uncomp_hdr_start - attribute
   o  uncomp_length - attribute
   o  uncomp_value - attribute
   o  uncompressed_format - struct syntax
   o  uncompressed_value - primitive encoding method

   reserved_word ::= primitive_encoding_method_name |
   attribute_identifier | struct_reserved_words

A.2  Characters

   Because ABNF [3] symbols are case insensitive, it is necessary to
   define explicit symbols for each of the lower case characters which
   we use in the reserved words of our grammar.  Fortunately there are
   no fundamental components of the FN syntax which are in upper case,
   otherwise we would have to define each capital letter separately
   also.

   a = %x61

   b = %x62

   c = %x63

   d = %x64

   e = %x65

   f = %x66

   g = %x67

   h = %x68

   i = %x69

   j = %x6a

   k = %x6b

   l = %x6c




Finking, et al.          Expires April 28, 2005                [Page 41]

Internet-Draft                  ROHC-FN                     October 2004


   m = %x6d

   n = %x6e

   o = %x6f

   p = %x70

   q = %x71

   r = %x72

   s = %x73

   t = %x74

   u = %x75

   v = %x76

   w = %x77

   x = %x78

   y = %x79

   z = %x7a

   lower-case-letter = %x61-7a ; a-z

   upper-case-letter = %x41-5a ; A-Z

   binary-digit = "0" / "1"

   octal-digit = binary-digit / "2" / "3" / "4" / "5" / "6" / "7"

   decimal-digit = octal-digit / "8" / "9"

   hexadecimal-digit = decimal-digit / %x61-66

   open-bracket = "("

   close-bracket = ")"

   open-brace = "{"

   close-brace = "}"




Finking, et al.          Expires April 28, 2005                [Page 42]

Internet-Draft                  ROHC-FN                     October 2004


   equals-sign = "="

   underscore = "_"

   comma = ","

   semi-colon = ";"

   single-quote = "'"

A.3  Literals

   decimal-literal = 1*decimal-digit

   binary-literal = "0".b 1*binary-digit

   octal-literal = "0".o 1*octal-digit

   hexadecimal-literal = "0".x 1*hexadecimal-digit

   numeric-literal = decimal-literal / binary-literal / octal-literal /
   hexadecimal-literal

A.4  Identifiers

   lower-case-identifier = (lower-case-letter *(lower-case-letter /
   decimal-digit / underscore)) ; The original EBNF had "-
   reserved-word" here, meaning "except reserved words", but ABNF has no
   equivalent construct.  Notwithstanding this fact, any automated tool
   should enforce the reservation of reserved words in this fashion.

   upper-case-identifier = upper-case-letter *(upper-case-letter /
   decimal-digit / underscore)

A.5  Opertators

   exponential-operator = "^"

   multiplicative-operator = "*" / "/"

   additive-operator = "+" / "-"

   unary-minus = "-"

A.6  Expressions

   parenthesised-expression = open-bracket arithmetic-expression
   close-bracket



Finking, et al.          Expires April 28, 2005                [Page 43]

Internet-Draft                  ROHC-FN                     October 2004


   primitive-expression = numeric-literal / constant-name /
   field-attribute / parenthesised-expression / (unary-minus
   primitive-expression)

   exponential-expression = primitive-expression *(exponential-operator
   primitive-expression)

   multiplicative-expression = exponential-expression
   *(multiplicative-operator exponential-expression)

   additive-expression = multiplicative-expression *(additive-operator
   multiplicative-expression)

   arithmetic-expression = additive-expression

A.7  Constants

   constant-name = upper-case-identifier

   constant-value = constant-name / expression

   constant-definition = constant-name equals-sign constant-value

A.8  Field Names

   field-name = lower-case-identifier

   annotated-field-name = field-name [ "[" constant "]" ]

A.9  Attributes

   attribute-category = (c.o.m.p) / (u.n.c.o.m.p)

   attribute-name = (l.e.n.g.t.h) / (v.a.l.u.e) /
   (h.d.r.underscore.s.t.a.r.t)

   attribute-identifier = attribute-category underscore attribute-name

   field-attribute = field-name ":" attribute-identifier

A.10  Encoding Methods

   primitive-encoding-method-name =
   (c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e) / (i.r.r.e.g.u.l.a.r) /
   (l.s.b) / (s.t.a.t.i.c) /
   (u.n.c.o.m.p.r.e.s.s.e.d.underscore.v.a.l.u.e)

   uncompressed-value-shorthand = single-quote *binary-digit



Finking, et al.          Expires April 28, 2005                [Page 44]

Internet-Draft                  ROHC-FN                     October 2004


   single-quote

   external-encoding-method-name = underscore lower-case-identifier

   non-primitive-encoding-method-name = structure-name /
   external-encoding-method-name

   encoding-method-parameter-list = open-bracket arithmetic-expression
   *(comma arithmetic-expression) close-bracket

   encoding-method = uncompressed-value-shorthand /
   (encoding-method-name [encoding-method-parameter-list])

   field-encoding = field-name "::=" encoding-method

A.11  Structures

   structure-name = lower-case-identifier

   field-order-list = [ annotated-field-name *(comma
   annotated-field-name) ]

   field-encodings-list = open-brace *(field-encoding semi-colon)
   close-brace

   uncompressed-format-prefix =
   (u.n.c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t)

   uncompressed-format = uncompressed-format-prefix [underscore
   lower-case-identifier] equals-sign field-order-list; semi-colon

   compressed-format-prefix =
   (c.o.m.p.r.e.s.s.e.d.underscore.f.o.r.m.a.t)

   compressed-format = compressed-format-prefix [underscore
   lower-case-identifier] equals-sign field-order-list
   field-encodings-list semi-colon

   default-methods-id ::= (d.e.f.a.u.l.t.underscore.m.e.t.h.o.d.s)

   default-methods = default-methods-id equals-sign field-encodings-list
   semi-colon

   uncompressed-format-list = *uncompressed-format

   compressed-format-list = 1*compressed-format

   structure-body = open-brace uncompressed-format-list



Finking, et al.          Expires April 28, 2005                [Page 45]

Internet-Draft                  ROHC-FN                     October 2004


   [default-methods] compressed-format-list close-brace

   structure-definition = structure-name "===" structure-body semi-colon

   struct-reserved-words = uncompressed-format-prefix /
   compressed-format-prefix / default-methods-id;

Appendix B.  Bit-level Worked Example

   This section gives a worked example at the bit level, showing how a
   simple profile describes the compression of real data from an
   imaginary packet format.  The example used has been kept fairly
   simple, whilst still aiming to illustrate some of the intricacies
   that arise in use of the notation.  In particular fields have been
   kept short to make it possible to read the binary representation of
   the headers by eye, without too much difficulty.

B.1  Example Packet Format

   Our imaginary header is just 16 bits long, and consists of the
   following fields:

   1.  version number - 2 bits
   2.  type - 2 bits
   3.  flow id - 4 bits
   4.  sequence number - 4 bits
   5.  flag bits - 4 bits

   So for example 0101000100010000 indicates a packet with a version
   number of one, a type of one, a flow id of one, a sequence number of
   one, and all flag bits set to zero.

B.2  Initial Encoding

   An initial definition based solely on the above information is:
















Finking, et al.          Expires April 28, 2005                [Page 46]

Internet-Draft                  ROHC-FN                     October 2004


   eg_header ===
   {
     uncompressed_format   =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               flag_bits      [ 4 ];

     compressed_format     =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               flag_bits      [ 4 ]
     {
       version_no          ::=   irregular(2);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       flag_bits           ::=   irregular(4);
     };
   };

   This defines the packet nicely, but doesn't actually offer any
   compression.  If we use it to encode the above header, we get:

     Uncompressed header: 0101000100010000
     Compressed header:   0101000100010000

   This is because we have stated that all fields are irregular - i.e.
   we don't know anything about their behaviour.

B.3  Basic Compression

   In order to achieve any compression we need to notate our knowledge
   about the header, and it's behaviour in a flow.  For example, we may
   know the following facts about the header:

   1.  version number - indicates which version of the protocol this is,
       always one for this version of the protocol
   2.  type - may take any value.
   3.  flow id - may take any value.
   4.  sequence number - make take any value
   5.  flag bits - contains three flags, a, b and c, each of which may
       be set or clear, and a reserved flag bit, which is always clear
       (i.e.  zero).

   We could notate this knowledge as follows:




Finking, et al.          Expires April 28, 2005                [Page 47]

Internet-Draft                  ROHC-FN                     October 2004


   eg_header ===
   {
     uncompressed_format   =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 1 ];

     compressed_format     =   version_no     [ 0 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 0 ]
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };

   Using this simple scheme, we have successfully encoded the fact that
   one of the fields has a permanently fixed value of one, and therefore
   contains no useful information.  We have also encoded the fact that
   the final flag bit is always zero, which again contains no useful
   information.  Both of these facts have been notated using the
   uncompressed_value encoding method (see Section 4.9.1)

   Note that we could just as well have omitted the "0 bits" fields from
   the definition of the compressed_data if we so wished, since the only
   purpose of that list is to indicate the order in the compressed
   header - zero bit fields don't actually appear and so can be omitted.

   Using this new encoding on the above header, we get:

     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000

   Which reduces the amount of data we need to transmit by roughly 20%.
   However, this encoding fails to take any advantage of relationships
   between values of a field in one packet and its value in subsequent
   packets.  For example, every packet in the following sequence is
   compressed the same amount despite the similarities between them:




Finking, et al.          Expires April 28, 2005                [Page 48]

Internet-Draft                  ROHC-FN                     October 2004


     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000


     Uncompressed header: 0101000100100000
     Compressed header:   0100010010000


     Uncompressed header: 0111000100110000
     Compressed header:   1100010011000


B.4  Inter-packet compression

   The profile we have defined so far has not compressed the sequence
   number or flow ID fields at all, since they can take any value.
   However the value of these fields in one header has a very simple
   relationship to their value in previous headers:
      the sequence number increases by one each time,
      the flow_id stays the same, it always has the same value that it
      did in the previous header in the flow,
      the abc_flag_bits stay the same most of the time, they usually
      have the same value that they did in the previous header in the
      flow,

   An obvious way of notating this is as follows:
   % This obvious encoding will not work (correct encoding below)
   eg_header  ===
   {
     uncompressed_format   =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 1 ];

     compressed_format     =   type           [ 2 ],
                               abc_flag_bits  [ 3 ]
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(0,-1);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };




Finking, et al.          Expires April 28, 2005                [Page 49]

Internet-Draft                  ROHC-FN                     October 2004


   This dependency on previous packets is notated using the static and
   LSB encoding methods (see Section 4.9.4 and Section 4.9.5
   respectively).

   However there are a few problems with the above notation.  Firstly,
   and most importantly, the flow_id field is notated as "static" which
   means that it doesn't change from packet to packet.  However, the
   notation does not indicate how to communicate the value of the field
   initially.  It's all very well saying "it's the same value as last
   time", but there must have been a first time, where we define what
   that value is, so that it can be referred back to.  The above
   notation provides no way of communicating that.  Similarly with the
   sequence number - there needs to be a way of communicating its
   initial value.

   Secondly, the sequence number field is communicated very efficiently
   in zero bits, but it is not at all robust against packet loss.  If a
   packet is lost then there is no way to fill in the missing sequence
   number.

   Finally, although the flag bits are usually the same as in the
   previous header in the flow, the profile doesn't make any use of this
   fact; since they are sometimes not the same as those in the previous
   header, it is not safe to say that they are always the same, so
   static encoding can't be used all the time.  We solve all three of
   these problems below, robustness first, since it is simplest.

   When communicating sequence numbers a very important consideration
   for the notator is how robust the compressed protocol needs to be
   against packet loss.  This will vary a lot from protocol to protocol.
   For example RTP has a high setup cost, so the compressed stream needs
   to be robust against fairly high packet loss.  Things are different
   with TCP, where robustness to loss of just a few packets is
   sufficient.  For the example protocol we'll assume short, low
   overhead flows and say we need to be robust to the loss of just one
   packet, which we can achieve with a single bit of LSB encoding (see
   Section 4.9.5 ).

   To communicate initial values for the sequence number and flow ID
   fields, and to take advantage of the fact that the flag bits are
   usually the same as in the previous header, we need to depart from
   the single packet format encoding we are currently using and instead
   use multiple packet formats:








Finking, et al.          Expires April 28, 2005                [Page 50]

Internet-Draft                  ROHC-FN                     October 2004


   eg_header  ===
   {
     uncompressed_data     =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 1 ];

     compressed_format_0   =   discriminator      [ 1 ],
                               type               [ 2 ],
                               flow_id            [ 4 ],
                               sequence_no        [ 4 ],
                               abc_flag_bits      [ 3 ]
     {
       discriminator       ::=   '0';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     compressed_format_1   =   discriminator        [ 1 ],
                               type                 [ 2 ],
                               sequence_no          [ 1 ]
     {
       discriminator       ::=   '1';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(1,-1);
       abc_flag_bits       ::=   static;
       reserved_flag       ::=   uncompressed_value(1,0);
     };
   };

   Note that we have had to add a discriminator field, so that the
   decompressor can tell which packet format has been used by the
   compressor.  The format with a static flow ID and LSB encoded
   sequence number, is now 4 bits long, less than a third of the size of
   the single packet format, and a quarter of the size of the
   uncompressed header.  Note that despite having to add the
   discriminator field, this format is even smaller than the original
   incorrect naȯve notation, which was 5 bits long, because this
   notation takes advantage of the fact that the abc flag bits rarely
   change.



Finking, et al.          Expires April 28, 2005                [Page 51]

Internet-Draft                  ROHC-FN                     October 2004


   However, the original packet format (with an irregular flow ID and
   sequence number) has also grown by one bit due to the addition of the
   discriminator.  An important consideration when creating multiple
   packet formats is whether the extra format occurs frequently enough
   that the average compressed header length is shorter as a result.
   For example, if in fact the sequence number in the example protocol
   counted up in steps of three, not one, then the LSB encoding could
   never be used; all we would have just achieved is to lengthen the
   irregular packet format by one bit.

   Using the above notation, we now get:

     Uncompressed header: 0101000100010000
     Compressed header:   00100010001000


     Uncompressed header: 0101000100100000
     Compressed header:   1010 ; 00100010010000


     Uncompressed header: 0111000100110000
     Compressed header:   1110 ; 01100010011000

   The first header in the stream is compressed the same way as before,
   except that it now has the extra 1 bit discriminator at the start
   (0).  When a second header arrives, with the same flow ID as the
   first and its sequence number one higher, it can now be compressed in
   two possible ways, either using format_1 or in the same way as
   previously, using format_0.

   Note that we show all possible encodings of a packet as defined by a
   given profile, separated by semi-colons.  Either of the above
   encodings for the packet could be produced by a valid implementation,
   although of course a good implementation would always aim to make the
   compressed size as small as possible and an optimum implementation
   would pick the encoding which led to the best compression of the
   packet stream (which is not necessarily the smallest encoding for a
   particular packet).

B.5  Variable Length Discriminators

   Suppose we do some analysis on flows of our example protocol and
   discover that whilst it is usual for successive packets to have the
   same flags, on the occasions when they don't, the packet is almost
   always a "flags set" packet, in which all three of the abc flags are
   set.  To encode the flow more efficiently a packet format needs to be
   written to reflect this.




Finking, et al.          Expires April 28, 2005                [Page 52]

Internet-Draft                  ROHC-FN                     October 2004


   This now gives a total of three packet formats, which means we need
   three discriminators to differentiate between them.  The obvious
   solution here is to increase the number of bits in the discriminator
   from 1 to two and for example use discriminators 00, 01, and 10.
   However we can do slightly better than this.

   Any uniquely identifiable discriminator will suffice, so we can use
   00, 01 and 1.  If the discriminator starts with 1, that's the whole
   thing.  If it starts with 0 the decompressor knows it has to check
   one more bit to determine the packet kind.

   Note that it would be erroneous to use e.g.  0, 01 and 10 as
   discriminators since after reading an initial 0, the decompressor
   would have no way of knowing if the next bit was a second bit of
   discriminator, or the first bit of the next field in the packet
   stream.  0, 10 and 11 however would be OK as the first bit again
   indicates whether or not there are further discriminator bits to
   follow.

   This gives us the following:
   eg_header  ===
   {
     uncompressed_data     =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 1 ];

     compressed_format_0   =   discriminator      [ 2 ],
                               type               [ 2 ],
                               flow_id            [ 4 ],
                               sequence_no        [ 4 ],
                               abc_flag_bits      [ 3 ]
     {
       discriminator       ::=   '00';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);
       abc_flag_bits       ::=   irregular(3);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     compressed_format_1   =   discriminator      [ 2 ],
                               type               [ 2 ],
                               sequence_no        [ 1 ]
     {



Finking, et al.          Expires April 28, 2005                [Page 53]

Internet-Draft                  ROHC-FN                     October 2004


       discriminator       ::=   '01';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(1,-1);
       abc_flag_bits       ::=   uncompressed_value(3,7);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     compressed_format_2   =   discriminator      [ 1 ],
                               type               [ 2 ],
                               sequence_no        [ 1 ]
     {
       discriminator       ::=   '1';
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(1,-1);
       abc_flag_bits       ::=   static;
       reserved_flag       ::=   uncompressed_value(1,0);
     };

   Here is some example output:

     Uncompressed header: 0101000100010000
     Compressed header:   000100010001000


     Uncompressed header: 0101000100100000
     Compressed header:   1010 ; 000100010010000


     Uncompressed header: 0111000100110000
     Compressed header:   1110 ; 001100010011000


     Uncompressed header: 0111000101001110
     Compressed header:   01110 ; 001100010100111

   Here we have a very similar sequence to last time, except that there
   is now an extra message on the end which has the flag bits set.  The
   encoding for the first message in the stream is now one bit larger,
   the encoding for the next two messages is the same as before, since
   that packet format has not grown, thanks to the use of variable
   length discriminators.  Finally the packet that comes through with
   all the flag bits set can be encoded in just five bits, only one bit
   more than the most common packet format.




Finking, et al.          Expires April 28, 2005                [Page 54]

Internet-Draft                  ROHC-FN                     October 2004


B.6  Default encoding

   There is some redundancy in the notation used to define the profile
   in that the same encoding method is used for the same fields several
   times in different formats, but the field is redefined explicitly
   each time.  If the encoding for any of these fields changed in the
   future (e.g.  if the reserved flag became permanently set to 1
   instead of 0), then every packet format would have to be modified to
   reflect this change.

   This problem can be avoided by specifying a default encoding for
   these fields, which also leads to a more concisely notated profile:
   eg_header  ===
   {
     uncompressed_data     =   version_no     [ 2 ],
                               type           [ 2 ],
                               flow_id        [ 4 ],
                               sequence_no    [ 4 ],
                               abc_flag_bits  [ 3 ],
                               reserved_flag  [ 1 ];

     default_methods       =
     {
       version_no          ::=   uncompressed_value(2,1);
       type                ::=   irregular(2);
       flow_id             ::=   static;
       sequence_no         ::=   lsb(1,-1);
       reserved_flag       ::=   uncompressed_value(1,0);
     };

     compressed_format_0   =   discriminator      [ 2 ],
                               type               [ 2 ],
                               flow_id            [ 4 ],
                               sequence_no        [ 4 ],
                               abc_flag_bits      [ 3 ]
     {
       discriminator       ::=   '00';
       flow_id             ::=   irregular(4);
       sequence_no         ::=   irregular(4);  % overrides default
       abc_flag_bits       ::=   irregular(3);
     };

     compressed_format_1   =   discriminator      [ 2 ],
                               type               [ 2 ],
                               sequence_no        [ 1 ]
     {
       discriminator       ::=   '01';
       abc_flag_bits       ::=   uncompressed_value(3,7);



Finking, et al.          Expires April 28, 2005                [Page 55]

Internet-Draft                  ROHC-FN                     October 2004


     };

     compressed_format_2   =   discriminator      [ 1 ],
                               type               [ 2 ],
                               sequence_no        [ 1 ]
     {
       discriminator       ::=   '1';
       abc_flag_bits       ::=   static;
     };
   };

   The above profile behaves in exactly the same way as the one notated
   previously, since it has the same meaning.  Note that the purposes
   behind the different formats become clearer with the default encoding
   methods factored out; all that remains are the encodings which are
   relevant to each specific format.  Note also that default encoding
   methods which compress down to zero bits have become completely
   implicit.  For example none of the compressed formats mentions
   "version_no" explicitly, either the field order list (no need, it's
   zero bits long) or in the field encodings list (no need it's
   specified in the default encoding methods).






























Finking, et al.          Expires April 28, 2005                [Page 56]

Internet-Draft                  ROHC-FN                     October 2004


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2004).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.




Finking, et al.          Expires April 28, 2005                [Page 57]


Html markup produced by rfcmarkup 1.107, available from http://tools.ietf.org/tools/rfcmarkup/