[Docs] [txt|pdf] [Tracker] [WG] [Email] [Diff1] [Diff2] [Nits]

Versions: (draft-ietf-rohc-epic-lite) 00 01 02 03 04 05 06 07 08 09 10 11 12 13 RFC 4997

Network Working Group                  Richard Price, Siemens/Roke Manor
INTERNET-DRAFT                       Abigail Surtees, Siemens/Roke Manor
Expires: April 2003                      Mark A West, Siemens/Roke Manor

                                                        October 28, 2002


               Protocol-Enabled BNF-Based LanguagE (PEBBLE)
                 <draft-ietf-rohc-formal-notation-00.txt>


Status of this memo

   This document is an Internet-Draft and is in full conformance with
   all provisions of Section 10 of RFC2026.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups. Note that other
   groups may also distribute working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or cite them other than as "work in progress".

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/lid-abstracts.txt

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html

   This document is a submission of the IETF ROHC WG. Comments should be
   directed to its mailing list, rohc@ietf.org.


Abstract

   This draft defines PEBBLE: a formal notation for describing the
   behavior of protocols or protocol stacks. The notation is designed to
   provide information useful in "protocol-aware" compression
   algorithms, for example the header compression profiles compatible
   with the Robust Header Compression (ROHC) framework.












Price et al.                                                    [Page 1]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


Table of contents

   1.  Introduction..................................................2
   2.  Terminology...................................................2
   3.  Overview of PEBBLE............................................3
   4.  Normative definition of PEBBLE................................7
   5.  Improving readability.........................................15
   6.  Pre-defined rules and integer operators.......................17
   7.  Library of commonly used rules................................22
   8.  Security considerations.......................................34
   9.  Acknowledgements..............................................34
   10. Authors' addresses............................................34
   11. References....................................................35
   Appendix A. ABNF description of PEBBLE............................35


1.  Introduction

   This document defines PEBBLE: a version of the well-known Backus-Naur
   Form (BNF) notation which can be used to describe the behavior of
   protocols or protocol stacks. It is intended to simplify the creation
   of new compression profiles to fit within the ROHC [RFC-3095]
   framework.

   The notation in itself does not define the "bits on the wire" of the
   compressed data. To achieve this an encoding algorithm must be chosen
   that can map the uncompressed packets onto a set of compressed
   packets and vice versa. The specification of such an encoder is
   beyond the scope of this draft: each ROHC profile created using
   PEBBLE can select its own based on the requirements of the profile.


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [RFC-2119].

   BNF (Backus Naur Form)

     BNF is a notation used to describe the syntax of many well-known
     protocols.

   Control field

     A control field is a field described by PEBBLE but which is not
     part of the protocol header itself. For example, PEBBLE may define
     a checksum field over the header to ensure robustness against bit
     errors and dropped packets.






Price et al.                                                    [Page 2]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   Encoding algorithm

     An encoding algorithm defines a reversible mapping between a
     set of uncompressed packets and a set of compressed packets.
     Protocol-aware encoding algorithms can make use of the
     information provided in the PEBBLE description of a protocol.

   Field

     The PEBBLE description of a protocol divides the protocol into a
     set of contiguous bit patterns known as fields.

   Label

     Each field can be assigned a "label name" which allows it to be
     referenced by different PEBBLE rules.

   Library of rules

     The library of rules contains a number of commonly used rules for
     describing the behavior of a protocol. More rules can be added to
     the library as and when they are needed.

   Pre-defined rules

     Four pre-defined rules are provided as standard by PEBBLE; these
     can be combined to create new rules with more complex behavior.

   Profile

     A ROHC [RFC-3095] profile is a description of how to compress a
     certain protocol stack over a certain type of link. Each profile
     includes an encoding algorithm to compress the headers and a state
     machine to control the actions of each endpoint.

   Rule

     Rules are the basis for the PEBBLE description of a protocol. Each
     rule defines the behavior of one or more fields in the protocol
     header. Rules can be taken from the set of pre-defined rules or can
     be defined in terms of existing rules.


3.  Overview of PEBBLE

   This chapter gives an overview of the PEBBLE notation, including its
   relation to existing versions of BNF such as "Augmented BNF" (ABNF)
   [RFC-2234].







Price et al.                                                    [Page 3]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


3.1. What is BNF?

   BNF (Backus Naur Form) is a notation used to describe the syntax of
   many well-known protocols, for example SIP [RFC-3261] and RTSP [RFC-
   2960]. All versions of BNF are based around the concept of a "BNF
   rule". Each BNF rule describes the syntax of a portion of the
   protocol, and can be defined in terms of existing BNF rules which
   makes it easy to build new descriptions out of existing ones (e.g.
   when adding new layers to a protocol stack). The primary or "top-
   level" BNF rule describes the syntax of the entire protocol stack in
   question.

   When the BNF rule which describes the syntax of a certain protocol
   stack is applied to a packet that may or may not contain a header
   from the protocol stack, there are two possible outcomes:

   1. The packet successfully matches the BNF rule, indicating that it
      contains a valid header from the corresponding protocol stack.

   2. The packet fails to match the BNF rule, indicating that it does
      not contain a header from the corresponding protocol stack.

   So each BNF rule can be considered to be an operator, whose input is
   a packet and whose output is a Boolean value specifying whether the
   packet contains a header from the chosen protocol stack.

   In order to create new BNF rules from existing rules, it is necessary
   to provide at least one pre-defined BNF rule as part of the notation.
   Different variants of BNF offer different sets of pre-defined BNF
   rules from which new rules can be constructed. For example, the
   variant of BNF known as "Augmented BNF" [RFC-2234] includes the
   following pre-defined rules:

   rule_1 rule_2 ... rule_n               Concatenation of n BNF rules

   rule_1 | rule_2 | ... | rule_n         Choice between n BNF rules

   "%d" m-n                               Character with ASCII code
                                          between m and n inclusive

   [rule]                                 Optional BNF rule

   x*y rule                               List of between x and y
                                          occurrences of a BNF rule

   Note that each of the pre-defined BNF rules have operands that can
   affect the outcome of the rule. For example, "x*y rule" matches n
   consecutive instances of an existing BNF rule where n lies between
   the integers x and y inclusive. So if the BNF rule "foo" matches the
   bit patterns "00" or "11" then the BNF rule 1*2(foo) will match the
   bit patterns "00", "11", "0000", "0011", "1100" or "1111".




Price et al.                                                    [Page 4]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   New BNF rules can be created by assigning a name for the rule,
   followed by the symbol "::=" and an expression which evaluates to the
   new rule and is written in terms of existing rules. For example:

   bar     ::=   1*2(foo)

   The following BNF description illustrates this process in more
   detail. Using the pre-defined BNF rules from ABNF [RFC-2234], a new
   BNF rule "list" is created which describes a list of arbitrary
   positive or negative integers:

   list    ::=   1*(number)

   number  ::=   ["+" | "-"] 1*(digit)

   digit   ::=   %d48-57

3.2. What is PEBBLE?

   PEBBLE is a version of BNF designed to be useful when creating
   "protocol-aware" compression algorithms. Knowledge of the protocol to
   be compressed typically improves the compression ratio and reduces
   the processing and memory overhead compared to the generic "protocol-
   unaware" case.

   Note that PEBBLE describes the behavior of the uncompressed headers
   only, so descriptions can be written without requiring any knowledge
   of the algorithm that will generate the actual compressed data. In
   particular the notation is designed to be both human-readable and
   machine-readable. If only one protocol stack needs to be compressed,
   a suitable encoder can be designed by hand based on the PEBBLE
   description of the header. Conversely, it is possible to implement a
   protocol-independent encoder that can take an arbitrary PEBBLE
   description and automatically generate a set of compressed headers
   for the relevant protocol.

   The PEBBLE notation is a version of BNF with four pre-defined rules:
   FIELD, DISTRIBUTION, MAP and CONCATENATION (see Chapter 6 for a
   detailed description of these rules). PEBBLE also includes the
   following unique features not found in simpler versions of BNF:

   1. As well as describing the set of valid headers for the chosen
      protocol, PEBBLE also outputs some additional information that may
      be useful to encoders.

   2. New rules can be created with operands that modify the outcome of
      the rule.

   Each of these features is discussed in more detail in the following
   sections.





Price et al.                                                    [Page 5]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


3.3. Useful information for encoders

   The rules defined by PEBBLE behave in the same manner as ordinary BNF
   rules: the input is a packet (and optionally some additional operands
   that affect the outcome of the rule), and the output is a Boolean
   value specifying whether the packet contains a valid header from the
   chosen protocol stack.

   However the rules provided by PEBBLE also output some additional
   information that may be useful to encoders. As an example, for each
   header that matches a certain PEBBLE rule, the rule also outputs the
   probability that the header will occur in a typical data flow. This
   extra information allows an encoding algorithm to encode more common
   headers using fewer bits than headers expected to occur rarely.

   The following diagram shows the inputs and outputs for a PEBBLE rule:

    +-----------------------+                 +-----------------------+
    |  packet to be tested  |   PEBBLE rule   | valid/invalid header? |
    +-----------------------+ --------------> +-----------------------+
    | operands for the rule |                 |  useful information   |
    +-----------------------+                 +-----------------------+

              Figure 1: Inputs and outputs for a PEBBLE rule

   More generally, the aim of the PEBBLE notation is to provide
   sufficient information to support a wide variety of encoders.
   Particular encoders are of course at liberty to ignore this
   information if they wish. If it is not clear whether or not a piece
   of information will be useful, it is generally considered that it
   should be included since it is easier to ignore unnecessary
   information than to infer missing information.

3.4. Creating new rules in PEBBLE

   All versions of BNF allow new rules to be defined in terms of
   existing rules. However, most variants of BNF only allow simple rules
   with no operands to be defined in terms of existing rules; BNF rules
   with one or more operands must be pre-defined by hand.

   Since the PEBBLE notation provides a large library of rules for
   describing new protocols, it is useful to extend the mechanism for
   creating new rules so that it can handle rules with operands. As a
   result the description of rules in the library is greatly simplified.

   For example, suppose that a rule RANGE(#n,#v,#k) has already been
   defined which describes a single n-bit field that can take values
   from v to v+k-1 inclusive.

   The RANGE rule can be used to create a new rule called VALUE with two
   integer operands n and v:




Price et al.                                                    [Page 6]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   VALUE(#n,#v)   ::=   RANGE(n,v,1)

   The VALUE rule describes an n-bit field that always takes the value
   v. In turn, this rule is useful for describing fields such as the IP
   Version:

   IPv4_Version   ::=   VALUE(4,4)

   The mechanism for defining new rules in terms of existing rules forms
   the basis of PEBBLE, and is discussed in detail in the following
   chapter.


4.  Normative definition of PEBBLE

   This chapter defines the PEBBLE notation for describing the behavior
   of protocols or protocol stacks.

4.1. Inputs and outputs of a rule

   As explained in Section 3.1, a rule can be considered to be an
   operator which takes a packet as its input and returns a Boolean
   value specifying whether or not the packet contains a valid header.
   In PEBBLE each rule can also take one or more additional operands
   that modify its behavior, and can return extra information that may
   be of use to certain encoders.

   As well as the operands declared explicitly when the rule is created,
   there are three "implicit operands" common to all rules. Implicit
   operands do not need to be declared explicitly because they are
   always required by a rule. The first implicit operand is the packet
   to be tested for a valid header; the other two implicit operands (the
   context and the list of labels) are described in Section 4.4 and
   Section 4.5 respectively.

   As well as the Boolean value indicating whether the packet contains a
   valid header or not, each rule provides three outputs that may be
   useful to encoders. The first is the probability that the header will
   occur in a typical flow, which is discussed in greater detail in
   Section 4.3. The other two outputs (an updated context and an updated
   list of labels) are described in Section 4.4 and Section 4.5
   respectively. Note that if the packet does not contain a valid header
   then the remaining three outputs are not provided.

   In summary, a complete list of the inputs and outputs for each rule
   is given below:

   Inputs:   Packet to be tested for a valid header
             Operands explicitly declared by the rule
             Length and value for fields in the context
             List of labels and their values




Price et al.                                                    [Page 7]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   Outputs:  Boolean value stating if the packet contains a valid header
             Probability value that header will occur in a typical flow
             Updated length and value for fields in the context
             Updated list of labels and their values

4.2. Creating new PEBBLE rules

   In PEBBLE the behavior of a protocol is defined by a single "primary"
   rule, which may itself be defined in terms of simpler rules. For
   example:

   IPv6_Header   ::=   Version
                       Traffic_Class
                       ECT_Flag
                       CE_Flag
                       Flow_Label
                       Payload_Length
                       Next_Header
                       Hop_Limit
                       Source_Address
                       Destination_Address

   The general format for creating a new rule in terms of existing rules
   is as follows:

              new_rule whitespace "::=" whitespace expression

   The term "new_rule" describes the name of the new rule together with
   the names of any operands needed by the rule. The term "expression"
   refers to an expression which evaluates to the desired behavior of
   the new rule. The expression can include one or more existing rules
   and integer operators, as well as any of the operands declared for
   the new rule.

   For example, using the RANGE rule discussed previously it is possible
   to create a new IRREGULAR rule that describes a single n-bit field
   capable of taking all values from 0 to 2^n-1 inclusive:

   IRREGULAR(#n)   ::=   RANGE(n,0,2^n)

   The syntax of each of these terms is described in detail in the
   following sections.

4.2.1. Syntax of rule names and operands

   When creating a new rule in PEBBLE it is necessary to give a name for
   the new rule, and to declare any operands that are required by the
   rule.







Price et al.                                                    [Page 8]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   The name of the new rule can be any combination of letters, digits or
   the symbol "_", provided that it doesn't coincide with the name of an
   existing rule or integer operator. Names are case-sensitive.

   The name of the new rule is followed by a list of the operands
   expected by the rule (enclosed in parentheses and separated by
   commas). A name is assigned to each operand so that it can be used in
   the expression which defines the new rule (see Section 4.2.2 for more
   on this). Operand names are also a combination of letters, digits and
   the symbol "_". Operand names within the definition of a rule should
   be unique, and should not overlap with the names of any existing
   rules or integer operators. However, the operand names from one rule
   can be reused when defining a different rule.

   Three types of operand are available: rules, text names and (signed)
   integers. Text name operands are preceded by a "$" whilst integer
   operands are preceded by a "#". If no symbol precedes the operand
   then it is assumed to be a rule operand. For example:

   RANGE(#n,#v,#k)
   IRREGULAR(#n)
   INFERRED_OFFSET(#n,$base)
   LOCAL($label,rule)

   It is also possible to declare a list of operands by appending a pair
   of square braces to the operand name. If the square braces enclose
   the name of a previously declared integer operand k then the list of
   operands must contain entries from 0 to k-1 inclusive.

   For example, RULE(#n,#list[n]) ::= ... creates a new rule which
   requires a single integer operand n followed by n integer operands
   list[0] to list[n-1].

4.2.2. Syntax of expressions

   The behavior of a new rule is defined by giving an expression in
   terms of one or more existing rules and integer operators. For
   example, using the IRREGULAR rule defined above it is possible to
   create the more complex UNCOMPRESSIBLE rule as follows:

   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(floor(label,d)*m+p)

   If the overall expression includes one or more existing rules then
   values must be given for any operands needed by the rules. These
   values can in turn be defined using expressions. For example, when
   defining the UNCOMPRESSIBLE rule in terms of the IRREGULAR rule the
   expression "floor(label,d)*m+p" gives the value of the operand n
   needed by the IRREGULAR rule.

   Expressions can include any of the operands that have been declared
   for the new rule. If the operand is a list then it must be followed




Price et al.                                                    [Page 9]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   by an integer expression (enclosed in square braces) that specifies
   which value from the list is to be used.

   For example, the following BRANCH rule applies one of n existing
   rules depending on the value contained in the operand "indicator":

   BRANCH(#indicator,#n,rule[n])   ::=   rule[indicator]

   The set of integer operators available to PEBBLE is described in
   Section 6.1, and the set of four pre-defined rules is described in
   Section 6.2.

   For simplicity, PEBBLE does not currently provide any operators which
   generate new text names from existing ones. The only way to specify a
   text name is to write the name explicitly or to use a text name
   operand.

4.2.3. Generating lists of expressions

   In some cases it is useful to specify a list of one or more values in
   a single step. This can be accomplished by specifying an expression
   in terms of one or more indices, which are then declared together
   with their range. The list of values is generated by allowing each of
   the indices to take every value in the specified range. The
   expression for generating a list of values is enclosed in square
   braces, and the indices are declared using the keywords "for" and
   "in" as illustrated below:

   [2*j for j in 0..5]
   [rule[j] for j in 0..4]
   [k-j for j in 0..3, k in 0..3]

   The lower bound for the range is inclusive whereas the upper bound is
   exclusive, so the expression "for j in 0..5" declares that j takes
   the values 0, 1, 2, 3 and 4.

   Suppose that the n indices are j[0],...,j[n-1]. Note that the range
   of the ith index j[i] can be defined in terms of any of the indices
   j[0] to j[i-1]. For example:

   [k for j in 0..4, k in 0..j]

   The expression to be evaluated can include any of the n indices, so
   it can be considered to be a function of j[0],...,j[n-1]. Denote the
   output of this function by list_item[j[0],...,j[n-1]].

   The function is then evaluated by replacing each index j[i] with all
   of the integers within the range of the index. The final output of
   the list expression is equivalent to writing a list of these values
   separated by commas.





Price et al.                                                   [Page 10]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   To define the order in which each item appears in the list, it is
   specified that list_item[j[0],...,j[n-1]] precedes
   list_item[k[0],...,k[n-1]] whenever there is an integer i such that
   j[0] = k[0], j[1] = k[1] up to j[i-1] = k[i-1], and j[i] < k[i].

   If any of the indices is given a range containing 0 values then the
   resulting list must be empty.

   Some example lists of expressions are given below, using both the
   square brace form and an expanded form that evaluates to give the
   same result:

   Square brace form:                   Equivalent expanded form:

   [2*j for j in 0..5]                  0,2,4,6,8
   [rule[j] for j in 0..4]              rule[0],rule[1],rule[2],rule[3]
   [k-j for j in 0..3, k in 0..3]       0,1,2,-1,0,1,-2,-1,0
   [k for j in 0..4, k in 0..j]         0,0,1,0,1,2

4.3. Probability distribution and robustness

   One of the pieces of information provided by PEBBLE is a probability
   distribution for the headers in the chosen protocol. For a given
   protocol header, the PEBBLE rule describing the protocol outputs the
   probability that the header will occur in a typical flow. This
   information can be useful to encoders, as more common headers can be
   encoded using fewer bits than headers expected to occur rarely.

   Note that the choice of how to partition a set of headers into flows
   must be made when writing the description of the protocol, and the
   resulting description will reflect this choice. For example, if flow
   classification is expected to be performed using IP addresses and
   ports then these fields will take a fixed value with probability 100%
   in the PEBBLE description of the protocol.

   The probability distribution should take into account not only the
   behavior of the headers themselves but also the characteristics of
   the link over which they are transmitted. For example, if the link
   can suffer from dropped or misordered packets then the behavior of
   certain header fields may change as a result. In particular a
   sequence number field that increases monotonically over a reliable
   link may occasionally jump forwards in the presence of packet drops.

   In general a wide range of possible error conditions may be
   introduced by the link: dropped, misordered or duplicated packets,
   isolated bit errors, fades etc. From the perspective of PEBBLE
   however, all error conditions can be classed as either "expected
   errors" or "unexpected errors".

   An expected error occurs regularly in the header flow, and hence is
   taken into account in the PEBBLE description of the protocol. As a




Price et al.                                                   [Page 11]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   result, an encoding algorithm based on the protocol description will
   be robust against the error (i.e. headers subject to the error can
   still be successfully decoded). ROHC profiles typically class packet
   drops or misordering up to a certain limit as expected errors.

   An unexpected error only occurs for a negligible proportion of
   headers in the flow, and so does not need to be taken into account
   when describing the behavior of the protocol. As a result, an
   encoding algorithm based on the protocol description may fail to
   decode any headers subject to an unexpected error. ROHC profiles
   typically class bit errors and excessive packet drops or misordering
   as unexpected errors.

4.4. Context values

   The headers generated by a typical protocol stack will change over
   time, so the behavior of one header often depends on the behavior of
   previous headers output by the stack. For example, the RTP Sequence
   Number field increases by 1 for each consecutive header in an RTP
   stream.

   PEBBLE takes into account the dependency between successive headers
   by storing a "context" of field values from each header. These values
   can then be used when describing the next header in the flow. The
   context is accessed using the integer operators context_length(#n)
   and context_value(#n), which return the length and value of the nth
   field in the context.

   Rules can also update the context by defining updated_length(#n) and
   updated_value(#n), which specify the updated length and value of the
   nth field in the context. These updated lists of values are used as
   the context for the next header in the flow. Updated values for the
   context are defined as part of the FIELD rule (see Chapter 6 for
   further details).

   Note that as discussed in Section 4.3, when a header flow contains an
   unexpected error then it is acceptable for an encoder to fail to
   reconstruct the relevant header. However, it is not acceptable for
   the decoding of subsequent valid headers to be adversely affected. In
   particular, it is necessary to identify the headers that fail to
   successfully decompress due to an unexpected error so that corruption
   of the context can be prevented.

   As part of the description of a header, PEBBLE can define the length
   of a checksum that should be provided over the original uncompressed
   header in order to detect the occurrence of unexpected errors and
   prevent them from mistakenly being used to update the context. This
   situation is illustrated in Figure 2:







Price et al.                                                   [Page 12]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


                                      Checksum failure
    +--------------+ +--------------+ ================ +--------------+
   -| Valid header |-| Valid header |-|Invalid header|-| Valid header |-
    +--------------+ +--------------+ ================ +--------------+
          |  |             |  |                              |  |
   >------+  +------>------+  +--------------->--------------+  +------>
                 context                   context

         Figure 2: Preventing accidental corruption of the context

   The process for creating the checksum is described in greater detail
   in Section 4.6.

4.5. Labels

   When describing the behavior of a protocol stack it is often the case
   that two fields in separate parts of the header have related
   behavior. For example, the Protocol field in an IP header is related
   to the type of header that follows.

   A standard version of BNF can capture this dependency by creating a
   new BNF rule to describe each possible case. As an example, suppose
   that a 1-bit flag field is followed by some intermediate fields that
   behave independently from the flag, and then by some fields whose
   behavior changes depending on whether the flag field takes value 0 or
   value 1:

   header   ::=   case_0 | case_1

   case_0   ::=   flag_is_0
                  intermediate_fields
                  dependent_behavior_0

   case_1   ::=   flag_is_1
                  intermediate_fields
                  dependent_behavior_1

   The drawback of this technique is that a new BNF rule must be created
   to capture each dependent behavior, which can result in extremely
   complex BNF descriptions when the number of dependencies is high.

   An alternative technique for capturing the above field dependency
   would be to label the flag field with a unique name, and then
   reference this name when defining the behavior of the dependent
   fields. For example:

   header   ::=   flag_field              ; store field value as "flag"
                  intermediate_fields
                  dependent_fields        ; define as function of "flag"






Price et al.                                                   [Page 13]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   PEBBLE allows field values to be stored for later reference by means
   of a label. A label name can be any combination of letters, digits or
   the symbol "_", provided that it doesn't coincide with the name of a
   rule, integer operator, or any of the operand names used when
   defining new rules.

   Each label can contain an integer value or can be undefined. When a
   rule is invoked it is passed a list of the labels which are currently
   defined together with their integer values. This list is treated as
   an implicit operand for the rule: it can alter the behavior of the
   rule, but since the list of labels is always supplied it doesn't need
   to be included when the operands for the rule are declared.

   Each rule also returns a list of labels together with their integer
   values. The input and output lists may be different (in particular
   the output list may include integer values for labels that were
   undefined in the input list, or vice versa).

   Label names can be used in integer expressions, in which case they
   are replaced by the input label value for the rule using the integer
   expression as one of its operands.

4.6. Control fields

   A control field is a field whose value is created by an encoder and
   not taken from the uncompressed protocol header itself. New control
   fields are defined in PEBBLE by means of three well-known labels:
   choice, msn and checksum. If these labels occur in an integer
   expression but their values are undefined, the encoder chooses a
   value for the label as follows:

   The "choice" label is assigned a value chosen by the particular
   implementation of an encoder. Different implementations may choose
   different values for the control field, allowing a certain amount of
   implementation flexibility when programming an encoder that makes use
   of the PEBBLE notation. For example, an implementation that wishes to
   obtain the highest possible compression efficiency may choose to try
   several different values for the control field (seeking to maximize
   the overall compression ratio for the header flow), whereas an
   implementation that wishes to reduce its processing requirements may
   use a simpler mechanism for choosing the value of the control field.

   The "msn" label is assigned the Master Sequence Number (MSN). The MSN
   is a counter which returns the position of the header in the flow: if
   the kth header in the flow is currently being described then the msn
   label is assigned the value k-1.

   The "checksum" label is assigned a checksum over the header to
   prevent unexpected errors from corrupting the context. A sufficiently
   long checksum should be provided to ensure the probability that an
   unexpected error will not be spotted is negligible. The algorithm for




Price et al.                                                   [Page 14]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   calculating the checksum is chosen by the encoder and hence is beyond
   the scope of PEBBLE.

   Note that once the three labels have been assigned values they behave
   in exactly the same manner as other labels. In particular the encoder
   cannot change the value for the label until a rule sets the label to
   be undefined.


5.  Improving readability

   The following chapter defines some extensions to the PEBBLE notation
   that are designed to offer more convenient alternatives for existing
   constructs. Whilst the extensions do not affect the expressive power
   of PEBBLE, they are considered useful for improving its overall
   readability.

5.1. Shorthand form of operators

   The standard form of the rules and integer operators can be
   cumbersome (particularly for simple operators), so it is possible to
   define operators using a special shorthand form.

   In the shorthand form of an operator the operands are given without
   parentheses or commas, but are instead separated by a well-known
   symbol such as "+", "-", "<", "|", <whitespace> etc. For example:

       CONCATENATION(foo,bar) can be rewritten as foo whitespace bar

   Using operators in shorthand form can lead to ambiguity because the
   operands are not necessarily enclosed in parentheses. For example the
   expression a+b*c could be interpreted as a+(b*c) or as (a+b)*c. To
   resolve this ambiguity it is necessary to define a precedence order
   for operators in shorthand form. The operator with the highest
   precedence takes the shortest possible expressions as its operands.
   The operator with the second-highest precedence takes the shortest
   possible expressions given that the first criterion holds, and so on.
   Since the operator "*" has a higher precedence than "+", the
   expression a+b*c must be interpreted as a+(b*c).

   Given two operators with the same precedence, the leftmost operator
   takes the shorter expressions as its operands. So 2^3^4 is
   interpreted as (2^3)^4.

   The operators that can be expressed in shorthand form are defined in
   Chapter 6 together with their precedence.

5.2. Lists of undefined size

   For notational convenience it is possible to specify a list of
   operands by appending an empty pair of square braces to the operand




Price et al.                                                   [Page 15]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   name. In this case the number of entries in the list is not
   specified. Instead, when the rule is encountered in an expression the
   number of entries in the list is derived by counting the total number
   of operands that follow the rule. For example, the operands for the
   CONCATENATION rule are declared as follows:

   CONCATENATION(rule[])

   If the term CONCATENATION(foo,bar) is encountered in an expression
   then the list rule[] must have two entries, with rule[0] = foo and
   rule[1] = bar.

   If more than one list of operands is created with an unspecified size
   then the size of each list must be equal. For example:

   INFERRED_TRANSLATE(#n,#a[],#b[])

   If the term INFERRED_TRANSLATE(1,2,3,4,5) is encountered in an
   expression then each of the lists a[] and b[] must contain two items,
   where a[0] = 2, a[1] = 3, b[0] = 4 and b[1] = 5. If
   INFERRED_TRANSLATE has an even number of operands then the rule must
   fail.

   If a new rule declares a list of undefined size then the well-known
   operand "end" is provided to access the length of the list in
   expressions. For example:

   PICK(rule[],#p[])   ::=   CHOICE(field_value)
                             LOCAL(field_value,rule[field_value])
                             DISTRIBUTION([p[j] for j in 0..end])

   In order to avoid forward referencing, all lists of operands with
   undefined size must be declared after the single operands and the
   lists of operands with known size. So defining a rule
   INFERRED_TRANSLATE(#a[],#b[],#n) would be incorrect.

5.3. Local definitions

   To improve the readability of the notation, the definition of a new
   rule can be given in terms of one or more "local definitions". For
   example:

   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(field_length)

   where field_length is (floor(label,d)*m+p)

   The keyword "where" indicates that a list of local definitions
   follows the description of a new rule. Each local definition is given
   a name, which can be any combination of letters, digits or the symbol
   "_" provided that it doesn't coincide with the name of a rule,
   integer operator, label, or any of the operands or local definitions




Price et al.                                                   [Page 16]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   already specified for the rule. Note however that the names of local
   definitions from one rule can be reused when making local definitions
   for a different rule.

   The name of each local definition is followed by the keyword "is" and
   then by an arbitrary expression. The expression can include any of
   the operands for the new rule and any labels defined for the rule.
   Note that if the expression contains labels then its value may change
   depending on where it occurs in the description of the new rule. In
   particular, if a local definition appears several times when creating
   a new rule then its value should be recalculated each time it occurs.


6.  Pre-defined rules and integer operators

   The following chapter defines a number of integer operators and rules
   that can be used when creating new PEBBLE rules as per Chapter 4.

6.1. Integer operators

   This section provides a set of integer operators for use when
   creating new rules. The following operators are currently available:

   Integers              Integers can be expressed as decimal values,
                         binary values prefixed by 0b, or hex values
                         prefixed by 0x. Negative integers are prefixed
                         by a "-" sign.

   Label names           Label names can be used in integer expressions,
                         where they are replaced by the input label
                         value for the rule which called the integer
                         expression.

   NULL                  The keyword "NULL" can be used to specify an
                         undefined integer value. This is useful for
                         deleting the value contained within a label.

   Parentheses           Parentheses can be used to enclose expressions,
                         which can be useful in conjunction with
                         operators in shorthand form.

   Arithmetic            The operators +, -, *, / and ^ can be used.
                         Note that k/v is undefined if it does not
                         evaluate to an integer.

   Percentages           A value can be expressed as a percentage with
                         up to 2 decimal places (i.e. abc.xy%). This is
                         interpreted as an integer from 0 to 10000
                         inclusive (v% is interpreted as the integer
                         v*100, so in particular 0% is interpreted as 0
                         and 100% is interpreted as 10000).




Price et al.                                                   [Page 17]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   Inequalities          Inequalities (<, =<, ==, !=, >=, >) return 1 if
                         they are true and 0 if they are false.

   floor(#k,#v)          Returns the largest integer not larger than k/v
                         (undefined when v = 0).

   mod(#k,#v)            Returns k-v*floor(k,v).

   log2(#v)              Returns the smallest integer k where v =< 2^k.

   min(#k[])             Returns j such that k[j] is minimal (taking the
                         smallest value of j in the event of a tie).

   checksum16(#n)        Returns the integer value of an IP checksum
                         over the last n bits in the packet.

   switch(#j,#k[])       Returns k[j].

   byte_swap(#n,#k)      Returns the byte-swapped value of k when
                         treated as an n-bit field (undefined if n is
                         not a multiple of 8).

   permutation(#n,#k,#x) Checks if the integer x is a concatenation of
                         the integers 0 to n-1 (each interpreted as a
                         bit pattern of length k) in any order. Returns
                         1 if this is the case and 0 otherwise.

   context_length(#n)    Returns the length of the nth field in the
                         context.

   context_value(#n)     Returns the value of the nth field in the
                         context.

   Note that if any of the operands are undefined, the value returned by
   the operator is also undefined.

   The precedence of each operator in shorthand form is given below
   (higher precedence first):

   Integers, percentages
   ^
   *, /
   +, -
   <, =<, ==, !=, >=, >

6.2. Pre-defined rules

   The following sections describe the four pre-defined rules provided
   by PEBBLE.






Price et al.                                                   [Page 18]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


6.2.1. FIELD

   The FIELD rule indicates the existence of a new field to be described
   by the PEBBLE notation. Note that the actual behaviour of the field
   is defined separately (for example by using the DISTRIBUTION rule).

   The syntax of the FIELD rule is given below:

   FIELD(#n)

   The operand n specifies the number of bits in the field. The FIELD
   rule uses the well-known label "remaining_data" to determine the
   position of the field in the packet. The remaining_data label
   specifies the start of the field as an bit offset relative to the end
   of the packet (so if remaining_data is 16 then the rule FIELD(16)
   refers to the last 16 bits in the packet), as illustrated in Figure
   3. The integer value of the field can then be determined, taking MSBs
   to precede LSBs. If n or remaining_data are undefined, or if
   remaining_data < n, then the FIELD rule must fail.

                         <-------------- remaining_data --------------->
   +---------------------+-------------+-------------------------------+
   |                     | n-bit field |                               |
   +---------------------+-------------+-------------------------------+

   <----------------------- header and payload ------------------------>

      Figure 3: The location of the n-bit field described by FIELD(n)

   If n = 0 then the integer value of the field is always taken to be 0.

   If the FIELD rule successfully locates a field, in the list of labels
   output by the FIELD rule the value of remaining_data is set to n less
   than its input value. Also, if the well-known label "field_value" is
   undefined in the input labels then for the output labels it is set
   equal to the integer value of the field. If field_value already has
   an integer value however, the value is left unchanged in the output
   list of labels and the value of remaining_data is not modified.

   The FIELD rule may also provide an updated value for one context
   entry: the location of the context entry to be updated is given by
   the well-known label "index". The FIELD rule first checks the well-
   known label "no_update" to determine whether an updated context value
   must be provided. If no_update = 1 then the values for
   updated_length(index) and updated_value(index) are set equal to
   context_length(index) and context_value(index) respectively.
   Otherwise they are set equal to the length and value of the field.
   The output value for index is set to 1 higher than its input value.

   All other labels are left unchanged by the FIELD rule, and the
   probability value output by the FIELD rule is always 100%.




Price et al.                                                   [Page 19]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


6.2.2. DISTRIBUTION

   The DISTRIBUTION rule gives the probability distribution of the well-
   known label "field_value". The syntax of the DISTRIBUTION rule is
   given below:

   DISTRIBUTION(#p[])

   The rule has a list of k operands p[0],...,p[k-1] specifying the
   probability distribution of the label. If probability[i] is the
   chance that the label field_value will take value i then:

   probability[i]   =           p[i]             for i from 0 to k-1
                        --------------------
                        p[0]+p[1]+...+p[k-1]

   The probability that the label will take a value outside the range 0
   to k-1 is zero.

   The DISTRIBUTION rule is successful provided that all of the k
   operands are positive integers and that the input value for the well-
   known label field_value lies between 0 and k-1 inclusive. If this is
   the case then the value of the label field_value and the probability
   distribution can be passed to an encoder that requires this
   information. The output value for the label field_value is undefined;
   all other labels are left unchanged by the rule.

   The DISTRIBUTION rule does not make use of the context and does not
   define any updated context entries.

6.2.3. MAP

   The MAP rule updates the value contained in a label. The syntax of
   the rule is given below:

   $label = #old_value ? #new_value

   Note that the syntax is in shorthand form, using the separators "="
   and "?".

   The MAP rule checks whether the input value for the specified label
   is equal to old_value. If this is the case then the output value for
   the label is set equal to new_value. If it is not the case then the
   MAP rule fails. All other labels are left unchanged by the rule.

   The expressions for old_value and new_value can contain any label
   names except for the label that is being updated by the MAP rule.

   The MAP rule does not define any updated context entries, and the
   probability value output by the MAP rule is always 100%.





Price et al.                                                   [Page 20]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


6.2.4. CONCATENATION

   The CONCATENATION rule matches a header which is the concatenation of
   the headers matched by n existing rules. The syntax of the
   CONCATENATION rule is given below:

   CONCATENATION(rule[])

   The CONCATENATION rule also has the following shorthand form:

   rule[0] whitespace ... whitespace rule[n-1]

   The shorthand version of the CONCATENATION rule has the lowest
   possible precedence (lower than any integer operators).

   The operands for the CONCATENATION rule consist of n existing rules
   denoted rule[0],...,rule[n-1]. The CONCATENATION rule applies each of
   the existing rules to the input packet, and is considered to match
   only if all of the n existing rules match.

   When the CONCATENATION rule is invoked it is passed an input list of
   labels and their corresponding integer values. This list is passed
   unmodified to rule[0], which returns a (possibly updated) list of
   labels to the CONCATENATION rule. The output list from rule[i-1] is
   then used as the input list for rule[i] for all i from 1 to n-1
   inclusive. Finally the output list from rule[n-1] is used unmodified
   as the output list for the CONCATENATION rule. This situation is
   illustrated in Figure 4:

                   |                                 ^
            Labels |                                 | Labels
                   v                                 |
                 +-------------------------------------+
                 | CONCATENATION(rule_1,rule_2,rule_3) |
                 +-------------------------------------+
                   | +-------------+ +-------------+ ^
            Labels | |             | |             | | Labels
                   v |             v |             v |
               +--------+      +--------+      +--------+
               | rule_1 |      | rule_2 |      | rule_3 |
               +--------+      +--------+      +--------+

                  Figure 4: Passing labels between rules

   Note that this avoids the need for an implementation to store more
   than one copy of each label.

   Any updated context entries defined by the n existing rules are
   passed unmodified as the output of the CONCATENATION rule (unless two
   existing rules define contradictory values for the same context
   entry, in which case the CONCATENATION rule fails). Note that the




Price et al.                                                   [Page 21]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   updated entries for the context do not overwrite the original context
   entries while the current header is being parsed: instead they are
   buffered and only used to replace the original context entries for
   the next header in the flow.

   The probability value output by the CONCATENATION rule is the product
   of the probabilities output by each of the n existing rules (i.e. the
   probabilities are assumed to be independent).


7.  Library of commonly used rules

   The ROHC [RFC-3095] standard contains a number of different
   techniques for compressing header fields (LSB encoding, scaled
   timestamp encoding, list-based compression etc.). Each of these
   techniques can be added to a library of rules describing several
   common field behaviors. The rules contained in the library can be
   concatenated to give a description of a complete protocol stack. For
   example, a rule describing the behavior of an IPv6 header can be
   constructed from the rules available in the library as shown below:

   IPv6_Header           ::=       Version
                                   Traffic_Class
                                   ECT_Flag
                                   CE_Flag
                                   Flow_Label
                                   Payload_Length
                                   Next_Header
                                   Hop_Limit
                                   Source_Address
                                   Destination_Address

   Version               ::=       VALUE(4,6)

   Traffic_Class         ::=       STATIC

   ECT_Flag              ::=       STATIC

   CE_Flag               ::=       VALUE(1,0) 99% | VALUE(1,1) 1%

   Flow_Label            ::=       STATIC

   Payload_Length        ::=       INFERRED_SIZE(16,288)

   Next_Header           ::=       LABEL(8,next_header)

   Hop_Limit             ::=       STATIC

   Source_Address        ::=       STATIC

   Destination_Address   ::=       STATIC




Price et al.                                                   [Page 22]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   The rules in the library are specified in terms of the four pre-
   defined rules, so new rules can be added to the library as and when
   they are needed.

7.1. Numeric rules

   This section defines the simplest set of rules that are typically
   used to handle numeric fields.

7.1.1. RANGE

   The RANGE rule describes an n-bit field that takes a known range of
   values. The definition of the rule is given below:

   RANGE(#n,#v,#k)   ::=   FIELD(n)
                           tmp = NULL ? field_value
                           field_value = tmp ? mod(tmp,k)
                           tmp = mod(mod(field_value-v,k)+v,2^n) ? NULL
                           DISTRIBUTION([1 for j in 0..k])

   The RANGE rule matches whenever the next n bits in the header (i.e.
   the n bits following the well-known label "remaining_data") lie
   between v and v+k-1 inclusive. The RANGE rule also assigns a
   probability distribution to the field, specifying that each of the k
   possible field values occurs with equal probability.

   The remaining rules in this section are all defined in terms of the
   RANGE rule.

7.1.2. VALUE

   The VALUE rule describes an n-bit field that takes a known value v.
   The definition of the rule is given below:

   VALUE(#n,#v)   ::=   RANGE(n,v,1)


7.1.3. IRREGULAR

   The IRREGULAR rule describes an n-bit field that behaves randomly
   (i.e. each of the 2^n possible bit patterns occurs with equal
   probability). The definition of the rule is given below:

   IRREGULAR(#n)   ::=   RANGE(n,0,2^n)

7.1.4. IRREGULAR_PADDED

   The IRREGULAR_PADDED rule describes an n-bit field with k least
   significant bits that behave randomly and with n-k most significant
   bits that are always zero. The definition of the rule is given below:





Price et al.                                                   [Page 23]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   IRREGULAR_PADDED(#n,#k)   ::=   RANGE(n,0,2^k)

7.1.5. STATIC

   The STATIC rule describes a field with the same length and value as
   the corresponding field stored in the context. The definition of the
   rule is given below:

   STATIC   ::=   RANGE(context_length(index),context_value(index),1)

   Note that the rule is defined in terms of the integer operators
   context_length(#n) and context_value(#n), which respectively return
   the length and value of the nth field stored in the context.

7.1.6. LSB

   The LSB rule describes a field whose value differs by a small amount
   from the field value stored in the context. The definition of the
   rule is given below:

   LSB(#k,#p) ::= RANGE(context_length(index),context_value(index)-
                   p,2^k)

   The LSB rule can be applied when the field value lies between
   (context_value-p) and (context_value-p+2^k-1) inclusive. In
   particular, if p = 0 then the field value can only stay the same or
   increase relative to the previous header in the flow. If p = -1 then
   it can only increase, whereas if p = 2^k then it can only decrease.

7.2. Labeling rules

   This section lists some general rules that can be used to define new
   labels in terms of existing labels.

7.2.1. LABEL

   The LABEL rule assigns a label name to an n-bit field. The definition
   of the rule is given below:

   LABEL(#n,$label)   ::=   FIELD(n)
                            label = NULL ? field_value
                            field_value = label ? NULL

   The LABEL rule can be used in conjunction with the NEXT_FIELD rule
   specified below to change the order in which fields are defined in a
   PEBBLE description. This is particularly important for capturing
   dependencies between fields that occur in different parts of the
   header.







Price et al.                                                   [Page 24]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


7.2.2. NEXT_FIELD

   The NEXT_FIELD rule indicates that the next field to be defined is
   not the next field in the header, but instead is a value contained in
   a label. The definition of the rule is given below:

   NEXT_FIELD($label)   ::=   field_value = NULL ? label
                              label = field_value ? NULL

   The NEXT_FIELD rule copies the value contained in the named label to
   the well-known label field_value. When a FIELD rule is next
   encountered it will use this value rather than taking a field value
   from the header.

7.2.3. LOCAL

   The LOCAL rule temporarily deletes the value of a label. The rule
   takes as its operands the name of a label and an existing rule: the
   named label is set to "undefined" and then the existing rule is
   invoked. The original label value is restored in the output of the
   LOCAL rule.

   The definition of the rule is given below:

   LOCAL($label,rule)   ::=   CALL(label,label,rule)

   Note that the definition of the LOCAL rule is given in terms of
   another rule CALL which is defined as follows:

   CALL(#k,$label,rule)   ::=   label = k ? NULL
                                rule
                                label = NULL ? k

   The LOCAL rule can be useful when defining a new rule in terms of an
   existing rule, both of which use the same name for a label. It
   prevents the case whereby two values are assigned to the same label
   at the same time (which results in the rule failing to match any
   header).

7.3. INFERRED rules

   The INFERRED rules describe fields whose values can be inferred from
   other fields in the header. The following INFERRED rules are
   available:

7.3.1. INFERRED_TRANSLATE

   The INFERRED_TRANSLATE rule translates a field value under a certain
   mapping. The definition of the rule is given below:






Price et al.                                                   [Page 25]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   INFERRED_TRANSLATE(#n,#a[],#b[]) ::= FIELD(n)
                tmp = NULL ? field_value
                field_value = tmp ? b[min([tmp!=a[j] for j in 0..end])]
                tmp = a[min([field_value!=b[j] for j in 0..end])] ? NULL

   The first operand specifies the length of the field before the
   translation. This is followed by additional lists of integers,
   respectively giving the value of the field before and after it is
   translated. For example:

   GRE_protocol   ::=   INFERRED_TRANSLATE(16,0x86dd,0x0800,41,4)

   The GRE Protocol field behaves in the same manner as the Next Header
   field in other extension headers, except that it indicates that the
   subsequent header is IPv6 or IPv4 using the values 0x86dd and 0x0800
   instead of 41 and 4. The INFERRED_TRANSLATE rule can translate the
   values required by the GRE Protocol field into the standard values,
   whose behavior can then be described by another rule such as LIST.

7.3.2. INFERRED_SIZE

   The INFERRED_SIZE rule infers the value of a field from the total
   amount of remaining data. The definition of the rule is given below:

   INFERRED_SIZE(#n,#p) ::= VALUE(n,(remaining_data-n-p)/8)

   The first operand specifies the length of the field in bits, and the
   second operand specifies an offset that will be subtracted from the
   total data length when deriving the value of the INFERRED_SIZE field.

7.3.3. INFERRED_OFFSET

   The INFERRED_OFFSET rule describes a field that takes a known offset
   relative to a certain base value. The definition of the rule is given
   below:

   INFERRED_OFFSET(#n,$base) ::= FIELD(n)
                           tmp = NULL ? field_value
                           field_value = tmp ? mod(field_value-base,2^n)
                           tmp = mod(field_value+base,2^n) ? NULL

   The first operand defines the length of the field. The second operand
   gives the name of a label where the base value can be found. The
   value of this label is defined by another rule (e.g. the LABEL rule).

   The INFERRED_OFFSET rule subtracts the base value from the field
   value. The resulting offset is then placed in the well-known label
   field_value so that it can be described by the next rule. For
   example, a typical sequence number can be described as follows:






Price et al.                                                   [Page 26]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   sequence_number   ::=   INFERRED_OFFSET(32,msn)

                           STATIC 99% |
                           LSB(8,-1) 0.7% |
                           LSB(16,-1) 0.2% |
                           IRREGULAR(32) 0.1%

   In this case the offset value is expected to change rarely and only
   by small amounts, and hence it is defined using the STATIC and LSB
   rules.

7.3.4. INFERRED_SEQUENCE

   The INFERRED_SEQUENCE rule describes a field that takes a known
   offset relative to a base value, in the same way as INFERRED_OFFSET.
   However, INFERRED_SEQUENCE additionally defines a new base value
   equal to the value of the field. This is useful for describing
   sequences of incrementing values.

   The definition of the rule is given below:

   INFERRED_SEQUENCE(#n,$base) ::= FIELD(n)
                                   tmp = NULL ? field_value
                                   field_value = tmp ? mod(tmp-base,2^n)
                                   base = mod(tmp-field_value,2^n) ? tmp
                                   tmp = base ? NULL

   The first operand defines the length of the field. The second operand
   gives the name of a label where the base value can be found. The
   value of this label is defined by another rule (e.g. the LABEL rule).

   The INFERRED_SEQUENCE rule subtracts the base value from the field
   value. The resulting offset is then placed in the well-known label
   field_value so that it can be described by the next rule.
   Additionally, the base label is updated to contain the original value
   of the field being described. For example, a sequence of values such
   as 10, 20, 30 can be described as follows:

   incrementing_sequence   ::=   LABEL(16,base_label)

                                 INFERRED_SEQUENCE(16,base_label)
                                 STATIC 90% | IRREGULAR(16) 10%
                                          ; describes (2nd - 1st) entry

                                 INFERRED_SEQUENCE(16,base_label)
                                 STATIC 90% | IRREGULAR(16) 10%
                                          ; describes (3rd - 2nd) entry

                                 NEXT_FIELD(base_label)
                                 STATIC 90% | IRREGULAR(16) 10%
                                          ; describes 3rd entry




Price et al.                                                   [Page 27]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


7.3.5. INFERRED_IP_CHECKSUM

   The INFERRED_IP_CHECKSUM rule is used to describe the IP checksum
   field. The definition of the rule is given below:

   INFERRED_IP_CHECKSUM   ::=   tmp = NULL ? remaining_data
                                remaining_data = tmp ? tmp+80
                                FIELD(16)
                                remaining_data = tmp+96 ? tmp
                                field_value = checksum16(tmp) ? NULL
                                tmp = remaining_data ? NULL

   The rule matches whenever the 2-byte field at an offset of 10 bytes
   from the current position in the packet contains a 16-bit one's
   complement checksum over the remaining data.

7.4. Control field rules

   This section provides a selection of rules for defining new control
   fields.

7.4.1. CHOICE

   The CHOICE rule creates a label containing an implementation-specific
   value. The definition of the rule is given below:

   CHOICE($label)   ::=   label = NULL ? choice
                          choice = label ? NULL

7.4.2. PICK

   The PICK rule allows an implementation to pick one rule from a list
   of existing rules. The definition of the rule is given below:

   PICK(rule[],#p[])   ::=   CHOICE(field_value)
                             LOCAL(field_value,rule[field_value])
                             DISTRIBUTION([p[j] for j in 0..end])

   The operands for the PICK rule include a list of existing rules,
   followed by a list of integers specifying the relative probability
   with which each rule will be applied in a typical header flow.

   The PICK rule also has the following shorthand form using the symbol
   "|":

   rule[0] whitespace p[0] | ... | rule[n-1] whitespace p[n-1]

   The precedence for this shorthand form is higher than that for the
   CONCATENATION rule, but lower than the precedence of any integer
   operators.





Price et al.                                                   [Page 28]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


7.4.3. FORMAT

   The FORMAT rule describes a field which displays the same behavior as
   the corresponding field in the previous header. The definition of the
   rule is given below:

   FORMAT(rule[])   ::=   CHOICE(field_value)
                          LOCAL(field_value,rule[field_value])
                          STATIC

   The FORMAT rule is followed by a list of existing rules, each
   describing a different behavior for the field. The rule that is
   chosen is the same rule as used for the preceding header.

   For example:

   IP_ID   ::=   FORMAT(SEQUENTIAL_IP_ID,RANDOM_IP_ID)

   Two behaviors for the IP-ID are provided: one for an IP-ID that
   increases sequentially, and one for a randomly behaving IP-ID.

7.4.4. NBO

   The NBO rule describes a field that may be in reverse byte order from
   the rest of the data, as is sometimes seen with the IP-ID.

   The definition of the rule is given below:

   NBO(#n)   ::=   LABEL(n,base_field)
                   CHOICE(nbo_flag)
                   nbo_field = NULL ? switch(nbo_flag,
                    byte_swap(n,base_field),base_field)
                   base_field = switch(nbo_flag,
                    byte_swap(n,nbo_field),nbo_field) ? NULL

   The rule examines the n-bit field (where n is typically 16 or 32) and
   generates a label nbo_flag indicating whether or not the field is in
   network byte order. Determining whether a field is in network byte
   order is a complex task (typically requiring the encoder to examine a
   history of field values) so the value of the nbo_flag label is left
   as a choice for the implementation.

   If the field is in network_byte_order then the nbo_flag is set to 1
   and a label nbo_field is created containing the original field value.
   If the field is not in network byte order then the nbo_flag is set to
   0 and the label nbo_field is given the byte-swapped value of the
   original field. For example:

   IP_ID   ::=   NBO(16)                           ; Checks byte order
                 NEXT_FIELD(nbo_flag)              ; Describes nbo_flag
                 STATIC 99% | IRREGULAR(1) 1%




Price et al.                                                   [Page 29]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


                 NEXT_FIELD(nbo_field)             ; Describes nbo_field
                 INFERRED_OFFSET(16,msn)
                 STATIC 80% | LSB(5,-1) 15% | IRREGULAR(16) 5%

7.4.5. SCALE

   The SCALE rule describes a field which changes by a regular (usually
   large) amount in consecutive headers. The definition of the rule is
   given below:

   SCALE(#n) ::= LABEL(n,base_field)
                 CHOICE(scale)
                 scaled_field = NULL ? floor(base_field,scale)
                 remainder_field = NULL ? mod(base_field,scale)
                 base_field = scale*scaled_field+remainder_field ? NULL

   The number of bits specified are examined and a suitable scale factor
   is chosen (typically using a history of field values). The original
   field value is then mapped to two values: the scaled value obtained
   by dividing the field by the chosen scale factor, and the remainder
   value from this division. These two values plus the scale factor
   itself are sufficient to reconstruct the original field value.

7.4.6. CHECKSUM

   The CHECKSUM rule describes a checksum calculated across the header.
   The size of the checksum can be altered depending on the
   characteristics of the link over which the protocol is to be
   transmitted.

   The definition of the rule is given below:

   CHECKSUM(#n)   ::=   NEXT_FIELD(checksum)
                        IRREGULAR(n)

   Note that it is possible to offer different checksum lengths, so
   extra checksum bits can be allocated to protect important headers.
   This is illustrated in the example below:

   checksum_field   ::=   CHECKSUM(3) 99% | CHECKSUM(7) 1%

7.5. List rules

   The behavior of many protocol fields and headers can be described by
   a single rule built up from lists of simpler rules. This section
   defines a set of rules that can be used to describe the behavior of
   these lists.








Price et al.                                                   [Page 30]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


7.5.1. LIST

   The LIST rule describes a list of items that do not necessarily occur
   in the same order for every header. Example applications for the LIST
   rule include TCP options and TCP SACK blocks.

   The definition of the rule is given below:

   LIST($label,#d,#m,#p,rule[]) ::= start = NULL ? remaining_data
                           order = NULL ? 0
                           presence = NULL ? 0
                           CONCATENATION([LIST_ITEM for j in 0..end])
                           start = switch(permutation(end,k,order),
                            -1,remaining_data+floor(label,d)*m+p) ? NULL

   where k is log2(end)

         LIST_ITEM is CHOICE(next_rule)
                      CHOICE(optional)
                      rule[next_rule]

                      tmp = NULL ? order
                      order = tmp ? tmp*2^k+next_rule
                      next_rule = mod(order,2^k) ? NULL
                      tmp = floor(order,2^k) ? NULL

                      tmp = NULL ? presence
                      presence = tmp ? tmp*2+optional
                      optional = mod(presence,2) ? NULL
                      tmp = floor(presence,2) ? NULL

   The first operand gives the name of a label, whilst the next three
   operands specify a divisor, multiplier and offset which scale the
   value of the label so that it specifies the exact size of the list in
   bits.

   These operands are followed by a set of n existing rules that can be
   used to describe individual items in the list.

   Once the total list size is reached, the LIST rule outputs well-known
   labels "order" and "presence". The order consists of a string of n *
   k-bit entries (where k is the least number of bits required to index
   each of the n entries). The order list contains the indices in the
   order in which they occurred. The presence data is a list a 1-bit
   flags, one per entry in the list, which are set to "1" to indicate
   that the list item is present or "0" if not. These flags occur in the
   order in which the entries appear in the LIST rule.

   It is expected that the OPTIONAL rule (see Section 7.6.1) might be
   used within LIST. The well-known label "optional" is thus set





Price et al.                                                   [Page 31]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   automatically within the LIST rule, so that it can be used in
   conjunction with the OPTIONAL rule as illustrated below:

   TCP_Options   ::=   LIST(tcp_data_offset,1,32,0,
                        OPTIONAL(TCP_SACK),
                        OPTIONAL(TCP_TIMESTAMP),
                        OPTIONAL(TCP_END),
                        OPTIONAL(TCP_GENERIC))

                       STATIC 99.9% |             ; TCP Options order
                       IRREGULAR(8) 0.1%

                       STATIC 75% |               ; TCP Options presence
                       IRREGULAR(4) 25%

   The LIST rule implicitly describes the behavior of the label that is
   specified in the first operand, so there is no need to explicitly
   describe this label using a NEXT_FIELD rule.

7.5.2. LIST_N

   The LIST_N rule is similar to the LIST rule, with the addition that
   each of the rules comprising the list can be applied multiple times.

   The definition of the rule is given below:

   LIST_N($label,#d,#m,#p,rule[],#number[]) ::=
          LIST(label,d,m,p,[rule[j] for j in 0..end, k in 0..number[j]])

   LIST_N is a useful shorthand in the case that a rule in the list may
   occur a large number of times. For parsing purposes, a count of
   number[k] is equivalent to having written out rule[k] a total of
   number[k] times.

   This form of specification is chosen over a truly open list (i.e. one
   allowing an unbounded number of repeats) since it is clear that it
   should always be possible to set an upper bound, and moreover this
   information may be useful to certain encoders. For example:

   TCP_Options   ::=   LIST_N(tcp_data_offset,1,32,0,
                        OPTIONAL(TCP_SACK),
                        OPTIONAL(TCP_TIMESTAMP),
                        OPTIONAL(TCP_END),
                        OPTIONAL(TCP_PAD),
                        OPTIONAL(TCP_GENERIC),
                        1,1,1,32,3)

7.6. Miscellaneous rules

   This section introduces some miscellaneous rules that can be used to
   describe fields in a protocol header.




Price et al.                                                   [Page 32]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


7.6.1. OPTIONAL

   The OPTIONAL rule is used to describe fields that are optionally
   present in the header. The definition of the rule is given below:

   OPTIONAL(rule)   ::=   CONCATENATION([rule for j in 0..optional-1])

   The OPTIONAL rule uses the well-known label "optional" to specify
   whether or not the optional field is present in the header. The value
   of the label can be set by another rule such as LABEL or LIST. For
   example:

   GRE_Key_Flag   ::=   LABEL(1,optional)

   GRE_Key        ::=   OPTIONAL(Key_Present)

   In this case the Key_Present rule is called to describe the GRE_Key
   field, but only if the GRE_Key Flag field is set to 1.

7.6.2. UNCOMPRESSIBLE

   The UNCOMPRESSIBLE rule is a special case of the IRREGULAR rule,
   where the length of the field can be derived from the value contained
   in a label. The definition of the rule is given below:

   UNCOMPRESSIBLE($label,#d,#m,#p)   ::=   IRREGULAR(floor(label,d)*m+p)

   The first operand gives the name of the label, whilst the next three
   operands specify a divisor, multiplier and offset which scale the
   value of the label so that it specifies the exact size of the
   UNCOMPRESSIBLE field in bits.

   The UNCOMPRESSIBLE rule is typically used in conjunction with the
   LABEL rule.

7.6.3. SKIP

   The SKIP rule skips past a certain field in the protocol header
   without defining its behavior. The definition of the rule is given
   below:

   SKIP(#n)   ::=   tmp = NULL ? remaining_data
                    remaining_data = tmp ? tmp+n
                    tmp = remaining_data-n ? NULL

7.6.4. NO_UPDATE

   The NO_UPDATE rule behaves as the rule specified by its operand, with
   the exception that it does not update the context. This is useful
   when a field exhibits an unusual behavior for one header and then
   reverts back to its original behavior in subsequent headers.




Price et al.                                                   [Page 33]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   The definition of the rule is given below:

   NO_UPDATE(rule)   ::=   no_update = NULL ? 1
                           rule
                           no_update = 1 ? NULL

   An example of the NO_UPDATE rule in use is given below:

   TCP_Window   ::=   STATIC 99% |
                      NO_UPDATE(LSB(11,2048)) 0.9% |
                      IRREGULAR(16) 0.1%

   In the above example the NO_UPDATE rule is applied to the TCP Window
   field. The field can behave as defined by the LSB rule, but when this
   behavior occurs it does not update the context.


8.  Security considerations

   This draft describes an abstract notation similar to ABNF [RFC-2234],
   and hence is not believed to raise any security issues.


9.  Acknowledgements

   A number of important concepts and ideas have been borrowed from ROHC
   [RFC-3095]. Updates to the LIST rules owe much to discussions with
   Qian Zhang and Hongbin Liao.

   Thanks to Paul Ollis for field labeling; and to Rob Hancock and
   Stephen McCann for putting up with the authors' arguments and making
   helpful suggestions, frequently against the tide!

   The authors would also like to thank Carsten Bormann, Christian
   Schmidt, Max Riegel and Lars-Erik Jonsson for their comments and
   encouragement. We haven't always agreed, but the arguments have been
   fun!


10. Authors' addresses

   Richard Price         Tel: +44 1794 833681
   Email:                richard.price@roke.co.uk

   Abigail Surtees       Tel: +44 1794 833131
   Email:                abigail.surtees@roke.co.uk

   Mark A West           Tel: +44 1794 833311
   Email:                mark.a.west@roke.co.uk






Price et al.                                                   [Page 34]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   Roke Manor Research Ltd
   Romsey, Hants, SO51 0ZN
   United Kingdom
   http://www.roke.co.uk


11. References

   [RFC-2026]  "The Internet Standards Process - Revision 3", Scott
               Bradner, RFC 2026, Internet Engineering Task Force,
               October 1996

   [RFC-2119]  "Key words for use in RFCs to Indicate Requirement
               Levels", Scott Bradner, RFC 2119, Internet Engineering
               Task Force, March 1997

   [RFC-2234]  "Augmented BNF for Syntax Specifications: ABNF",
               D. Crocker and P. Overell, RFC 2234, Internet Engineering
               Task Force, November 1997

   [RFC-2960]  "Stream Control Transmission Protocol", Stewart et al,
               RFC 2960, Internet Engineering Task Force, October 2000

   [RFC-3095]  "RObust Header Compression (ROHC)", Carsten Bormann et
               al, RFC3095, Internet Engineering Task Force, July 2001

   [RFC-3261]  "SIP: Session Initiation Protocol", J. Rosenberg et al,
               RFC 3261, Internet Engineering Task Force, June 2002


Appendix A. ABNF description of PEBBLE

   definition          ::=   new_rule ws "::=" ws expression
                              [ws "where" 1*(ws local_definition)]

   new_rule            ::=   name [ws] ["(" operand
                              *("," [ws] operand) ")"]

   name                ::=   *(letter | digit | "_")

   letter              ::=   %x41-5a | %x61-7a

   digit               ::=   %x30-39

   operand             ::=   [type] name ["[" [name] "]"]

   type                ::=   "#" | "$"

   expression          ::=   operator | reference | repetition

   operator            ::=   longhand_form | shorthand_form




Price et al.                                                   [Page 35]

INTERNET-DRAFT                   PEBBLE                 October 28, 2002


   longhand_form       ::=   name [ws] ["(" expression
                              *("," [ws] expression) ")"]

   shorthand_form      ::=   [expression] *(reserved
                              expression) [reserved]

   reserved            ::=   1*(%x00-ff)

   reference           ::=   name ["[" expression "]"]

   repetition          ::=   "[" expression ws "for" ws index
                              *("," [ws] index "]"

   index               ::=   name ws "in" ws expression ".." expression

   local_definition    ::=   name ws "is" ws expression







































Price et al.                                                   [Page 36]


Html markup produced by rfcmarkup 1.108, available from http://tools.ietf.org/tools/rfcmarkup/