draft-ietf-rohc-sigcomp-user-guide-00.txt   draft-ietf-rohc-sigcomp-user-guide-01.txt 
Network Working Group Richard Price, Siemens/Roke Manor Robust Header Compression A. Surtees
INTERNET-DRAFT Abigail Surtees, Siemens/Roke Manor Internet-Draft M. West
Expires: December 2003 Mark A West, Siemens/Roke Manor Expires: August 19, 2005 Siemens/Roke Manor Research
February 18, 2005
June 16, 2003
SigComp User Guide SigComp Users' Guide
<draft-ietf-rohc-sigcomp-user-guide-00.txt> draft-ietf-rohc-sigcomp-user-guide-01.txt
Status of this memo Status of this Memo
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is subject to all provisions
all provisions of Section 10 of RFC2026. of section 3 of RFC 3667. By submitting this Internet-Draft, each
author represents that any applicable patent or other IPR claims of
which he or she is aware have been or will be disclosed, and any of
which he or she become aware will be disclosed, in accordance with
RFC 3668.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that other Task Force (IETF), its areas, and its working groups. Note that
groups may also distribute working documents as Internet-Drafts. other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or cite them other than as "work in progress". material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/lid-abstracts.txt http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html.
This document is a submission of the IETF ROHC WG. Comments should be This Internet-Draft will expire on August 19, 2005.
directed to its mailing list, rohc@ietf.org.
Copyright Notice
Copyright (C) The Internet Society (2005).
Abstract Abstract
This document provides an informational guide for users of the This document provides an informational guide for users of the
SigComp protocol. The aim of the document is to assist users when SigComp protocol. The aim of the document is to assist users when
making SigComp implementation decisions; for example the choice of making SigComp implementation decisions; for example the choice of
compression algorithm and the level of robustness against lost or compression algorithm and the level of robustness against lost or
misordered packets. misordered packets.
Table of contents Table of Contents
1. Introduction..................................................2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Overview of the User Guide....................................2 2. Overview of the User Guide . . . . . . . . . . . . . . . . . . 3
3. UDVM assembly language........................................3 3. UDVM assembly language . . . . . . . . . . . . . . . . . . . . 4
4. Compression algorithms........................................10 3.1 Lexical level . . . . . . . . . . . . . . . . . . . . . . 4
5. Additional SigComp mechanisms.................................28 3.2 Syntactic level . . . . . . . . . . . . . . . . . . . . . 5
6. Security considerations.......................................35 3.2.1 Expressions . . . . . . . . . . . . . . . . . . . . . 6
7. Acknowledgements..............................................35 3.2.2 Instructions . . . . . . . . . . . . . . . . . . . . . 7
8. Authors' addresses............................................35 3.2.3 Directives . . . . . . . . . . . . . . . . . . . . . . 8
9. Intellectual Property Right Considerations....................35 3.2.4 Labels . . . . . . . . . . . . . . . . . . . . . . . . 9
10. References....................................................36 3.3 Uploading the bytecode to the UDVM . . . . . . . . . . . . 10
Appendix A: UDVM bytecode for the compression algorithms..........37 4. Compression algorithms . . . . . . . . . . . . . . . . . . . . 11
4.1 Simplified LZ77 . . . . . . . . . . . . . . . . . . . . . 11
4.2 LZSS . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 LZW . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4 DEFLATE . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 LZJH . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.6 EPIC . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5. Additional SigComp mechanisms . . . . . . . . . . . . . . . . 31
5.1 Acknowledging a state item . . . . . . . . . . . . . . . . 31
5.2 Static dictionary . . . . . . . . . . . . . . . . . . . . 33
5.3 CRC checksum . . . . . . . . . . . . . . . . . . . . . . . 33
5.4 Announcing additional resources . . . . . . . . . . . . . 34
5.5 Shared compression . . . . . . . . . . . . . . . . . . . . 35
6. Security considerations . . . . . . . . . . . . . . . . . . . 38
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 38
8. Intellectual Property Right Considerations . . . . . . . . . . 38
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 40
A. UDVM bytecode for the compression algorithms . . . . . . . . . 40
A.1 Simplified LZ77 . . . . . . . . . . . . . . . . . . . . . 40
A.2 LZSS . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.3 LZW . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A.4 DEFLATE . . . . . . . . . . . . . . . . . . . . . . . . . 41
A.5 LZJH . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
A.6 EPIC . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Intellectual Property and Copyright Statements . . . . . . . . 42
1. Introduction 1. Introduction
This document provides an informational guide for users of the This document provides an informational guide for users of the
SigComp protocol [RFC-3320]. The idea behind SigComp is to SigComp protocol RFC-3320 [4]. The idea behind SigComp is to
standardize a Universal Decompressor Virtual Machine (UDVM) that can standardize a Universal Decompressor Virtual Machine (UDVM) that can
be programmed to understand the output of many well-known compressors be programmed to understand the output of many well-known compressors
including DEFLATE [DEFLATE] and LZW [LZW]. The bytecode for the including DEFLATE DEFLATE [10] and LZW LZW [9]. The bytecode for the
choice of compression algorithm is uploaded to the UDVM as part of choice of compression algorithm is uploaded to the UDVM as part of
the compressed data. the compressed data.
The basic SigComp RFC describes the actions that an endpoint must The basic SigComp RFC describes the actions that an endpoint must
take upon receiving a SigComp message. However the entity responsible take upon receiving a SigComp message. However the entity
for generating new SigComp messages (the SigComp compressor) is left responsible for generating new SigComp messages (the SigComp
as an implementation decision; any compressor can be used provided compressor) is left as an implementation decision; any compressor can
that it generates SigComp messages that can be successfully be used provided that it generates SigComp messages that can be
decompressed by the receiving endpoint. successfully decompressed by the receiving endpoint.
This document offers a number of different compressors that can be This document offers a number of different compressors that can be
used by the SigComp protocol. It also describes how standard stream- used by the SigComp protocol. It also describes how standard stream-
based compressors can be modified for robustness against lost and/or based compressors can be modified for robustness against lost and/or
misordered packets over an unreliable transport such as UDP. misordered packets over an unreliable transport such as UDP.
2. Overview of the User Guide 2. Overview of the User Guide
When implementing a SigComp compressor the first step is to choose a When implementing a SigComp compressor the first step is to choose a
compression algorithm that can encode the application messages into a compression algorithm that can encode the application messages into a
skipping to change at page 3, line 20 skipping to change at page 3, line 52
6. EPIC 6. EPIC
Any of the above algorithms may be useful depending on the desired Any of the above algorithms may be useful depending on the desired
compression ratio, processing and memory requirements, code size, compression ratio, processing and memory requirements, code size,
implementation complexity and Intellectual Property (IPR) implementation complexity and Intellectual Property (IPR)
considerations. considerations.
As well as encoding the application messages using the chosen As well as encoding the application messages using the chosen
algorithm, the SigComp compressor is responsible for ensuring that algorithm, the SigComp compressor is responsible for ensuring that
messages can be correctly decompressed even if packets are lost or messages can be correctly decompressed even if packets are lost or
misordered during transmission. The SigComp feedback mechanism can be misordered during transmission. The SigComp feedback mechanism can
used to acknowledge successful decompression over an unreliable be used to acknowledge successful decompression over an unreliable
transport such as UDP. transport such as UDP.
The following robustness techniques and other mechanisms specific to The following robustness techniques and other mechanisms specific to
the SigComp environment are covered in this document: the SigComp environment are covered in this document:
1. Acknowledgements using the SigComp feedback mechanism 1. Acknowledgements using the SigComp feedback mechanism
2. Static dictionary 2. Static dictionary
3. CRC checksum 3. CRC checksum
4. Announcing additional resources 4. Announcing additional resources
5. Shared compression 5. Shared compression
Any or all of the above mechanisms can be implemented in conjunction Any or all of the above mechanisms can be implemented in conjunction
with the chosen compression algorithm. A subroutine of UDVM bytecode with the chosen compression algorithm. A subroutine of UDVM bytecode
is provided for each of the mechanisms; these subroutines can be is provided for each of the mechanisms; these subroutines can be
added to the bytecode for one of the basic compression algorithms. added to the bytecode for one of the basic compression algorithms.
3. UDVM assembly language 3. UDVM assembly language
Writing UDVM programs directly in bytecode would be a daunting task, Writing UDVM programs directly in bytecode would be a daunting task,
skipping to change at page 4, line 5 skipping to change at page 4, line 33
so a simple assembly language is provided to facilitate the creation so a simple assembly language is provided to facilitate the creation
of new decompression algorithms. The assembly language includes of new decompression algorithms. The assembly language includes
mnemonic codes for each of the UDVM instructions, as well as simple mnemonic codes for each of the UDVM instructions, as well as simple
directives for evaluating integer expressions, padding the bytecode directives for evaluating integer expressions, padding the bytecode
and so forth. and so forth.
The syntax of the UDVM assembly language uses the customary two-level The syntax of the UDVM assembly language uses the customary two-level
description technique, partitioning the grammar into a lexical and a description technique, partitioning the grammar into a lexical and a
syntactical level. syntactical level.
3.1. Lexical level 3.1 Lexical level
On a lexical level, a string of assembly consists of zero or more On a lexical level, a string of assembly consists of zero or more
tokens optionally separated by whitespace. Each token can be a text tokens optionally separated by whitespace. Each token can be a text
name, an instruction opcode, a delimiter, or an integer (specified as name, an instruction opcode, a delimiter, or an integer (specified as
decimal, binary or hex). decimal, binary or hex).
The following ABNF description [RFC-2234] specifies the syntax of a The following ABNF description RFC-2234 [3] specifies the syntax of a
token: token:
token = (name / opcode / delimiter / dec / bin / hex) token = (name / opcode / delimiter / dec / bin / hex)
name = (lowercase / "_") 1*(lowercase / digit / "_") name = (lowercase / "_") 1*(lowercase / digit / "_")
opcode = uppercase *(uppercase / digit / "-") opcode = uppercase *(uppercase / digit / "-")
delimiter = "." / "!" / "$" / ":" / "(" / ")" / operator delimiter = "." / "!" / "$" / ":" / "(" / ")" / operator
dec = 1*(digit) dec = 1*(digit)
bin = "0b" 1*("0" / "1") bin = "0b" 1*("0" / "1")
hex = "0x" 1*(hex_digit) hex = "0x" 1*(hex_digit)
hex_digit = digit / %x41-46 / %x61-66 hex_digit = digit / %x41-46 / %x61-66
digit = %x30-39 digit = %x30-39
uppercase = %x41-5a uppercase = %x41-5a
lowercase = %x61-7a lowercase = %x61-7a
operator = "+" / "-" / "*" / "/" / "%" / "&" / "|" / operator = "+" / "-" / "*" / "/" / "%" / "&" / "|" /
"^" / "~" / "<<" / ">>" "^" / "~" / "<<" / ">>"
When parsing for tokens a longest match is applied, i.e. a token is When parsing for tokens a longest match is applied, i.e. a token is
the longest string that matches the <token> rule specified above. the longest string that matches the <token> rule specified above.
The syntax of whitespace and comments is specified by the following The syntax of whitespace and comments is specified by the following
ABNF: ABNF:
ws = *(%x09 / %x0a / %x0d / %x20 / comment) ws = *(%x09 / %x0a / %x0d / %x20 / comment)
skipping to change at page 5, line 14 skipping to change at page 5, line 35
LOAD (temp, 1) ; This is a comment. LOAD (temp, 1) ; This is a comment.
Any other input is a syntax error. Any other input is a syntax error.
When parsing on the lexical level the string of assembly should be When parsing on the lexical level the string of assembly should be
divided up into a list of successive tokens. The whitespace and divided up into a list of successive tokens. The whitespace and
comments should also be deleted. The assembly should then be parsed comments should also be deleted. The assembly should then be parsed
on the syntactic level as explained in Section 3.2. on the syntactic level as explained in Section 3.2.
3.2. Syntactic level 3.2 Syntactic level
Once the string of assembly has been divided into tokens as per Once the string of assembly has been divided into tokens as per
Section 3.1, the next step is to convert the assembly into a string Section 3.1, the next step is to convert the assembly into a string
of UDVM bytecode. of UDVM bytecode.
On a syntactic level, a string of assembly consists of zero or more On a syntactic level, a string of assembly consists of zero or more
instructions, directives or labels, each of which is itself built up instructions, directives or labels, each of which is itself built up
from one or more lexical tokens. from one or more lexical tokens.
The following ABNF description specifies the syntax of the assembly The following ABNF description specifies the syntax of the assembly
skipping to change at page 5, line 46 skipping to change at page 6, line 20
; compiler, e.g. a literal (#) operand ; compiler, e.g. a literal (#) operand
; that specifies the total number of ; that specifies the total number of
; operands for the instruction. ; operands for the instruction.
; When "$" is prepended to an operand, ; When "$" is prepended to an operand,
; the corresponding integer is an ; the corresponding integer is an
; address rather than the actual operand ; address rather than the actual operand
; value. This symbol is mandatory for ; value. This symbol is mandatory for
; reference operands ($), optional for ; reference operands ($), optional for
; multitypes (%) and addresses (@), and ; multitypes (%) and addresses (@), and
; disallowed for literals (#). ; disallowed for literals (#).
label = ":" name label = ":" name
; abbreviation for set(<name>, .)
directive = padding / data / set directive = padding / data / set
; note that directive names are ; note that directive names are
; syntactically of category <name>; all ; syntactically of category <name>; all
; directives are intended to syntactically ; directives are intended to syntactically
; match: name ["(" expression *("," ; match: name ["(" expression *(","
; expression) ")"] ; expression) ")"]
padding = ("pad" / "align" / "at") "(" expression ")" padding = ("pad" / "align" / "at") "(" expression ")"
data = ("byte" / "word") "(" expression *("," data = ("byte" / "word") "(" expression *(","
expression) ")" expression) ")"
set = "set" "(" name "," expression ")" set = "set" "(" name "," expression ")"
expression = value / "(" expression operator expression ")" expression = value / "(" expression operator expression ")"
value = dec / bin / hex / name / "." / "!" value = dec / bin / hex / name / "." / "!"
; "." is the location of this ; "." is the location of this
; instruction/directive, whereas "!" is ; instruction/directive, whereas "!" is
; the location of the closest ; the location of the closest
; DECOMPRESSION-FAILURE ; DECOMPRESSION-FAILURE
The following sections define how to convert the instructions, labels The following sections define how to convert the instructions, labels
and directives into UDVM bytecode: and directives into UDVM bytecode:
3.2.1. Expressions 3.2.1 Expressions
The operand values needed by particular instructions or directives The operand values needed by particular instructions or directives
can be given in the form of expressions. An expression can include can be given in the form of expressions. An expression can include
one or more values specified as decimal, binary or hex (binary values one or more values specified as decimal, binary or hex (binary values
are preceded by "0b" and hex values are preceded by "0x"). The are preceded by "0b" and hex values are preceded by "0x"). The
expression may also include one or more of the following operators: expression may also include one or more of the following operators:
"+" Addition "+" Addition
"-" Subtraction "-" Subtraction
"*" Multiplication "*" Multiplication
skipping to change at page 6, line 48 skipping to change at page 7, line 19
"~" Binary XNOR "~" Binary XNOR
"<<" Binary LSHIFT "<<" Binary LSHIFT
">>" Binary RSHIFT ">>" Binary RSHIFT
The operands for each operator must always be surrounded by The operands for each operator must always be surrounded by
parentheses so that the order in which the operators should be parentheses so that the order in which the operators should be
evaluated is clear. For example: evaluated is clear. For example:
((1 + (2 * 3)) & (0xabcd - 0b00101010)) gives the result 3. ((1 + (2 * 3)) & (0xabcd - 0b00101010)) gives the result 3.
Expressions can also include the special values "." and "!". When the Expressions can also include the special values "." and "!". When
symbol "." is encountered, it is replaced by the location in the the symbol "." is encountered, it is replaced by the location in the
bytecode of the current instruction/directive. When the symbol "!" is bytecode of the current instruction/directive. When the symbol "!"
encountered it is replaced by the location in the bytecode of the is encountered it is replaced by the location in the bytecode of the
closest DECOMPRESSION-FAILURE instruction (i.e. the closest zero closest DECOMPRESSION-FAILURE instruction (i.e. the closest zero
byte). This can be useful when writing UDVM instructions that call a byte). This can be useful when writing UDVM instructions that call a
decompression failure, for example: decompression failure, for example:
INPUT-BYTES (1, temp, !) INPUT-BYTES (1, temp, !)
The above instruction causes a decompression failure to occur if it The above instruction causes a decompression failure to occur if it
tries to input data from beyond the end of the compressed message. tries to input data from beyond the end of the compressed message.
It is also possible to assign integer values to text names: when a It is also possible to assign integer values to text names: when a
text name is encountered in an expression it is replaced by the text name is encountered in an expression it is replaced by the
integer value assigned to it. Section 3.2.3 explains how to assign integer value assigned to it. Section 3.2.3 explains how to assign
integer values to text names. integer values to text names.
3.2.2. Instructions 3.2.2 Instructions
A UDVM instruction is specified by the instruction opcode followed by A UDVM instruction is specified by the instruction opcode followed by
zero or more operands. The instruction operands are enclosed in zero or more operands. The instruction operands are enclosed in
parentheses and separated by commas, for example: parentheses and separated by commas, for example:
ADD (3, 4) ADD (3, 4)
When generating the bytecode the parser should replace the When generating the bytecode the parser should replace the
instruction opcode with the corresponding 1-byte value as per Figure instruction opcode with the corresponding 1-byte value as per Figure
11 of SigComp [RFC-3320]. 11 of SigComp RFC-3320 [4].
Each operand consists of an expression which evaluates to an integer, Each operand consists of an expression which evaluates to an integer,
optionally preceded by the symbol "$". This symbol indicates that the optionally preceded by the symbol "$". This symbol indicates that
supplied integer value must be interpreted as the memory address at the supplied integer value must be interpreted as the memory address
which the operand value can be found, rather than the actual operand at which the operand value can be found, rather than the actual
value itself. operand value itself.
When converting each instruction operand to bytecode, the parser When converting each instruction operand to bytecode, the parser
first determines whether the instruction expects the operand to be a first determines whether the instruction expects the operand to be a
literal, a reference, a multitype or an address. If the operand is a literal, a reference, a multitype or an address. If the operand is a
literal then as per Figure 8 of SigComp, the parser inserts the literal then as per Figure 8 of SigComp, the parser inserts the
shortest bytecode capable of encoding the supplied operand value. shortest bytecode capable of encoding the supplied operand value.
Since literal operands are used to indicate the total number of Since literal operands are used to indicate the total number of
operands for an instruction, it is possible to leave a literal operands for an instruction, it is possible to leave a literal
operand blank and allow its value to be inferred automatically by the operand blank and allow its value to be inferred automatically by the
skipping to change at page 8, line 6 skipping to change at page 8, line 30
The missing operand should be given the value 4 because it is The missing operand should be given the value 4 because it is
followed by a total of 4 operands. followed by a total of 4 operands.
If the operand is a reference then as per Figure 9 of SigComp, the If the operand is a reference then as per Figure 9 of SigComp, the
parser inserts the shortest bytecode capable of encoding the supplied parser inserts the shortest bytecode capable of encoding the supplied
memory address. Note that reference operands will always be preceded memory address. Note that reference operands will always be preceded
by the symbol "$" in assembly because they always encode memory by the symbol "$" in assembly because they always encode memory
addresses rather than actual operand values. addresses rather than actual operand values.
If the operand is a multitype then the parser first checks whether If the operand is a multitype then the parser first checks whether
the symbol "$" is present. If so then as per Figure 10 of SigComp, it the symbol "$" is present. If so then as per Figure 10 of SigComp,
inserts the shortest bytecode capable of encoding the supplied it inserts the shortest bytecode capable of encoding the supplied
integer as a memory address. If not then it inserts the shortest integer as a memory address. If not then it inserts the shortest
bytecode that encodes the supplied integer as an operand value. bytecode that encodes the supplied integer as an operand value.
If the operand is an address then the parser checks whether the If the operand is an address then the parser checks whether the
symbol "$" is present. If so then the supplied integer is encoded as symbol "$" is present. If so then the supplied integer is encoded as
a memory address, just as for the multitype instruction above. If not a memory address, just as for the multitype instruction above. If
then the byte position of the opcode is subtracted from the supplied not then the byte position of the opcode is subtracted from the
integer modulo 16, and the result is encoded as an operand value as supplied integer modulo 16, and the result is encoded as an operand
per Figure 10 of SigComp. value as per Figure 10 of SigComp.
3.2.3. Directives 3.2.3 Directives
The assembly language provides a number of directives for evaluating The assembly language provides a number of directives for evaluating
expressions, moving instructions to a particular memory address etc. expressions, moving instructions to a particular memory address etc.
The directives "pad", "align" and "at" can be used to add padding to The directives "pad", "align" and "at" can be used to add padding to
the bytecode. the bytecode.
The directive "pad (n)" appends n consecutive padding bytes to the The directive "pad (n)" appends n consecutive padding bytes to the
bytecode. The actual value of the padding bytes is unimportant, so bytecode. The actual value of the padding bytes is unimportant, so
when the bytecode is uploaded to the UDVM the padding bytes can be when the bytecode is uploaded to the UDVM the padding bytes can be
skipping to change at page 8, line 54 skipping to change at page 9, line 30
The directive "byte (n[0],..., n[k-1])" appends k consecutive bytes The directive "byte (n[0],..., n[k-1])" appends k consecutive bytes
to the bytecode. The byte string is supplied as expressions which to the bytecode. The byte string is supplied as expressions which
evaluate to give integers n[0],..., n[k-1] from 0 to 255. evaluate to give integers n[0],..., n[k-1] from 0 to 255.
The directive "word (n[0],..., n[k-1])" appends k consecutive 2-byte The directive "word (n[0],..., n[k-1])" appends k consecutive 2-byte
words to the bytecode. The word string is supplied as expressions words to the bytecode. The word string is supplied as expressions
which evaluate to give integers n[0],..., n[k-1] from 0 to 65535. which evaluate to give integers n[0],..., n[k-1] from 0 to 65535.
The directive "set (name, n)" assigns an integer value n to a The directive "set (name, n)" assigns an integer value n to a
specified text name. The integer value can be supplied in the form of specified text name. The integer value can be supplied in the form
an expression. of an expression.
3.2.4. Labels 3.2.4 Labels
A label is a special directive used to assign memory addresses to A label is a special directive used to assign memory addresses to
text names. text names.
Labels are specified by giving a single colon followed by the text Labels are specified by giving a single colon followed by the text
name to be defined. The (absolute) position of the byte immediately name to be defined. The (absolute) position of the byte immediately
following the label is evaluated and assigned to the text name. For following the label is evaluated and assigned to the text name. For
example: example:
:start :start
LOAD (temp, 1) LOAD (temp, 1)
Since the label "start" occurs at the beginning of the bytecode, it Since the label "start" occurs at the beginning of the bytecode, it
is assigned the integer value 0. is assigned the integer value 0.
Note that writing the label ":name" has exactly the same behavior as Note that writing the label ":name" has exactly the same behavior as
writing the directive "set (name, .)". writing the directive "set (name, .)".
3.3. Uploading the bytecode to the UDVM 3.3 Uploading the bytecode to the UDVM
Once the parser has converted a string of assembly into the Once the parser has converted a string of assembly into the
corresponding bytecode, it must be copied to the UDVM memory corresponding bytecode, it must be copied to the UDVM memory
beginning at Address 0 and then executed beginning from the first beginning at Address 0 and then executed beginning from the first
UDVM instruction in the bytecode. UDVM instruction in the bytecode.
SigComp provides the following message format for uploading bytecode SigComp provides the following message format for uploading bytecode
to the UDVM: to the UDVM:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
skipping to change at page 10, line 34 skipping to change at page 11, line 23
either padding, set to 0 by the bytecode, or are beyond the total either padding, set to 0 by the bytecode, or are beyond the total
length of the bytecode. length of the bytecode.
The "uploaded UDVM bytecode" should be set to contain the segment of The "uploaded UDVM bytecode" should be set to contain the segment of
bytecode that lies between Address (destination) and Address bytecode that lies between Address (destination) and Address
(destination + code_len - 1) inclusive. (destination + code_len - 1) inclusive.
4. Compression algorithms 4. Compression algorithms
This chapter describes a number of compression algorithms that can be This chapter describes a number of compression algorithms that can be
used by a SigComp compressor. In each case the document provides UDVM used by a SigComp compressor. In each case the document provides
bytecode for the corresponding decompression algorithm, which can be UDVM bytecode for the corresponding decompression algorithm, which
uploaded to the receiving endpoint as part of a SigComp message. can be uploaded to the receiving endpoint as part of a SigComp
message.
Section 4.1 covers a simple algorithm in some detail, including the Section 4.1 covers a simple algorithm in some detail, including the
steps required to compress and decompress a SigComp message. The steps required to compress and decompress a SigComp message. The
remaining sections cover well-known compression algorithms that can remaining sections cover well-known compression algorithms that can
be adapted for use in SigComp with minimal modification. be adapted for use in SigComp with minimal modification.
4.1. Simplified LZ77 4.1 Simplified LZ77
This section describes how to implement a very simple compression This section describes how to implement a very simple compression
algorithm based on LZ77 [LZ77]. algorithm based on LZ77 LZ77 [7].
A compressed message generated by the simplified LZ77 scheme consists A compressed message generated by the simplified LZ77 scheme consists
of a sequence of 4-byte characters, where each character contains a of a sequence of 4-byte characters, where each character contains a
2-byte position value followed by a 2-byte length value. Each pair of 2-byte position value followed by a 2-byte length value. Each pair
integers identifies a byte string in the UDVM memory; when of integers identifies a byte string in the UDVM memory; when
concatenated these byte strings form the decompressed message. concatenated these byte strings form the decompressed message.
When implementing a bytecode decompressor for the simplified LZ77 When implementing a bytecode decompressor for the simplified LZ77
scheme, the UDVM memory is partitioned into five distinct areas as scheme, the UDVM memory is partitioned into five distinct areas as
shown below: shown below:
0 64 128 256 512 0 64 128 256 512
| scratch-pad | variables | bytecode | dictionary | circular buffer | | scratch-pad | variables | bytecode | dictionary | circular buffer |
+-------------+-----------+----------+------------+-----------------+ +-------------+-----------+----------+------------+-----------------+
<-----------> <---------> <--------> <----------> <---------------> <-----------> <---------> <--------> <----------> <--------------->
64 bytes 64 bytes 128 bytes 256 bytes 512+ bytes 64 bytes 64 bytes 128 bytes 256 bytes 512+ bytes
The first 128 bytes are used to hold the 2-byte variables needed by The first 128 bytes are used to hold the 2-byte variables needed by
the LZ77 decompressor. Within this memory the first 64 bytes is used the LZ77 decompressor. Within this memory the first 64 bytes is used
as a scratch-pad, holding the 2-byte variables that can be discarded as a scratch-pad, holding the 2-byte variables that can be discarded
between SigComp messages. In contrast the next 64 bytes (and in fact between SigComp messages. In contrast the next 64 bytes (and in fact
all of the UDVM memory starting from Address 64) should be saved all of the UDVM memory starting from Address 64) should be saved
after decompressing a SigComp message to improve the compression after decompressing a SigComp message to improve the compression
ratio of subsequent messages. ratio of subsequent messages.
skipping to change at page 11, line 38 skipping to change at page 12, line 28
additional features to the decompressor at a later stage. additional features to the decompressor at a later stage.
The next 256 bytes are initialized by the bytecode to contain the The next 256 bytes are initialized by the bytecode to contain the
integers 0 to 255 inclusive. The purpose of this memory area is to integers 0 to 255 inclusive. The purpose of this memory area is to
provide a dictionary of all possible uncompressed characters; this is provide a dictionary of all possible uncompressed characters; this is
important to ensure that the compressor can always generate a important to ensure that the compressor can always generate a
sequence of position/length pairs that encode a given message. For sequence of position/length pairs that encode a given message. For
example, a byte with value 0x41 (corresponding to the ASCII character example, a byte with value 0x41 (corresponding to the ASCII character
"A") can be found at Address 0x0141 of the UDVM memory, so the "A") can be found at Address 0x0141 of the UDVM memory, so the
compressed character 0x0141 0001 will decompress to give this ASCII compressed character 0x0141 0001 will decompress to give this ASCII
character. Note that encoding each byte in the application message as character. Note that encoding each byte in the application message
a separate 4-byte compressed character is not recommended however, as as a separate 4-byte compressed character is not recommended however,
the resulting "compressed" message is four times as large as the as the resulting "compressed" message is four times as large as the
original uncompressed message. original uncompressed message.
The compression ratio of LZ77 is improved by the remaining UDVM The compression ratio of LZ77 is improved by the remaining UDVM
memory, which is used to store a history buffer containing the memory, which is used to store a history buffer containing the
previously decompressed messages. Compressed characters can point to previously decompressed messages. Compressed characters can point to
strings that have previously been decompressed and stored in the strings that have previously been decompressed and stored in the
buffer; so the overall compression ratio of the LZ77 algorithm buffer; so the overall compression ratio of the LZ77 algorithm
improves as the decompressor "learns" more text strings and is able improves as the decompressor "learns" more text strings and is able
to encode longer strings using a single compressed character. The to encode longer strings using a single compressed character. The
buffer is circular, so older messages are overwritten by new data buffer is circular, so older messages are overwritten by new data
when the buffer becomes full. when the buffer becomes full.
Note that the actual size of this circular buffer depends on the Note that the actual size of this circular buffer depends on the
total amount of memory available to the UDVM. The minimum size of the total amount of memory available to the UDVM. The minimum size of
the
UDVM memory is 1K, so the circular buffer will always contain at UDVM memory is 1K, so the circular buffer will always contain at
least 512 bytes. least 512 bytes.
The steps required to implement an LZ77 compressor and decompressor The steps required to implement an LZ77 compressor and decompressor
are similar, although compression is more processor-intensive as it are similar, although compression is more processor-intensive as it
requires a searching operation to be performed. Assembly for the requires a searching operation to be performed. Assembly for the
simplified LZ77 decompressor is given below: simplified LZ77 decompressor is given below:
; Variables that do not need to be stored after decompressing each ; Variables that do not need to be stored after decompressing each
; SigComp message are stored here: ; SigComp message are stored here:
skipping to change at page 14, line 42 skipping to change at page 15, line 46
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| | | |
: remaining SigComp message : : remaining SigComp message :
| | | |
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
The partial_state_identifier field must contain the first 6 bytes of The partial_state_identifier field must contain the first 6 bytes of
the state identifier for the state item to be accessed (see [RFC- the state identifier for the state item to be accessed (see [RFC-
3320] for details of how state identifiers are derived). 3320] for details of how state identifiers are derived).
4.2. LZSS 4.2 LZSS
This section provides UDVM bytecode for the simple but effective LZSS This section provides UDVM bytecode for the simple but effective LZSS
compression algorithm [LZSS]. compression algorithm LZSS [8].
The principal improvement offered by LZSS over LZ77 is that each The principal improvement offered by LZSS over LZ77 is that each
compressed character begins with a 1-bit indicator flag to specify compressed character begins with a 1-bit indicator flag to specify
whether the character is a literal or an offset/length pair. A whether the character is a literal or an offset/length pair. A
literal value is simply a single uncompressed byte that is appended literal value is simply a single uncompressed byte that is appended
directly to the decompressed message. directly to the decompressed message.
An offset/length pair contains a 12-bit offset value from 1 to 4096 An offset/length pair contains a 12-bit offset value from 1 to 4096
inclusive, followed by a 4-bit length value from 3 to 18 inclusive. inclusive, followed by a 4-bit length value from 3 to 18 inclusive.
Taken together these values specify one of the previously received Taken together these values specify one of the previously received
skipping to change at page 16, line 24 skipping to change at page 17, line 40
returned_parameters_location, state_length, 64, returned_parameters_location, state_length, 64,
decompress_sigcomp_message, 6, 0) decompress_sigcomp_message, 6, 0)
:circular_buffer :circular_buffer
An example message compressed using the LZSS algorithm is given An example message compressed using the LZSS algorithm is given
below: below:
0x279a 0406 e378 b200 6074 1018 4ce6 1349 b842 0x279a 0406 e378 b200 6074 1018 4ce6 1349 b842
4.3. LZW 4.3 LZW
This section provides UDVM bytecode for the well-known LZW This section provides UDVM bytecode for the well-known LZW
compression algorithm [LZW]. This algorithm is used in a number of compression algorithm LZW [9]. This algorithm is used in a number of
standards including the GIF image format. standards including the GIF image format.
LZW compression operates in a similar manner to LZ77 in that it LZW compression operates in a similar manner to LZ77 in that it
maintains a circular buffer of previously received decompressed data, maintains a circular buffer of previously received decompressed data,
and each compressed character references exactly one byte string from and each compressed character references exactly one byte string from
the circular buffer. However, LZW also maintains a "codebook" the circular buffer. However, LZW also maintains a "codebook"
containing 1024 position/length pairs that point to byte strings containing 1024 position/length pairs that point to byte strings
which LZW believes are most likely to occur in the uncompressed data. which LZW believes are most likely to occur in the uncompressed data.
The byte strings stored in the LZW codebook can be referenced by The byte strings stored in the LZW codebook can be referenced by
skipping to change at page 18, line 28 skipping to change at page 20, line 4
; position/length pair: ; position/length pair:
OUTPUT ($position_value, $length_value) OUTPUT ($position_value, $length_value)
JUMP (next_character) JUMP (next_character)
:end_of_message :end_of_message
END-MESSAGE (requested_feedback_location, END-MESSAGE (requested_feedback_location,
returned_parameters_location, state_length, 64, returned_parameters_location, state_length, 64,
decompress_sigcomp_message, 6, 0) decompress_sigcomp_message, 6, 0)
:static_dictionary pad (256) :static_dictionary pad (256)
:circular_buffer :circular_buffer
at (4492) at (4492)
:codebook :codebook
An example message compressed using the LZW algorithm is given below: An example message compressed using the LZW algorithm is given below:
0x14c6 f080 6c1b c6e1 9c20 1846 e190 201d 0684 206b 1cc2 0198 6f1c 0x14c6 f080 6c1b c6e1 9c20 1846 e190 201d 0684 206b 1cc2 0198 6f1c
0x9071 b06c 42c6 8195 111a 4731 a021 02bf f0 0x9071 b06c 42c6 8195 111a 4731 a021 02bf f0
4.4. DEFLATE 4.4 DEFLATE
This section provides UDVM bytecode for the DEFLATE compression This section provides UDVM bytecode for the DEFLATE compression
algorithm. DEFLATE is the algorithm used in the well-known "gzip" algorithm. DEFLATE is the algorithm used in the well-known "gzip"
file format. file format.
The following bytecode will decompress the DEFLATE compressed data The following bytecode will decompress the DEFLATE compressed data
format [DEFLATE] with the following modifications: format DEFLATE [10] with the following modifications:
1. The DEFLATE compressed data format separates blocks of compressed 1. The DEFLATE compressed data format separates blocks of compressed
data by transmitting 7 consecutive zero bits. Each SigComp message data by transmitting 7 consecutive zero bits. Each SigComp
is assumed to contain a separate block of compressed data, so the message is assumed to contain a separate block of compressed
end-of-block bits are implicit and do not need to be transmitted data, so the end-of-block bits are implicit and do not need to be
at the end of a SigComp message. transmitted at the end of a SigComp message.
2. The bytecode supports only DEFLATE block type 01 (data compressed 2. The bytecode supports only DEFLATE block type 01 (data compressed
with fixed Huffman codes). with fixed Huffman codes).
Assembly for the DEFLATE decompressor is given below: Assembly for the DEFLATE decompressor is given below:
at (32) at (32)
:index pad (2) :index pad (2)
:extra_length_bits pad (2) :extra_length_bits pad (2)
:length_value pad (2) :length_value pad (2)
skipping to change at page 21, line 25 skipping to change at page 23, line 9
decompress_sigcomp_message, 6, 0) decompress_sigcomp_message, 6, 0)
:circular_buffer :circular_buffer
An example message compressed using the DEFLATE algorithm is given An example message compressed using the DEFLATE algorithm is given
below: below:
0xf3c9 4c4b d551 28c9 4855 08cd cb2c 4b2d 2a4e 5548 cc4b 5170 0532 0xf3c9 4c4b d551 28c9 4855 08cd cb2c 4b2d 2a4e 5548 cc4b 5170 0532
0x2b4b 3232 f3d2 b900 0000 00ff ff00 0x2b4b 3232 f3d2 b900 0000 00ff ff00
4.5. LZJH 4.5 LZJH
This section provides UDVM bytecode for the LZJH compression This section provides UDVM bytecode for the LZJH compression
algorithm. LZJH is the algorithm adopted by the International algorithm. LZJH is the algorithm adopted by the International
Telecommunication Union (ITU-T) Recommendation V.44 [LZJH]. Telecommunication Union (ITU-T) Recommendation V.44 LZJH [11].
Assembly for the LZJH decompressor is given below: Assembly for the LZJH decompressor is given below:
at (32) at (32)
; The following 2-byte variables are stored in the scratch-pad memory ; The following 2-byte variables are stored in the scratch-pad memory
; area because they do not need to be saved after decompressing a ; area because they do not need to be saved after decompressing a
; SigComp message: ; SigComp message:
:length_value pad (2) :length_value pad (2)
skipping to change at page 25, line 4 skipping to change at page 26, line 49
; The STEPUP control character increases the number of bits used to ; The STEPUP control character increases the number of bits used to
; encode an ordinal value or a codeword: ; encode an ordinal value or a codeword:
INPUT-BITS (1, index, !) INPUT-BITS (1, index, !)
COMPARE ($index, 1, stepup_ordinal, stepup_codeword, 0) COMPARE ($index, 1, stepup_ordinal, stepup_codeword, 0)
:stepup_ordinal :stepup_ordinal
ADD ($ordinal_length, 1) ADD ($ordinal_length, 1)
JUMP (ordinal) JUMP (ordinal)
:stepup_codeword
:stepup_codeword
ADD ($codeword_length, 1) ADD ($codeword_length, 1)
JUMP (codeword_control) JUMP (codeword_control)
:end_of_message :end_of_message
END-MESSAGE (requested_feedback_location, END-MESSAGE (requested_feedback_location,
returned_parameters_location, state_length, 64, returned_parameters_location, state_length, 64,
decompress_sigcomp_message, 6, 0) decompress_sigcomp_message, 6, 0)
:circular_buffer :circular_buffer
An example message compressed using the LZJH algorithm is given An example message compressed using the LZJH algorithm is given
below: below:
0x5c09 e6e0 cadc c8d2 dcce 40c2 40f2 cac2 e440 c825 c840 ccde 29e8 0x5c09 e6e0 cadc c8d2 dcce 40c2 40f2 cac2 e440 c825 c840 ccde 29e8
0xc2f0 40e0 eae4 e0de e6ca e65c 1403 0xc2f0 40e0 eae4 e0de e6ca e65c 1403
4.6. EPIC 4.6 EPIC
This section provides bytecode for a version of the Efficient This section provides bytecode for a version of the Efficient
Protocol Independent Compression (EPIC) scheme. Protocol Independent Compression (EPIC) scheme.
The basic EPIC scheme is designed to compress protocol headers such The basic EPIC scheme is designed to compress protocol headers such
as TCP/IP, but the underlying algorithm (known as Hierarchical as TCP/IP, but the underlying algorithm (known as Hierarchical
Huffman) can be applied to the compression of arbitrary data. In Huffman) can be applied to the compression of arbitrary data. In
particular the compression algorithm used by EPIC obtains a very high particular the compression algorithm used by EPIC obtains a very high
compression ratio on data with a known structure, so it is ideally compression ratio on data with a known structure, so it is ideally
suited for compressing the messages generated by SIP or other suited for compressing the messages generated by SIP or other
skipping to change at page 26, line 4 skipping to change at page 28, line 7
to cope with generic data by adding the ability to store a circular to cope with generic data by adding the ability to store a circular
buffer of previously received text strings as per LZ77 or DEFLATE. buffer of previously received text strings as per LZ77 or DEFLATE.
The resulting hybrid algorithm offers the best of both worlds: a very The resulting hybrid algorithm offers the best of both worlds: a very
high compression ratio for the "well-behaved" parts of the high compression ratio for the "well-behaved" parts of the
application message, and a good compression ratio even for the application message, and a good compression ratio even for the
portions of the message that cannot be pre-programmed into the portions of the message that cannot be pre-programmed into the
compression algorithm. compression algorithm.
The following bytecode implements a decompressor for a hybrid of EPIC The following bytecode implements a decompressor for a hybrid of EPIC
and DEFLATE. The tables of compressed characters are generated using and DEFLATE. The tables of compressed characters are generated using
the Hierarchical Huffman algorithm from EPIC, and are designed to the Hierarchical Huffman algorithm from EPIC, and are designed to
give a very high compression ratio for a typical SIP/SDP message. The give a very high compression ratio for a typical SIP/SDP message.
ability to store and retrieve text strings from a buffer of The ability to store and retrieve text strings from a buffer of
previously received messages is taken from DEFLATE. previously received messages is taken from DEFLATE.
To illustrate the performance of the hybrid algorithm, the following To illustrate the performance of the hybrid algorithm, the following
results have been generated for the call flow in Section 3.1.2 of results have been generated for the call flow in Section 3.2.1 of
"SIP Call Flow Examples" [FLOWS]. Note that to improve the overall "SIP Call Flow Examples" FLOWS [1]. Note that to improve the overall
compression ratio, all algorithms employ a static dictionary (see compression ratio, all algorithms employ a static dictionary (see
Section 5.2) and the shared compression mechanism (see Section 5.5): Section 5.2) and the shared compression mechanism (see Section 5.5):
Algorithm: Total compressed message size: Algorithm: Total compressed message size:
DEFLATE with static Huffman codes 660 bytes DEFLATE with static Huffman codes 660 bytes
DEFLATE with adaptive Huffman codes 625 bytes DEFLATE with adaptive Huffman codes 625 bytes
EPIC 560 bytes EPIC 560 bytes
Assembly for the EPIC algorithm is given below. A compressor to Assembly for the EPIC algorithm is given below. A compressor to
skipping to change at page 29, line 26 skipping to change at page 31, line 43
item indicating whether the state can safely be accessed or not. item indicating whether the state can safely be accessed or not.
State items should not be accessed until they have been acknowledged State items should not be accessed until they have been acknowledged
(e.g. by using the SigComp feedback mechanism as per Section 5.1). (e.g. by using the SigComp feedback mechanism as per Section 5.1).
State items are deleted from the list when the total State items are deleted from the list when the total
state_memory_size for the compartment is used up by states of a state_memory_size for the compartment is used up by states of a
higher priority. The SigComp compressor should not attempt to access higher priority. The SigComp compressor should not attempt to access
any state items that have been deleted in this manner, as they may no any state items that have been deleted in this manner, as they may no
longer be available at the receiving endpoint. longer be available at the receiving endpoint.
5.1. Acknowledging a state item 5.1 Acknowledging a state item
The simplest method for acknowledging a SigComp state item is to The simplest method for acknowledging a SigComp state item is to
employ a reliable transport layer such as TCP or SCTP. Alternatively, employ a reliable transport layer such as TCP or SCTP.
over an unreliable transport such as UDP the SigComp feedback Alternatively, over an unreliable transport such as UDP the SigComp
mechanism can be used to acknowledge that a state item has been feedback mechanism can be used to acknowledge that a state item has
successfully created at the receiving endpoint. been successfully created at the receiving endpoint.
As explained in SigComp [RFC-3320], in order to invoke the feedback As explained in SigComp RFC-3320 [4], in order to invoke the feedback
mechanism the following fields must be reserved in the UDVM memory: mechanism the following fields must be reserved in the UDVM memory:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| reserved | Q | S | I | requested_feedback_location | reserved | Q | S | I | requested_feedback_location
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| 1 | requested_feedback_length | if Q = 1 | 1 | requested_feedback_length | if Q = 1
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| | | |
: requested_feedback_field : if Q = 1 : requested_feedback_field : if Q = 1
skipping to change at page 30, line 28 skipping to change at page 33, line 8
decompress_sigcomp_message, 6) decompress_sigcomp_message, 6)
SHA-1 (hash_start, hash_length, requested_feedback_field) SHA-1 (hash_start, hash_length, requested_feedback_field)
The receiving endpoint then returns the state identifier in the The receiving endpoint then returns the state identifier in the
"returned feedback field" of the next SigComp message to be "returned feedback field" of the next SigComp message to be
transmitted in the reverse direction. transmitted in the reverse direction.
When the state identifier is returned, the compressor can set the When the state identifier is returned, the compressor can set the
availability flag for the corresponding state to 1. availability flag for the corresponding state to 1.
5.2. Static dictionary 5.2 Static dictionary
Certain protocols that can be compressed using SigComp offer a fixed, Certain protocols that can be compressed using SigComp offer a fixed,
mandatory state item known as a static dictionary. This dictionary mandatory state item known as a static dictionary. This dictionary
contains a number of text strings that commonly occur in messages contains a number of text strings that commonly occur in messages
generated by the protocol in question. The overall compression ratio generated by the protocol in question. The overall compression ratio
can often be improved by accessing the text phrases from this static can often be improved by accessing the text phrases from this static
dictionary rather than by uploading them as part of the compressed dictionary rather than by uploading them as part of the compressed
message. message.
As an example, a static dictionary is provided for the protocols SIP As an example, a static dictionary is provided for the protocols SIP
and SDP [RFC-3485]. This dictionary is designed for use by a wide and SDP RFC-3485 [6]. This dictionary is designed for use by a wide
range of compression algorithms including all of the ones covered in range of compression algorithms including all of the ones covered in
Chapter 4. Chapter 4.
In any of the compression algorithms of Chapter 4, the static In any of the compression algorithms of Chapter 4, the static
dictionary can be accessed by inserting the following instruction dictionary can be accessed by inserting the following instruction
immediately after the ":initialize_memory" label: immediately after the ":initialize_memory" label:
STATE-ACCESS (dictionary_id, 6, 0, 0, 1024, 0) STATE-ACCESS (dictionary_id, 6, 0, 0, 1024, 0)
The following lines should also be inserted immediately after the The following lines should also be inserted immediately after the
skipping to change at page 31, line 4 skipping to change at page 33, line 35
immediately after the ":initialize_memory" label: immediately after the ":initialize_memory" label:
STATE-ACCESS (dictionary_id, 6, 0, 0, 1024, 0) STATE-ACCESS (dictionary_id, 6, 0, 0, 1024, 0)
The following lines should also be inserted immediately after the The following lines should also be inserted immediately after the
END-MESSAGE instruction: END-MESSAGE instruction:
:dictionary_id :dictionary_id
byte (0xfb, 0xe5, 0x07, 0xdf, 0xe5, 0xe6) byte (0xfb, 0xe5, 0x07, 0xdf, 0xe5, 0xe6)
The text strings contained in the static dictionary can then be The text strings contained in the static dictionary can then be
accessed in exactly the same manner as the text strings from accessed in exactly the same manner as the text strings from
previously decompressed messages (see Section 5.1 for further previously decompressed messages (see Section 5.1 for further
details). details).
Note that in some cases it is sufficient to only load part of the Note that in some cases it is sufficient to only load part of the
static dictionary into the UDVM memory. Further information on the static dictionary into the UDVM memory. Further information on the
contents of the SIP and SDP static dictionary can be found in the contents of the SIP and SDP static dictionary can be found in the
relevant document [RFC-3485]. relevant document RFC-3485 [6].
5.3. CRC checksum 5.3 CRC checksum
Whilst the acknowledgement scheme of Section 5.1 is designed to Whilst the acknowledgement scheme of Section 5.1 is designed to
ensure that SigComp does not propagate errors introduced by the ensure that SigComp does not propagate errors introduced by the
underlying transport layer, in some cases it may be useful to add an underlying transport layer, in some cases it may be useful to add an
extra CRC check over the UDVM memory. For example, if the transport extra CRC check over the UDVM memory. For example, if the transport
layer fails to discard a damaged SigComp message then a CRC check can layer fails to discard a damaged SigComp message then a CRC check can
ensure that the corresponding decompressed message is not forwarded ensure that the corresponding decompressed message is not forwarded
to the application. to the application.
If an additional CRC check is required then the following bytecode If an additional CRC check is required then the following bytecode
can be inserted after the ":end_of_message" label: can be inserted after the ":end_of_message" label:
INPUT-BYTES (2, index, !) INPUT-BYTES (2, index, !)
CRC ($index, 64, state_length, !) CRC ($index, 64, state_length, !)
The bytecode extracts a 2-byte CRC from the end of the SigComp The bytecode extracts a 2-byte CRC from the end of the SigComp
message and compares it with a CRC calculated over the UDVM memory. message and compares it with a CRC calculated over the UDVM memory.
Decompression failure occurs if the two CRC values do not match. Decompression failure occurs if the two CRC values do not match.
A definition of the CRC polynomial used by the CRC instruction can be A definition of the CRC polynomial used by the CRC instruction can be
found in SigComp [RFC-3320]. found in SigComp RFC-3320 [4].
5.4. Announcing additional resources 5.4 Announcing additional resources
If a particular endpoint is able to offer more processing or memory If a particular endpoint is able to offer more processing or memory
resources than the mandatory minimum, the SigComp feedback mechanism resources than the mandatory minimum, the SigComp feedback mechanism
can be used to announce that these resources are available to the can be used to announce that these resources are available to the
remote endpoint. This may help to improve the overall compression remote endpoint. This may help to improve the overall compression
ratio between the two endpoints. ratio between the two endpoints.
The values of the following SigComp parameters can be announced using The values of the following SigComp parameters can be announced using
the SigComp feedback mechanism: the SigComp feedback mechanism:
cycles_per_bit cycles_per_bit decompression_memory_size state_memory_size
decompression_memory_size
state_memory_size
SigComp_version SigComp_version
As explained in SigComp, in order to announce the values of these As explained in SigComp, in order to announce the values of these
parameters the following fields must be reserved in the UDVM memory: parameters the following fields must be reserved in the UDVM memory:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| cpb | dms | sms | returned_parameters_location | cpb | dms | sms | returned_parameters_location
+---+---+---+---+---+---+---+---+ +---+---+---+---+---+---+---+---+
| SigComp_version | | SigComp_version |
skipping to change at page 32, line 28 skipping to change at page 35, line 18
When a SigComp message is successfully decompressed and saved as When a SigComp message is successfully decompressed and saved as
state, the following bytecode announces to the receiving endpoint state, the following bytecode announces to the receiving endpoint
that additional resources are available at the sending endpoint: that additional resources are available at the sending endpoint:
:end_of_message :end_of_message
LOAD (returned_parameters_location, N) LOAD (returned_parameters_location, N)
Note that the integer value "N" should be set equal to the amount of Note that the integer value "N" should be set equal to the amount of
resources available at the sending endpoint. N should be expressed as resources available at the sending endpoint. N should be expressed
a 2-byte integer with the most significant bits corresponding to the as a 2-byte integer with the most significant bits corresponding to
cycles_per_bit parameter and the least significant bits corresponding the cycles_per_bit parameter and the least significant bits
to the SigComp_version parameter. corresponding to the SigComp_version parameter.
5.5. Shared compression 5.5 Shared compression
This section provides bytecode for implementing the SigComp shared This section provides bytecode for implementing the SigComp shared
compression mechanism [RFC-3321]. If two endpoints A and B are compression mechanism RFC-3321 [5]. If two endpoints A and B are
communicating via SigComp, shared compression allows the messages communicating via SigComp, shared compression allows the messages
sent from Endpoint A to Endpoint B to be compressed relative to the sent from Endpoint A to Endpoint B to be compressed relative to the
messages sent from Endpoint B to Endpoint A (and vice versa). This messages sent from Endpoint B to Endpoint A (and vice versa). This
may improve the overall compression ratio by reducing the need to may improve the overall compression ratio by reducing the need to
transmit the same information in both directions. transmit the same information in both directions.
As described in [RFC-3321], two steps must be taken to implement As described in RFC-3321 [5], two steps must be taken to implement
shared compression at an endpoint. Firstly, it is necessary to shared compression at an endpoint. Firstly, it is necessary to
announce to the remote endpoint that shared compression is available. announce to the remote endpoint that shared compression is available.
Secondly, assuming that such an announcement is received from the Secondly, assuming that such an announcement is received from the
remote endpoint, then the state created by shared compression must be remote endpoint, then the state created by shared compression must be
accessed to improve the overall compression ratio. accessed to improve the overall compression ratio.
In order to announce that shared compression is available the In order to announce that shared compression is available the
following fields must be reserved in the UDVM memory: following fields must be reserved in the UDVM memory:
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
skipping to change at page 35, line 8 skipping to change at page 38, line 18
LOAD ($decompressed_start, $decompressed_length) LOAD ($decompressed_start, $decompressed_length)
SHA-1 ($decompressed_start, $shared_hash_length, SHA-1 ($decompressed_start, $shared_hash_length,
$announcement_location) $announcement_location)
:end :end
6. Security considerations 6. Security considerations
This draft describes implementation options for the SigComp protocol This draft describes implementation options for the SigComp protocol
[RFC-3320]. Consequently the security considerations for this draft RFC-3320 [4]. Consequently the security considerations for this
match those of SigComp. draft match those of SigComp.
7. Acknowledgements 7. Acknowledgements
Thanks to Thanks to
Carsten Bormann Carsten Bormann
Adam Roach Adam Roach
Lawrence Conroy Lawrence Conroy
Christian Schmidt Christian Schmidt
Max Riegel Max Riegel
Lars-Erik Jonsson Lars-Erik Jonsson
Jonathan Rosenberg Jonathan Rosenberg
Stefan Forsgren Stefan Forsgren
Krister Svanbro Krister Svanbro
Miguel Garcia Miguel Garcia
skipping to change at page 35, line 32 skipping to change at page 38, line 41
Stefan Forsgren Stefan Forsgren
Krister Svanbro Krister Svanbro
Miguel Garcia Miguel Garcia
Christopher Clanton Christopher Clanton
Khiem Le Khiem Le
Ka Cheong Leung Ka Cheong Leung
Zoltan Barczikay Zoltan Barczikay
for valuable input and review. for valuable input and review.
8. Authors' addresses 8. Intellectual Property Right Considerations
Richard Price Tel: +44 1794 833681
Email: richard.price@roke.co.uk
Abigail Surtees Tel: +44 1794 833131
Email: abigail.surtees@roke.co.uk
Mark A West Tel: +44 1794 833311
Email: mark.a.west@roke.co.uk
Roke Manor Research Ltd
Romsey, Hants, SO51 0ZN
United Kingdom
9. Intellectual Property Right Considerations
The IETF has been notified of intellectual property rights claimed in The IETF has been notified of intellectual property rights claimed in
regard to some or all of the specification contained in this regard to some or all of the specification contained in this
document. For more information consult the online list of claimed document. For more information consult the online list of claimed
rights. rights.
10. References 9 References
[FLOWS] "Session Initiation Protocol Basic Call Flow Examples", [1] Johnston, A., Donovan, S., Sparks, R., Cunningham, C. and K.
A. Johnston et al, <draft-ietf-sipping-basic-call-flows- Summers, "Session Initiation Protocol (SIP) Basic Call Flow
02.txt>, April 2003 Examples", RFC 3665, December 2003.
[RFC-2026] "The Internet Standards Process - Revision 3", Scott [2] Bradner, S., "The Internet Standards Process -- Revision 3",
Bradner, RFC 2026, Internet Engineering Task Force, RFC 3667, February 2004.
October 1996
[RFC-2119] "Key words for use in RFCs to Indicate Requirement [3] Crocker, D. and P. Overell, "Augmented BNF for Syntax
Levels", Scott Bradner, RFC 2119, Internet Engineering Specifications: ABNF", RFC 2234, November 1997.
Task Force, March 1997
[RFC-2234] "Augmented BNF for Syntax Specifications: ABNF", [4] Price, R., Borman, C., Christoffersson, J., Hannu, H., Liu, Z.
D. Crocker and P. Overell, RFC 2234, Internet Engineering and J. Rosenberg, "Signaling Compression (SigComp)", RFC 3320,
Task Force, November 1997 January 2003.
[RFC-2326] "Real Time Streaming Protocol (RTSP)", H. Schulzrinne, A. [5] Hannu, H., Christoffersson, J., Forsgren, S., Leung, K., Liu,
Rao and R. Lanphier, RFC 2326, Internet Engineering Task Z. and R. Price, "Signaling Compression (SigComp)", RFC 3321,
Force, April 1998 January 2003.
[RFC-3261] "SIP: Session Initiation Protocol", J. Rosenberg et al, [6] Garcia-Martin, M., Borman, C., Ott, J., Price, R. and A. Roach,
RFC 3261, Internet Engineering Task Force, June 2002 "The Session Initiation Protocol (SIP) and Session Description
Protocol (SDP) Statc Dictionary for Signaling Compression
(SigComp)", RFC 3485, February 2003.
[RFC-3320] "Signaling Compression (SigComp)", R. Price et al, RFC [7] Ziv, J. and A. Lempel, "A universal algorithm for sequential
3320, Internet Engineering Task Force, January 2003 data compression", IEEE 23:337-343, 1977.
[RFC-3321] "SigComp - Extended Operations", Hannu et al, RFC 3321, [8] Storer, J., "Data Compression: Methods and Theory", Computer
Internet Engineering Task Force, January 2003 Science Press ISBN 0-88175-161-8, 1998.
[RFC-3485] "The Session Initiation Protocol (SIP) and Session [9] Nelson, M., "LZW Data Compression", Dr Dobb's Journal, October
Description Protocol (SDP) Static Dictionary for 1989.
Signaling Compression (SigComp)", M. Garcia-Martin et al,
RFC 3485, Internet Engineering Task Force, February 2003
[LZ77] "A universal algorithm for sequential data compression", [10] Deutsch, P., "DEFLATE Compressed Data Format Specification
J. Ziv and A. Lempel, IEEE 23:337-343, 1977 version 1.3", RFC 1951, May 1996.
[LZSS] "Data Compression: Methods and Theory", J. Storer, [11] "Data Compression Procedures", ITU-T Recommendation V.44,
Computer Science Press, ISBN 0-88175-161-8, 1988 November 2000.
[LZW] "LZW Data Compression", Mark Nelson, Dr. Dobb's Authors' Addresses
Journal, October 1989
[DEFLATE] "DEFLATE Compressed Data Format Specification version Abigail Surtees
1.3", P. Deutsch, RFC 1951, Internet Engineering Task Siemens/Roke Manor Research
Force, May 1996 Roke Manor Research Ltd.
Romsey, Hants SO51 0ZN
UK
[LZJH] "Data Compression Procedures", ITU-T Recommendation V.44, Phone: +44 (0)1794 833131
November 2000 EMail: abigail.surtees@roke.co.uk
URI: http://www.roke.co.uk
Appendix A: UDVM bytecode for the compression algorithms Mark A. West
Siemens/Roke Manor Research
Roke Manor Research Ltd.
Romsey, Hants SO51 0ZN
UK
Phone: +44 (0)1794 833311
EMail: mark.a.west@roke.co.uk
URI: http://www.roke.co.uk
Appendix A. UDVM bytecode for the compression algorithms
The following sections list the UDVM bytecode generated for each The following sections list the UDVM bytecode generated for each
compression algorithm of Chapter 4. compression algorithm of Section 4.
Note that the different assemblers can output different bytecode for Note that the different assemblers can output different bytecode for
the same piece of assembly code, so a valid assembler can produce the same piece of assembly code, so a valid assembler can produce
results different from those presented below. However, the following results different from those presented below. However, the following
bytecode should always generate the same decompressed messages on any bytecode should always generate the same decompressed messages on any
UDVM. UDVM.
A.1. Simplified LZ77 A.1 Simplified LZ77
0x0f86 0389 8d89 1588 8800 011c 0420 0d13 5051 2222 5051 16f5 2300 0x0f86 0389 8d89 1588 8800 011c 0420 0d13 5051 2222 5051 16f5 2300
0x00bf c086 a08b 06 0x00bf c086 a08b 06
A.2. LZSS A.2 LZSS
0x0f86 04a0 c48d 00a0 c41e 2031 0209 00a0 ff8e 048c bfff 0117 508d 0x0f86 04a0 c48d 00a0 c41e 2031 0209 00a0 ff8e 048c bfff 0117 508d
0x0f23 0622 2101 1321 0123 16e5 1d04 22e8 0611 030e 2463 1450 5123 0x0f23 0622 2101 1321 0123 16e5 1d04 22e8 0611 030e 2463 1450 5123
0x2252 5116 9fd2 2300 00bf c086 a089 06 0x2252 5116 9fd2 2300 00bf c086 a089 06
A.3. LZW A.3 LZW
0x0f86 06a1 ce8d 00b1 8f01 a0ce 13a0 4903 2313 2501 2506 1201 1752 0x0f86 06a1 ce8d 00b1 8f01 a0ce 13a0 4903 2313 2501 2506 1201 1752
0x88f4 079f 681d 0a24 2508 1203 0612 b18f 1252 0321 0ea0 4801 0624 0x88f4 079f 681d 0a24 2508 1203 0612 b18f 1252 0321 0ea0 4801 0624
0x5013 a049 0323 1351 5025 2251 5016 9fde 2300 00bf c086 a09f 06 0x5013 a049 0323 1351 5025 2251 5016 9fde 2300 00bf c086 a09f 06
A.4. DEFLATE A.4 DEFLATE
0x0f86 7aa2 528d 05a2 5200 0300 0400 0500 0600 0700 0800 0900 0a01 0x0f86 7aa2 528d 05a2 5200 0300 0400 0500 0600 0700 0800 0900 0a01
0x0b01 0d01 0f01 1102 1302 1702 1b02 1f03 2303 2b03 3303 3b04 a043 0x0b01 0d01 0f01 1102 1302 1702 1b02 1f03 2303 2b03 3303 3b04 a043
0x04a0 5304 a063 04a0 7305 a083 05a0 a305 a0c3 05a0 e300 a102 0001 0x04a0 5304 a063 04a0 7305 a083 05a0 a305 a0c3 05a0 e300 a102 0001
0x0002 0003 0004 0105 0107 0209 020d 0311 0319 0421 0431 05a0 4105 0x0002 0003 0004 0105 0107 0209 020d 0311 0319 0421 0431 05a0 4105
0xa061 06a0 8106 a0c1 07a1 0107 a181 08a2 0108 a301 09a4 0109 a601 0xa061 06a0 8106 a0c1 07a1 0107 a181 08a2 0108 a301 09a4 0109 a601
0x0aa8 010a ac01 0bb0 010b b801 0c80 2001 0c80 3001 0d80 4001 0d80 0x0aa8 010a ac01 0bb0 010b b801 0c80 2001 0c80 3001 0d80 4001 0d80
0x6001 1d03 229f b41e 20a0 6504 0700 1780 4011 0130 a0bf 0000 a0c0 0x6001 1d03 229f b41e 20a0 6504 0700 1780 4011 0130 a0bf 0000 a0c0
0xa0c7 8040 2901 a190 a1ff a090 1750 8040 1109 a046 1322 2101 1321 0xa0c7 8040 2901 a190 a1ff a090 1750 8040 1109 a046 1322 2101 1321
0x0123 169f d108 1004 1250 0422 1d51 229f d706 1251 1e20 9fcf 0105 0x0123 169f d108 1004 1250 0422 1d51 229f d706 1251 1e20 9fcf 0105
0x001f 2f08 1004 1250 0426 1d53 26f6 0614 530e 2063 1454 5223 2250 0x001f 2f08 1004 1250 0426 1d53 26f6 0614 530e 2063 1454 5223 2250
0x5216 9f9e 2300 00bf c086 a1de 06 0x5216 9f9e 2300 00bf c086 a1de 06
A.5. LZJH A.5 LZJH
0x0f86 08a1 5b8d 0700 a15b 0706 b18f 1d01 24a0 c317 5201 1a31 311e 0x0f86 08a1 5b8d 0700 a15b 0706 b18f 1d01 24a0 c317 5201 1a31 311e
0x24a0 b802 0101 0102 0100 0100 1752 0107 a04e 1e1d 6524 f822 2501 0x24a0 b802 0101 0102 0100 0100 1752 0107 a04e 1e1d 6524 f822 2501
0x0ea0 4602 13a0 4703 2713 2501 2416 9fcd 1d66 24e1 1752 03a0 639f 0x0ea0 4602 13a0 4703 2713 2501 2416 9fcd 1d66 24e1 1752 03a0 639f
0xb808 0812 0306 12b1 8312 5203 210e a046 0106 2350 0e28 6713 a047 0xb808 0812 0306 12b1 8312 5203 210e a046 0106 2350 0e28 6713 a047
0x0327 1351 5024 2251 5016 9fa8 1e24 9fb1 0401 0101 0102 0103 0201 0x0327 1351 5024 2251 5016 9fa8 1e24 9fb1 0401 0101 0102 0103 0201
0x0101 0d03 0007 0517 520d 0d06 061d 0826 f706 1253 1351 5011 1351 0x0101 0d03 0007 0517 520d 0d06 061d 0826 f706 1253 1351 5011 1351
0x5224 2251 5206 1250 1225 0154 169f 6617 5201 9fdb 070f 1c00 009e 0x5224 2251 5206 1250 1225 0154 169f 6617 5201 9fdb 070f 1c00 009e
0xce16 9f57 1d01 24fa 1752 0107 0d9e c206 2501 169f 6506 2601 169f 0xce16 9f57 1d01 24fa 1752 0107 0d9e c206 2501 169f 6506 2601 169f
0x7623 0000 bfc0 86a0 8e06 0x7623 0000 bfc0 86a0 8e06
A.6. EPIC A.6 EPIC
0x0f86 04a1 d38d 00a1 d31e 20a1 4010 0500 0b2e 000c 0c88 011a 20a1 0x0f86 04a1 d38d 00a1 d31e 20a1 4010 0500 0b2e 000c 0c88 011a 20a1
0x0101 a042 a044 2000 a045 a05e a061 00a0 5fa0 66a1 0800 a067 a067 0x0101 a042 a044 2000 a045 a05e a061 00a0 5fa0 66a1 0800 a067 a067
0xa1ff 02a1 a0a1 aa23 00a1 aba1 d13a 00a1 d2a1 e1a1 1001 a3c4 a3e3 0xa1ff 02a1 a0a1 aa23 00a1 aba1 d13a 00a1 d2a1 e1a1 1001 a3c4 a3e3
0xa120 03bf 20bf 34a0 7b00 bf35 bfb3 a180 0180 3f68 803f 8700 0080 0xa120 03bf 20bf 34a0 7b00 bf35 bfb3 a180 0180 3f68 803f 8700 0080
0x3f88 803f c7a1 4001 807f 9080 7fff a090 1750 88a0 79a0 83a0 831e 0x3f88 803f c7a1 4001 807f 9080 7fff a090 1750 88a0 79a0 83a0 831e
0x20a0 c810 0400 00a1 ff01 0209 8801 1416 2000 171e a108 013e a049 0x20a0 c810 0400 00a1 ff01 0209 8801 1416 2000 171e a108 013e a049
0x2e00 a04a a059 a110 02a1 68a1 81a0 6100 a182 a1a1 a120 01a3 44a3 0x2e00 a04a a059 a110 02a1 68a1 81a0 6100 a182 a1a1 a120 01a3 44a3
0x6a3a 00a3 6ba3 aaa1 4001 a756 a760 2300 a761 a7df a180 01af c0af 0x6a3a 00a3 6ba3 aaa1 4001 a756 a760 2300 a761 a7df a180 01af c0af
0xd4a0 7b01 bfaa bfc9 0001 803f 9480 3ffb a090 0180 7ff8 807f ffa0 0xd4a0 7b01 bfaa bfc9 0001 803f 9480 3ffb a090 0180 7ff8 807f ffa0
0xf817 5088 0610 1022 2101 1321 0123 169f 1107 10a0 fd1e 229f d909 0xf817 5088 0610 1022 2101 1321 0123 169f 1107 10a0 fd1e 229f d909
0x0900 0709 0008 3fa0 8101 87a0 8701 00a0 88a0 f711 00a0 f8a1 3fa0 0x0900 0709 0008 3fa0 8101 87a0 8701 00a0 88a0 f711 00a0 f8a1 3fa0
0xb901 a280 a57f a101 02b6 00b9 ffa4 0101 8034 0080 3bff a801 0290 0xb901 a280 a57f a101 02b6 00b9 ffa4 0101 8034 0080 3bff a801 0290
0x00ff b001 0e24 6314 5150 2322 5250 169f 3b23 0000 bfc0 86a0 8906 0x00ff b001 0e24 6314 5150 2322 5250 169f 3b23 0000 bfc0 86a0 8906
Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Disclaimer of Validity
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright Statement
Copyright (C) The Internet Society (2005). This document is subject
to the rights, licenses and restrictions contained in BCP 78, and
except as set forth therein, the authors retain all their rights.
Acknowledgment
Funding for the RFC Editor function is currently provided by the
Internet Society.
This Internet-Draft will expire on August 19, 2005.
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/