--- 1/draft-ietf-httpbis-header-compression-02.txt 2013-08-27 16:14:22.723220115 -0700 +++ 2/draft-ietf-httpbis-header-compression-03.txt 2013-08-27 16:14:22.767221277 -0700 @@ -1,19 +1,19 @@ HTTPbis Working Group R. Peon Internet-Draft Google, Inc Intended status: Informational H. Ruellan -Expires: February 22, 2014 Canon CRF - August 21, 2013 +Expires: February 28, 2014 Canon CRF + August 27, 2013 - HPACK - draft-ietf-httpbis-header-compression-02 + HPACK - Header Compression for HTTP/2.0 + draft-ietf-httpbis-header-compression-03 Abstract This document describes HPACK, a format adapted to efficiently represent HTTP headers in the context of HTTP/2.0. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. @@ -21,21 +21,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 22, 2014. + This Internet-Draft will expire on February 28, 2014. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -43,92 +43,87 @@ to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Outline . . . . . . . . . . . . . . . . . . . . . . . . . 3 - 3. Header Encoding . . . . . . . . . . . . . . . . . . . . . . . 4 - 3.1. Encoding Concepts . . . . . . . . . . . . . . . . . . . . 4 + 3. Header Encoding . . . . . . . . . . . . . . . . . . . . . . . 3 + 3.1. Encoding Concepts . . . . . . . . . . . . . . . . . . . . 3 3.1.1. Encoding Context . . . . . . . . . . . . . . . . . . . 4 3.1.2. Header Table . . . . . . . . . . . . . . . . . . . . . 4 3.1.3. Reference Set . . . . . . . . . . . . . . . . . . . . 5 3.1.4. Header set . . . . . . . . . . . . . . . . . . . . . . 6 3.1.5. Header Representation . . . . . . . . . . . . . . . . 6 3.1.6. Header Emission . . . . . . . . . . . . . . . . . . . 6 3.2. Header Set Processing . . . . . . . . . . . . . . . . . . 7 3.2.1. Header Representation Processing . . . . . . . . . . . 7 - 3.2.2. Reference Set Emission . . . . . . . . . . . . . . . . 8 + 3.2.2. Reference Set Emission . . . . . . . . . . . . . . . . 7 3.2.3. Header Set Completion . . . . . . . . . . . . . . . . 8 3.2.4. Header Table Management . . . . . . . . . . . . . . . 8 - 3.2.5. Specific Use Cases . . . . . . . . . . . . . . . . . . 8 - 4. Detailed Format . . . . . . . . . . . . . . . . . . . . . . . 9 - 4.1. Low-level representations . . . . . . . . . . . . . . . . 9 - 4.1.1. Integer representation . . . . . . . . . . . . . . . . 9 - 4.1.2. Header Name Representation . . . . . . . . . . . . . . 11 + 4. Detailed Format . . . . . . . . . . . . . . . . . . . . . . . 8 + 4.1. Low-level representations . . . . . . . . . . . . . . . . 8 + 4.1.1. Integer representation . . . . . . . . . . . . . . . . 8 + 4.1.2. Header Name Representation . . . . . . . . . . . . . . 10 4.1.3. Header Value Representation . . . . . . . . . . . . . 11 4.2. Indexed Header Representation . . . . . . . . . . . . . . 11 - 4.3. Literal Header Representation . . . . . . . . . . . . . . 12 - 4.3.1. Literal Header without Indexing . . . . . . . . . . . 12 - 4.3.2. Literal Header with Incremental Indexing . . . . . . . 13 - 4.3.3. Literal Header with Substitution Indexing . . . . . . 14 + 4.3. Literal Header Representation . . . . . . . . . . . . . . 11 + 4.3.1. Literal Header without Indexing . . . . . . . . . . . 11 + 4.3.2. Literal Header with Incremental Indexing . . . . . . . 12 + 4.3.3. Literal Header with Substitution Indexing . . . . . . 13 5. Parameter Negotiation . . . . . . . . . . . . . . . . . . . . 15 6. Security Considerations . . . . . . . . . . . . . . . . . . . 15 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 - 8. Informative References . . . . . . . . . . . . . . . . . . . . 16 + 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 8.1. Normative References . . . . . . . . . . . . . . . . . . . 16 + 8.2. Informative References . . . . . . . . . . . . . . . . . . 16 Appendix A. Change Log (to be removed by RFC Editor before - publication . . . . . . . . . . . . . . . . . . . . . 16 - A.1. Since draft-ietf-httpbis-header-compression-01 . . . . . . 16 - Appendix B. Initial Header Tables . . . . . . . . . . . . . . . . 17 - B.1. Requests . . . . . . . . . . . . . . . . . . . . . . . . . 17 - B.2. Responses . . . . . . . . . . . . . . . . . . . . . . . . 18 - Appendix C. Example . . . . . . . . . . . . . . . . . . . . . . . 19 - C.1. First header set . . . . . . . . . . . . . . . . . . . . . 19 - C.2. Second header set . . . . . . . . . . . . . . . . . . . . 21 + publication . . . . . . . . . . . . . . . . . . . . . 17 + A.1. Since draft-ietf-httpbis-header-compression-01 . . . . . . 17 + A.2. Since draft-ietf-httpbis-header-compression-01 . . . . . . 17 + Appendix B. Initial Header Tables . . . . . . . . . . . . . . . . 18 + B.1. Requests . . . . . . . . . . . . . . . . . . . . . . . . . 18 + B.2. Responses . . . . . . . . . . . . . . . . . . . . . . . . 19 + Appendix C. Example . . . . . . . . . . . . . . . . . . . . . . . 20 + C.1. First header set . . . . . . . . . . . . . . . . . . . . . 20 + C.2. Second header set . . . . . . . . . . . . . . . . . . . . 22 1. Introduction This document describes HPACK, a format adapted to efficiently represent HTTP headers in the context of HTTP/2.0. 2. Overview - In HTTP/1.X, HTTP headers, which are necessary for the functioning of - the protocol, are transmitted with no transformations. - Unfortunately, the amount of redundancy in both the keys and the - values of these headers is high, and is the cause of increased - latency on lower bandwidth links. This indicates that an alternate - more compact encoding for headers would be beneficial to latency, and - that is what is proposed here. + In HTTP/1.X, headers are sent without any form of compression. As + web pages have grown to include dozens to hundreds of requests, the + redundant headers in these requests now pose a problem of measurable + latency and unnecessary bandwidth. 1 [PERF1] 2 [PERF2] - As shown by SPDY [SPDY], Deflate compresses HTTP very effectively. - However, the use of a compression scheme which allows for arbitrary - matches against the previously encoded data (such as Deflate) exposes - users to security issues. In particular, the compression of - sensitive data, together with other data controlled by an attacker, - may lead to leakage of that sensitive data, even when the resultant - bytes are transmitted over an encrypted channel. + SPDY [SPDY] initially addressed this redundancy by compressing + headers with Deflate, which proved very effective at eliminating the + redundant headers. However, that aproach exposed a security risk as + demonstrated by the CRIME [CRIME]. - Another consideration is that processing and memory costs of a - compressor such as Deflate may also be too high for some classes of - devices, for example when doing forward or reverse proxying. + In this document, we propose a new header compressor which eliminates + the redundant headers, is not vulnerable to the CRIME style attack, + and which also has a bounded memory cost for use in small constrained + environments. 2.1. Outline The HTTP header encoding described in this document is based on a - header table that map (name, value) pairs to index values. This - scheme is believed to be safe for all known attacks against the - compression context today. Header tables are incrementally updated - during the HTTP/2.0 session. + header table that map (name, value) pairs to index values. Header + tables are incrementally updated during the HTTP/2.0 session. The encoder is responsible for deciding which headers to insert as new entries in the header table. The decoder then does exactly what the encoder prescribes, ending in a state that exactly matches the encoder's state. This enables decoders to remain simple and understand a wide variety of encoders. As two consecutive sets of headers often have headers in common, each set of headers is coded as a difference from the previous set of headers. The goal is to only encode the changes (headers present in @@ -154,22 +149,22 @@ Header Set: A header set (see Section 3.1.4) is a group of headers that are encoded jointly. A complete set of key-value pairs as encoded in an HTTP request or response is a header set. Header Representation: A header can be represented in encoded form either as a literal or as an index (see Section 3.1.5). The indexed representation is based on the header table. Header Emission: When decoding a set of headers, some operations emit a header (see Section 3.1.6). An emitted header is added to - the set of headers. Once emitted, a header can't be removed from - the set of headers. + the set of headers that form the HTTP request or response. Once + emitted, a header can't be removed from the set of headers. 3.1.1. Encoding Context The set of components used to encode or decode a header set form an encoding context: an encoding context contains a header table and a reference set. Using HTTP, messages are exchanged between a client and a server in both direction. To keep the encoding of headers in each direction independent from the other direction, there is one encoding context @@ -178,45 +173,45 @@ The headers contained in a PUSH_PROMISE frame sent by a server to a client are encoded within the same context as the headers contained in the HEADERS frame corresponding to a response sent from the server to the client. 3.1.2. Header Table A header table consists of an ordered list of (name, value) pairs. The first entry of a header table is assigned the index 0. - A header can be represented by an entry of the header table if they - match. A header and an entry match if both their name and their - value match. A header name and an entry name match if they are equal - using a character-based, _case insensitive_ comparison (the case - insensitive comparison is used because HTTP header names are defined - in a case insensitive way). A header value and an entry value match - if they are equal using a character-based, _case sensitive_ - comparison. + A header can be represented by an entry from the header table. + Rather than encoding a literal value for the header field name and + value, the encoder can select an entry from the header table. - Generally, the header table will not contain duplicate entries. - However, implementations MUST be prepared to accept duplicates - without signalling an error. + Literal header names MUST be translated to lowercase before encoding + and transmission. This enables an encoder to perform direct bitwise + comparisons on names and values when determining if an entry already + exists in the header table. + + There is no need for the header table to contain duplicate entries. + However, duplicate entries MUST NOT be treated as an error by a + decoder. Initially, a header table contains a list of common headers. Two initial lists of header are provided in Appendix B. One list is for headers transmitted from a client to a server, the other for the reverse direction. A header table is modified by either adding a new entry at the end of the table, or by replacing an existing entry. The encoder decides how to update the header table and as such can control how much memory is used by the header table. To limit the memory requirements on the decoder side, the header table size is - bounded (see the SETTINGS_MAX_BUFFER_SIZE in Section 5). + bounded (see the SETTINGS_HEADER_TABLE_SIZE in Section 5). The size of an entry is the sum of its name's length in bytes (as defined in Section 4.1.2), of its value's length in bytes (Section 4.1.3) and of 32 bytes. The 32 bytes are an accounting for the entry structure overhead. For example, an entry structure using two 64-bits pointers to reference the name and the value and the entry, and two 64-bits integer for counting the number of references to these name and value would use 32 bytes. The size of a header table is the sum of the size of its entries. @@ -340,63 +335,42 @@ Once all of the header representations have been processed, and the remaining items in the reference set have been emitted, the header set is complete. 3.2.4. Header Table Management The header table can be modified by either adding a new entry to it or by replacing an existing one. Before doing such a modification, it has to be ensured that the header table size will stay lower than - or equal to the SETTINGS_MAX_BUFFER_SIZE limit (see Section 5). To + or equal to the SETTINGS_HEADER_TABLE_SIZE limit (see Section 5). To achieve this, repeatedly, the first entry of the header table is removed, until enough space is available for the modification. A consequence of removing one or more entries at the beginning of the header table is that the remaining entries are renumbered. The first entry of the header table is always associated to the index 0. When the modification of the header table is the replacement of an existing entry, the replaced entry is the one indicated in the literal representation before any entry is removed from the header table. If the entry to be replaced is removed from the header table when performing the size adjustment, the replacement entry is inserted at the beginning of the header table. The addition of a new entry with a size greater than the - SETTINGS_MAX_BUFFER_SIZE limit causes all the entries from the header - table to be dropped and the new entry not to be added to the header - table. The replacement of an existing entry with a new entry with a - size greater than the SETTINGS_MAX_BUFFER_SIZE has the same + SETTINGS_HEADER_TABLE_SIZE limit causes all the entries from the + header table to be dropped and the new entry not to be added to the + header table. The replacement of an existing entry with a new entry + with a size greater than the SETTINGS_HEADER_TABLE_SIZE has the same consequences. -3.2.5. Specific Use Cases - - Three occurrences of the same indexed representation, corresponding - to an entry not present in the reference set, emit the associated - header twice: - - o The first occurrence emits the header a first time and adds the - corresponding entry to the reference set. - - o The second occurrence removes the header's entry from the - reference set. - - o The third occurrence emits the header a second time and adds again - its entry to the reference set. - - This allows for headers sets which include duplicate header entries - to be encoded efficiently and faithfully. - - The first occurrence of the indexed representation can be replaced by - a literal representation creating an entry for the header. - 4. Detailed Format 4.1. Low-level representations 4.1.1. Integer representation Integers are used to represent name indexes, pair indexes or string lengths. To allow for optimized processing, an integer representation always finishes at the end of a byte. @@ -409,20 +383,21 @@ small enough (strictly less than 2^N-1), it is encoded within the N-bit prefix. Otherwise all the bits of the prefix are set to 1 and the value is encoded using an unsigned variable length integer [1] representation. The algorithm to represent an integer I is as follows: If I < 2^N - 1, encode I on N bits Else encode 2^N - 1 on N bits + I = I - (2^N - 1) While I >= 128 Encode (I % 128 + 128) on 8 bits I = I / 128 encode (I) on 8 bits 4.1.1.1. Example 1: Encoding 10 using a 5-bit prefix The value 10 is to be encoded with a 5-bit prefix. o 10 is less than 31 (= 2^5 - 1) and is represented using the 5-bit @@ -496,35 +471,50 @@ storage required to store the text, represented as a variable- length-quantity (Section 4.1.1). 2. The specific sequence of octets representing the UTF-8 text. Invalid UTF-8 octet sequences, "over-long" UTF-8 encodings, and UTF-8 octets that represent invalid Unicode Codepoints MUST NOT be used. 4.2. Indexed Header Representation + An indexed header representation identifies an entry in the header + table. The entry is emitted and added to the reference set if it is + not currently in the reference set. The entry is removed from the + reference set if it is present in the reference set. + 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 1 | Index (7+) | +---+---------------------------+ Indexed Header This representation starts with the '1' 1-bit pattern, followed by the index of the matching pair, represented as an integer with a 7-bit prefix. 4.3. Literal Header Representation + Literal header representations contain a literal header field value. + Header field names are either provided as a literal or by reference + to an existing header table entry. + + Literal representations all result in the emission of a header when + decoded. + 4.3.1. Literal Header without Indexing + An literal header without indexing causes the emission of a header + without altering the header table. + 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 0 | 1 | 1 | Index (5+) | +---+---+---+-------------------+ | Value Length (8+) | +-------------------------------+ | Value String (Length octets) | +-------------------------------+ Literal Header without Indexing - Indexed Name @@ -537,36 +527,38 @@ +-------------------------------+ | Name String (Length octets) | +-------------------------------+ | Value Length (8+) | +-------------------------------+ | Value String (Length octets) | +-------------------------------+ Literal Header without Indexing - New Name - This representation, which does not involve updating the header - table, starts with the '011' 3-bit pattern. + This representation starts with the '011' 3-bit pattern. If the header name matches the header name of a (name, value) pair stored in the Header Table, the index of the pair increased by one (index + 1) is represented as an integer with a 5-bit prefix. Note that if the index is strictly below 31, one byte is used. If the header name does not match a header name entry, the value 0 is represented on 5 bits followed by the header name (Section 4.1.2). Header name representation is followed by the header value represented as a literal string as described in Section 4.1.3. 4.3.2. Literal Header with Incremental Indexing + A literal header with incremental indexing adds a new entry to the + header table. + 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 0 | 1 | 0 | Index (5+) | +---+---+---+-------------------+ | Value Length (8+) | +-------------------------------+ | Value String (Length octets) | +-------------------------------+ Literal Header with Incremental Indexing - @@ -596,20 +588,23 @@ that if the index is strictly below 31, one byte is used. If the header name does not match a header name entry, the value 0 is represented on 5 bits followed by the header name (Section 4.1.2). Header name representation is followed by the header value represented as a literal string as described in Section 4.1.3. 4.3.3. Literal Header with Substitution Indexing + A literal header with substitution indexing replaces an existing + header table entry. + 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ | 0 | 0 | Index (6+) | +---+---+-----------------------+ | Substituted Index (8+) | +-------------------------------+ | Value Length (8+) | +-------------------------------+ | Value String (Length octets) | +-------------------------------+ @@ -633,21 +628,21 @@ +-------------------------------+ Literal Header with Substitution Indexing - New Name This representation starts with the '00' 2-bit pattern. If the header name matches the header name of a (name, value) pair stored in the Header Table, the index of the pair increased by one (index + 1) is represented as an integer with a 6-bit prefix. Note - that if the index is strictly below 62, one byte is used. + that if the index is strictly below 63, one byte is used. If the header name does not match a header name entry, the value 0 is represented on 6 bits followed by the header name (Section 4.1.2). The index of the substituted (name, value) pair is inserted after the header name representation as a 0-bit prefix integer. The index of the substituted pair MUST correspond to a position in the header table containing a non-void entry. An index for the substituted pair that corresponds to empty position in the header @@ -657,30 +652,31 @@ string as described in Section 4.1.3. 5. Parameter Negotiation A few parameters can be used to accommodate client and server processing and memory requirements. [[anchor3: These settings are currently not supported as they have not been integrated in the main specification. Therefore, the maximum buffer size for the header table is fixed at 4096 bytes.]] - SETTINGS_MAX_BUFFER_SIZE: Allows the sender to inform the remote - endpoint of the maximum size it accepts for the header table. + SETTINGS_HEADER_TABLE_SIZE (TBD): Allows the sender to inform the + remote endpoint of the maximum size it accepts for the header + table. The default value is 4096 bytes. [[anchor4: Is this default value OK? Do we need a maximum size? Do we want to allow infinite buffer?]] When the remote endpoint receives a SETTINGS frame containing a - SETTINGS_MAX_BUFFER_SIZE setting with a value smaller than the one - currently in use, it MUST send as soon as possible a HEADER frame - with a stream identifier of 0x0 containing a value smaller than or - equal to the received setting value. + SETTINGS_HEADER_TABLE_SIZE setting with a value smaller than the + one currently in use, it MUST send as soon as possible a HEADER + frame with a stream identifier of 0x0 containing a value smaller + than or equal to the received setting value. [[anchor5: This changes slightly the behaviour of the HEADERS frame, which should be updated as follows:]] A HEADER frame with a stream identifier of 0x0 indicates that the sender has reduced the maximum size of the header table. The new maximum size of the header table is encoded on 32-bit. The decoder MUST reduce its own header table by dropping entries from it until the size of the header table is lower than or equal to the transmitted maximum size. 6. Security Considerations @@ -710,27 +706,53 @@ by considering overhead in the state size calculation. Implementors must still be careful in the creation of APIs to an implementation of this compressor by ensuring that header keys and values are either emitted as a stream, or that the compression implementation have a limit on the maximum size of a key or value. Failure to implement these kinds of safeguards may still result in a scenario where the local endpoint exhausts its memory. 7. IANA Considerations - This memo includes no request to IANA. + This document registers the SETTINGS_HEADER_TABLE_SIZE setting in the + "HTTP/2.0 Settings" registry established by [HTTP2]. The assigned + code for this setting is TBD. -8. Informative References +8. References + +8.1. Normative References + + [HTTP2] Belshe, M., Peon, R., Thomson, M., and A. Melnikov, + "Hypertext Transfer Protocol version 2.0", + draft-ietf-httpbis-http2-06 (work in progress), + February 2013. + +8.2. Informative References + + [CRIME] Rizzo, J. and T. Duong, "The Crime Attack", September 2012, + . + + [PERF1] Belshe, M., "IETF83: SPDY and What to Consider for + HTTP/2.0", March 2012, . + + [PERF2] McManus, P., "SPDY What I Like About You", September 2011, < + http://bitsup.blogspot.com/2011/09/ + spdy-what-i-like-about-you.html>. [SPDY] Belshe, M. and R. Peon, "SPDY Protocol", February 2012, . +URIs + [1] Appendix A. Change Log (to be removed by RFC Editor before publication A.1. Since draft-ietf-httpbis-header-compression-01 o Refactored of Header Encoding Section: split definitions and processing rule. o Backward incompatible change: Updated reference set management as @@ -744,26 +766,47 @@ o Added example of 32 bytes entry structure (issue #191). o Added Header Set Completion section. Reflowed some text. Clarified some writing which was akward. Added text about duplicate header entry encoding. Clarified some language w.r.t Header Set. Changed x-my-header to mynewheader. Added text in the HeaderEmission section indicating that the application may also be able to free up memory more quickly. Added information in Security Considerations section. +A.2. Since draft-ietf-httpbis-header-compression-01 + + Fixed bug/omission in integer representation algorithm. + + Changed the document title. + + Header matching text rewritten. + + Changed the definition of header emission. + + Changed the name of the setting which dictates how much memory the + compression context should use. + + Removed "specific use cases" section + Corrected erroneous statement about what index can be contained in + one byte + + Added descriptions of opcodes + + Removed security claims from introduction. + Appendix B. Initial Header Tables - [[anchor9: The tables in this section should be updated based on + [[anchor11: The tables in this section should be updated based on statistical analysis of header names frequency and specific HTTP 2.0 header rules (like removal of some headers).]] - [[anchor10: These tables are not adapted for headers contained in + [[anchor12: These tables are not adapted for headers contained in PUSH_PROMISE frames. Either the tables can be merged, or the table for responses can be updated.]] B.1. Requests The following table lists the pre-defined headers that make-up the initial header table user to represent requests sent from a client to a server. +-------+---------------------+--------------+ @@ -843,21 +886,21 @@ | 27 | strict-transport-security | | | 28 | transfer-encoding | | | 29 | www-authenticate | | +-------+-----------------------------+--------------+ Table 2: Initial Header Table for Responses Appendix C. Example Here is an example that illustrates different representations and how - tables are updated. [[anchor13: This section needs to be updated to + tables are updated. [[anchor15: This section needs to be updated to better reflect the new processing of header fields, and include more examples.]] C.1. First header set The first header set to represent is the following: :path: /my-example/index.html user-agent: my-user-agent mynewheader: first