--- 1/draft-ietf-hybi-thewebsocketprotocol-11.txt 2011-08-24 21:16:05.000000000 +0200 +++ 2/draft-ietf-hybi-thewebsocketprotocol-12.txt 2011-08-24 21:16:05.000000000 +0200 @@ -1,19 +1,19 @@ HyBi Working Group I. Fette Internet-Draft Google, Inc. Intended status: Standards Track A. Melnikov -Expires: February 24, 2012 Isode Ltd - August 23, 2011 +Expires: February 25, 2012 Isode Ltd + August 24, 2011 The WebSocket protocol - draft-ietf-hybi-thewebsocketprotocol-11 + draft-ietf-hybi-thewebsocketprotocol-12 Abstract The WebSocket protocol enables two-way communication between a client running untrusted code running in a controlled environment to a remote host that has opted-in to communications from that code. The security model used for this is the Origin-based security model commonly used by Web browsers. The protocol consists of an opening handshake followed by basic message framing, layered over TCP. The goal of this technology is to provide a mechanism for browser-based @@ -31,21 +31,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on February 24, 2012. + This Internet-Draft will expire on February 25, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -63,38 +63,38 @@ 1.3. Opening Handshake . . . . . . . . . . . . . . . . . . . . 6 1.4. Closing Handshake . . . . . . . . . . . . . . . . . . . . 9 1.5. Design Philosophy . . . . . . . . . . . . . . . . . . . . 9 1.6. Security Model . . . . . . . . . . . . . . . . . . . . . . 10 1.7. Relationship to TCP and HTTP . . . . . . . . . . . . . . . 11 1.8. Establishing a Connection . . . . . . . . . . . . . . . . 11 1.9. Subprotocols Using the WebSocket protocol . . . . . . . . 11 2. Conformance Requirements . . . . . . . . . . . . . . . . . . . 13 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 13 3. WebSocket URIs . . . . . . . . . . . . . . . . . . . . . . . . 15 - 4. Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 4.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 4.2. Base Framing Protocol . . . . . . . . . . . . . . . . . . 16 - 4.3. Client-to-Server Masking . . . . . . . . . . . . . . . . . 20 - 4.4. Fragmentation . . . . . . . . . . . . . . . . . . . . . . 21 - 4.5. Control Frames . . . . . . . . . . . . . . . . . . . . . . 22 - 4.5.1. Close . . . . . . . . . . . . . . . . . . . . . . . . 23 - 4.5.2. Ping . . . . . . . . . . . . . . . . . . . . . . . . . 24 - 4.5.3. Pong . . . . . . . . . . . . . . . . . . . . . . . . . 24 - 4.6. Data Frames . . . . . . . . . . . . . . . . . . . . . . . 24 - 4.7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 25 - 4.8. Extensibility . . . . . . . . . . . . . . . . . . . . . . 25 - 5. Opening Handshake . . . . . . . . . . . . . . . . . . . . . . 27 - 5.1. Client Requirements . . . . . . . . . . . . . . . . . . . 27 - 5.2. Server-side Requirements . . . . . . . . . . . . . . . . . 32 - 5.2.1. Reading the Client's Opening Handshake . . . . . . . . 32 - 5.2.2. Sending the Server's Opening Handshake . . . . . . . . 33 - 5.3. Collected ABNF for new header fields used in handshake . . 36 + 4. Opening Handshake . . . . . . . . . . . . . . . . . . . . . . 16 + 4.1. Client Requirements . . . . . . . . . . . . . . . . . . . 16 + 4.2. Server-side Requirements . . . . . . . . . . . . . . . . . 21 + 4.2.1. Reading the Client's Opening Handshake . . . . . . . . 21 + 4.2.2. Sending the Server's Opening Handshake . . . . . . . . 22 + 4.3. Collected ABNF for new header fields used in handshake . . 25 + 5. Data Framing . . . . . . . . . . . . . . . . . . . . . . . . . 27 + 5.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 27 + 5.2. Base Framing Protocol . . . . . . . . . . . . . . . . . . 27 + 5.3. Client-to-Server Masking . . . . . . . . . . . . . . . . . 31 + 5.4. Fragmentation . . . . . . . . . . . . . . . . . . . . . . 32 + 5.5. Control Frames . . . . . . . . . . . . . . . . . . . . . . 33 + 5.5.1. Close . . . . . . . . . . . . . . . . . . . . . . . . 34 + 5.5.2. Ping . . . . . . . . . . . . . . . . . . . . . . . . . 35 + 5.5.3. Pong . . . . . . . . . . . . . . . . . . . . . . . . . 35 + 5.6. Data Frames . . . . . . . . . . . . . . . . . . . . . . . 35 + 5.7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . 36 + 5.8. Extensibility . . . . . . . . . . . . . . . . . . . . . . 36 6. Sending and Receiving Data . . . . . . . . . . . . . . . . . . 38 6.1. Sending Data . . . . . . . . . . . . . . . . . . . . . . . 38 6.2. Receiving Data . . . . . . . . . . . . . . . . . . . . . . 38 7. Closing the connection . . . . . . . . . . . . . . . . . . . . 40 7.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 40 7.1.1. Close the WebSocket Connection . . . . . . . . . . . . 40 7.1.2. Start the WebSocket Closing Handshake . . . . . . . . 40 7.1.3. The WebSocket Closing Handshake is Started . . . . . . 40 7.1.4. The WebSocket Connection is Closed . . . . . . . . . . 41 7.1.5. The WebSocket Connection Close Code . . . . . . . . . 41 @@ -195,21 +195,21 @@ Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat The leading line from the client follows the Request-Line format. The leading line from the server follows the Status-Line format. The Request-Line and Status-Line productions are defined in [RFC2616]. After the leading line in both cases come an unordered set of header - fields. The meaning of these header fields is specified in Section 5 + fields. The meaning of these header fields is specified in Section 4 of this document. Additional header fields may also be present, such as cookies [RFC6265]. The format and parsing of headers is as defined in [RFC2616]. Once the client and server have both sent their handshakes, and if the handshake was successful, then the data transfer part starts. This is a two-way communication channel where each side can, independently from the other, send data at will. Clients and servers, after a successful handshake, transfer data back @@ -367,21 +367,21 @@ cookies, as described in [RFC6265]. 1.4. Closing Handshake _This section is non-normative._ The closing handshake is far simpler than the opening handshake. Either peer can send a control frame with data containing a specified control sequence to begin the closing handshake (detailed in - Section 4.5.1). Upon receiving such a frame, the other peer sends a + Section 5.5.1). Upon receiving such a frame, the other peer sends a close frame in response, if it hasn't already sent one. Upon receiving _that_ control frame, the first peer then closes the connection, safe in the knowledge that no further data is forthcoming. After sending a control frame indicating the connection should be closed, a peer does not send any further data; after receiving a control frame indicating the connection should be closed, a peer discards any further data received. @@ -600,531 +600,38 @@ port = path = query = The port component is OPTIONAL; the default for "ws" is port 80, while the default for "wss" is port 443. The URI is called "secure" (and it said that "the secure flag is set") if the scheme component matches "wss" case-insensitively. - The "resource-name" (also known as /resource name/ in Section 5.1) + The "resource-name" (also known as /resource name/ in Section 4.1) can be constructed by concatenating o "/" if the path component is empty o the path component o "?" if the query component is non-empty o the query component Fragment identifiers are meaningless in the context of WebSocket URIs, and MUST NOT be used on these URIs. The character "#" in URIs MUST be escaped as %23 if used as part of the query component. -4. Data Framing - -4.1. Overview - - In the WebSocket protocol, data is transmitted using a sequence of - frames. Frames sent from the client to the server are masked to - avoid confusing network intermediaries, such as intercepting proxies. - Frames sent from the server to the client are not masked. - - The base framing protocol defines a frame type with an opcode, a - payload length, and designated locations for extension and - application data, which together define the _payload_ data. Certain - bits and opcodes are reserved for future expansion of the protocol. - - A data frame MAY be transmitted by either the client or the server at - any time after opening handshake completion and before that endpoint - has sent a close frame (Section 4.5.1). - -4.2. Base Framing Protocol - - This wire format for the data transfer part is described by the ABNF - [RFC5234] given in detail in this section. A high level overview of - the framing is given in the following figure. - - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 - +-+-+-+-+-------+-+-------------+-------------------------------+ - |F|R|R|R| opcode|M| Payload len | Extended payload length | - |I|S|S|S| (4) |A| (7) | (16/63) | - |N|V|V|V| |S| | (if payload len==126/127) | - | |1|2|3| |K| | | - +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - + - | Extended payload length continued, if payload len == 127 | - + - - - - - - - - - - - - - - - +-------------------------------+ - | |Masking-key, if MASK set to 1 | - +-------------------------------+-------------------------------+ - | Masking-key (continued) | Payload Data | - +-------------------------------- - - - - - - - - - - - - - - - + - : Payload Data continued ... : - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + - | Payload Data continued ... | - +---------------------------------------------------------------+ - - FIN: 1 bit - - Indicates that this is the final fragment in a message. The first - fragment MAY also be the final fragment. - - RSV1, RSV2, RSV3: 1 bit each - - MUST be 0 unless an extension is negotiated which defines meanings - for non-zero values. If a nonzero value is received and none of - the negotiated extensions defines the meaning of such a nonzero - value, the receiving endpoint MUST _Fail the WebSocket - Connection_. - - Opcode: 4 bits - - Defines the interpretation of the payload data. If an unknown - opcode is received, the receiving endpoint MUST _Fail the - WebSocket Connection_. The following values are defined. - - * %x0 denotes a continuation frame - - * %x1 denotes a text frame - - * %x2 denotes a binary frame - - * %x3-7 are reserved for further non-control frames - - * %x8 denotes a connection close - - * %x9 denotes a ping - - * %xA denotes a pong - - * %xB-F are reserved for further control frames - - Mask: 1 bit - - Defines whether the payload data is masked. If set to 1, a - masking key is present in masking-key, and this is used to unmask - the payload data as per Section 4.3. All frames sent from client - to server have this bit set to 1. - - Payload length: 7 bits, 7+16 bits, or 7+64 bits - - The length of the payload data, in bytes: if 0-125, that is the - payload length. If 126, the following 2 bytes interpreted as a 16 - bit unsigned integer are the payload length. If 127, the - following 8 bytes interpreted as a 64-bit unsigned integer (the - most significant bit MUST be 0) are the payload length. Multibyte - length quantities are expressed in network byte order. The - payload length is the length of the extension data + the length of - the application data. The length of the extension data may be - zero, in which case the payload length is the length of the - application data. - - Masking-key: 0 or 4 bytes - - All frames sent from the client to the server are masked by a 32- - bit value that is contained within the frame. This field is - present if the mask bit is set to 1, and is absent if the mask bit - is set to 0. See Section 4.3 for further information on client- - to-server masking. - - Payload data: (x+y) bytes - - The payload data is defined as extension data concatenated with - application data. - - Extension data: x bytes - - The extension data is 0 bytes unless an extension has been - negotiated. Any extension MUST specify the length of the - extension data, or how that length may be calculated, and how the - extension use MUST be negotiated during the opening handshake. If - present, the extension data is included in the total payload - length. - - Application data: y bytes - - Arbitrary application data, taking up the remainder of the frame - after any extension data. The length of the application data is - equal to the payload length minus the length of the extension - data. - - The base framing protocol is formally defined by the following ABNF - [RFC5234]: - - ws-frame = frame-fin - frame-rsv1 - frame-rsv2 - frame-rsv3 - frame-opcode - frame-masked - frame-payload-length - [ frame-masking-key ] - frame-payload-data - - frame-fin = %x0 ; more frames of this message follow - / %x1 ; final frame of this message - - frame-rsv1 = %x0 - ; 1 bit, MUST be 0 unless negotiated otherwise - - frame-rsv2 = %x0 - ; 1 bit, MUST be 0 unless negotiated otherwise - - frame-rsv3 = %x0 - ; 1 bit, MUST be 0 unless negotiated otherwise - - frame-opcode = %x0 ; continuation frame - / %x1 ; text frame - / %x2 ; binary frame - / %x3-7 ; reserved for further non-control frames - / %x8 ; connection close - / %x9 ; ping - / %xA ; pong - / %xB-F ; reserved for further control frames - - frame-masked = %x0 ; frame is not masked, no frame-masking-key - / %x1 ; frame is masked, frame-masking-key present - - frame-payload-length = %x00-7D - / %x7E frame-payload-length-16 - / %x7F frame-payload-length-63 - - frame-payload-length-16 = %x0000-FFFF - - frame-payload-length-63 = %x0000000000000000-7FFFFFFFFFFFFFFF - - frame-masking-key = 4( %0x00-FF ) ; present only if frame-masked is 1 - - frame-payload-data = (frame-masked-extension-data - frame-masked-application-data) ; frame-masked 1 - / (frame-unmasked-extension-data - frame-unmasked-application-data) ; frame-masked 0 - - frame-masked-extension-data = *( %x00-FF ) ; to be defined later - - frame-masked-application-data = *( %x00-FF ) - - frame-unmasked-extension-data = *( %x00-FF ) ; to be defined later - - frame-unmasked-application-data = *( %x00-FF ) - -4.3. Client-to-Server Masking - - The client MUST mask all frames sent to the server. A server MUST - close the connection upon receiving a frame with the MASK bit set to - 0. In this case, a server MAY send a close frame with a status code - of 1002 (protocol error) as defined in Section 7.4.1. - - A masked frame MUST have the field frame-masked set to 1, as defined - in Section 4.2. - - The masking key is contained completely within the frame, as defined - in Section 4.2 as frame-masking-key. It is used to mask the payload - data defined in the same section as frame-payload-data, which - includes extension and application data. - - The masking key is a 32-bit value chosen at random by the client. - The masking key MUST be derived from a strong source of entropy, and - the masking key for a given frame MUST NOT make it simple for a - server to predict the masking key for a subsequent frame. RFC 4086 - [RFC4086] discusses what entails a suitable source of entropy for - security-sensitive applications. - - The masking does not affect the length of the payload data. To - convert masked data into unmasked data, or vice versa, the following - algorithm is applied. The same algorithm applies regardless of the - direction of the translation - e.g. the same steps are applied to - mask the data as to unmask the data. - - Octet i of the transformed data ("transformed-octet-i") is the XOR of - octet i of the original data ("original-octet-i") with octet at index - i modulo 4 of the masking key ("masking-key-octet-j"): - - j = i MOD 4 - transformed-octet-i = original-octet-i XOR masking-key-octet-j - - When preparing a masked frame, the client MUST pick a fresh masking - key from the set of allowed 32-bit values. The masking key must be - unpredictable. The unpredictability of the masking key is essential - to prevent the author of malicious applications from selecting the - bytes that appear on the wire. - - The payload length, indicated in the framing as frame-payload-length, - does NOT include the length of the masking key. It is the length of - the payload data, e.g. the number of bytes following the masking key. - -4.4. Fragmentation - - The primary purpose of fragmentation is to allow sending a message - that is of unknown size when the message is started without having to - buffer that message. If messages couldn't be fragmented, then an - endpoint would have to buffer the entire message so its length could - be counted before first byte is sent. With fragmentation, a server - or intermediary may choose a reasonable size buffer, and when the - buffer is full write a fragment to the network. - - A secondary use-case for fragmentation is for multiplexing, where it - is not desirable for a large message on one logical channel to - monopolize the output channel, so the MUX needs to be free to split - the message into smaller fragments to better share the output - channel. - - The following rules apply to fragmentation: - - o An unfragmented message consists of a single frame with the FIN - bit set and an opcode other than 0. - - o A fragmented message consists of a single frame with the FIN bit - clear and an opcode other than 0, followed by zero or more frames - with the FIN bit clear and the opcode set to 0, and terminated by - a single frame with the FIN bit set and an opcode of 0. A - fragmented message is conceptually equivalent to a single larger - message whose payload is equal to the concatenation of the - payloads of the fragments in order, however in the presence of - extensions this may not hold true as the extension defines the - interpretation of the extension data present. For instance, - extension data may only be present at the beginning of the first - fragment and apply to subsequent fragments, or there may be - extension data present in each of the fragments that applies only - to that particular fragment. In absence of extension data, the - following example demonstrates how fragmentation works. - - EXAMPLE: For a text message sent as three fragments, the first - fragment would have an opcode of 0x1 and a FIN bit clear, the - second fragment would have an opcode of 0x0 and a FIN bit clear, - and the third fragment would have an opcode of 0x0 and a FIN bit - that is set. - - o Control frames MAY be injected in the middle of a fragmented - message. Control frames themselves MUST NOT be fragmented. - - o Message fragments MUST be delivered to the recipient in the order - sent by the sender. - - o The fragments of one message MUST NOT be interleaved between the - fragments of another message unless an extension has been - negotiated that can interpret the interleaving. - - o An endpoint MUST be capable of handling control frames in the - middle of a fragmented message. - - o A sender MAY create fragments of any size for non-control - messages. - - o Clients and servers MUST support receiving both fragmented and - unfragmented messages. - - o As control frames cannot be fragmented, an intermediary MUST NOT - attempt to change the fragmentation of a control frame. - - o An intermediary MUST NOT change the fragmentation of a message if - any reserved bit values are used and the meaning of these values - is not known to the intermediary. - - o An intermediary MUST NOT change the fragmentation of any message - in the context of a connection where extensions have been - negotiated and the intermediary is not aware of the semantics of - the negotiated extensions. - - o As a consequence of these rules, all fragments of a message are of - the same type, as set by the first fragment's opcode. Since - Control frames cannot be fragmented, the type for all fragments in - a message MUST be either text or binary, or one of the reserved - opcodes. - - _Note: if control frames could not be interjected, the latency of a - ping, for example, would be very long if behind a large message. - Hence, the requirement of handling control frames in the middle of a - fragmented message._ - -4.5. Control Frames - - Control frames are identified by opcodes where the most significant - bit of the opcode is 1. Currently defined opcodes for control frames - include 0x8 (Close), 0x9 (Ping), and 0xA (Pong). Opcodes 0xB-0xF are - reserved for further control frames yet to be defined. - - Control frames are used to communicate state about the WebSocket. - Control frames can be interjected in the middle of a fragmented - message. - - All control frames MUST have a payload length of 125 bytes or less - and MUST NOT be fragmented. - -4.5.1. Close - - The Close frame contains an opcode of 0x8. - - The Close frame MAY contain a body (the "application data" portion of - the frame) that indicates a reason for closing, such as an endpoint - shutting down, an endpoint having received a frame too large, or an - endpoint having received a frame that does not conform to the format - expected by the other endpoint. If there is a body, the first two - bytes of the body MUST be a 2-byte unsigned integer (in network byte - order) representing a status code with value /code/ defined in - Section 7.4. Following the 2-byte integer the body MAY contain UTF-8 - encoded data with value /reason/, the interpretation of which is not - defined by this specification. This data is not necessarily human - readable, but may be useful for debugging or passing information - relevant to the script that opened the connection. As the data is - not guaranteed to be human readable, clients MUST NOT show it to end - users. - - Close frames sent from client to server must be masked as per - Section 4.3. - - The application MUST NOT send any more data frames after sending a - close frame. - - If an endpoint receives a Close frame and that endpoint did not - previously send a Close frame, the endpoint MUST send a Close frame - in response. It SHOULD do so as soon as is practical. An endpoint - MAY delay sending a close frame until its current message is sent - (for instance, if the majority of a fragmented message is already - sent, an endpoint MAY send the remaining fragments before sending a - Close frame). However, there is no guarantee that the endpoint which - has already sent a Close frame will continue to process data. - - After both sending and receiving a close message, an endpoint - considers the WebSocket connection closed, and MUST close the - underlying TCP connection. The server MUST close the underlying TCP - connection immediately; the client SHOULD wait for the server to - close the connection but MAY close the connection at any time after - sending and receiving a close message, e.g. if it has not received a - TCP close from the server in a reasonable time period. - - If a client and server both send a Close message at the same time, - both endpoints will have sent and received a Close message and should - consider the WebSocket connection closed and close the underlying TCP - connection. - -4.5.2. Ping - - The Ping frame contains an opcode of 0x9. - - Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in - response. It SHOULD do so as soon as is practical. Pong frames are - discussed in Section 4.5.3. - - An endpoint MAY send a Ping frame any time after the connection is - established and before the connection is closed. NOTE: A ping frame - may serve either as a keepalive, or to verify that the remote - endpoint is still responsive. - -4.5.3. Pong - - The Pong frame contains an opcode of 0xA. - - Section 4.5.2 details requirements that apply to both Ping and Pong - frames. - - A Pong frame sent in response to a Ping frame must have identical - Application Data as found in the message body of the Ping frame being - replied to. - - If an endpoint receives a Ping frame and has not yet sent Pong - frame(s) in response to previous Ping frame(s), the endpoint MAY - elect to send a Pong frame for only the most recently processed Ping - frame. - - A Pong frame MAY be sent unsolicited. This serves as a - unidirectional heartbeat. A response to an unsolicited pong is not - expected. - -4.6. Data Frames - - Data frames (e.g. non-control frames) are identified by opcodes where - the most significant bit of the opcode is 0. Currently defined - opcodes for data frames include 0x1 (Text), 0x2 (Binary). Opcodes - 0x3-0x7 are reserved for further non-control frames yet to be - defined. - - Data frames carry application-layer and/or extension-layer data. The - opcode determines the interpretation of the data: - - Text - - The payload data is text data encoded as UTF-8. Note that a - particular text frame might include a partial UTF-8 sequence, - however the whole message MUST contain valid UTF-8. - - Binary - - The payload data is arbitrary binary data whose interpretation is - solely up to the application layer. - -4.7. Examples - - _This section is non-normative._ - - o A single-frame unmasked text message - - * 0x81 0x05 0x48 0x65 0x6c 0x6c 0x6f (contains "Hello") - - o A single-frame masked text message - - * 0x81 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58 - (contains "Hello") - - o A fragmented unmasked text message - - * 0x01 0x03 0x48 0x65 0x6c (contains "Hel") - - * 0x80 0x02 0x6c 0x6f (contains "lo") - - o Unmasked Ping request and masked Ping response - - * 0x89 0x05 0x48 0x65 0x6c 0x6c 0x6f (contains a body of "Hello", - but the contents of the body are arbitrary) - - * 0x8a 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58 - (contains a body of "Hello", matching the body of the ping) - - o 256 bytes binary message in a single unmasked frame - - * 0x82 0x7E 0x0100 [256 bytes of binary data] - - o 64KiB binary message in a single unmasked frame - - * 0x82 0x7F 0x0000000000010000 [65536 bytes of binary data] - -4.8. Extensibility - - The protocol is designed to allow for extensions, which will add - capabilities to the base protocols. The endpoints of a connection - MUST negotiate the use of any extensions during the opening - handshake. This specification provides opcodes 0x3 through 0x7 and - 0xB through 0xF, the extension data field, and the frame-rsv1, frame- - rsv2, and frame-rsv3 bits of the frame header for use by extensions. - The negotiation of extensions is discussed in further detail in - Section 9.1. Below are some anticipated uses of extensions. This - list is neither complete nor prescriptive. - - o Extension data may be placed in the payload data before the - application data. - - o Reserved bits can be allocated for per-frame needs. - - o Reserved opcode values can be defined. - - o Reserved bits can be allocated to the opcode field if more opcode - values are needed. - - o A reserved bit or an "extension" opcode can be defined which - allocates additional bits out of the payload data to define larger - opcodes or more per-frame bits. - -5. Opening Handshake +4. Opening Handshake -5.1. Client Requirements +4.1. Client Requirements To _Establish a WebSocket Connection_, a client opens a connection and sends a handshake as defined in this section. A connection is defined to initially be in a CONNECTING state. A client will need to supply a /host/, /port/, /resource name/, and a /secure/ flag, which are the components of a WebSocket URI as discussed in Section 3, along with a list of /protocols/ and /extensions/ to be used. Additionally, if the client is a web browser, an /origin/ MUST be supplied. @@ -1356,56 +863,56 @@ 5. If the response includes a "Sec-WebSocket-Extensions" header field, and this header field indicates the use of an extension that was not present in the client' handshake (the server has indicated an extension not requested by the client), the client MUST _Fail the WebSocket Connection_. (The parsing of this header field to determine which extensions are requested is discussed in Section 9.1.) If the server's response does not conform to the requirements for the - server's handshake as defined in this section and in Section 5.2.2, + server's handshake as defined in this section and in Section 4.2.2, the client MUST _Fail the WebSocket Connection_. If the server's response is validated as provided for above, it is said that _The WebSocket Connection is Established_ and that the WebSocket Connection is in the OPEN state. The _Extensions In Use_ is defined to be a (possibly empty) string, the value of which is equal to the value of the |Sec-WebSocket-Extensions| header field supplied by the server's handshake, or the null value if that header field was not present in the server's handshake. The _Subprotocol In Use_ is defined to be the value of the |Sec-WebSocket-Protocol| header field in the server's handshake, or the null value if that header field was not present in the server's handshake. Additionally, if any header fields in the server's handshake indicate that cookies should be set (as defined by [RFC6265]), these cookies are referred to as _Cookies Set During the Server's Opening Handshake_. -5.2. Server-side Requirements +4.2. Server-side Requirements _This section only applies to servers._ Servers MAY offload the management of the connection to other agents on the network, for example load balancers and reverse proxies. In such a situation, the server for the purposes of conformance is considered to include all parts of the server-side infrastructure from the first device to terminate the TCP connection all the way to the server that processes requests and sends responses. EXAMPLE: For example, a data center might have a server that responds to WebSocket requests with an appropriate handshake, and then passes the connection to another server to actually process the data frames. For the purposes of this specification, the "server" is the combination of both computers. -5.2.1. Reading the Client's Opening Handshake +4.2.1. Reading the Client's Opening Handshake When a client starts a WebSocket connection, it sends its part of the opening handshake. The server must parse at least part of this handshake in order to obtain the necessary information to generate the server part of the handshake. The client's opening handshake consists of the following parts. If the server, while reading the handshake, finds that the client did not send a handshake that matches the description below, including but not limited to any violations of the grammar (ABNF) specified for @@ -1435,21 +942,21 @@ speak, ordered by preference. 7. Optionally, a "Sec-WebSocket-Extensions" header field, with a list of values indicating which extensions the client would like to speak. The interpretation of this header field is discussed in Section 9.1. 8. Optionally, other header fields, such as those used to send cookies to a server. Unknown header fields MUST be ignored. -5.2.2. Sending the Server's Opening Handshake +4.2.2. Sending the Server's Opening Handshake When a client establishes a WebSocket connection to a server, the server MUST complete the following steps to accept the connection and send the server's opening handshake. 1. If the server supports encryption, perform a TLS handshake over the connection. If this fails (e.g. the client indicated a host name in the extended client hello "server_name" extension that the server does not host), then close the connection; otherwise, all further communication for the connection (including the @@ -1536,21 +1043,21 @@ [RFC2616]. Such a response could look like "HTTP/1.1 101 Switching Protocols" 2. An "Upgrade" header field with value "websocket" as per RFC 2616 [RFC2616]. 3. A "Connection" header field with value "Upgrade" 4. A "Sec-WebSocket-Accept" header field. The value of this header field is constructed by concatenating /key/, defined - above in Paragraph 2 of Section 5.2.2, with the string + above in Paragraph 2 of Section 4.2.2, with the string "258EAFA5-E914-47DA-95CA-C5AB0DC85B11", taking the SHA-1 hash of this concatenated value to obtain a 20-byte value, and base64-encoding (see Section 4 of [RFC4648]) this 20-byte hash. The ABNF of this header field is defined as follows: Sec-WebSocket-Accept = base64-value base64-value = *base64-data [ base64-padding ] base64-data = 4base64-character @@ -1564,35 +1071,35 @@ string "dGhlIHNhbXBsZSBub25jZQ==258EAFA5-E914-47DA-95CA- C5AB0DC85B11". The server would then take the SHA-1 hash of this string, giving the value 0xb3 0x7a 0x4f 0x2c 0xc0 0x62 0x4f 0x16 0x90 0xf6 0x46 0x06 0xcf 0x38 0x59 0x45 0xb2 0xbe 0xc4 0xea. This value is then base64-encoded, to give the value "s3pPLMBiTxaQ9kYGzzhZRbK+xOo=", which would be returned in the "Sec-WebSocket-Accept" header field. 5. Optionally, a "Sec-WebSocket-Protocol" header field, with a value /subprotocol/ as defined in Paragraph 2 of - Section 5.2.2. + Section 4.2.2. 6. Optionally, a "Sec-WebSocket-Extensions" header field, with a value /extensions/ as defined in Paragraph 2 of - Section 5.2.2. If multiple extensions are to be used, they + Section 4.2.2. If multiple extensions are to be used, they must all be listed in a single Sec-WebSocket-Extensions header field. This header field MUST NOT be repeated. This completes the server's handshake. If the server finishes these steps without aborting the WebSocket handshake, the server considers the WebSocket connection to be established and that the WebSocket connection is in the OPEN state. At this point, the server may begin sending (and receiving) data. -5.3. Collected ABNF for new header fields used in handshake +4.3. Collected ABNF for new header fields used in handshake Unlike other section of the document this section is using ABNF syntax/rules from [RFC2616], including "implied WSP rule". The following new header field can be sent during the handshake from the client to the server: Sec-WebSocket-Key = base64-value Sec-WebSocket-Extensions = extension-list Sec-WebSocket-Protocol-Client = 1#token @@ -1612,86 +1119,579 @@ ; 0-255 The following new header field can be sent during the handshake from the server to the client: Sec-WebSocket-Extensions = extension-list Sec-WebSocket-Accept = base64-value Sec-WebSocket-Protocol-Server = token Sec-WebSocket-Version-Server = 1#version +5. Data Framing + +5.1. Overview + + In the WebSocket protocol, data is transmitted using a sequence of + frames. Frames sent from the client to the server are masked to + avoid confusing network intermediaries, such as intercepting proxies. + Frames sent from the server to the client are not masked. + + The base framing protocol defines a frame type with an opcode, a + payload length, and designated locations for extension and + application data, which together define the _payload_ data. Certain + bits and opcodes are reserved for future expansion of the protocol. + + A data frame MAY be transmitted by either the client or the server at + any time after opening handshake completion and before that endpoint + has sent a close frame (Section 5.5.1). + +5.2. Base Framing Protocol + + This wire format for the data transfer part is described by the ABNF + [RFC5234] given in detail in this section. A high level overview of + the framing is given in the following figure. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-------+-+-------------+-------------------------------+ + |F|R|R|R| opcode|M| Payload len | Extended payload length | + |I|S|S|S| (4) |A| (7) | (16/63) | + |N|V|V|V| |S| | (if payload len==126/127) | + | |1|2|3| |K| | | + +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - + + | Extended payload length continued, if payload len == 127 | + + - - - - - - - - - - - - - - - +-------------------------------+ + | |Masking-key, if MASK set to 1 | + +-------------------------------+-------------------------------+ + | Masking-key (continued) | Payload Data | + +-------------------------------- - - - - - - - - - - - - - - - + + : Payload Data continued ... : + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + | Payload Data continued ... | + +---------------------------------------------------------------+ + + FIN: 1 bit + + Indicates that this is the final fragment in a message. The first + fragment MAY also be the final fragment. + + RSV1, RSV2, RSV3: 1 bit each + + MUST be 0 unless an extension is negotiated which defines meanings + for non-zero values. If a nonzero value is received and none of + the negotiated extensions defines the meaning of such a nonzero + value, the receiving endpoint MUST _Fail the WebSocket + Connection_. + + Opcode: 4 bits + + Defines the interpretation of the payload data. If an unknown + opcode is received, the receiving endpoint MUST _Fail the + WebSocket Connection_. The following values are defined. + + * %x0 denotes a continuation frame + + * %x1 denotes a text frame + + * %x2 denotes a binary frame + + * %x3-7 are reserved for further non-control frames + + * %x8 denotes a connection close + + * %x9 denotes a ping + + * %xA denotes a pong + + * %xB-F are reserved for further control frames + + Mask: 1 bit + + Defines whether the payload data is masked. If set to 1, a + masking key is present in masking-key, and this is used to unmask + the payload data as per Section 5.3. All frames sent from client + to server have this bit set to 1. + + Payload length: 7 bits, 7+16 bits, or 7+64 bits + + The length of the payload data, in bytes: if 0-125, that is the + payload length. If 126, the following 2 bytes interpreted as a 16 + bit unsigned integer are the payload length. If 127, the + following 8 bytes interpreted as a 64-bit unsigned integer (the + most significant bit MUST be 0) are the payload length. Multibyte + length quantities are expressed in network byte order. The + payload length is the length of the extension data + the length of + the application data. The length of the extension data may be + zero, in which case the payload length is the length of the + application data. + + Masking-key: 0 or 4 bytes + + All frames sent from the client to the server are masked by a 32- + bit value that is contained within the frame. This field is + present if the mask bit is set to 1, and is absent if the mask bit + is set to 0. See Section 5.3 for further information on client- + to-server masking. + + Payload data: (x+y) bytes + + The payload data is defined as extension data concatenated with + application data. + + Extension data: x bytes + + The extension data is 0 bytes unless an extension has been + negotiated. Any extension MUST specify the length of the + extension data, or how that length may be calculated, and how the + extension use MUST be negotiated during the opening handshake. If + present, the extension data is included in the total payload + length. + + Application data: y bytes + + Arbitrary application data, taking up the remainder of the frame + after any extension data. The length of the application data is + equal to the payload length minus the length of the extension + data. + + The base framing protocol is formally defined by the following ABNF + [RFC5234]: + + ws-frame = frame-fin + frame-rsv1 + frame-rsv2 + frame-rsv3 + frame-opcode + frame-masked + frame-payload-length + [ frame-masking-key ] + frame-payload-data + + frame-fin = %x0 ; more frames of this message follow + / %x1 ; final frame of this message + + frame-rsv1 = %x0 + ; 1 bit, MUST be 0 unless negotiated otherwise + + frame-rsv2 = %x0 + ; 1 bit, MUST be 0 unless negotiated otherwise + + frame-rsv3 = %x0 + ; 1 bit, MUST be 0 unless negotiated otherwise + + frame-opcode = %x0 ; continuation frame + / %x1 ; text frame + / %x2 ; binary frame + / %x3-7 ; reserved for further non-control frames + / %x8 ; connection close + / %x9 ; ping + / %xA ; pong + / %xB-F ; reserved for further control frames + + frame-masked = %x0 ; frame is not masked, no frame-masking-key + / %x1 ; frame is masked, frame-masking-key present + + frame-payload-length = %x00-7D + / %x7E frame-payload-length-16 + / %x7F frame-payload-length-63 + + frame-payload-length-16 = %x0000-FFFF + + frame-payload-length-63 = %x0000000000000000-7FFFFFFFFFFFFFFF + + frame-masking-key = 4( %0x00-FF ) ; present only if frame-masked is 1 + + frame-payload-data = (frame-masked-extension-data + frame-masked-application-data) ; frame-masked 1 + / (frame-unmasked-extension-data + frame-unmasked-application-data) ; frame-masked 0 + + frame-masked-extension-data = *( %x00-FF ) ; to be defined later + + frame-masked-application-data = *( %x00-FF ) + + frame-unmasked-extension-data = *( %x00-FF ) ; to be defined later + + frame-unmasked-application-data = *( %x00-FF ) + +5.3. Client-to-Server Masking + + The client MUST mask all frames sent to the server. A server MUST + close the connection upon receiving a frame with the MASK bit set to + 0. In this case, a server MAY send a close frame with a status code + of 1002 (protocol error) as defined in Section 7.4.1. + + A masked frame MUST have the field frame-masked set to 1, as defined + in Section 5.2. + + The masking key is contained completely within the frame, as defined + in Section 5.2 as frame-masking-key. It is used to mask the payload + data defined in the same section as frame-payload-data, which + includes extension and application data. + + The masking key is a 32-bit value chosen at random by the client. + The masking key MUST be derived from a strong source of entropy, and + the masking key for a given frame MUST NOT make it simple for a + server to predict the masking key for a subsequent frame. RFC 4086 + [RFC4086] discusses what entails a suitable source of entropy for + security-sensitive applications. + + The masking does not affect the length of the payload data. To + convert masked data into unmasked data, or vice versa, the following + algorithm is applied. The same algorithm applies regardless of the + direction of the translation - e.g. the same steps are applied to + mask the data as to unmask the data. + + Octet i of the transformed data ("transformed-octet-i") is the XOR of + octet i of the original data ("original-octet-i") with octet at index + i modulo 4 of the masking key ("masking-key-octet-j"): + + j = i MOD 4 + transformed-octet-i = original-octet-i XOR masking-key-octet-j + + When preparing a masked frame, the client MUST pick a fresh masking + key from the set of allowed 32-bit values. The masking key must be + unpredictable. The unpredictability of the masking key is essential + to prevent the author of malicious applications from selecting the + bytes that appear on the wire. + + The payload length, indicated in the framing as frame-payload-length, + does NOT include the length of the masking key. It is the length of + the payload data, e.g. the number of bytes following the masking key. + +5.4. Fragmentation + + The primary purpose of fragmentation is to allow sending a message + that is of unknown size when the message is started without having to + buffer that message. If messages couldn't be fragmented, then an + endpoint would have to buffer the entire message so its length could + be counted before first byte is sent. With fragmentation, a server + or intermediary may choose a reasonable size buffer, and when the + buffer is full write a fragment to the network. + + A secondary use-case for fragmentation is for multiplexing, where it + is not desirable for a large message on one logical channel to + monopolize the output channel, so the MUX needs to be free to split + the message into smaller fragments to better share the output + channel. + + The following rules apply to fragmentation: + + o An unfragmented message consists of a single frame with the FIN + bit set and an opcode other than 0. + + o A fragmented message consists of a single frame with the FIN bit + clear and an opcode other than 0, followed by zero or more frames + with the FIN bit clear and the opcode set to 0, and terminated by + a single frame with the FIN bit set and an opcode of 0. A + fragmented message is conceptually equivalent to a single larger + message whose payload is equal to the concatenation of the + payloads of the fragments in order, however in the presence of + extensions this may not hold true as the extension defines the + interpretation of the extension data present. For instance, + extension data may only be present at the beginning of the first + fragment and apply to subsequent fragments, or there may be + extension data present in each of the fragments that applies only + to that particular fragment. In absence of extension data, the + following example demonstrates how fragmentation works. + + EXAMPLE: For a text message sent as three fragments, the first + fragment would have an opcode of 0x1 and a FIN bit clear, the + second fragment would have an opcode of 0x0 and a FIN bit clear, + and the third fragment would have an opcode of 0x0 and a FIN bit + that is set. + + o Control frames MAY be injected in the middle of a fragmented + message. Control frames themselves MUST NOT be fragmented. + + o Message fragments MUST be delivered to the recipient in the order + sent by the sender. + + o The fragments of one message MUST NOT be interleaved between the + fragments of another message unless an extension has been + negotiated that can interpret the interleaving. + + o An endpoint MUST be capable of handling control frames in the + middle of a fragmented message. + + o A sender MAY create fragments of any size for non-control + messages. + + o Clients and servers MUST support receiving both fragmented and + unfragmented messages. + + o As control frames cannot be fragmented, an intermediary MUST NOT + attempt to change the fragmentation of a control frame. + + o An intermediary MUST NOT change the fragmentation of a message if + any reserved bit values are used and the meaning of these values + is not known to the intermediary. + + o An intermediary MUST NOT change the fragmentation of any message + in the context of a connection where extensions have been + negotiated and the intermediary is not aware of the semantics of + the negotiated extensions. + + o As a consequence of these rules, all fragments of a message are of + the same type, as set by the first fragment's opcode. Since + Control frames cannot be fragmented, the type for all fragments in + a message MUST be either text or binary, or one of the reserved + opcodes. + + _Note: if control frames could not be interjected, the latency of a + ping, for example, would be very long if behind a large message. + Hence, the requirement of handling control frames in the middle of a + fragmented message._ + +5.5. Control Frames + + Control frames are identified by opcodes where the most significant + bit of the opcode is 1. Currently defined opcodes for control frames + include 0x8 (Close), 0x9 (Ping), and 0xA (Pong). Opcodes 0xB-0xF are + reserved for further control frames yet to be defined. + + Control frames are used to communicate state about the WebSocket. + Control frames can be interjected in the middle of a fragmented + message. + + All control frames MUST have a payload length of 125 bytes or less + and MUST NOT be fragmented. + +5.5.1. Close + + The Close frame contains an opcode of 0x8. + + The Close frame MAY contain a body (the "application data" portion of + the frame) that indicates a reason for closing, such as an endpoint + shutting down, an endpoint having received a frame too large, or an + endpoint having received a frame that does not conform to the format + expected by the other endpoint. If there is a body, the first two + bytes of the body MUST be a 2-byte unsigned integer (in network byte + order) representing a status code with value /code/ defined in + Section 7.4. Following the 2-byte integer the body MAY contain UTF-8 + encoded data with value /reason/, the interpretation of which is not + defined by this specification. This data is not necessarily human + readable, but may be useful for debugging or passing information + relevant to the script that opened the connection. As the data is + not guaranteed to be human readable, clients MUST NOT show it to end + users. + + Close frames sent from client to server must be masked as per + Section 5.3. + + The application MUST NOT send any more data frames after sending a + close frame. + + If an endpoint receives a Close frame and that endpoint did not + previously send a Close frame, the endpoint MUST send a Close frame + in response. It SHOULD do so as soon as is practical. An endpoint + MAY delay sending a close frame until its current message is sent + (for instance, if the majority of a fragmented message is already + sent, an endpoint MAY send the remaining fragments before sending a + Close frame). However, there is no guarantee that the endpoint which + has already sent a Close frame will continue to process data. + + After both sending and receiving a close message, an endpoint + considers the WebSocket connection closed, and MUST close the + underlying TCP connection. The server MUST close the underlying TCP + connection immediately; the client SHOULD wait for the server to + close the connection but MAY close the connection at any time after + sending and receiving a close message, e.g. if it has not received a + TCP close from the server in a reasonable time period. + + If a client and server both send a Close message at the same time, + both endpoints will have sent and received a Close message and should + consider the WebSocket connection closed and close the underlying TCP + connection. + +5.5.2. Ping + + The Ping frame contains an opcode of 0x9. + + Upon receipt of a Ping frame, an endpoint MUST send a Pong frame in + response. It SHOULD do so as soon as is practical. Pong frames are + discussed in Section 5.5.3. + + An endpoint MAY send a Ping frame any time after the connection is + established and before the connection is closed. NOTE: A ping frame + may serve either as a keepalive, or to verify that the remote + endpoint is still responsive. + +5.5.3. Pong + + The Pong frame contains an opcode of 0xA. + + Section 5.5.2 details requirements that apply to both Ping and Pong + frames. + + A Pong frame sent in response to a Ping frame must have identical + Application Data as found in the message body of the Ping frame being + replied to. + + If an endpoint receives a Ping frame and has not yet sent Pong + frame(s) in response to previous Ping frame(s), the endpoint MAY + elect to send a Pong frame for only the most recently processed Ping + frame. + + A Pong frame MAY be sent unsolicited. This serves as a + unidirectional heartbeat. A response to an unsolicited pong is not + expected. + +5.6. Data Frames + + Data frames (e.g. non-control frames) are identified by opcodes where + the most significant bit of the opcode is 0. Currently defined + opcodes for data frames include 0x1 (Text), 0x2 (Binary). Opcodes + 0x3-0x7 are reserved for further non-control frames yet to be + defined. + + Data frames carry application-layer and/or extension-layer data. The + opcode determines the interpretation of the data: + + Text + + The payload data is text data encoded as UTF-8. Note that a + particular text frame might include a partial UTF-8 sequence, + however the whole message MUST contain valid UTF-8. + + Binary + + The payload data is arbitrary binary data whose interpretation is + solely up to the application layer. + +5.7. Examples + + _This section is non-normative._ + + o A single-frame unmasked text message + + * 0x81 0x05 0x48 0x65 0x6c 0x6c 0x6f (contains "Hello") + + o A single-frame masked text message + + * 0x81 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58 + (contains "Hello") + + o A fragmented unmasked text message + + * 0x01 0x03 0x48 0x65 0x6c (contains "Hel") + + * 0x80 0x02 0x6c 0x6f (contains "lo") + + o Unmasked Ping request and masked Ping response + + * 0x89 0x05 0x48 0x65 0x6c 0x6c 0x6f (contains a body of "Hello", + but the contents of the body are arbitrary) + + * 0x8a 0x85 0x37 0xfa 0x21 0x3d 0x7f 0x9f 0x4d 0x51 0x58 + (contains a body of "Hello", matching the body of the ping) + + o 256 bytes binary message in a single unmasked frame + + * 0x82 0x7E 0x0100 [256 bytes of binary data] + + o 64KiB binary message in a single unmasked frame + + * 0x82 0x7F 0x0000000000010000 [65536 bytes of binary data] + +5.8. Extensibility + + The protocol is designed to allow for extensions, which will add + capabilities to the base protocols. The endpoints of a connection + MUST negotiate the use of any extensions during the opening + handshake. This specification provides opcodes 0x3 through 0x7 and + 0xB through 0xF, the extension data field, and the frame-rsv1, frame- + rsv2, and frame-rsv3 bits of the frame header for use by extensions. + The negotiation of extensions is discussed in further detail in + Section 9.1. Below are some anticipated uses of extensions. This + list is neither complete nor prescriptive. + + o Extension data may be placed in the payload data before the + application data. + + o Reserved bits can be allocated for per-frame needs. + + o Reserved opcode values can be defined. + + o Reserved bits can be allocated to the opcode field if more opcode + values are needed. + + o A reserved bit or an "extension" opcode can be defined which + allocates additional bits out of the payload data to define larger + opcodes or more per-frame bits. + 6. Sending and Receiving Data 6.1. Sending Data To _Send a WebSocket Message_ comprising of /data/ over a WebSocket connection, an endpoint MUST perform the following steps. 1. The endpoint MUST ensure the WebSocket connection is in the OPEN - state (cf. Section 5.1 and Section 5.2.2.) If at any point the + state (cf. Section 4.1 and Section 4.2.2.) If at any point the state of the WebSocket connection changes, the endpoint MUST abort the following steps. 2. An endpoint MUST encapsulate the /data/ in a WebSocket frame as - defined in Section 4.2. If the data to be sent is large, or if + defined in Section 5.2. If the data to be sent is large, or if the data is not available in its entirety at the point the endpoint wishes to begin sending the data, the endpoint MAY alternately encapsulate the data in a series of frames as defined - in Section 4.4. + in Section 5.4. 3. The opcode (frame-opcode) of the first frame containing the data - MUST be set to the appropriate value from Section 4.2 for data + MUST be set to the appropriate value from Section 5.2 for data that is to be interpreted by the recipient as text or binary data. 4. The FIN bit (frame-fin) of the last frame containing the data - MUST be set to 1 as defined in Section 4.2. + MUST be set to 1 as defined in Section 5.2. 5. If the data is being sent by the client, the frame(s) MUST be - masked as defined in Section 4.3. + masked as defined in Section 5.3. 6. If any extensions (Section 9) have been negotiated for the WebSocket connection, additional considerations may apply as per the definition of those extensions. 7. The frame(s) that have been formed MUST be transmitted over the underlying network connection. 6.2. Receiving Data To receive WebSocket data, an endpoint listens on the underlying network connection. Incoming data MUST be parsed as WebSocket frames - as defined in Section 4.2. If a control frame (Section 4.5) is - received, the frame MUST be handled as defined by Section 4.5. Upon - receiving a data frame (Section 4.6), the endpoint MUST note the + as defined in Section 5.2. If a control frame (Section 5.5) is + received, the frame MUST be handled as defined by Section 5.5. Upon + receiving a data frame (Section 5.6), the endpoint MUST note the /type/ of the data as defined by the Opcode (frame-opcode) from - Section 4.2. The _Application Data_ from this frame is defined as + Section 5.2. The _Application Data_ from this frame is defined as the /data/ of the message. If the frame comprises an unfragmented - message (Section 4.4), it is said that _A WebSocket Message Has Been + message (Section 5.4), it is said that _A WebSocket Message Has Been Received_ with type /type/ and data /data/. If the frame is part of a fragmented message, the _Application Data_ of the subsequent data frames is concatenated to form the /data/. When the last fragment is received as indicated by the FIN bit (frame-fin), it is said that _A WebSocket Message Has Been Received_ with data /data/ (comprised of the concatenation of the _Application Data_ of the fragments) and type /type/ (noted from the first frame of the fragmented message). Subsequent data frames MUST be interpreted as belonging to a new WebSocket Message. Extensions (Section 9) MAY change the semantics of how data is read, specifically including what comprises a message boundary. Extensions, in addition to adding "Extension data" before the "Application data" in a payload, MAY also modify the "Application data" (such as by compressing it). A server MUST remove masking for data frames received from a client - as described in Section 4.3. + as described in Section 5.3. 7. Closing the connection 7.1. Definitions 7.1.1. Close the WebSocket Connection To _Close the WebSocket Connection_, an endpoint closes the underlying TCP connection. An endpoint SHOULD use a method that cleanly closes the TCP connection, as well as the TLS session, if @@ -1715,21 +1715,21 @@ sockets, one would call shutdown() with SHUT_WR on the socket, call recv() until obtaining a return value of 0 indicating that the peer has also performed an orderly shutdown, and finally calling close() on the socket. 7.1.2. Start the WebSocket Closing Handshake To _Start the WebSocket Closing Handshake_ with a status code (Section 7.4) /code/ and an optional close reason (Section 7.1.6) /reason/, an endpoint MUST send a Close control frame, as described - in Section 4.5.1 whose status code is set to /code/ and whose close + in Section 5.5.1 whose status code is set to /code/ and whose close reason is set to /reason/. Once an endpoint has both sent and received a Close control frame, that endpoint SHOULD _Close the WebSocket Connection_ as defined in Section 7.1.1. 7.1.3. The WebSocket Closing Handshake is Started Upon either sending or receiving a Close control frame, it is said that _The WebSocket Closing Handshake is Started_ and that the WebSocket connection is in the CLOSING state. @@ -1739,21 +1739,21 @@ WebSocket Connection is Closed_ and that the WebSocket connection is in the CLOSED state. If the tcp connection was closed after the WebSocket closing handshake was completed, the WebSocket connection is said to have been closed _cleanly_. If the WebSocket connection could not be established, it is also said that _The WebSocket Connection is Closed_, but not cleanly. 7.1.5. The WebSocket Connection Close Code - As defined in Section 4.5.1 and Section 7.4, a Close control frame + As defined in Section 5.5.1 and Section 7.4, a Close control frame may contain a status code indicating a reason for closure. A closing of the WebSocket connection may be initiated by either endpoint, potentially simultaneously. _The WebSocket Connection Close Code_ is defined as the status code (Section 7.4) contained in the first Close control frame received by the application implementing this protocol. If this Close control frame contains no status code, _The WebSocket Connection Close Code_ is considered to be 1005. If _The WebSocket Connection is Closed_ and no Close control frame was received by the endpoint (such as could occur if the underlying transport connection is lost), _The WebSocket Connection Close Code_ is considered to be @@ -1767,21 +1767,21 @@ send a Close frame, both endpoints will have sent and received a Close frame, and will not send further Close frames. Each endpoint will see the Connection Close Code sent by the other end as the _WebSocket Connection Close Code_. As such, it is possible that the two endpoints may not agree on the value of _The WebSocket Connection Close Code_ in the case that both endpoints _Start the WebSocket Closing Handshake_ independently and at roughly the same time. 7.1.6. The WebSocket Connection Close Reason - As defined in Section 4.5.1 and Section 7.4, a Close control frame + As defined in Section 5.5.1 and Section 7.4, a Close control frame may contain a status code indicating a reason for closure, followed by UTF-8 encoded data, the interpretation of said data being left to the endpoints and not defined by this protocol. A closing of the WebSocket connection may be initiated by either endpoint, potentially simultaneously. _The WebSocket Connection Close Reason_ is defined as the UTF-8 encoded data following the status code (Section 7.4) contained in the first Close control frame received by the application implementing this protocol. If there is no such data in the Close control frame, _The WebSocket Connection Close Reason_ is the empty string. @@ -2449,21 +2449,21 @@ WebSocket Subprotocol names to be used with the WebSocket protocol in accordance with the principles set out in RFC 5226 [RFC5226]. As part of this registry IANA will maintain the following information: Subprotocol Identifier The identifier of the subprotocol, as will be used in the Sec- WebSocket-Protocol header field registered in Section 11.3.4 of this specification. The value must conform to the requirements - given in Paragraph 10 of Section 5.1 of this specification, namely + given in Paragraph 10 of Section 4.1 of this specification, namely the value must be a token as defined by RFC 2616 [RFC2616]. Subprotocol Common Name The name of the subprotocol, as the subprotocol is generally referred to. Subprotocol Definition A reference to the document in which the subprotocol being used with the WebSocket protocol is defined. @@ -2474,21 +2474,21 @@ This specification requests the creation of a new IANA registry for WebSocket Version Numbers to be used with the WebSocket protocol in accordance with the principles set out in RFC 5226 [RFC5226]. As part of this registry IANA will maintain the following information: Version Number The version number to be used in the Sec-WebSocket-Version as - specified in Section 5.1 of this specification. The value must be + specified in Section 4.1 of this specification. The value must be a non negative integer in the range between 0 and 255 (inclusive). Reference The RFC requesting a new version number. WebSocket Version Numbers are to be subject to "IETF Review" IANA registration policy [RFC5226]. In order to improve interoperability with intermediate versions published in Internet Drafts, version numbers associated with such drafts might be registered in this registry. Note that "IETF Review" applies to registrations @@ -2594,21 +2594,21 @@ This specification requests the creation of a new IANA registry for WebSocket Opcodes in accordance with the principles set out in RFC 5226 [RFC5226]. As part of this registry IANA will maintain the following information: Opcode The opcode denotes the frame type of the WebSocket frame, as - defined in Section 4.2. The status code is an integer number + defined in Section 5.2. The status code is an integer number between 0 and 15, inclusive. Meaning The meaning of the opcode code. Reference The specification requesting the opcode. WebSocket Opcode numbers are subject to "Standards Action" IANA registration policy [RFC5226]. @@ -2630,21 +2630,21 @@ | 9 | Ping Frame | RFC XXXX | -+--------+-------------------------------------+-----------| | 10 | Pong Frame | RFC XXXX | -+--------+-------------------------------------+-----------| 11.9. WebSocket Framing Header Bits Registry This specification requests the creation of a new IANA registry for WebSocket Framing Header Bits in accordance with the principles set out in RFC 5226 [RFC5226]. This registry controls assignment of the - bits marked RSV1, RSV2, and RSV3 in Section 4.2. + bits marked RSV1, RSV2, and RSV3 in Section 5.2. These bits are reserved for future versions or extensions of this specification. WebSocket Framing Header Bits assignments are subject to "Standards Action" IANA registration policy [RFC5226]. 12. Using the WebSocket protocol from Other Specifications The WebSocket protocol is intended to be used by another