[Docs] [txt|pdf] [Tracker] [Email] [Diff1] [Diff2] [Nits]

Versions: 00 01 02 03 05 06 07 10 11 12 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 draft-ietf-hybi-thewebsocketprotocol

Network Working Group                                         I. Hickson
Internet-Draft                                              Google, Inc.
Intended status: Standards Track                        October 23, 2009
Expires: April 26, 2010


                        The Web Socket protocol
                  draft-hixie-thewebsocketprotocol-54

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on April 26, 2010.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.









Hickson                  Expires April 26, 2010                 [Page 1]


Internet-Draft           The Web Socket protocol            October 2009


Abstract

   The Web Sockets protocol enables two-way communication between a user
   agent running untrusted code running in a controlled environment to a
   remote host that has opted-in to communications from that code.  The
   protocol consists of an initial handshake followed by basic message
   framing, layered over TCP.  The goal of this technology is to provide
   a mechanism for browser-based applications that need two-way
   communication with servers that does not rely on opening multiple
   HTTP connections (e.g. using XMLHttpRequest or <iframe>s and long
   polling).








































Hickson                  Expires April 26, 2010                 [Page 2]


Internet-Draft           The Web Socket protocol            October 2009


Author's note

   This document is automatically generated from the same source
   document as the HTML5 specification.  [HTML5]

   Please send feedback to either the hybi@ietf.org list or the
   whatwg@whatwg.org list.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Background . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.2.  Protocol overview  . . . . . . . . . . . . . . . . . . . .  4
     1.3.  Design philosophy  . . . . . . . . . . . . . . . . . . . .  6
     1.4.  Security model . . . . . . . . . . . . . . . . . . . . . .  6
     1.5.  Relationship to TCP/IP and HTTP  . . . . . . . . . . . . .  7
     1.6.  Establishing a connection  . . . . . . . . . . . . . . . .  7
   2.  Conformance requirements . . . . . . . . . . . . . . . . . . .  9
     2.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  9
   3.  Web Socket URLs  . . . . . . . . . . . . . . . . . . . . . . . 10
     3.1.  Parsing Web Socket URLs  . . . . . . . . . . . . . . . . . 10
     3.2.  Constructing Web Socket URLs . . . . . . . . . . . . . . . 11
   4.  Client-side requirements . . . . . . . . . . . . . . . . . . . 12
     4.1.  Handshake  . . . . . . . . . . . . . . . . . . . . . . . . 12
     4.2.  Data framing . . . . . . . . . . . . . . . . . . . . . . . 19
     4.3.  Closing the connection . . . . . . . . . . . . . . . . . . 21
     4.4.  Handling errors in UTF-8 . . . . . . . . . . . . . . . . . 21
   5.  Server-side requirements . . . . . . . . . . . . . . . . . . . 22
     5.1.  Minimal handshake  . . . . . . . . . . . . . . . . . . . . 22
     5.2.  Handshake details  . . . . . . . . . . . . . . . . . . . . 23
     5.3.  Data framing . . . . . . . . . . . . . . . . . . . . . . . 24
   6.  Closing the connection . . . . . . . . . . . . . . . . . . . . 26
   7.  Security considerations  . . . . . . . . . . . . . . . . . . . 27
   8.  IANA considerations  . . . . . . . . . . . . . . . . . . . . . 28
     8.1.  Registration of ws: scheme . . . . . . . . . . . . . . . . 28
     8.2.  Registration of wss: scheme  . . . . . . . . . . . . . . . 29
     8.3.  Registration of the "WebSocket" HTTP Upgrade keyword . . . 30
     8.4.  WebSocket-Origin . . . . . . . . . . . . . . . . . . . . . 30
     8.5.  WebSocket-Protocol . . . . . . . . . . . . . . . . . . . . 31
     8.6.  WebSocket-Location . . . . . . . . . . . . . . . . . . . . 32
   9.  Using the Web Socket protocol from other specifications  . . . 33
   10. Normative References . . . . . . . . . . . . . . . . . . . . . 34
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 36







Hickson                  Expires April 26, 2010                 [Page 3]


Internet-Draft           The Web Socket protocol            October 2009


1.  Introduction

1.1.  Background

   _This section is non-normative._

   Historically, creating an instant messenger chat client as a Web
   application has required an abuse of HTTP to poll the server for
   updates while sending upstream notifications as distinct HTTP calls.

   This results in a variety of problems:

   o  The server is forced to use a number of different underlying TCP
      connections for each client: one for sending information to the
      client, and a new one for each incoming message.

   o  The wire protocol has a high overhead, with each client-to-server
      message having an HTTP header.

   o  The client-side script is forced to maintain a mapping from the
      outgoing connections to the incoming connection to track replies.

   A simpler solution would be to use a single TCP connection for
   traffic in both directions.  This is what the Web Socket protocol
   provides.  Combined with the Web Socket API, it provides an
   alternative to HTTP polling for two-way communication from a Web page
   to a remote server.  [WSAPI]

   The same technique can be used for a variety of Web applications:
   games, stock tickers, multiuser applications with simultaneous
   editing, user interfaces exposing server-side services in real time,
   etc.

1.2.  Protocol overview

   _This section is non-normative._

   The protocol has two parts: a handshake, and then the data transfer.

   The handshake from the client looks as follows:

        GET /demo HTTP/1.1
        Upgrade: WebSocket
        Connection: Upgrade
        Host: example.com
        Origin: http://example.com
        WebSocket-Protocol: sample




Hickson                  Expires April 26, 2010                 [Page 4]


Internet-Draft           The Web Socket protocol            October 2009


   The handshake from the server looks as follows:

        HTTP/1.1 101 Web Socket Protocol Handshake
        Upgrade: WebSocket
        Connection: Upgrade
        WebSocket-Origin: http://example.com
        WebSocket-Location: ws://example.com/demo
        WebSocket-Protocol: sample

   The first three lines in each case are hard-coded (the exact case and
   order matters); the remainder are an unordered ASCII case-insensitive
   set of fields, one per line, that match the following non-normative
   ABNF: [RFC5234]

     field         = 1*name-char colon [ space ] *any-char cr lf
     colon         = %x003A ; U+003A COLON (:)
     space         = %x0020 ; U+0020 SPACE
     cr            = %x000D ; U+000D CARRIAGE RETURN (CR)
     lf            = %x000A ; U+000A LINE FEED (LF)
     name-char     = %x0000-0009 / %x000B-000C / %x000E-0039 / %x003B-10FFFF
                     ; a Unicode character other than U+000A LINE FEED (LF), U+000D CARRIAGE RETURN (CR), or U+003A COLON (:)
     any-char      = %x0000-0009 / %x000B-000C / %x000E-10FFFF
                     ; a Unicode character other than U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR)

   Lines that don't match the above production cause the connection to
   be aborted.

   Once the client and server have both sent their handshakes, and if
   the handshake was successful, then the data transfer part starts.
   This is a two-way communication channel where each side can,
   independently from the other, send data at will.

   Data is sent in the form of UTF-8 text.  Each frame of data starts
   with a 0x00 byte and ends with a 0xFF byte, with the UTF-8 text in
   between.

   The Web Socket protocol uses this framing so that specifications that
   use the Web Socket protocol can expose such connections using an
   event-based mechanism instead of requiring users of those
   specifications to implement buffering and piecing together of
   messages manually.


   The protocol is designed to support other frame types in future.
   Instead of the 0x00 byte, other bytes might in future be defined.
   Frames denoted by bytes that do not have the high bit set (0x00 to
   0x7F) are treated as described above (a stream of bytes terminated by
   0xFF).  Frames denoted by bytes that have the high bit set (0x80 to



Hickson                  Expires April 26, 2010                 [Page 5]


Internet-Draft           The Web Socket protocol            October 2009


   0xFF) have a leading length indicator, which is encoded as a series
   of 7-bit bytes stored in octets with the 8th bit being set for all
   but the last byte.  The remainder of the frame is then as much data
   as was specified.


   The following diagrams summarise the protocol:

        Handshake
           |
          \|/
        Frame type byte <-------------------------------------.
           |      |                                           |
           |      `-- (0x00 to 0x7F) --> Data... --> 0xFF -->-+
           |                                                  |
           `-- (0x80 to 0xFF) --> Length --> Data... ------->-'

1.3.  Design philosophy

   _This section is non-normative._

   The Web Socket protocol is designed on the principle that there
   should be minimal framing (the only framing that exists is to make
   the protocol frame-based instead of stream-based, and to support a
   distinction between Unicode text and binary frames).  It is expected
   that metadata would be layered on top of Web Socket by the
   application layer, in the same way that metadata is layered on top of
   TCP/IP by the application layer (HTTP).

   Conceptually, Web Socket is really just a layer on top of TCP/IP that
   adds a Web "origin"-based security model for browsers; adds an
   addressing and protocol naming mechanism to support multiple services
   on one port and multiple host names on one IP address; and layers a
   framing mechanism on top of TCP to get back to the IP packet
   mechanism that TCP is built on, but without length limits.  Other
   than that, it adds nothing.  Basically it is intended to be as close
   as possible to just exposing raw TCP/IP to script as possible given
   the constraints of the Web. It's also designed in such a way that its
   servers can share a port with HTTP servers, by having its handshake
   be a valid HTTP Upgrade handshake also.

1.4.  Security model

   _This section is non-normative._

   The Web Socket protocol uses the origin model used by Web browsers to
   restrict which Web pages can contact a Web Socket server when the Web
   Socket protocol is used from a Web page.  Naturally, when the Web



Hickson                  Expires April 26, 2010                 [Page 6]


Internet-Draft           The Web Socket protocol            October 2009


   Socket protocol is used directly (not from a Web page), the origin
   model is not useful, as the client can provide any arbitrary origin
   string.

   This protocol is intended to fail to establish a connection with
   servers of pre-existing protocols like SMTP or HTTP, while allowing
   HTTP servers to opt-in to supporting this protocol if desired.  This
   is achieved by having a strict and elaborate handshake, and by
   limiting the data that can be inserted into the connection before the
   handshake is finished (thus limiting how much the server can be
   influenced).

1.5.  Relationship to TCP/IP and HTTP

   _This section is non-normative._

   The Web Socket protocol is an independent TCP-based protocol.  Its
   only relationship to HTTP is that its handshake is interpreted by
   HTTP servers as an Upgrade request.

   Based on the expert recommendation of the IANA, the Web Socket
   protocol by default uses port 80 for regular Web Socket connections
   and port 443 for Web Socket connections tunneled over TLS.

1.6.  Establishing a connection

   _This section is non-normative._

   There are several options for establishing a Web Socket connection.

   On the face of it, the simplest method would seem to be to use port
   80 to get a direct connection to a Web Socket server.  Port 80
   traffic, however, will often be intercepted by HTTP proxies, which
   can lead to the connection failing to be established.

   The most reliable method, therefore, is to use TLS encryption and
   port 443 to connect directly to a Web Socket server.  This has the
   advantage of being more secure; however, TLS encryption can be
   computationally expensive.

   When a connection is to be made to a port that is shared by an HTTP
   server (a situation that is quite likely to occur with traffic to
   ports 80 and 443), the connection will appear to the HTTP server to
   be a regular GET request with an Upgrade offer.  In relatively simple
   setups with just one IP address and a single server for all traffic
   to a single hostname, this might allow a practical way for systems
   based on the Web Socket protocol to be deployed.  In more elaborate
   setups (e.g. with load balancers and multiple servers), a dedicated



Hickson                  Expires April 26, 2010                 [Page 7]


Internet-Draft           The Web Socket protocol            October 2009


   set of hosts for Web Socket connections separate from the HTTP
   servers is probably easier to manage.

















































Hickson                  Expires April 26, 2010                 [Page 8]


Internet-Draft           The Web Socket protocol            October 2009


2.  Conformance requirements

   All diagrams, examples, and notes in this specification are non-
   normative, as are all sections explicitly marked non-normative.
   Everything else in this specification is normative.

   The key words "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT",
   "RECOMMENDED", "MAY", and "OPTIONAL" in the normative parts of this
   document are to be interpreted as described in RFC2119.  For
   readability, these words do not appear in all uppercase letters in
   this specification.  [RFC2119]

   Requirements phrased in the imperative as part of algorithms (such as
   "strip any leading space characters" or "return false and abort these
   steps") are to be interpreted with the meaning of the key word
   ("must", "should", "may", etc) used in introducing the algorithm.

   Conformance requirements phrased as algorithms or specific steps may
   be implemented in any manner, so long as the end result is
   equivalent.  (In particular, the algorithms defined in this
   specification are intended to be easy to follow, and not intended to
   be performant.)

   Implementations may impose implementation-specific limits on
   otherwise unconstrained inputs, e.g. to prevent denial of service
   attacks, to guard against running out of memory, or to work around
   platform-specific limitations.

   The conformance classes defined by this specification are user agents
   and servers.

2.1.  Terminology

   *Converting a string to ASCII lowercase* means replacing all
   characters in the range U+0041 to U+005A (i.e.  LATIN CAPITAL LETTER
   A to LATIN CAPITAL LETTER Z) with the corresponding characters in the
   range U+0061 to U+007A (i.e.  LATIN SMALL LETTER A to LATIN SMALL
   LETTER Z).

   The term "URL" is used in this section in a manner consistent with
   the terminology used in HTML, namely, to denote a string that might
   or might not be a valid URI or IRI and to which certain error
   handling behaviors will be applied when the string is parsed.
   [HTML5]







Hickson                  Expires April 26, 2010                 [Page 9]


Internet-Draft           The Web Socket protocol            October 2009


3.  Web Socket URLs

3.1.  Parsing Web Socket URLs

   The steps to *parse a Web Socket URL's components* from a string
   /url/ are as follows.  These steps return either a /host/, a /port/,
   a /resource name/, and a /secure/ flag, or they fail.

   1.   If /protocol/ is specified but is either the empty string or
        contains characters that are not in the range U+0021 to U+007E,
        then fail this algorithm.

   2.   If the /url/ string is not an absolute URL, then fail this
        algorithm.  [WEBADDRESSES]

   3.   Resolve the /url/ string using the resolve a Web address
        algorithm defined by the Web addresses specification, with the
        URL character encoding set to UTF-8.  [WEBADDRESSES] [RFC3629]

        NOTE: It doesn't matter what it is resolved relative to, since
        we already know it is an absolute URL at this point.

   4.   If /url/ does not have a <scheme> component whose value is
        either "ws" or "wss", when compared in an ASCII case-insensitive
        manner, then fail this algorithm.

   5.   If the <scheme> component of /url/ is "ws", set /secure/ to
        false; otherwise, the <scheme> component is "wss", set /secure/
        to true.

   6.   Let /host/ be the value of the <host> component of /url/,
        converted to ASCII lowercase.

   7.   If /url/ has a <port> component, t