draft-ietf-rtcweb-security-07.txt   draft-ietf-rtcweb-security-08.txt 
RTC-Web E. Rescorla RTC-Web E. Rescorla
Internet-Draft RTFM, Inc. Internet-Draft RTFM, Inc.
Intended status: Standards Track July 04, 2014 Intended status: Standards Track February 26, 2015
Expires: January 5, 2015 Expires: August 30, 2015
Security Considerations for WebRTC Security Considerations for WebRTC
draft-ietf-rtcweb-security-07 draft-ietf-rtcweb-security-08
Abstract Abstract
The Real-Time Communications on the Web (RTCWEB) working group is The Real-Time Communications on the Web (RTCWEB) working group is
tasked with standardizing protocols for real-time communications tasked with standardizing protocols for real-time communications
between Web browsers, generally called "WebRTC". The major use cases between Web browsers, generally called "WebRTC". The major use cases
for WebRTC technology are real-time audio and/or video calls, Web for WebRTC technology are real-time audio and/or video calls, Web
conferencing, and direct data transfer. Unlike most conventional conferencing, and direct data transfer. Unlike most conventional
real-time systems (e.g., SIP-based soft phones) WebRTC communications real-time systems (e.g., SIP-based soft phones) WebRTC communications
are directly controlled by a Web server, which poses new security are directly controlled by a Web server, which poses new security
challenges. For instance, a Web browser might expose a JavaScript challenges. For instance, a Web browser might expose a JavaScript
API which allows a server to place a video call. Unrestricted access API which allows a server to place a video call. Unrestricted access
to such an API would allow any site which a user visited to "bug" a to such an API would allow any site which a user visited to "bug" a
user's computer, capturing any activity which passed in front of user's computer, capturing any activity which passed in front of
their camera. This document defines the WebRTC threat model and their camera. This document defines the WebRTC threat model and
analyzes the security threats of WebRTC in that model. analyzes the security threats of WebRTC in that model.
Status of this Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 5, 2015. This Internet-Draft will expire on August 30, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 3, line 7 skipping to change at page 2, line 29
modifications of such material outside the IETF Standards Process. modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other it for publication as an RFC or to translate it into languages other
than English. than English.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 5 3. The Browser Threat Model . . . . . . . . . . . . . . . . . . 4
3.1. Access to Local Resources . . . . . . . . . . . . . . . . 6 3.1. Access to Local Resources . . . . . . . . . . . . . . . . 5
3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 6 3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . 6
3.3. Bypassing SOP: CORS, WebSockets, and consent to 3.3. Bypassing SOP: CORS, WebSockets, and consent to
communicate . . . . . . . . . . . . . . . . . . . . . . . 7 communicate . . . . . . . . . . . . . . . . . . . . . . . 6
4. Security for WebRTC Applications . . . . . . . . . . . . . . . 7 4. Security for WebRTC Applications . . . . . . . . . . . . . . 7
4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 8 4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 7
4.1.1. Threats from Screen Sharing . . . . . . . . . . . . . 9 4.1.1. Threats from Screen Sharing . . . . . . . . . . . . . 8
4.1.2. Calling Scenarios and User Expectations . . . . . . . 9 4.1.2. Calling Scenarios and User Expectations . . . . . . . 9
4.1.2.1. Dedicated Calling Services . . . . . . . . . . . . 9 4.1.2.1. Dedicated Calling Services . . . . . . . . . . . 9
4.1.2.2. Calling the Site You're On . . . . . . . . . . . . 10 4.1.2.2. Calling the Site You're On . . . . . . . . . . . 9
4.1.3. Origin-Based Security . . . . . . . . . . . . . . . . 10 4.1.3. Origin-Based Security . . . . . . . . . . . . . . . . 10
4.1.4. Security Properties of the Calling Page . . . . . . . 12 4.1.4. Security Properties of the Calling Page . . . . . . . 11
4.2. Communications Consent Verification . . . . . . . . . . . 13 4.2. Communications Consent Verification . . . . . . . . . . . 12
4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 14 4.2.3. Backward Compatibility . . . . . . . . . . . . . . . 14
4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15 4.2.4. IP Location Privacy . . . . . . . . . . . . . . . . . 15
4.3. Communications Security . . . . . . . . . . . . . . . . . 16 4.3. Communications Security . . . . . . . . . . . . . . . . . 15
4.3.1. Protecting Against Retrospective Compromise . . . . . 17 4.3.1. Protecting Against Retrospective Compromise . . . . . 16
4.3.2. Protecting Against During-Call Attack . . . . . . . . 17 4.3.2. Protecting Against During-Call Attack . . . . . . . . 17
4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 18 4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . 17
4.3.2.2. Short Authentication Strings . . . . . . . . . . . 18 4.3.2.2. Short Authentication Strings . . . . . . . . . . 18
4.3.2.3. Third Party Identity . . . . . . . . . . . . . . . 19 4.3.2.3. Third Party Identity . . . . . . . . . . . . . . 19
4.3.2.4. Page Access to Media . . . . . . . . . . . . . . . 20 4.3.2.4. Page Access to Media . . . . . . . . . . . . . . 19
4.3.3. Malicious Peers . . . . . . . . . . . . . . . . . . . 20 4.3.3. Malicious Peers . . . . . . . . . . . . . . . . . . . 20
4.4. Privacy Considerations . . . . . . . . . . . . . . . . . . 21 4.4. Privacy Considerations . . . . . . . . . . . . . . . . . 20
4.4.1. Correlation of Anonymous Calls . . . . . . . . . . . . 21 4.4.1. Correlation of Anonymous Calls . . . . . . . . . . . 20
4.4.2. Browser Fingerprinting . . . . . . . . . . . . . . . . 21 4.4.2. Browser Fingerprinting . . . . . . . . . . . . . . . 21
5. Security Considerations . . . . . . . . . . . . . . . . . . . 21 5. Security Considerations . . . . . . . . . . . . . . . . . . . 21
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21
7. Changes Since -04 . . . . . . . . . . . . . . . . . . . . . . 21 7. Changes Since -04 . . . . . . . . . . . . . . . . . . . . . . 21
8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 21
8.1. Normative References . . . . . . . . . . . . . . . . . . . 22 8.1. Normative References . . . . . . . . . . . . . . . . . . 22
8.2. Informative References . . . . . . . . . . . . . . . . . . 22 8.2. Informative References . . . . . . . . . . . . . . . . . 22
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 25 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 24
1. Introduction 1. Introduction
The Real-Time Communications on the Web (RTCWEB) working group is The Real-Time Communications on the Web (RTCWEB) working group is
tasked with standardizing protocols for real-time communications tasked with standardizing protocols for real-time communications
between Web browsers, generally called "WebRTC" between Web browsers, generally called "WebRTC"
[I-D.ietf-rtcweb-overview]. The major use cases for WebTC technology [I-D.ietf-rtcweb-overview]. The major use cases for WebRTC
are real-time audio and/or video calls, Web conferencing, and direct technology are real-time audio and/or video calls, Web conferencing,
data transfer. Unlike most conventional real-time systems, (e.g., and direct data transfer. Unlike most conventional real-time
SIP-based[RFC3261] soft phones) WebRTC communications are directly systems, (e.g., SIP-based[RFC3261] soft phones) WebRTC communications
controlled by some Web server. A simple case is shown below. are directly controlled by some Web server. A simple case is shown
below.
+----------------+ +----------------+
| | | |
| Web Server | | Web Server |
| | | |
+----------------+ +----------------+
^ ^ ^ ^
/ \ / \
HTTP / \ HTTP HTTP / \ HTTP
or / \ or or / \ or
skipping to change at page 5, line 34 skipping to change at page 4, line 52
3. The Browser Threat Model 3. The Browser Threat Model
The security requirements for WebRTC follow directly from the The security requirements for WebRTC follow directly from the
requirement that the browser's job is to protect the user. Huang et requirement that the browser's job is to protect the user. Huang et
al. [huang-w2sp] summarize the core browser security guarantee as: al. [huang-w2sp] summarize the core browser security guarantee as:
Users can safely visit arbitrary web sites and execute scripts Users can safely visit arbitrary web sites and execute scripts
provided by those sites. provided by those sites.
It is important to realize that this includes sites hosting arbitrary It is important to realize that this includes sites hosting arbitrary
malicious scripts. The motivation for this requirement is simple: malicious scripts. The motivation for this requirement is simple: it
it is trivial for attackers to divert users to sites of their choice. is trivial for attackers to divert users to sites of their choice.
For instance, an attacker can purchase display advertisements which For instance, an attacker can purchase display advertisements which
direct the user (either automatically or via user clicking) to their direct the user (either automatically or via user clicking) to their
site, at which point the browser will execute the attacker's scripts. site, at which point the browser will execute the attacker's scripts.
Thus, it is important that it be safe to view arbitrarily malicious Thus, it is important that it be safe to view arbitrarily malicious
pages. Of course, browsers inevitably have bugs which cause them to pages. Of course, browsers inevitably have bugs which cause them to
fall short of this goal, but any new WebRTC functionality must be fall short of this goal, but any new WebRTC functionality must be
designed with the intent to meet this standard. The remainder of designed with the intent to meet this standard. The remainder of
this section provides more background on the existing Web security this section provides more background on the existing Web security
model. model.
skipping to change at page 6, line 25 skipping to change at page 5, line 40
3.1. Access to Local Resources 3.1. Access to Local Resources
While the browser has access to local resources such as keying While the browser has access to local resources such as keying
material, files, the camera and the microphone, it strictly limits or material, files, the camera and the microphone, it strictly limits or
forbids web servers from accessing those same resources. For forbids web servers from accessing those same resources. For
instance, while it is possible to produce an HTML form which will instance, while it is possible to produce an HTML form which will
allow file upload, a script cannot do so without user consent and in allow file upload, a script cannot do so without user consent and in
fact cannot even suggest a specific file (e.g., /etc/passwd); the fact cannot even suggest a specific file (e.g., /etc/passwd); the
user must explicitly select the file and consent to its upload. user must explicitly select the file and consent to its upload.
[Note: in many cases browsers are explicitly designed to avoid [Note: in many cases browsers are explicitly designed to avoid
dialogs with the semantics of "click here to screw yourself", as dialogs with the semantics of "click here to screw yourself", as
extensive research shows that users are prone to consent under such extensive research shows that users are prone to consent under such
circumstances.] circumstances.]
Similarly, while Flash programs (SWFs) [SWF] can access the camera Similarly, while Flash programs (SWFs) [SWF] can access the camera
and microphone, they explicitly require that the user consent to that and microphone, they explicitly require that the user consent to that
access. In addition, some resources simply cannot be accessed from access. In addition, some resources simply cannot be accessed from
the browser at all. For instance, there is no real way to run the browser at all. For instance, there is no real way to run
specific executables directly from a script (though the user can of specific executables directly from a script (though the user can of
course be induced to download executable files and run them). course be induced to download executable files and run them).
3.2. Same Origin Policy 3.2. Same Origin Policy
Many other resources are accessible but isolated. For instance, Many other resources are accessible but isolated. For instance,
while scripts are allowed to make HTTP requests via the while scripts are allowed to make HTTP requests via the
XMLHttpRequest() API those requests are not allowed to be made to any XMLHttpRequest() API those requests are not allowed to be made to any
server, but rather solely to the same ORIGIN from whence the script server, but rather solely to the same ORIGIN from whence the script
came xref target="RFC6454"/> (although CORS [CORS] and WebSockets came [RFC6454] (although CORS [CORS] and WebSockets [RFC6455] provide
[RFC6455] provide a escape hatch from this restriction, as described a escape hatch from this restriction, as described below.) This SAME
below.) This SAME ORIGIN POLICY (SOP) prevents server A from ORIGIN POLICY (SOP) prevents server A from mounting attacks on server
mounting attacks on server B via the user's browser, which protects B via the user's browser, which protects both the user (e.g., from
both the user (e.g., from misuse of his credentials) and the server B misuse of his credentials) and the server B (e.g., from DoS attack).
(e.g., from DoS attack).
More generally, SOP forces scripts from each site to run in their More generally, SOP forces scripts from each site to run in their
own, isolated, sandboxes. While there are techniques to allow them own, isolated, sandboxes. While there are techniques to allow them
to interact, those interactions generally must be mutually consensual to interact, those interactions generally must be mutually consensual
(by each site) and are limited to certain channels. For instance, (by each site) and are limited to certain channels. For instance,
multiple pages/browser panes from the same origin can read each multiple pages/browser panes from the same origin can read each
other's JS variables, but pages from the different origins--or even other's JS variables, but pages from the different origins--or even
iframes from different origins on the same page--cannot. iframes from different origins on the same page--cannot.
3.3. Bypassing SOP: CORS, WebSockets, and consent to communicate 3.3. Bypassing SOP: CORS, WebSockets, and consent to communicate
While SOP serves an important security function, it also makes it While SOP serves an important security function, it also makes it
inconvenient to write certain classes of applications. In inconvenient to write certain classes of applications. In
particular, mash-ups, in which a script from origin A uses resources particular, mash-ups, in which a script from origin A uses resources
from origin B, can only be achieved via a certain amount of hackery. from origin B, can only be achieved via a certain amount of hackery.
The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a
response to this demand. In CORS, when a script from origin A response to this demand. In CORS, when a script from origin A
executes what would otherwise be a forbidden cross-origin request, executes what would otherwise be a forbidden cross-origin request,
the browser instead contacts the target server to determine whether the browser instead contacts the target server to determine whether
it is willing to allow cross-origin requests from A. If it is so it is willing to allow cross-origin requests from A. If it is so
willing, the browser then allows the request. This consent willing, the browser then allows the request. This consent
verification process is designed to safely allow cross-origin verification process is designed to safely allow cross-origin
requests. requests.
While CORS is designed to allow cross-origin HTTP requests, While CORS is designed to allow cross-origin HTTP requests,
WebSockets [RFC6455] allows cross-origin establishment of transparent WebSockets [RFC6455] allows cross-origin establishment of transparent
channels. Once a WebSockets connection has been established from a channels. Once a WebSockets connection has been established from a
script to a site, the script can exchange any traffic it likes script to a site, the script can exchange any traffic it likes
without being required to frame it as a series of HTTP request/ without being required to frame it as a series of HTTP request/
response transactions. As with CORS, a WebSockets transaction starts response transactions. As with CORS, a WebSockets transaction starts
skipping to change at page 8, line 20 skipping to change at page 7, line 31
to initiate calls to arbitrary locations without user consent. This to initiate calls to arbitrary locations without user consent. This
immediately raises the question, however, of what should be the scope immediately raises the question, however, of what should be the scope
of user consent. of user consent.
In order for the user to make an intelligent decision about whether In order for the user to make an intelligent decision about whether
to allow a call (and hence his camera and microphone input to be to allow a call (and hence his camera and microphone input to be
routed somewhere), he must understand either who is requesting routed somewhere), he must understand either who is requesting
access, where the media is going, or both. As detailed below, there access, where the media is going, or both. As detailed below, there
are two basic conceptual models: are two basic conceptual models:
You are sending your media to entity A because you want to talk to 1. You are sending your media to entity A because you want to talk
Entity A (e.g., your mother). to Entity A (e.g., your mother).
Entity A (e.g., a calling service) asks to access the user's
devices with the assurance that it will transfer the media to 2. Entity A (e.g., a calling service) asks to access the user's
entity B (e.g., your mother) devices with the assurance that it will transfer the media to
entity B (e.g., your mother)
In either case, identity is at the heart of any consent decision. In either case, identity is at the heart of any consent decision.
Moreover, identity is all that the browser can meaningfully enforce; Moreover, identity is all that the browser can meaningfully enforce;
if you are calling A, A can simply forward the media to C. Similarly, if you are calling A, A can simply forward the media to C.
if you authorize A to place a call to B, A can call C instead. In Similarly, if you authorize A to place a call to B, A can call C
either case, all the browser is able to do is verify and check instead. In either case, all the browser is able to do is verify and
authorization for whoever is controlling where the media goes. The check authorization for whoever is controlling where the media goes.
target of the media can of course advertise a security/privacy The target of the media can of course advertise a security/privacy
policy, but this is not something that the browser can enforce. Even policy, but this is not something that the browser can enforce. Even
so, there are a variety of different consent scenarios that motivate so, there are a variety of different consent scenarios that motivate
different technical consent mechanisms. We discuss these mechanisms different technical consent mechanisms. We discuss these mechanisms
in the sections below. in the sections below.
It's important to understand that consent to access local devices is It's important to understand that consent to access local devices is
largely orthogonal to consent to transmit various kinds of data over largely orthogonal to consent to transmit various kinds of data over
the network (see Section 4.2. Consent for device access is largely a the network (see Section 4.2. Consent for device access is largely a
matter of protecting the user's privacy from malicious sites. By matter of protecting the user's privacy from malicious sites. By
contrast, consent to send network traffic is about preventing the contrast, consent to send network traffic is about preventing the
skipping to change at page 9, line 11 skipping to change at page 8, line 21
accessing the user's camera and microphone even if the data is to be accessing the user's camera and microphone even if the data is to be
sent back to the site via conventional HTTP-based network mechanisms sent back to the site via conventional HTTP-based network mechanisms
such as HTTP POST. such as HTTP POST.
4.1.1. Threats from Screen Sharing 4.1.1. Threats from Screen Sharing
In addition to camera and microphone access, there has been demand In addition to camera and microphone access, there has been demand
for screen and/or application sharing functionality. Unfortunately, for screen and/or application sharing functionality. Unfortunately,
the security implications of this functionality are much harder for the security implications of this functionality are much harder for
users to intuitively analyze than for camera and microphone access. users to intuitively analyze than for camera and microphone access.
(See (See http://lists.w3.org/Archives/Public/public-
http://lists.w3.org/Archives/Public/public-webrtc/2013Mar/0024.html webrtc/2013Mar/0024.html for a full analysis.)
for a full analysis.)
The most obvious threats are simply those of "oversharing". I.e., The most obvious threats are simply those of "oversharing". I.e.,
the user may believe they are sharing a window when in fact they are the user may believe they are sharing a window when in fact they are
sharing an application, or may forget they are sharing their whole sharing an application, or may forget they are sharing their whole
screen, icons, notifications, and all. This is already an issue with screen, icons, notifications, and all. This is already an issue with
existing screen sharing technologies and is made somewhat worse if a existing screen sharing technologies and is made somewhat worse if a
partially trusted site is responsible for asking for the resource to partially trusted site is responsible for asking for the resource to
be shared rather than having the user propose it. be shared rather than having the user propose it.
A less obvious threat involves the impact of screen sharing on the A less obvious threat involves the impact of screen sharing on the
skipping to change at page 10, line 18 skipping to change at page 9, line 33
With any kind of service where the user may use the same service to With any kind of service where the user may use the same service to
talk to many different people, there is a question about whether the talk to many different people, there is a question about whether the
user can know who they are talking to. If I grant permission to user can know who they are talking to. If I grant permission to
calling service A to make calls on my behalf, then I am implicitly calling service A to make calls on my behalf, then I am implicitly
granting it permission to bug my computer whenever it wants. This granting it permission to bug my computer whenever it wants. This
suggests another consent model in which a site is authorized to make suggests another consent model in which a site is authorized to make
calls but only to certain target entities (identified via media-plane calls but only to certain target entities (identified via media-plane
cryptographic mechanisms as described in Section 4.3.2 and especially cryptographic mechanisms as described in Section 4.3.2 and especially
Section 4.3.2.3.) Note that the question of consent here is related Section 4.3.2.3.) Note that the question of consent here is related
to but distinct from the question of peer identity: I might be to but distinct from the question of peer identity: I might be
willing to allow a calling site to in general initiate calls on my willing to allow a calling site to in general initiate calls on my
behalf but still have some calls via that site where I can be sure behalf but still have some calls via that site where I can be sure
that the site is not listening in. that the site is not listening in.
4.1.2.2. Calling the Site You're On 4.1.2.2. Calling the Site You're On
Another simple scenario is calling the site you're actually visiting. Another simple scenario is calling the site you're actually visiting.
The paradigmatic case here is the "click here to talk to a The paradigmatic case here is the "click here to talk to a
representative" windows that appear on many shopping sites. In this representative" windows that appear on many shopping sites. In this
case, the user's expectation is that they are calling the site case, the user's expectation is that they are calling the site
skipping to change at page 11, line 47 skipping to change at page 11, line 20
The other two options are designed to restrict calls to a given The other two options are designed to restrict calls to a given
target. Callee-oriented consent provided by the calling site not target. Callee-oriented consent provided by the calling site not
work well because a malicious site can claim that the user is calling work well because a malicious site can claim that the user is calling
any user of his choice. One fix for this is to tie calls to a any user of his choice. One fix for this is to tie calls to a
cryptographically established identity. While not suitable for all cryptographically established identity. While not suitable for all
cases, this approach may be useful for some. If we consider the case cases, this approach may be useful for some. If we consider the case
of advertising, it's not particularly convenient to require the of advertising, it's not particularly convenient to require the
advertiser to instantiate an iframe on the hosting site just to get advertiser to instantiate an iframe on the hosting site just to get
permission; a more convenient approach is to cryptographically tie permission; a more convenient approach is to cryptographically tie
the advertiser's certificate to the communication directly. We're the advertiser's certificate to the communication directly. We're
still tying permissions to origin here, but to the media origin still tying permissions to origin here, but to the media origin (and-
(and-or destination) rather than to the Web origin. or destination) rather than to the Web origin.
[I-D.ietf-rtcweb-security-arch] describes mechanisms which facilitate [I-D.ietf-rtcweb-security-arch] describes mechanisms which facilitate
this sort of consent. this sort of consent.
Another case where media-level cryptographic identity makes sense is Another case where media-level cryptographic identity makes sense is
when a user really does not trust the calling site. For instance, I when a user really does not trust the calling site. For instance, I
might be worried that the calling service will attempt to bug my might be worried that the calling service will attempt to bug my
computer, but I also want to be able to conveniently call my friends. computer, but I also want to be able to conveniently call my friends.
If consent is tied to particular communications endpoints, then my If consent is tied to particular communications endpoints, then my
risk is limited. Naturally, it is somewhat challenging to design UI risk is limited. Naturally, it is somewhat challenging to design UI
primitives which express this sort of policy. The problem becomes primitives which express this sort of policy. The problem becomes
even more challenging in multi-user calling cases. even more challenging in multi-user calling cases.
4.1.4. Security Properties of the Calling Page 4.1.4. Security Properties of the Calling Page
Origin-based security is intended to secure against web attackers. Origin-based security is intended to secure against web attackers.
However, we must also consider the case of network attackers. However, we must also consider the case of network attackers.
Consider the case where I have granted permission to a calling Consider the case where I have granted permission to a calling
service by an origin that has the HTTP scheme, e.g., service by an origin that has the HTTP scheme, e.g., http://calling-
http://calling-service.example.com. If I ever use my computer on an service.example.com. If I ever use my computer on an unsecured
unsecured network (e.g., a hotspot or if my own home wireless network network (e.g., a hotspot or if my own home wireless network is
is insecure), and browse any HTTP site, then an attacker can bug my insecure), and browse any HTTP site, then an attacker can bug my
computer. The attack proceeds like this: computer. The attack proceeds like this:
1. I connect to http://anything.example.org/. Note that this site 1. I connect to http://anything.example.org/. Note that this site is
is unaffiliated with the calling service. unaffiliated with the calling service.
2. The attacker modifies my HTTP connection to inject an IFRAME (or 2. The attacker modifies my HTTP connection to inject an IFRAME (or
a redirect) to http://calling-service.example.com a redirect) to http://calling-service.example.com
3. The attacker forges the response apparently
http://calling-service.example.com/ to inject JS to initiate a 3. The attacker forges the response apparently http://calling-
call to himself. service.example.com/ to inject JS to initiate a call to himself.
Note that this attack does not depend on the media being insecure. Note that this attack does not depend on the media being insecure.
Because the call is to the attacker, it is also encrypted to him. Because the call is to the attacker, it is also encrypted to him.
Moreover, it need not be executed immediately; the attacker can Moreover, it need not be executed immediately; the attacker can
"infect" the origin semi-permanently (e.g., with a web worker or a "infect" the origin semi-permanently (e.g., with a web worker or a
popped-up window that is hidden under the main window.) and thus be popped-up window that is hidden under the main window.) and thus be
able to bug me long after I have left the infected network. This able to bug me long after I have left the infected network. This
risk is created by allowing calls at all from a page fetched over risk is created by allowing calls at all from a page fetched over
HTTP. HTTP.
Even if calls are only possible from HTTPS sites, if the site embeds Even if calls are only possible from HTTPS sites, if the site embeds
active content (e.g., JavaScript) that is fetched over HTTP or from active content (e.g., JavaScript) that is fetched over HTTP or from
an untrusted site, because that JavaScript is executed in the an untrusted site, because that JavaScript is executed in the
security context of the page [finer-grained]. Thus, it is also security context of the page [finer-grained]. Thus, it is also
dangerous to allow WebRTC functionality from HTTPS origins that embed dangerous to allow WebRTC functionality from HTTPS origins that embed
mixed content. Note: this issue is not restricted to PAGES which mixed content. Note: this issue is not restricted to PAGES which
contain mixed content. If a page from a given origin ever loads contain mixed content. If a page from a given origin ever loads
mixed content then it is possible for a network attacker to infect mixed content then it is possible for a network attacker to infect
the browser's notion of that origin semi-permanently. the browser's notion of that origin semi-permanently.
4.2. Communications Consent Verification 4.2. Communications Consent Verification
As discussed in Section 3.3, allowing web applications unrestricted As discussed in Section 3.3, allowing web applications unrestricted
network access via the browser introduces the risk of using the network access via the browser introduces the risk of using the
browser as an attack platform against machines which would not browser as an attack platform against machines which would not
otherwise be accessible to the malicious site, for instance because otherwise be accessible to the malicious site, for instance because
skipping to change at page 16, line 4 skipping to change at page 15, line 29
suppress the start of ICE negotiation until the callee has answered. suppress the start of ICE negotiation until the callee has answered.
In addition, either side may wish to hide their location entirely by In addition, either side may wish to hide their location entirely by
forcing all traffic through a TURN server. forcing all traffic through a TURN server.
In ordinary operation, the site learns the browser's IP address, In ordinary operation, the site learns the browser's IP address,
though it may be hidden via mechanisms like Tor though it may be hidden via mechanisms like Tor
[http://www.torproject.org] or a VPN. However, because sites can [http://www.torproject.org] or a VPN. However, because sites can
cause the browser to provide IP addresses, this provides a mechanism cause the browser to provide IP addresses, this provides a mechanism
for sites to learn about the user's network environment even if the for sites to learn about the user's network environment even if the
user is behind a VPN that masks their IP address. Implementations user is behind a VPN that masks their IP address. Implementations
wish to provide settings which suppress all non-VPN candidates if the may wish to provide settings which suppress all non-VPN candidates if
user is on certain kinds of VPN, especially privacy-oriented systems the user is on certain kinds of VPN, especially privacy-oriented
such as Tor. systems such as Tor.
4.3. Communications Security 4.3. Communications Security
Finally, we consider a problem familiar from the SIP world: Finally, we consider a problem familiar from the SIP world:
communications security. For obvious reasons, it MUST be possible communications security. For obvious reasons, it MUST be possible
for the communicating parties to establish a channel which is secure for the communicating parties to establish a channel which is secure
against both message recovery and message modification. (See against both message recovery and message modification. (See
[RFC5479] for more details.) This service must be provided for both [RFC5479] for more details.) This service must be provided for both
data and voice/video. Ideally the same security mechanisms would be data and voice/video. Ideally the same security mechanisms would be
used for both types of content. Technology for providing this used for both types of content. Technology for providing this
skipping to change at page 18, line 42 skipping to change at page 18, line 18
[cranor-wolf], it seems extremely unlikely that any key continuity [cranor-wolf], it seems extremely unlikely that any key continuity
mechanism will be effective rather than simply annoying. mechanism will be effective rather than simply annoying.
Moreover, it is trivial to bypass even this kind of mechanism. Moreover, it is trivial to bypass even this kind of mechanism.
Recall that unlike the case of SSH, the browser never directly gets Recall that unlike the case of SSH, the browser never directly gets
the peer's identity from the user. Rather, it is provided by the the peer's identity from the user. Rather, it is provided by the
calling service. Even enabling a mechanism of this type would calling service. Even enabling a mechanism of this type would
require an API to allow the calling service to tell the browser "this require an API to allow the calling service to tell the browser "this
is a call to user X". All the calling service needs to do to avoid is a call to user X". All the calling service needs to do to avoid
triggering a key continuity warning is to tell the browser that "this triggering a key continuity warning is to tell the browser that "this
is a call to user Y" where Y is close to X. Even if the user actually is a call to user Y" where Y is close to X. Even if the user
checks the other side's name (which all available evidence indicates actually checks the other side's name (which all available evidence
is unlikely), this would require (a) the browser to trusted UI to indicates is unlikely), this would require (a) the browser to trusted
provide the name and (b) the user to not be fooled by similar UI to provide the name and (b) the user to not be fooled by similar
appearing names. appearing names.
4.3.2.2. Short Authentication Strings 4.3.2.2. Short Authentication Strings
ZRTP [RFC6189] uses a "short authentication string" (SAS) which is ZRTP [RFC6189] uses a "short authentication string" (SAS) which is
derived from the key agreement protocol. This SAS is designed to be derived from the key agreement protocol. This SAS is designed to be
compared by the users (e.g., read aloud over the the voice channel or compared by the users (e.g., read aloud over the the voice channel or
transmitted via an out of band channel) and if confirmed by both transmitted via an out of band channel) and if confirmed by both
sides precludes MITM attack. The intention is that the SAS is used sides precludes MITM attack. The intention is that the SAS is used
once and then key continuity (though a different mechanism from that once and then key continuity (though a different mechanism from that
skipping to change at page 20, line 12 skipping to change at page 19, line 36
third-party authenticated transactions. It is possible to use third-party authenticated transactions. It is possible to use
systems of this type to authenticate WebRTC calls, linking them to systems of this type to authenticate WebRTC calls, linking them to
existing user notions of identity (e.g., Facebook adjacencies). existing user notions of identity (e.g., Facebook adjacencies).
Specifically, the third-party identity system is used to bind the Specifically, the third-party identity system is used to bind the
user's identity to cryptographic keying material which is then used user's identity to cryptographic keying material which is then used
to authenticate the calling endpoints. Calls which are authenticated to authenticate the calling endpoints. Calls which are authenticated
in this fashion are naturally resistant even to active MITM attack by in this fashion are naturally resistant even to active MITM attack by
the calling site. the calling site.
Note that there is one special case in which PKI-style certificates Note that there is one special case in which PKI-style certificates
do provide a practical solution: calls from end-users to large do provide a practical solution: calls from end-users to large sites.
sites. For instance, if you are making a call to Amazon.com, then For instance, if you are making a call to Amazon.com, then Amazon can
Amazon can easily get a certificate to authenticate their media easily get a certificate to authenticate their media traffic, just as
traffic, just as they get one to authenticate their Web traffic. they get one to authenticate their Web traffic. This does not
This does not provide additional security value in cases in which the provide additional security value in cases in which the calling site
calling site and the media peer are one in the same, but might be and the media peer are one in the same, but might be useful in cases
useful in cases in which third parties (e.g., ad networks or in which third parties (e.g., ad networks or retailers) arrange for
retailers) arrange for calls but do not participate in them. calls but do not participate in them.
4.3.2.4. Page Access to Media 4.3.2.4. Page Access to Media
Identifying the identity of the far media endpoint is a necessary but Identifying the identity of the far media endpoint is a necessary but
not sufficient condition for providing media security. In WebRTC, not sufficient condition for providing media security. In WebRTC,
media flows are rendered into HTML5 MediaStreams which can be media flows are rendered into HTML5 MediaStreams which can be
manipulated by the calling site. Obviously, if the site can modify manipulated by the calling site. Obviously, if the site can modify
or view the media, then the user is not getting the level of or view the media, then the user is not getting the level of
assurance they would expect from being able to authenticate their assurance they would expect from being able to authenticate their
peer. In many cases, this is acceptable because the user values peer. In many cases, this is acceptable because the user values
skipping to change at page 22, line 7 skipping to change at page 21, line 30
Johnston, Hadriel Kaplan (S 4.2.1), Matthew Kaufman, Martin Thomson, Johnston, Hadriel Kaplan (S 4.2.1), Matthew Kaufman, Martin Thomson,
Magnus Westerlund. Magnus Westerlund.
7. Changes Since -04 7. Changes Since -04
o Replaced RTCWEB and RTC-Web with WebRTC, except when referring to o Replaced RTCWEB and RTC-Web with WebRTC, except when referring to
the IETF WG the IETF WG
o Removed discussion of the IFRAMEd advertisement case, since we o Removed discussion of the IFRAMEd advertisement case, since we
decided not to treat it specially. decided not to treat it specially.
o Added a privacy section considerations section. o Added a privacy section considerations section.
o Significant edits to the SAS section to reflect Alan Johnston's o Significant edits to the SAS section to reflect Alan Johnston's
comments. comments.
o Added some discussion if IP location privacy and Tor. o Added some discussion if IP location privacy and Tor.
o Updated the "communications consent" section to reflrect draft- o Updated the "communications consent" section to reflrect draft-
ietf. ietf.
o Added a section about "malicious peers". o Added a section about "malicious peers".
o Added a section describing screen sharing threats. o Added a section describing screen sharing threats.
o Assorted editorial changes. o Assorted editorial changes.
8. References 8. References
8.1. Normative References 8.1. Normative References
[I-D.ietf-rtcweb-overview] [I-D.ietf-rtcweb-overview]
Alvestrand, H., "Overview: Real Time Protocols for Alvestrand, H., "Overview: Real Time Protocols for
Browser-based Applications", draft-ietf-rtcweb-overview-10 Browser-based Applications", draft-ietf-rtcweb-overview-13
(work in progress), June 2014. (work in progress), November 2014.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
8.2. Informative References 8.2. Informative References
[CORS] van Kesteren, A., "Cross-Origin Resource Sharing". [CORS] van Kesteren, A., "Cross-Origin Resource Sharing", January
2014.
[I-D.ietf-avtcore-6222bis] [I-D.ietf-avtcore-6222bis]
Begen, A., Perkins, C., Wing, D., and E. Rescorla, Begen, A., Perkins, C., Wing, D., and E. Rescorla,
"Guidelines for Choosing RTP Control Protocol (RTCP) "Guidelines for Choosing RTP Control Protocol (RTCP)
Canonical Names (CNAMEs)", draft-ietf-avtcore-6222bis-06 Canonical Names (CNAMEs)", draft-ietf-avtcore-6222bis-06
(work in progress), July 2013. (work in progress), July 2013.
[I-D.ietf-rtcweb-security-arch] [I-D.ietf-rtcweb-security-arch]
Rescorla, E., "WebRTC Security Architecture", Rescorla, E., "WebRTC Security Architecture", draft-ietf-
draft-ietf-rtcweb-security-arch-09 (work in progress), rtcweb-security-arch-10 (work in progress), July 2014.
February 2014.
[I-D.ietf-rtcweb-stun-consent-freshness] [I-D.ietf-rtcweb-stun-consent-freshness]
Perumal, M., Wing, D., R, R., Reddy, T., and M. Thomson, Perumal, M., Wing, D., R, R., Reddy, T., and M. Thomson,
"STUN Usage for Consent Freshness", "STUN Usage for Consent Freshness", draft-ietf-rtcweb-
draft-ietf-rtcweb-stun-consent-freshness-04 (work in stun-consent-freshness-11 (work in progress), December
progress), June 2014. 2014.
[I-D.kaufman-rtcweb-security-ui] [I-D.kaufman-rtcweb-security-ui]
Kaufman, M., "Client Security User Interface Requirements Kaufman, M., "Client Security User Interface Requirements
for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in for RTCWEB", draft-kaufman-rtcweb-security-ui-00 (work in
progress), June 2011. progress), June 2011.
[RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
A., Peterson, J., Sparks, R., Handley, M., and E. A., Peterson, J., Sparks, R., Handley, M., and E.
Schooler, "SIP: Session Initiation Protocol", RFC 3261, Schooler, "SIP: Session Initiation Protocol", RFC 3261,
June 2002. June 2002.
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552, Text on Security Considerations", BCP 72, RFC 3552, July
July 2003. 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
Norrman, "The Secure Real-time Transport Protocol (SRTP)", Norrman, "The Secure Real-time Transport Protocol (SRTP)",
RFC 3711, March 2004. RFC 3711, March 2004.
[RFC3760] Gustafson, D., Just, M., and M. Nystrom, "Securely [RFC3760] Gustafson, D., Just, M., and M. Nystrom, "Securely
Available Credentials (SACRED) - Credential Server Available Credentials (SACRED) - Credential Server
Framework", RFC 3760, April 2004. Framework", RFC 3760, April 2004.
[RFC4251] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) [RFC4251] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH)
skipping to change at page 23, line 38 skipping to change at page 23, line 25
[RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer [RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
Security", RFC 4347, April 2006. Security", RFC 4347, April 2006.
[RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session
Description Protocol (SDP) Security Descriptions for Media Description Protocol (SDP) Security Descriptions for Media
Streams", RFC 4568, July 2006. Streams", RFC 4568, July 2006.
[RFC5245] Rosenberg, J., "Interactive Connectivity Establishment [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment
(ICE): A Protocol for Network Address Translator (NAT) (ICE): A Protocol for Network Address Translator (NAT)
Traversal for Offer/Answer Protocols", RFC 5245, Traversal for Offer/Answer Protocols", RFC 5245, April
April 2010. 2010.
[RFC5479] Wing, D., Fries, S., Tschofenig, H., and F. Audet, [RFC5479] Wing, D., Fries, S., Tschofenig, H., and F. Audet,
"Requirements and Analysis of Media Security Management "Requirements and Analysis of Media Security Management
Protocols", RFC 5479, April 2009. Protocols", RFC 5479, April 2009.
[RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework
for Establishing a Secure Real-time Transport Protocol for Establishing a Secure Real-time Transport Protocol
(SRTP) Security Context Using Datagram Transport Layer (SRTP) Security Context Using Datagram Transport Layer
Security (DTLS)", RFC 5763, May 2010. Security (DTLS)", RFC 5763, May 2010.
[RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media [RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media
Path Key Agreement for Unicast Secure RTP", RFC 6189, Path Key Agreement for Unicast Secure RTP", RFC 6189,
April 2011. April 2011.
[RFC6222] Begen, A., Perkins, C., and D. Wing, "Guidelines for [RFC6222] Begen, A., Perkins, C., and D. Wing, "Guidelines for
Choosing RTP Control Protocol (RTCP) Canonical Names Choosing RTP Control Protocol (RTCP) Canonical Names
(CNAMEs)", RFC 6222, April 2011. (CNAMEs)", RFC 6222, April 2011.
[RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, [RFC6454] Barth, A., "The Web Origin Concept", RFC 6454, December
December 2011. 2011.
[RFC6455] Fette, I. and A. Melnikov, "The WebSocket Protocol", [RFC6455] Fette, I. and A. Melnikov, "The WebSocket Protocol", RFC
RFC 6455, December 2011. 6455, December 2011.
[SWF] Adobe, "SWF File Format Specification Version 19". [SWF] Adobe, , "SWF File Format Specification Version 19", April
2013.
[abarth-rtcweb] [abarth-rtcweb]
Barth, A., "Prompting the user is security failure", RTC- Barth, A., "Prompting the user is security failure", RTC-
Web Workshop. Web Workshop, September 2010.
[cranor-wolf] [cranor-wolf]
Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and
L. cranor, "Crying Wolf: An Empirical Study of SSL Warning L. cranor, "Crying Wolf: An Empirical Study of SSL Warning
Effectiveness", Proceedings of the 18th USENIX Security Effectiveness", Proceedings of the 18th USENIX Security
Symposium, 2009. Symposium, 2009, August 2009.
[farus-conversion] [farus-conversion]
Farrus, M., Erro, D., and J. Hernando, "Speaker Farrus, M., Erro, D., and J. Hernando, "Speaker
Recognition Robustness to Voice Conversion". Recognition Robustness to Voice Conversion", January 2008.
[finer-grained] [finer-grained]
Barth, A. and C. Jackson, "Beware of Finer-Grained Barth, A. and C. Jackson, "Beware of Finer-Grained
Origins", W2SP, 2008. Origins", W2SP, 2008, July 2008.
[huang-w2sp] [huang-w2sp]
Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C. Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C.
Jackson, "Talking to Yourself for Fun and Profit", W2SP, Jackson, "Talking to Yourself for Fun and Profit", W2SP,
2011. 2011, May 2011.
[kain-conversion] [kain-conversion]
Kain, A. and M. Macon, "Design and Evaluation of a Voice Kain, A. and M. Macon, "Design and Evaluation of a Voice
Conversion Algorithm based on Spectral Envelope Mapping Conversion Algorithm based on Spectral Envelope Mapping
and Residual Prediction", Proceedings of ICASSP, May and Residual Prediction", Proceedings of ICASSP, May 2001,
2001. May 2001.
[whitten-johnny] [whitten-johnny]
Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A
Usability Evaluation of PGP 5.0", Proceedings of the 8th Usability Evaluation of PGP 5.0", Proceedings of the 8th
USENIX Security Symposium, 1999. USENIX Security Symposium, 1999, August 1999.
Author's Address Author's Address
Eric Rescorla Eric Rescorla
RTFM, Inc. RTFM, Inc.
2064 Edgewood Drive 2064 Edgewood Drive
Palo Alto, CA 94303 Palo Alto, CA 94303
USA USA
Phone: +1 650 678 2350 Phone: +1 650 678 2350
Email: ekr@rtfm.com Email: ekr@rtfm.com
 End of changes. 49 change blocks. 
130 lines changed or deleted 138 lines changed or added

This html diff was produced by rfcdiff 1.42. The latest version is available from http://tools.ietf.org/tools/rfcdiff/